Interobserver agreement for detecting Hill-Sachs lesions on magnetic resonance imaging

Article information

Clin Shoulder Elb. 2021;24(2):98-105

Publication date (electronic) : 2021 May 27

doi : https://doi.org/10.5397/cise.2021.00115

Hassanin Alkaduhimi^,¹

, Aïmane Saarig ¹, Ihsan Amajjar ¹

, Just A. van der Linde ², Marieke F. van Wier ¹

, Nienke W. Willigenburg ¹

, Michel P.J. van den Bekerom ¹

, Shoulder and Elbow Center

¹Shoulder and Elbow Unit, Joint Research, OLVG Hospital, Amsterdam, Netherlands

²Department of Orthopedic Surgery and Traumatology, Reinier Haga Orthopedisch Centrum, Zoetermeer, Netherlands

Correspondence to: Hassanin Alkaduhimi Shoulder and Elbow Unit, Joint Research, OLVG Hospital, Oosterpark 9, 1091 AC Amsterdam, Netherlands Tel: +31-205999111, E-mail: hassaninalkaduhimi@gmail.com

Received 2021 March 10; Revised 2021 April 25; Accepted 2021 April 27.

Abstract

Background

Our aim is to determine the interobserver reliability for surgeons to detect Hill-Sachs lesions on magnetic resonance imaging (MRI), the certainty of judgement, and the effects of surgeon characteristics on agreement.

Methods

Twenty-nine patients with Hill-Sachs lesions or other lesions with a similar appearance on MRIs were presented to 20 surgeons without any patient characteristics. The surgeons answered questions on the presence of Hill-Sachs lesions and the certainty of diagnosis. Interobserver agreement was assessed using the Fleiss’ kappa (κ) and percentage of agreement. Agreement between surgeons was compared using a technique similar to the pairwise t-test for means, based on large-sample linear approximation of Fleiss' kappa, with Bonferroni correction.

Results

The agreement between surgeons in detecting Hill-Sachs lesions on MRI was fair (69% agreement; κ, 0.304; p<0.001). In 84% of the cases, surgeons were certain or highly certain about the presence of a Hill-Sachs lesion.

Conclusions

Although surgeons reported high levels of certainty for their ability to detect Hill-Sachs lesions, there was only a fair amount of agreement between surgeons in detecting Hill-Sachs lesions on MRI. This indicates that clear criteria for defining Hill-Sachs lesions are lacking, which hampers accurate diagnosis and can compromise treatment.

Keywords: Observer variation; Shoulder; Joint instability; Bankart lesions; Hill-Sachs

INTRODUCTION

During anterior shoulder dislocation, the head of the humerus can be pressed against the antero-inferior part of the glenoid rim and cause an impression fracture of the posterior superior lateral humeral head, known as a Hill-Sachs lesion [1]. The incidence of these Hill-Sachs lesions is reported to be between 40% and 90% for patients with anterior instability and could be as high as 100% for patients with recurrent dislocation [2]. Furthermore, humeral bone loss associated with a Hill-Sachs lesion can increase the risk of recurrent dislocation depending on the size and location of the lesion [1]. Treatment algorithms, such as the instability severity index score and glenoid track instability management score, have been developed to assess whether instability could be treated with a soft-tissue procedure or a bony procedure [3]. In these treatment algorithms a more aggressive approach is recommended based on the presence of factors that result in a higher recurrent instability rate, and a Hill-Sachs lesion is one of these factors. Since the presence of a Hill-Sachs lesion is important for determining treatment, it is important that healthcare providers agree on the presence of a Hill-Sachs lesion.

A Hill-Sachs lesion can be detected on radiographic imaging, but computed tomography (CT) and magnetic resonance imaging (MRI) are more sensitive [4,5]. Traditionally, CT scans were obtained to assess humeral and glenoid bone loss. In contrast to CT scans, MRI does not expose patients to radiation and assessment of the soft-tissue can be more accurate [6]. Therefore, MRI is the preferred imaging modality by orthopedic shoulder surgeons [7]. Saqib et al. [8] recently reported high sensitivity and specificity of magnetic resonance arthrography reviewed by experienced radiologists in detecting Hill-Sachs lesions compared to arthroscopy by one single surgeon. Although the accuracy of MRI to detect Hill-Sachs lesions is documented (Table 1) [8-17] insight into the reliability is limited.

Table 1.

Studies since 2000 that have assessed MRI accuracy and reliability

This gap in the literature is critical, as discordant diagnoses by healthcare professionals can have detrimental impacts on patient care and recovery. Consequently, if reliability is low, healthcare providers do not agree on the presence of Hill-Sachs lesions. That means that patients with (and without) Hill-Sachs lesions can be diagnosed and treated differently by surgeon. Additionally, the incidence of Hill-Sachs lesions in the literature can vary, largely due to differences in clinical judgement. We are interested specifically in treating surgeon radiological judgement rather than the expert radiologist assessment judgement because surgeons always assess MRIs before discussing treatment options with the patient.

Halma et al. [18] reported fair interobserver agreement in surgeons and radiologists that assessed Hill-Sachs lesions compared to only 3 of 50 MRIs that included a Hill-Sachs lesion in the present study. Therefore, concrete conclusions on the reliability of detecting Hill-Sachs lesions could not be made. Beason et al. [19] evaluated interobserver agreement for detecting Hill-Sachs lesions among shoulder/sports medicine fellowship-trained orthopedic surgeons based only on coronal and axial T2-weighted MRI series. However, the surgeon’s level of expertise was not taken into account, and the overall agreement was fair. van Grinsven et al. [20] has assessed the agreement between radiologists and orthopedic surgeons for instability-related shoulder lesions on MRI, although the study did not report on the number of Hill-Sachs lesions in the population. Furthermore, they reported the agreement for all instability-related shoulder lesions without specifying the agreement for Hill-Sachs lesions.

This is the fourth study on this important topic, and we aimed to provide further insight into the role of MRI as a diagnostic instruments that can be used by surgeons. Specifically, we aimed to determine: (1) the interobserver reliability for surgeons to detect Hill-Sachs lesions on MRI, (2) the certainty of surgeons regarding their judgement, and (3) the effects of surgeon characteristics on agreement. To achieve this, we incorporated results from a substantially sized group of surgeons with varying levels of expertise to assess multiple MRIs with and without Hill-Sachs lesions and with no additional patient characteristics for context. We hypothesized that agreement would be fair, certainty would be high, and agreement would increase with corresponding increase in level of expertise.

METHODS

This study has been approved by the Institutional Review Board of the OLVG Hospital (IRB No. WO 16.052).

Patients

Our hospital database was screened for available shoulder MRIs of patients with shoulder instability based on diagnosis codes. The medical records of these patients were manually screened by two researchers (HA and AS) for MRIs with Hill-Sachs lesions (n=19) or other defects with a similar appearance (n=10). These other defects were visible at the typical location for a Hill-Sachs lesion, but were not a Hill-Sachs lesion as reported by the musculoskeletal radiologist. Such lesions included bone cyst, erosion of cartilage, small grooves, or the bare area of the humeral head [21]. The majority of MRIs was performed without intra-articular contrast, and the Hill-Sachs lesions varied in size (Fig. 1). Proton density turbo spin echo MRIs were performed with a Siemens Magnetom Aera device (Siemens Healthineers, Erlangen, Germany). All MRIs were performed with the same MRI device and using the same protocol, positioning, and slice thickness.

Fig. 1.

Examples of magnetic resonance imaging (MRI) included in this study. (A) An MRI of a shoulder with a large Hill-Sachs lesion. (B) An MRI of a shoulder with a small Hill-Sachs lesion. (C) An MRI with intra-articular contrast of a shoulder with a small Hill-Sachs lesion.

Methods and Assessment

The MRI results were uploaded to a secure online survey platform (http://www.shoulderelbowcenter.com/) offering additional tools to perform measurements including lengths, angles, multiplanar reconstruction, and areas of surfaces. Experienced orthopedic surgeons with a specialization in shoulder pathology were invited to assess the MRIs and answer two questions based on the images: whether there was a Hill-Sachs lesion (yes/no) and how certain they were about the presence of a Hill-Sachs lesion (absolutely certain/certain/some doubts/very uncertain). General information about the assessing surgeons included the geographical location of their practice, years of clinical experience, scope of clinical interest, and whether they were involved in resident or fellowship training.

We did not provide any patient characteristics to isolate and assess the role of the MRI, which is just one of the available diagnostic tools. Because age, sex, and history of recurrent instability can predispose patients toward a Hill-Sachs or other diagnosis in regular clinical practice, not providing this information allowed assessment of the research question based purely on MRI.

Statistical Analysis

Sample size was based on expert opinion, numbers of MRIs and respondents in previous studies, [19,20,22], and feasibility in terms of the time needed to complete the survey for the set of MRIs. All analyses were performed with Stata ver. 14 (StataCorp., College Station, TX, USA). Fleiss’ Kappas were compared using the STATA package Kappaetc [23].

The interobserver variability was determined using Fleiss’ Kappa, a statistical measure for assessing agreement of a fixed number of more than two observers. The kappa (κ) value is interpreted as poor (<0 points), slight (0.01–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), or almost perfect (0.81–1) agreement. The overall kappa values were calculated for each MRI and indicated the extent to which surgeons agreed on the presence or absence of a Hill-Sachs lesion. All surgeon characteristics were presented in absolute numbers and percentages, and surgeons were grouped according to characteristics. A technique similar to the classical pairwise t-test for means, based on a large-sample linear approximation of Fleiss' kappa, was used to test differences in interobserver agreement [24]. For clarity, we also presented the percentage of (observed) agreement, calculated as the average agreement between all possible pairs of r raters [23]. Statistical significance was set at p<0.05. When comparing three groups, we applied the Bonferroni correction. For each MRI, the overall certainty was calculated by dividing the total numbers for each response that were given as absolutely certain, certain, some doubts, or very uncertain for all the questions by the total number of surgeons.

RESULTS

Surgeon Characteristics

We invited 106 surgeons in total, and 20 surgeons completed the survey (19%). The majority was employed in Europe and specialized in shoulder and elbow surgery. Among the three surgeons with another specialty, two specialized in orthopedic traumatology.

Interobserver Agreement for Presence of Hill-Sachs Lesions

The observer answers are summarized in Table 2, and there were only two cases with complete agreement between all surgeons. For eight of the 29 MRIs (28%), the responses were almost randomly distributed; 40%–60% of the surgeons identified a Hill Sachs lesion, while the other 60%–40% did not. Together, all answers resulted in fair overall interobserver agreement for presence of a Hill-Sachs lesion (69% agreement; κ=0.304; p<0.001).

Table 2.

Results per MRI

Certainty

Reponses for evaluating the presence of a Hill-Sachs lesion indicated that 32% of the answers were very certain, 52% were certain, 16% had some doubts, and 0% were very uncertain.

Effect of Characteristics on Interobserver Variability

Surgeons with 11–20 years of experience had better agreement than surgeons with 6–10 years of experience (11–20 years: 90% agreement; κ=0.703 vs. 6–10 years: 66% agreement; κ=0.235, p=0.005). Having 0–5 years of experience did not influence agreement in comparison with 6–10 years (71% agreement, κ=0.363 vs. 66% agreement, κ=0.235, p=0.046) or 11–20 years (71% agreement; κ=0.363 vs. 90% agreement; κ=0.703, p=0.05). Country of specialty, shoulder and elbow specialty, and involvement in resident or fellowship training did not affect the level of agreement within subgroups of surgeons, as detailed in Table 3.

Table 3.

Agreement by surgeon characteristics on presence of Hill-Sachs lesions

DISCUSSION

This study showed fair interobserver reliability to detect Hill-Sachs lesions on MRI, indicating that MRI alone should be interpreted with caution in clinical decision making. Although the surgeons were mostly (84%) certain or very certain regarding their decision about the presence of a Hill-Sachs lesion, the degree of agreement between surgeons in detecting a Hill-Sachs lesion on MRI was only fair. In this sample of 20 surgeons, agreement was not affected consistently by surgeon’s country of specialty, years of experience, specialty, or fellowship training.

The fair agreement for the presence of Hill-Sachs lesions could be attributed to difference in interpretation of the transition zone between cartilage and bone. Lack of cartilage can have the same appearance as an impression fracture and could be mistaken for a Hill-Sachs lesion, or vice versa. Moreover, the articular surface of the humeral head is the smallest in the superior-posterior segment and is the typical location of a Hill-Sachs lesion [25]. The anatomical humeral groove could be mistaken for a Hill-Sachs lesion [26]. Furthermore, detecting a Hill-Sachs lesion is difficult, even when assessing on arthroscopic videos, even though arthroscopy is the gold standard. Sasyniuk et al. [27] reported that only 35% of the surgeons assessing videotapes of arthroscopic procedures agreed on the presence of a Hill-Sachs lesion. Additionally, a previous study showed fair agreement between radiologists and fair to poor agreement between radiologists and an orthopedic surgeon in detecting Hill-Sachs lesions [18]. However, the present study included only two radiologists and one orthopedic surgeon.

The fact that the two surgeons with 11–20 years of experience had better agreement when assessing the presence of a Hill-Sachs lesion supports the value of subspecialties. Our results show a slightly higher agreement between surgeons with less than 5 years of experience in comparison with those with of 6–10 years, but both agreements were fair with a difference of only 5%, which limits the clinical relevance of this finding. The fair agreement with high level of confidence about the presence of a Hill-Sachs lesion indicates that surgeons cannot rely on their personal sense of certainty for these types of diagnostic and treatment decisions.

We included a representative mix of MRIs that consisted of smaller and larger Hill-Sachs lesions as well as lesions that are similar in appearance to simulate the clinical setting. We agree that adding these cases of lesions with a similar appearance to a Hill-Sachs lesion likely limits agreement between surgeons, but deemed this inclusion an important parameter for adequately assessing agreement as these cases provided relevant simulations of the clinical population. There were cases in the set of MRIs that had varying agreement that ranged from bad to good, but the overall agreement was fair. We think that the overall agreement best represents the clinical setting that consists not only of cases wherein lesions are easily distinguished from each other.

There are some limitations for interpreting the results of this study. First, we only had a response rate of 19%, which could influence our data due to lack of generalizability to all surgeons. Second, we did not confirm the Hill-Sachs lesions by arthroscopy. However, the accuracy and correlation between the MRI and arthroscopic findings have been documented in previous studies [8,28]. Additionally, only 35% of the surgeons agreed on the presence of a Hill-Sachs lesion when assessing videotapes of arthroscopic procedures [27]. More importantly, MRI typically guides the decision for conservative or operative treatment. Therefore, it is important to reliably assess Hill Sachs lesions on MRI, prior to arthroscopic or other surgery. Given the lack of a true gold standard, we did not intend to standardize or confirm the presence or absence of the lesions, but instead provide evidence of a substantial lack of consensus, which needs to be addressed.

Another limitation is that we looked at years of experience of the surgeons and not at the volume of shoulder and elbow procedures they had performed. Years of experience might be biased due to young, subspecialized shoulder surgeons performing many more shoulder procedures than older surgeons who have a wider scope of interest. Finally, some of the MRIs were performed with intravascular contrast. To our knowledge, there is no known difference in assessing Hill-Sachs lesions between MRIs with and without contrast.

A strength of this study was that a widely used interobserver agreement method (kappa) was used to assess the degree of consensus between surgeons regarding the presence and treatment of Hill-Sachs lesions and was augmented with percentage of agreement, which is easier to interpret. Moreover, we assessed consensus based on MRIs, which are most commonly used to detect pathology that causes glenohumeral instability [7]. In addition, we deliberately withheld patient characteristics from the reviewers to isolate the role of MRI in detecting a Hill-Sachs lesion without confounding factors. Our findings of limited agreement support the need for international criteria and guidelines for diagnosing Hill Sachs lesions.

Future research could address the disagreements that arise by evaluating and defining the criteria for individual surgeons to use to diagnose Hill Sachs lesions. These criteria can be considered and included in guideline development. Furthermore, an important and trending topic is to evaluate the most reliable measurement for glenoid and humeral bone loss [29,30]. Finally, the interobserver agreement of surgeons or radiologists could be measured for other imaging techniques, such as CT scans. Although surgeons are highly confident in their ability to detect Hill-Sachs lesions, in the absence of patient characteristics, there is only fair agreement between surgeons for detecting Hill-Sachs lesions on MRI.

Acknowledgements

Shoulder and Elbow Center (collaborators): Gregory R. Waryasz; Matthijs R. Krijnen; Pierre Mansat; Sven A.F. Tulner; Christian M. Fortanier; Carola F. van Eck; Ruud P. van Hove; Christiaan J.A. van Bergen; John N. Trantalis; Paul Hoogervorst; Tjarco D.W. Alta; Guus J.M. Janus; Alexander van Tongel; Diederik J.W. Meijer; Ronald N. Wessel; Mark Schnetzke; John Cheung; Derek F.P. van Deurzen.

Notes

Financial support

None.

Conflict of interest

None.

References

1. Di Giacomo G, Piscitelli L, Pugliese M. The role of bone in glenohumeral stability. EFORT Open Rev 2018;3:632–40.

2. Reddy MP, Krishnan SG. Hill-Sachs lesions are best treated with a surface replacement: affirms. Semin Arthroplasty 2014;25:226–30.

3. Di Giacomo G, Peebles LA, Pugliese M, et al. Glenoid track instability management score: radiographic modification of the instability severity index score. Arthroscopy 2020;36:56–67.

4. Cicak N, Bilić R, Delimar D. Hill-Sachs lesion in recurrent shoulder dislocation: sonographic detection. J Ultrasound Med 1998;17:557–60.

5. Workman TL, Burkhard TK, Resnick D, et al. Hill-Sachs lesion: comparison of detection with MR imaging, radiography, and arthroscopy. Radiology 1992;185:847–52.

6. Stefaniak J, Lubiatowski P, Kubicka AM, Wawrzyniak A, Wałecka J, Romanowski L. Clinical and radiological examination of bony-mediated shoulder instability. EFORT Open Rev 2020;5:815–27.

7. Weel H, Tromp W, Krekel PR, Randelli P, van den Bekerom MP, van Deurzen DF. International survey and surgeon's preferences in diagnostic work-up towards treatment of anterior shoulder instability. Arch Orthop Trauma Surg 2016;136:741–6.

8. Saqib R, Harris J, Funk L. Comparison of magnetic resonance arthrography with arthroscopy for imaging of shoulder injuries: retrospective study. Ann R Coll Surg Engl 2017;99:271–4.

9. Kalson NS, Geoghegan JM, Funk L. Magnetic resonance arthrogram and arthroscopy of the shoulder: a comparative retrospective study with emphasis on posterior labral lesions and radiologist locality. Shoulder Elbow 2011;3:210–4.

10. Hayes ML, Collins MS, Morgan JA, Wenger DE, Dahm DL. Efficacy of diagnostic magnetic resonance imaging for articular cartilage lesions of the glenohumeral joint in patients with instability. Skeletal Radiol 2010;39:1199–204.

11. Theodoropoulos JS, Andreisek G, Harvey EJ, Wolin P. Magnetic resonance imaging and magnetic resonance arthrography of the shoulder: dependence on the level of training of the performing radiologist for diagnostic accuracy. Skeletal Radiol 2010;39:661–7.

12. van Grinsven S, Kesselring FO, van Wassenaer-van Hall HN, Lindeboom R, Lucas C, van Loon CJ. MR arthrography of traumatic anterior shoulder lesions showed modest reproducibility and accuracy when evaluated under clinical circumstances. Arch Orthop Trauma Surg 2007;127:11–7.

13. Kirkley A, Litchfield R, Thain L, Spouge A. Agreement between magnetic resonance imaging and arthroscopic evaluation of the shoulder joint in primary anterior dislocation of the shoulder. Clin J Sport Med 2003;13:148–51.

14. Chauvin NA, Jaimes C, Ho-Fung V, Wells L, Ganley T, Jaramillo D. Diagnostic performance of magnetic resonance arthrography of the shoulder in children. Pediatr Radiol 2013;43:1309–15.

15. Mahmoud MK, Badran YM, Zaki HG, Ali AH. One-shot MR and MDCT arthrography of shoulder lesions with arthroscopic correlation. Egypt J Radiol Nucl Med 2013;44:273–81.

16. O'Brien J, Grebenyuk J, Leith J, Forster BB. Frequency of glenoid chondral lesions on MR arthrography in patients with anterior shoulder instability. Eur J Radiol 2012;81:3461–5.

17. Simão MN, Nogueira-Barbosa MH, Muglia VF, Barbieri CH. Anterior shoulder instability: correlation between magnetic resonance arthrography, ultrasound arthrography and intraoperative findings. Ultrasound Med Biol 2012;38:551–60.

18. Halma JJ, Eshuis R, Krebbers YM, Weits T, de Gast A. Interdisciplinary inter-observer agreement and accuracy of MR imaging of the shoulder with arthroscopic correlation. Arch Orthop Trauma Surg 2012;132:311–20.

19. Beason AM, Koehler RJ, Sanders RA, et al. Surgeon agreement on the presence of pathologic anterior instability on shoulder imaging studies. Orthop J Sports Med 2019;7:2325967119862501.

20. van Grinsven S, Nijenhuis TA, Konings PC, van Kampen A, van Loon CJ. Are radiologists superior to orthopaedic surgeons in diagnosing instability-related shoulder lesions on magnetic resonance arthrography? A multicenter reproducibility and accuracy study. J Shoulder Elbow Surg 2015;24:1405–12.

21. Baudi P, Campochiaro G, Rebuzzi M, Matino G, Catani F. Assessment of bone defects in anterior shoulder instability. Joints 2013;1:40–8.

22. Probyn LJ, White LM, Salonen DC, Tomlinson G, Boynton EL. Recurrent symptoms after shoulder instability repair: direct MR arthrographic assessment: correlation with second-look surgical evaluation. Radiology 2007;245:814–23.

23. Klein D. Implementing a general framework for assessing interrater agreement in Stata. Stata J 2018;18:871–901.

24. Gwet KL. Testing the difference of correlated agreement coefficients for statistical significance. Educ Psychol Meas 2016;76:609–37.

25. Fox JA, Cole BJ, Romeo AA, et al. Articular cartilage thickness of the humeral head: an anatomic study. Orthopedics 2008;31:216.

26. Richards RD, Sartoris DJ, Pathria MN, Resnick D. Hill-Sachs lesion and normal humeral groove: MR imaging features allowing their differentiation. Radiology 1994;190:665–8.

27. Sasyniuk TM, Mohtadi NG, Hollinshead RM, Russell ML, Fick GH. The inter-rater reliability of shoulder arthroscopy. Arthroscopy 2007;23:971–7.

28. van der Veen HC, Collins JP, Rijk PC. Value of magnetic resonance arthrography in post-traumatic anterior shoulder instability prior to arthroscopy: a prospective evaluation of MRA versus arthroscopy. Arch Orthop Trauma Surg 2012;132:371–5.

29. Rouleau DM, Garant-Saine L, Canet F, Sandman E, Ménard J, Clément J. Measurement of combined glenoid and Hill-Sachs lesions in anterior shoulder instability. Shoulder Elbow 2017;9:160–8.

30. Assunção JH, Gracitelli ME, Borgo GD, Malavolta EA, Bordalo-Rodrigues M, Ferreira Neto AA. Tomographic evaluation of Hill-Sachs lesions: is there a correlation between different methods of measurement. Acta Radiol 2017;58:77–83.

Article information Continued

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table 1.

Studies since 2000 that have assessed MRI accuracy and reliability

MRI: magnetic resonance imaging, R1: radiologist 1, R2: radiologist 2, OS: orthopedic surgeon, MR: magnetic resonance.

Study	Observer	Number of MRIs wherein Hill Sachs is studied	Kappa	Sensitivity (%)	Specificity (%)
Beason et al. (2019) [19]	22 Shoulder/sports medicine fellowship-trained orthopedic surgeons	20	0.33
Halma et al. (2012) [18]	2 Radiologists, 1 orthopedic surgeon	50	R1 vs. R2: 0.21	0–33	72.3–95.7
			R1 vs. OS: 0.31
			R2 vs. OS: –0.01
Saqib et al. (2017) [8]	Radiologist	194	-	91	91
Kalson et al. (2011) [9]	Shoulder radiologist or musculoskeletal radiologist	95	-	71	85
Hayes et al. (2010) [10]	2 Radiologists	87	-	96.3	90.6
Theodoropoulos et al. (2010) [11]	Community-based radiologists	238	1 (Unenhanced MRI)	Unenhanced MRI : 0	Unenhanced MRI : 85–100
	Fellowship-trained radiologists		0.788 (MR arthrogram)	MRMR arthrogram : 50–70	MR MR arthrogram: 99–100
Probyn et al. (2007) [22]	Musculoskeletal radiologist or musculoskeletal imaging fellow	15	-	100	(due to only 1 patient not having a Hill-Sachs)
van Grinsven et al. (2007) [12]	2 Radiologists	61	0.45	-	-
Kirkley et al. (2003) [13]	2 Musculoskeletal radiologists	16	-	100	100
van Grinsven et al. (2015) [20]	4 Radiologists	45	Between radiologists: 0.51 and 0.46	41.1–73.8	81.3–88.5
	4 Orthopedic surgeons		Between orthopedic surgeons: 0.46 and 0.41
	(2 teams of radiologists and 2 of orthopedic surgeons)
Chauvin et al. (2013) [14]	3 Radiologists with experience in musculoskeletal disorders	66	-	100	94
Mahmoud et al. (2013) [15]	2 Musculoskeletal radiologists	31	-	81.8	95.2
O'Brien et al. (2012) [16]	2 Musculoskeletal radiologists	165	0.964	-	-
Simão et al. (2012) [17]	3 Radiologists	56	0.64	100	78

Variable	Agreement (%)	Fleiss’ kappa (κ)	p-value
Country of specialty
Europe (n=15, 75%)	70	0.323	0.863 (vs. USA)
			0.067 (vs. other)
United States (n=2, 10%)	66	0.289	0.394 (vs. other)
Other (n=3, 15%)	66	0.114
Year of practice
0–5 (n=8, 40%)	71	0.363	0.046 (vs. 6–10)
			0.050 (vs. 11–20)
6–10 (n=10, 50%)	66	0.235	0.005^*(vs. 11–20)
11–20 (n=2, 10%)	90	0.703
Specialty			0.876
Shoulder and elbow surgery (n=17, 85%)	69	0.298
Other (n=3, 15%)	68	0.276
Involved in resident or fellow training			0.172
Yes (n=13, 65%)	67	0.259
No (n=7, 35%)	72	0.366

MRI	Hill-Sachs present		Certainty regarding presence of Hill-Sachs
MRI	Yes (%)	No (%)	Very uncertain (%)	Some doubts (%)	Certain (%)	Absolutely certain (%)
1	90	10	0	0	45	55
2	45	55	0	15	45	40
3	90	10	0	5	70	25
4	95	5	0	15	50	35
5	5	95	0	5	55	40
6	80	20	0	20	40	40
7	90	10	0	15	30	55
8	30	70	0	25	45	30
9	90	10	0	5	60	35
10	60	40	5	25	50	20
11	80	20	0	15	40	45
12	60	40	0	30	40	30
13	55	45	0	20	55	25
14	20	80	0	20	65	15
15	30	70	0	15	65	20
16	75	25	0	25	45	30
17	100	0	0	10	40	50
18	55	45	0	35	50	15
19	90	10	0	15	45	40
20	60	40	0	35	45	20
21	95	5	0	0	55	45
22	90	10	0	10	60	30
23	75	25	0	15	60	25
24	5	95	0	5	75	20
25	70	30	0	30	40	30
26	50	50	0	15	70	15
27	100	0	0	10	45	45
28	90	10	0	15	45	40
29	60	40	5	10	70	15