Qualitative Research, volume 16, issue 2, pages 198-212

Process guidelines for establishing Intercoder Reliability in qualitative studies

Publication typeJournal Article
Publication date2015-04-20
scimago Q1
wos Q1
SJR1.839
CiteScore8.1
Impact factor3.2
ISSN14687941, 12298417, 17413109
Social Sciences (miscellaneous)
History and Philosophy of Science
Abstract

Qualitative interviews are increasingly being utilized within the context of intervention trials. While there is emerging assistance for conducting and reporting qualitative analysis, there are limited practical resources available for researchers engaging in a group coding process and interested in ensuring adequate Intercoder Reliability (ICR); the amount of agreement between two or more coders for the codes applied to qualitative text. Assessing the reliability of the coding helps establish the credibility of qualitative findings. We discuss our experience calculating ICR in the context of a behavioural HIV prevention trial for young women in South Africa which involves multiple rounds of longitudinal qualitative data collection. We document the steps that we took to improve ICR in this study, the challenges to improving ICR, and the value of the process to qualitative data analysis. As a result, we provide guidelines for other researchers to consider as they embark on large qualitative projects.

Campbell J.L., Quincy C., Osserman J., Pedersen O.K.
2013-08-23 citations by CoLab: 1770 Abstract  
Many social science studies are based on coded in-depth semistructured interview transcripts. But researchers rarely report or discuss coding reliability in this work. Nor is there much literature on the subject for this type of data. This article presents a procedure for developing coding schemes for such data. It involves standardizing the units of text on which coders work and then improving the coding scheme’s discriminant capability (i.e., reducing coding errors) to an acceptable point as indicated by measures of either intercoder reliability or intercoder agreement. This approach is especially useful for situations where a single knowledgeable coder will code all the transcripts once the coding scheme has been established. This approach can also be used with other types of qualitative data and in other circumstances.
MacPhail C., Adato M., Kahn K., Selin A., Twine R., Khoza S., Rosenberg M., Nguyen N., Becker E., Pettifor A.
AIDS and Behavior scimago Q1 wos Q2
2013-02-23 citations by CoLab: 27 Abstract  
Women are at increased risk of HIV infection in much of sub-Saharan Africa. Longitudinal and cross-sectional studies have found an association between school attendance and reduced HIV risk. We report feasibility and acceptability results from a pilot of a cash transfer intervention conditional on school attendance paid to young women and their families in rural Mpumalanga, South Africa for the prevention of HIV infection. Twenty-nine young women were randomised to intervention or control and a cash payment based on school attendance made over a 2-month period. Quantitative (survey) and qualitative (focus group and interview) data collection was undertaken with young women, parents, teachers and young men in the same school. Qualitative analysis was conducted in Atlas.ti using a framework approach and basic descriptive analysis in Excel was conducted on the quantitative data. Results indicate it was both feasible and acceptable to introduce such an intervention among this population in rural South Africa. There was good understanding of the process of randomisation and the aims of the study, although some rumours developed in the study community. We address some of the changes necessary to ensure acceptability and feasibility of the main trial.
Friese S.
2012-01-01 citations by CoLab: 205
Artstein R., Poesio M.
Computational Linguistics scimago Q1 wos Q1 Open Access
2008-09-30 citations by CoLab: 677 Abstract  
Abstract This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff's alpha as well as Scott's pi and Cohen's kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasks—but that their use makes the interpretation of the value of the coefficient even harder.
Burla L., Knierim B., Barth J., Liewald K., Duetz M., Abel T.
Nursing Research scimago Q1 wos Q1
2008-03-13 citations by CoLab: 454 Abstract  
High intercoder reliability (ICR) is required in qualitative content analysis for assuring quality when more than one coder is involved in data analysis. The literature is short of standardized procedures for ICR procedures in qualitative content analysis.To illustrate how ICR assessment can be used to improve codings in qualitative content analysis.Key steps of the procedure are presented, drawing on data from a qualitative study on patients' perspectives on low back pain.First, a coding scheme was developed using a comprehensive inductive and deductive approach. Second, 10 transcripts were coded independently by two researchers, and ICR was calculated. A resulting kappa value of .67 can be regarded as satisfactory to solid. Moreover, varying agreement rates helped to identify problems in the coding scheme. Low agreement rates, for instance, indicated that respective codes were defined too broadly and would need clarification. In a third step, the results of the analysis were used to improve the coding scheme, leading to consistent and high-quality results.The quantitative approach of ICR assessment is a viable instrument for quality assurance in qualitative content analysis. Kappa values and close inspection of agreement rates help to estimate and increase quality of codings. This approach facilitates good practice in coding and enhances credibility of analysis, especially when large samples are interviewed, different coders are involved, and quantitative results are presented.
Tong A., Sainsbury P., Craig J.
2007-09-16 citations by CoLab: 24162 Abstract  
Qualitative research explores complex phenomena encountered by clinicians, health care providers, policy makers and consumers. Although partial checklists are available, no consolidated reporting framework exists for any type of qualitative design.To develop a checklist for explicit and comprehensive reporting of qualitative studies (in depth interviews and focus groups).We performed a comprehensive search in Cochrane and Campbell Protocols, Medline, CINAHL, systematic reviews of qualitative studies, author or reviewer guidelines of major medical journals and reference lists of relevant publications for existing checklists used to assess qualitative studies. Seventy-six items from 22 checklists were compiled into a comprehensive list. All items were grouped into three domains: (i) research team and reflexivity, (ii) study design and (iii) data analysis and reporting. Duplicate items and those that were ambiguous, too broadly defined and impractical to assess were removed.Items most frequently included in the checklists related to sampling method, setting for data collection, method of data collection, respondent validation of findings, method of recording data, description of the derivation of themes and inclusion of supporting quotations. We grouped all items into three domains: (i) research team and reflexivity, (ii) study design and (iii) data analysis and reporting.The criteria included in COREQ, a 32-item checklist, can help researchers to report important aspects of the research team, study methods, context of the study, findings, analysis and interpretations.
Hayes A.F., Krippendorff K.
2007-04-01 citations by CoLab: 2788 Abstract  
In content analysis and similar methods, data are typically generated by trained human observers who record or transcribe textual, pictorial, or audible matter in terms suitable for analysis. Conclusions from such data can be trusted only after demonstrating their reliability. Unfortunately, the content analysis literature is full of proposals for so-called reliability coefficients, leaving investigators easily confused, not knowing which to choose. After describing the criteria for a good measure of reliability, we propose Krippendorff's alpha as the standard reliability measure. It is general in that it can be used regardless of the number of observers, levels of measurement, sample sizes, and presence or absence of missing data. To facilitate the adoption of this recommendation, we describe a freely available macro written for SPSS and SAS to calculate Krippendorff's alpha and illustrate its use with a simple example.
Guest G., Bunce A., Johnson L.
Field Methods scimago Q1 wos Q2
2006-02-01 citations by CoLab: 10936 Abstract  
Guidelines for determining nonprobabilistic sample sizes are virtually nonexistent. Purposive samples are the most commonly used form of nonprobabilistic sampling, and their size typically relies on the concept of “saturation,” or the point at which no new information or themes are observed in the data. Although the idea of saturation is helpful at the conceptual level, it provides little practical guidance for estimating sample sizes, prior to data collection, necessary for conducting quality research. Using data from a study involving sixty in-depth interviews with women in two West African countries, the authors systematically document the degree of data saturation and variability over the course of thematic analysis. They operationalize saturation and make evidence-based recommendations regarding nonprobabilistic sample sizes for interviews. Based on the data set, they found that saturation occurred within the first twelve interviews, although basic elements for metathemes were present as early as six interviews. Variability within the data followed similar patterns.
Hruschka D.J., Schwartz D., St.John D.C., Picone-Decaro E., Jenkins R.A., Carey J.W.
Field Methods scimago Q1 wos Q2
2004-08-01 citations by CoLab: 522 Abstract  
Analysis of text from open-ended interviews has become an important research tool in numerous fields, including business, education, and health research. Coding is an essential part of such analysis, but questions of quality control in the coding process have generally received little attention. This article examines the text coding process applied to three HIV-related studies conducted with the Centers for Disease Control and Prevention considering populations in the United States and Zimbabwe. Based on experience coding data from these studies, we conclude that (1) a team of coders will initially produce very different codings, but (2) it is possible, through a process of codebook revision and recoding, to establish strong levels of intercoder reliability (e.g., most codes with kappa 0.8). Furthermore, steps can be taken to improve initially poor intercoder reliability and to reduce the number of iterations required to generate stronger intercoder reliability.
Lombard M., Snyder-Duch J., Bracken C.C.
Human Communication Research scimago Q1 wos Q1
2002-10-01 citations by CoLab: 1774
Kurasaki K.S.
Field Methods scimago Q1 wos Q2
2000-08-01 citations by CoLab: 292 Abstract  
Intercoder reliability is a measure of agreement among multiple coders for how they apply codes to text data. Intercoder reliability can be used as a proxy for the validity of constructs that emerge from the data. Popular methods for establishing intercoder reliability involve presenting predetermined text segments to coders. Using this approach, researchers run the risk of altering meanings by lifting text from its original context, or making interpretations about the length of codable text. This article describes a set of procedures that was used to develop and assess intercoder reliability with free-flowing text data, in which the coders themselves determined the length of codable text segments. Content analysis of open-ended interview data collected from twenty third-generation Japanese American men and women generated an intercoder reliability of more than .80 for fifteen of the seventeen themes, an average agreement of .90 across all themes, and consistency among the coders in how they segmented coded text. The findings suggest that these procedures may be useful for validating the conclusions drawn from other qualitative studies using text data.
Pope C.
BMJ scimago Q1 wos Q1
2000-01-08 citations by CoLab: 4253 Abstract  
This is the second in a series of three articles Contrary to popular perception, qualitative research can produce vast amounts of data. These may include verbatim notes or transcribed recordings of interviews or focus groups, jotted notes and more detailed “fieldnotes” of observational research, a diary or chronological account, and the researcher's reflective notes made during the research. These data are not necessarily small scale: transcribing a typical single interview takes several hours and can generate 20–40 pages of single spaced text. Transcripts and notes are the raw data of the research. They provide a descriptive record of the research, but they cannot provide explanations. The researcher has to make sense of the data by sifting and interpreting them. #### Summary points Qualitative research produces large amounts of textual data in the form of transcripts and observational fieldnotes The systematic and rigorous preparation and analysis of these data is time consuming and labour intensive Data analysis often takes place alongside data collection to allow questions to be refined and new avenues of inquiry to develop Textual data are typically explored inductively using content analysis to generate categories and explanations; software packages can help with analysis but should not be viewed as short cuts to rigorous and systematic analysis High quality analysis of qualitative data depends on the skill, vision, and integrity of the researcher; it should not be left to the novice In much qualitative research the analytical process begins during data collection as the data already gathered are analysed and shape the ongoing data collection. This sequential analysis1 or interim analysis2 has the advantage of allowing the researcher to go back and refine questions, develop hypotheses, and pursue emerging avenues of inquiry in further depth. Crucially, it also enables the researcher to look for deviant or negative cases; that is, …
Morse J.M.
Qualitative Health Research scimago Q1 wos Q1
1999-11-01 citations by CoLab: 37
Lacy S., Riffe D.
1996-12-01 citations by CoLab: 138 Abstract  
This study views intercoder reliability as a sampling problem. It develops a formula for generating sample sizes needed to have valid reliability estimates. It also suggests steps for reporting reliability. The resulting sample sizes will permit a known degree of confidence that the agreement in a sample of items is representative of the pattern that would occur if all content items were coded by all coders.
Ricci L., Fery C., Tubach F., Agrinier N., Gagneux-Brunon A.
Vaccine scimago Q1 wos Q2
2025-04-01 citations by CoLab: 0
Dabestani R., Solaimani S., Ajroemjan G., Koelemeijer K.
2025-04-01 citations by CoLab: 0
Hsu P., Wang X.
2025-03-11 citations by CoLab: 0 Abstract  
Background: Science internships have been suggested as a powerful way to engage high school students in conducting authentic science inquiry. However, despite the recognized significance of high school science internships, little research is done to examine how these experiences affect high school students’ career choices. Purpose: Our study drew on the theoretical framework of social cognitive career theory to examine how a 7-month science internship might shape high school students’ career choices. Method: 88 students were interviewed 6–8 months after their internship graduation. Findings: The analysis suggests that the science internships altered more than 90% of the participating students’ career choices by either enhancing, expanding, narrowing down, or even replacing their original career choices. Students reported that the science internships boosted their self-efficacy through their first-hand mastery of authentic STEM practices, by directly observing scientists’ STEM performance, by hearing scientists’ opinions on students’ capabilities and potential in STEM, and by the impact of the students’ own physiological and affective states on the STEM practices. Implications: These findings help educators better understand how a unique learning environment like science internship may influence high school students’ career choices; they have important implications for internship design, career counseling, and education policy.
Kim D., Shin D.
PLoS ONE scimago Q1 wos Q1 Open Access
2025-03-05 citations by CoLab: 0 PDF Abstract  
This study empirically analyzes the evolution of cultural products based on theoretical cultural discourse and evolutionary processes. We use data from 116 survival auditions aired in Korea between 2006 and 2017 to examine the cultural memes that shape the continued appeal of survival audition programs. Specifically, we discuss the influence of “memes” in cultural codes, namely, audience empowerment, experts’ involvement, fair rewards, and career opportunities. The results of probit regression analysis with survival audition program reproduction as the dependent variable show that audience empowerment, experts’ involvement, fair rewards, and career opportunities in survival audition programs influence the reproduction of cultural goods. The findings confirm all four hypotheses. The findings of this study have theoretical and practical implications. First, it enriches the theoretical discourse on the evolution of cultural goods by offering a meme-based explanation for their reproduction. Second, it has implications for industry practitioners involved in planning and producing cultural goods by identifying normative cultural codes that affect the longevity of these products.
Aeschlimann A., Heim E., Killikelly C., Mahmoud N., Haji F., Stoeckli R.T., Aebersold M., Thoma M., Maercker A.
Internet Interventions scimago Q1 wos Q1 Open Access
2025-03-01 citations by CoLab: 0
Pohlmann J.R., ten Caten C.S., Ribeiro J.L.
Journal of Technology Transfer scimago Q1 wos Q1
2025-02-21 citations by CoLab: 0
Iyer K.V., Dani K.
Corporate Communications scimago Q2 wos Q2
2025-02-12 citations by CoLab: 0 Abstract  
PurposeAlthough women have been represented in advertising since WWII, the themes were laden with stereotypes – from working roles in the 1940s to superwomen in the 1970 and 1980s, second-wave feminism. Contemporary women-centric advertising (or femvertising) strives towards women empowerment and gender equality by stripping down stereotypes. However, through closer inspection, this study examines if this femvertising by brands nowadays is a gimmick to sell their products and further the neoliberal, postfeminist perspective.Design/methodology/approachSemiotic content analysis (SCA) explored the post-feminist discourses, as categorised by Windels et al. (2020) – in the internationally awarded 80 advertisements produced from 2013 to 2023 in the global West and South. Codes generated from SCA were then quantitatively analysed using chi-square and p-values, comparing the three themes: post-feminist elements and discourses, the form of self-surveillance and product ads and measuring the changes in post-feminist discourses in recent years.FindingsAfter 2018, advertisements used more post-feminist discourse, especially commodity feminism, self-surveillance and love-your-body parameters. Brands reacted in their campaigns, conforming to gender stereotypes under empowerment and modifying feminist values.Research limitations/implicationsThe study lacked a phenomenological understanding of the perspective of the consumers, the advertisers and the panel judges of these awards through a qualitative study on the post-feminist aspects of the femvertisements, the importance of depoliticising the women’s struggle or the feminist movement in communicating with the audience and how such a strategy has helped in empowering (or disempowering) real women.Practical implicationsThe study highlights the need for inclusive marketing communication and also outlines implications for the brand owners, advertisers and the creative team. The research emphasises determining the fit between brands and the social issue, eventually leading to positive brand attitude and purchase intention among consumers.Social implicationsThe research helps inform the young consumers about gender equity, the role played by the social, cultural, political, environmental and structural elements in shaping women’s empowerment and how their identity and experiences affect their empowerment. An inclusive communication approach would enable projects with real people with whom consumers, irrespective of gender, can resonate.Originality/valueThe study highlighted the femvertising issue from an inclusive marketing communication spectrum, implying its importance for brands’ attempts to connect with feminist and women consumers authentically.
Liu G., Chen Y., Ko W.
2025-02-10 citations by CoLab: 0 Abstract  
ABSTRACTThis study explores how inter‐organizational justice, and formal contracts influence new product development (NPD) collaboration in supply chain networks. Challenging traditional transaction cost economics (TCE), the research focuses on collaborative NPD in hub‐and‐spoke supply chain structures. Data from 183 Chinese suppliers and 22 executive interviews reveal unexpected patterns in NPD collaboration. Procedural justice exhibits an inverted U‐shaped relationship with NPD collaboration, linking higher fairness to improved collaboration up to a point, beyond which further increases may associate with diminishing returns. In contrast, distributive justice shows a U‐shaped relationship with NPD collaboration, where higher equity initially relates to reduced collaboration but later correlates with renewed engagement. Notably, formal contracts amplify the negative interactions between these justice dimensions. This contradicts the conventional view of their complementary roles. These findings contribute to theoretical advancements by illustrating how inter‐organizational justice mechanisms function differently in complex network structures compared to simple dyadic relationships. Careful calibration of inter‐organizational justice dimensions and formal contracts proves essential for fostering productive NPD collaboration. These governance insights offer directions for enhancing supply chain relationship management.
Porter S., Pitt T., Eubank M., Butt J., Thomas O.
2025-02-10 citations by CoLab: 0 Abstract  
Following Moshe Talmon's (1990) ground-breaking work on single-session therapy, the philosophy and practice of single-session therapy has expanded across the world. Critically, and regardless of context and approach, single-session therapists support adopting a “single-session mindset” (Cannistrà, 2022; Hoyt et al., 2020). Our study sought to clarify expert's understanding of this mindset with empirical evidence. Ten world leading figures in single-session therapy were interviewed against this aim. Reflexive thematic analysis highlighted a single-session mindset is founded upon nine core beliefs and seventeen attitudes that are intentionally embraced, before and during single-session work. This mindset aligns the therapist and client towards the possibility of creating change within a single session. The findings provide empirical clarity on the concept of the single-session mindset, offering valuable insights for practitioners attempting to implement brief methods into practice and for trainers who are helping others to do so.
Williamson C., van Rooyen A., Dry R.
2025-02-01 citations by CoLab: 0 PDF Abstract  
As a leading qualitative researcher, Norman Denzin advocated for a bigger tent to expand and deepen qualitative inquiry. The “tent” metaphor has therefore been used by scholars to advocate for considerations of diverse options around philosophies of inquiry. With the advent of Artificial Intelligence (AI), researchers face additional options around research decisions, philosophically, practically and ethically. Researchers table numerous research questions around AI, given its rapid uptake. These include re-visiting the established notions of human-centered coding, as well as exploring the potential of AI coding, within these notions. There is therefore room to expand “the tent” to delineate 1) the specific use of AI within qualitative data analysis, specifically AI coding, which is now an established function within bespoke qualitative data analysis software (QDAS). Additionally, with AI providing a stream of independent coding, 2) researchers may well deliberate the need for multiple, second or independent coders. This paper responds to this two-fold aim of the study using action research that involved researchers doing coding and using independent coding, in tandem with AI. The contribution of the paper is an outline of these action steps, reflectively culminating in a practical framework. This may be used by researchers, both curious and confident, in seeking novel ways to broaden the scope of their coding practices.
Roy A., Das M., Lim W.M., Kalai A.
2025-01-17 citations by CoLab: 0
McHugh G.A., Lavender E.C., Bennell K.L., Kingsbury S.R., Conaghan P.G., Hinman R.S., Comer C., Conner M., Nelligan R.K., Groves‐Williams D.
Musculoskeletal Care scimago Q1 wos Q3
2025-01-09 citations by CoLab: 0 Abstract  
ABSTRACTIntroductionPersistent knee pain often due to knee osteoarthritis (OA) is a highly prevalent and disabling condition. Electronic‐rehabilitation (e‐rehab) programmes have the potential to support self‐management of knee OA. This study aimed to evaluate user engagement and acceptability of two e‐rehab programmes, Group e‐rehab, a remote physiotherapy‐led programme and My Knee UK, a self‐directed web‐based exercise programme.MethodsDescriptive qualitative study nested within a feasibility trial. In‐depth interviews were conducted remotely. Data were analysed using inductive thematic analysis.ResultsEighteen participants from the feasibility trial took part in the interviews, 10 who received Group e‐rehab and eight My Knee UK. Two key themes were engagement with exercise and impact of programme. Despite initial challenges with doing the exercises, most participants found both programmes acceptable and beneficial in improving symptoms and knowledge in managing their knee pain. Multiple factors contributed to motivation to exercise.DiscussionUnderstanding more about users' perception and acceptability of both programmes was important to ascertain, both from people who engaged and those who did not engage with the programmes, to make improvements for the future delivery of the e‐rehab programmes.ConclusionGroup e‐rehab and My Knee UK can support people to self‐manage their persistent knee pain due to knee OA. The e‐rehab programmes have the potential to improve health services by providing two new models of service delivery enabling more patients to receive support and training to equip them to effectively manage their knee OA.
Lu Y., Zhu Y., Huang W., Zhou C., Shen Z., Qiu Y., Ke Y., Zong X., Guo Q., Yashengjiang M., Man J.
2025-01-07 citations by CoLab: 0 Abstract  
Cultural mismatch theory attempts to explain how the independent, middle-class culture of U.S. higher education may inadvertently contribute to creating and maintaining social class inequalities for students from interdependently oriented, working-class background. The current research was the first attempt to examine the theory’s cross-cultural applicability in the Chinese context. Validating the main claims of cultural mismatch theory, more selective Chinese universities endorsed significantly more independent values but similar interdependent values as their less-selective counterparts (Study 1), and first-generation college students endorsed significantly more interdependent motives for college than their continuing-generation counterparts (Studies 2, 3). Somewhat different from prior findings in the U.S., which found holding interdependent models of self to robustly predict negative outcomes, results revealed nuanced cultural advantage for holding independent models of self for Chinese students in terms of lower depression (Study 2) and higher daily sense of fit over a 2-week span (Study 3). The Chinese university culture did not seem to exert unequal cultural barriers through a lack of interdependent norms, but an overemphasis of independent norms. The cross-cultural perspective helps contextualize institutionalized cultural mismatch in the culture of the larger society.
Sklar E., Chodur G.M., Kemp L., Fetter D.S., Scherr R.E.
2025-01-01 citations by CoLab: 1

Top-30

Journals

1
2
3
4
5
6
1
2
3
4
5
6

Publishers

10
20
30
40
50
60
10
20
30
40
50
60
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?