Content Analysis: A Flexible Methodology

White, Marilyn Domas and Emily E Marsh. "Content Analysis: A Flexible Methodology." Library Trends, vol. 55 no. 1, 2006, p. 22-45. Project MUSE, doi:10.1353/lib.2006.0053.

Content analysis is "systematic, rigorous approach to analyzing documents obtained or generated in the course of research" (pg. 22). Content analysis is flexible in that it accomodates quantitative, qualitative, and mixed methods research; it can also be used in tandem with other methods. Its roots stem from research in mass communications in the 1950s on a model of "sender / message / receiver," a quantified analysis of recurring text content called "manifest content." The broad cluster of analysis methods applied to text is generally referred to as "textual analysis." Variants of textual analysis include content analysis, "conversational analysis, discourse analysis, ethnographic analysis, functional pragmatics, rhetorical analysis, and narrative semiotics" (pg. 23).


A broad definition from Krippendorff (2004) is: "“a research technique for making replicable and valid inferences from texts (or other meaningful matter) to the contexts of their use" (pg. 23, 28). The research makes inferences (analytical constructs) from the text to answer research questions. "The two domains, the texts and the context, are logically independent, and the researcher draws conclusions from one independent domain (the texts) to the other (the context) ... The analytical constructs may be derived from (1) existing theories or practices; (2) the experience or knowledge of experts; and (3) previous research" (pg. 27). One might use a model of communication to make inferences about the sender of a message, the message itself, its effect, and the context surrounding its creation. Quantitative models of content analysis also allow for replication.


Data includes content that sends some sort of message or meaning to a receiver. This may be text, but also pictures in combination with text. Pictures and texts may be analyzed independently or alongside one another. However, text is the most common data for content analysis. Beaugrande and Dressler define seven criteria for what makes text: "cohesion, coherence, intentionality, acceptability, informativity, situationality, and intertextuality" (pg. 27-28). Each criteria is more clearly defined by example on page 28.

Neuendorf created a typology of messaging in texts: "individual messaging, interpersonal and group messaging, organizational messaging, and mass messaging" (pg. 28). Once more, the authors define more clear examples on page 28-29.

Unitizing Data

Content analysis necessitates breaking texts into units "for sampling, collecting, and analysis and reporting" (pg. 29). These units can be different or the same. This methodology follows a pragmatic paradigm, where sampling and data collection are informed by the question at hand.

Sampling units: identify population.
Data collection units: for measuring variables.
Units of analysis: for reporting analysis.


The authors define the steps as:

1. Establish hypothesis or hypotheses
2. Identify appropriate data (text or other communicative material)
3. Determine sampling method and sampling unit
4. Draw sample
5. Establish data collection unit and unit of analysis
6. Establish coding scheme that allows for testing hypothesis
7. Code data
8. Check for reliability of coding and adjust coding process if necessary
9. Analyze coded data, applying appropriate statistical test(s)
10. Write up results
(pg. 30)


The authors define the steps as:

1. Formulating Research Questions
2. Sampling
3. Coding / Method of Analysis