Ways of Knowing in HCI

Olson, J. S., & Kellogg, W. A. (Eds.). (2014). Ways of Knowing in HCI. New York, NY, USA:: Springer.

Prologue

“The field of HCI grew from the field of human factors applied to computing, with strong roots in cognitive psychology” (pg ix). Aspects of cognitive psychology were quantitatively applied to computing in some of the most infamous theories in HCI: Fitts’ law, Gestalt laws, recall vs. recognition.

Cognitive psychology is still present in HCI today, guiding the design of computing systems, but it has grown beyond cog sci and human factors. Beyond the design of interfaces, HCI researchers also study the settings, practices, opinions, experiences, and perceptions of what has commonly become known as “the user.” Suchman introduced ethnographic methods to the field in 1978 with Plans and Situated Action, moving the field beyond positivist measures. The field has now become so diverse in terms of epistemologies, methodologies, and methods, that “what counts as good research” and “how do we assess research fairly” have become common questions.

Concepts,‌ ‌Values,‌ ‌and‌ ‌Methods‌ ‌for‌ ‌Technical‌ ‌Human–Computer‌ ‌Interaction‌ ‌Research‌ ‌- ‌Scott‌ ‌E.‌ ‌Hudson‌ ‌and‌ ‌Jennifer‌ ‌Mankoff‌ ‌

Technical Research

Technical HCI research seeks to solve social problems and improve the world through technical artifacts.

Invention

Invention of technology seeks to “use technology to expand what can be done or to find how best to do things that can already be done” (pg. 69). Inventions are about inventing new solutions to social problems, increasing capabilities and advancing technologies, and enabling others to do the same. The fascination with invention, the newness and enhancement of technologies, drives technical HCI work.

Discovery

Discovery is more closely aligned to an interpretivist mode of research. They argue that invention, as a research approach in HCI, differs from discovery, the other forms of research aimed at understanding the world. Technical HCI work is more aimed at producing knowledge that allows others to reuse it for building and advancing artifacts. They write that technical research “emphasizes knowledge about how to create something (invention) but also knowledge that might be reused to assist in the creation of a whole class of similar things or even multiple types of different things” (pg. 70). Invention differs from development, which requires the creation of knowledge but also does not require the knowledge to be reusable. Yet, though there may be “purer” versions of both research and development, there is not a clear dividing line between the two.

Valuable and Trustworthy

Both valuable and trustworthy are intertwined properties in technical HCI work. Discovery research must be trustworthy in order to be viewed as valuable (and this also seems to lean more into the positivist way of doing things). Trustworthiness is often defined through high confidence as defined in statistics. The desire is to build “confidence in the results, building consensus, and causal attribution” about a slice of reality (pg. 73). The work is also valued if it works practically for real world situations, and is considered less valuable if it is only applicable to tightly scoped situations.

Types of Creations

Direct creation of things: Creating a new artifact, new capabilities, or introducing an artifact to a new population.
Enabling the creation of things: Enabling others to address an end-user need by making it possible, less expensive, or faster to do so.
Tools: Make it easier to make new things or meet certain needs for development.
Systems: “bring together a set of capabilities into a single working whole” (pg. 75).
Basic capabilities: An enabling advancement that might include algorithms, circuit boards, sensors, or input drivers.

Secondary Forms of Evaluation

These are more traditional UX validation methods, like usability testing (how easy it is for intended users to use), human performance tests (testing the performance of “typical users), and machine performance tests (the performance of the artifact or the algorithm).

Reading and Interpreting Ethnography - Paul Dourish

Interestingly, despite ethnography’s early foray into HCI, it has been largely viewed as new, not part of HCI’s earlier history, given its difference from cognitive science. The focus on this chapter is for readers to be able to understand and interpret ethnographic work in HCI. Dourish provides a historical account of ethnography in HCI.

“Ethnography, then, is data production rather than data gathering, in the sense that an it is only the ethnographer’s presence in the field and engagement with the site—through action and interaction—that produces the data that is then the basis of analysis” (pg. 3).

3rd Wave HCI

He contributes to what has become to be down as the “third wave” or “third paradigm” in HCI, “an approach that focuses on technology not so much in utilitarian terms but more in experiential … ones” (pg. 2). Such a turn would be interested in ethnography, given the interest in the “lifeworld” of experiences that ethnography illuminates. It is not utilitarian in its focus in the design of an artifact or even usefulness towards designing an artifact.

Structuralist Anthropology

Associated most with Levi-Strauss and the 1960s, structural anthropology had its roots in linguistics, particularly semiotics (how words carry meanings). Specifically, structural anthropologists look for meaning in relationships of difference. He describes structuralism has having two consequences for ethnography.

(1) It focuses not on a single event, but a system of events. It searches for moments of difference in a system of events, which makes particular events meaningful and distinct.

(2) It “ focuses ethnographic attention on the decoding of patterns of meaning and the symbolic nature of culture and paves the way for further examinations of cultural life (and ethnography itself) as an interpretive process” (pg. 7).

Hermeneutic Turn

The hermeneutic turn occurred in the 1970s. It encompasses the following: “The hermeneutic turn, then, is one that places interpretation at its core, in at least two ways—first, it focuses on the work of the ethnographer as essentially interpretive, and second, it draws attention to the interpretive practices that participants themselves are engaged in as they go about everyday life” (pg. 7). If culture is a text to be read, then people are reading culture in their everyday lives.

The ethnography is no longer simply providing an explanation, but is providing an interpretation. There is no underlying “fact of matter” or objective truth (pg 8). This is in line with Interpretivist ontologies and epistemologies, given that the worldview here embraces that the world is socially constructed and is perceived through the lens of one’s own experiences.

Thick Description

Thick description is a way of describing to capture “ multiple levels of understanding … different frames of interpretation, layers of meaning, contradictions and elaborations woven together” (pg 8). It is not simply writing what was seen or captured, but providing rich writing that “allows for multiple, repeated, indefinite processes of interpretation” (pg. 8).

Generative vs Taxonomic Culture

Taxonomic view: “ attempts to differentiate one cultural practice from another and to be able to set out a framework of cultural classification” (pg. 8). This view is often geographically bound.

Generative view: “culture is produced as a continual, ongoing process of interpretation” (pg. 9). We participate in many cultures. Culture is not bound to place, but evolving.

Reflexivity Turn

Popularized in the 1980s, the reflexivity turn sought to understand how both researcher and participant/subject shaped ethnographic accounts. Reflexive practice sought to address power relations, identity-based positionality (e.g., gender), and impacts on participant cultures. Reflexivity also “spoke to the importance of subject position as both a tool and a topic of ethnographic work” (pg. 10), hence the shift from subject to participant.

Multi-Sited Ethnography

In the 1990s, multi-sited ethnography arose from both the increases in digital media and in globalization (in terms of production). The lack of boundedness to a specific physical space gave rise to ethnographies that occurred across multiple physical sites or across digital ones, or both. It aims to understand cultural practices that may not be bound to a specific place, but instead exist across shared spaces and objects.

In describing what multi-sited ethnography is NOT, Dourish writes: “ Multi-sited ethnography is not explicitly a comparative project; the goal of the incorporation of multiple sites is not to line them up next to each other and see what differs. Nor is it an attempt to achieve some kind of statistical validity by leaning towards the quantitative and amassing large data sets” (pg. 11).

Distinguishing Generalizability and Abstraction

Traditional HCI seeks generalization, so that designs can be “generalized” to a wide audience. He distinguishes between generalization and abstraction by writing: “Generalization concerns making statements that have import beyond the specific circumstances from which they are generated. Abstraction concerns the creation of new entities that operate on a conceptual plane rather than a plane of actualities and that have generalized reach through the removal of specifics and particulars” (pg. 13). He makes a distinction between these two concepts because, in ethnography, there may be forms of generalization that do not depend on abstraction. Ethnography often generalizes through highlighting and contrasting patterns, but it “does not imagine specific observations to be particularlized instances of abstract entities” (pg. 13).

Ethnomethedolodgy

He describes it as the particular analytic position common in ethnographies in HCI. He contrasts ethnography and ethnomethodology: “ethnography advocates an approach to understanding social phenomena through participation. Ethnomethodology, on the other hand, is a particular analytic position on the organization of social action and in turn on the role of analysis and theorization within sociology” (pg. 14).

Ethnography and Design

Ethnography and Design: The basic ideal is to be able to formulate design insights or interventions from a generalized ethnographic study. Ethnography is often destabilizing and raises more questions than answers, but this can be useful to a design.

Curiosity, Creativity, and Surprise as Analytic Tools: Grounded Theory Method - Michael Muller

Grounded theory is a method (or series of methods) for developing theory from empirical work. The goal of grounded theory is “to explore a domain, with an emphasis on discovering new insights, testing those insights, and building partial understandings into a broader theory of the domain” (pg. 25). The method is aimed at “the ability to make sense of diverse phenomena, to construct an account of those phenomena that is strongly based in the data (“grounded” in the data), to develop that account through an iterative and principled series of challenges and modifications, and to communicate the end result to others in a way that is convincing and valuable to their own research and understanding” (pg. 25). Grounded theory is concerned with the creation of new theory, linked to the data. It is not concerned with existing theory.

GT differs from many positivistic “objective” methods in that it is not defined as a procedural series of steps. Instead, its methods are “derived ... form the philosophy of pragmatism” (pg. 27). GT relies on an iterative approach to interpretation and developing theory using principles of “constant comparison,” which prompt the researcher to continuously compare data to data and data to theory. Data is iteratively collected, often in a way that purposefully challenges the emerging theory (e.g., “is our finding universal or only in our sample of the population?” would lead to sampling a different subset of the population to test the emerging theory).

It came out of a rejection of positivist sociology and a rejection of conventional approaches which began with a theory, collected data uniformly, and tested the theory.

Abduction

It is a “logic of discovery” that seeks surprising findings and then ways to explain it. It involves forming a best conclusion, or in this case, theory, from the analysis of the findings. I found the definition lacking here, and often, definitions online confusing.

Coding in GT

“A code is a descriptor of some aspect of a particular situation (a site, informant or group of informants, episode, conversational turn, action, etc.). When codes are reused across more diverse situations, they gain explanatory power” (pg. 31). The researcher starts with very rich and specific descriptions of the data, and moves to more explanatory abstractions of larger themes in the data.

Open coding: The initial phase of coding used for rich descriptions of data instances. They are “open minded” and not governed by prior knowledge or assumptions.
Axial coding: Organizing open codes into broader abstractions, moving from describing to knowing. Collections of related open codes, which may also be used to interrogate the open codes, leading to more data collection and more open coding.
Categories: “a well-understood set of attributes of known relation to one another” (pg. 33). When we feel the axial codes have become sufficiently comprehensive to describe an overall phenomena, we describe that phenomena in the form of a category.
Core Concept: “emerges through this kind of intense comparison of data to data, and data to emerging theory (some grounded theorists make reference to selective coding, which is approximately the choice of the core concept) … The core concept that we have chosen now will be the basis for one report of the work. We may want to revisit the data and our memos later, for additional insights, and perhaps additional papers. ” (pg. 35).
Memoing: Not exactly a form of coding, but memoing is a constant process where the researcher constructs the knowledge they develop over the course of their theory building. Memo-writing is the process of making the knowledge known to oneself and others, crystallizing it in writing.

Substantitive Theory

Substantive theory: “Our intense thinking, sampling, and theorizing about the core concept has resulted in what grounded theorists call a substantive theory —that is, a welldeveloped, well-integrated set of internally consistent concepts that provide a thorough description of the data” (pg. 36). This does not mean the work is over, and the next step is generally to relate the theory to prior work in the report.

Usage Patterns in HCI/CSCW

Using GTM to Structure Data Collection and Analysis: “iterative episodes of data collection and theorizing, guided by theoretical sampling, and the use of constant comparison as a way to think about and develop theory during ongoing data collection” (pg. 40).
Using GTM to Analyze a Completed Dataset: “ applies deep and iterative coding to a complete set of data that have already been collected, gradually building theory from the data, often through explicit use of concepts of open coding, axial coding, categories, and core concepts” (pg. 41). Data may not be collected still, but GT guides reorganizing and rethinking through existing data.
Using GTM to Signal a Deep and Iterative Coding Approach: Problematic in that it does not seem to follow a true GT method, some employ the term GT to describe careful data coding. It is often difficult to understand their coding processes and how it led to building any theory out of the data. Often, the data collection was gathered through highly specific questions - uncommon in GT.

Knowing by Doing: Action Research as an Approach to HCI - Gillian R. Hayes

“Action research (AR) is an approach to research that involves engaging with a community to address some problem or challenge and through this problem solving to develop scholarly knowledge” (pg. 49). It is methods agnostic, meaning any method can be used, though HCI often designs and deploys technologies to assess community impact. AR is collaborative and interdisciplinary and is focused on conducting research with a community, not for one.

The goal is to achieve intervention and understanding, and it is empirical and cyclical. AR work unpacks the setting itself and the effectiveness of interventions. Research questions continuously evolve so researchers can capitalize on the knowledge they gain through the research process. “The research team then must ask the following: What happened? Did the intervention work (as planned)? What do we know about the site, our theories, and the empirical data that can explain why or why not? Now what?” (pg. 52).

It is thought to have emerged in 1944 from Kurt Lewin from his work “Action research and minority problems.” This work made acceptable the intervention of the researcher in research settings.

Three Approaches

The three approaches are: scientific-technical, practical-deliberative, and critical-emancipatory. The first is more computational, and the latter are more humanities and critical theory. Practical-Deliberative and Critical-Emancipatory are both interpretivist.

Scientific-Technical: A naturalist approach which believes in a single reality. The intervention is meant to involve some change in settings, like “ new practices and approaches, different power structures or group dynamics, altered patterns of action, or simply the incorporation of a new piece of technology into daily practice” (pg. 51). A limitation to this approach is that it may benefit the scientific or HCI community more than the intended community and change may not last.
Practical-Deliberative: Focused on “understanding local practices and solving locally identified problems” (pg. 51).
Critical-Emancipatory: “promotes a kind of consciousness raising and criticality that seeks to empower partners to identify and rise up against problems they may not have identified initially on their own” (pg. 51).

Community Partnership

Community partners are those in the community the researcher is working with, and who will hopefully take full control of any ongoing interventions or change the researcher leaves. Researchers must establish trust and an ongoing relationship with community partners before research work begins.

Researchers are also tasked with ensuring community partners understand the academic scholarship that is published based on the AR work, including language translation, explaining the venue, etc. This is because the entire team should be included in every aspect of the project. I could imagine, and have seen (e.g., in Lindsay Blackwell’s HeartMob work), that community collaborators also become authors. I could also imagine that there are sometimes institutional barriers to this.

Outside of academic scholarship, AR researchers are tasks with writing reports made with a community partner audience in mind. This can allow the team to come together and reflect deeply on the entirety of the work. They can also be used to update gatekeepers and local sponsors on the progress and benefits of the project.

Due to the relationship building process and intensive collaboration between community partners, leaving the site can be much more painful than in other research types, for both community members and researchers. Though leaving is often expected and happens at a project’s end, it can also happen suddenly - funding issues, job movements, student graduations. Yet the researcher must also prepare community members for sustainable and positive change and cannot just suddenly leave when research is done, regardless of the circumstances. Everything from organizational to IT support should be addressed.

Science and Design: The Implications of Different Forms of Accountability - William Gaver

The purpose of this chapter is to differentiate between the structure and goals of research projects in “science” and in “design.” He notes that disciplines characterized as “science” are often extremely diverse, ranging from positivist to interpretivist, quantitative to qualitative. Similarly, design is diverse in how it is done, individually or with teams, in commercial settings or not. Canonical examples of science and design are not used to constrain what makes “real science” or “real design” but for the sake of comparing the two.

He defines accountability as “the expectations of what activities must be defended and how, and by extension the ways narratives (accounts) are legitimately formed about” both science and design (pg. 147).

He argues that a core difference between research and design is how one defends the work against criticism. Science is often critiqued and questioned through the thesis of “how do you know what you have said is true?” Questions are aimed at research questions, methods, and data analysis, with a focus on thoroughness over novelty. To summarize, “science is defined by epistemological accountability, in which the essential requirement is to be able to explain and defend the basis of one’s claimed knowledge” (pg. 147).

In design, the question is often “does it work?” Whether something works is not purely technical, but also social, cultural, aesthetic, and ethical. Questions are aimed at better interpreting and understanding a design and its intended purpose or contribution. Meticulous and thorough methodology does not redeem bad design. To summarize, design works with “aesthetic accountability , where “aesthetic” refers to how satisfactory the composition of multiple design features are (as opposed to how ‘beautiful’ it might be). The requirement here is to be able to explain and defend—or, more typically, to demonstrate—that one’s design works” (pg. 147).

He writes: “ Scientific activities seek to discover, explain and predict things that are held to pre-exist in the world, whereas design is fundamentally bent on creating the new” (pg. 151).

Values of Science

Scientific work follows the iterative process of applying or testing theory and building new theory. In doing this work, Gaver argues that science values replicability, objectivity, generalizability, causality, explanation, prediction, and definiteness. He cites replicability, the ability to reproduce a study to evaluate it and build on it, as highly important. Objectivity refers to the truthfulness of the work being independent of the researchers who conducted it. Generalizability refers to the value of work being able to be abstracted to broader contexts. Also, scientific theories are ideally causal, meaning that they explain the relationship between related phenomena rather than define correlations or chance. Theory should also be able to explain phenomena and predict new ones. Finally, he defines definiteness as the most important value: “Being able to say what you know—precisely, and ideally quantifiably—and how you know, and when or under what conditions what you know is known to be true—these are the hallmarks of science” (pg. 149). He points out that these values are idealized, and often, most research does not meet them. Rather, they guide scientific efforts.

Values of Design

Because design is driven by the creation of new things, its values differ from science. Design values working things, individuality, resonance, evocation, and illumination. Working, in this case, means that the thing that is designed functions effectively and efficiently, solves a problem neatly, reconfigures a problem insightfully, using materials elegantly, etc. Individuality is not necessarily novelty alone, but also a specific character or style, resulting in resonance with users or viewers, having some cultural impact. Such designs are also evocative in that they inspire new designs, and illuminating if they change how we view the world.

Design Methods

Exploring Context: Understanding information about the specific setting of a design to better ensure designs will be applicable and appropriate to the setting. Further, it promotes empathy for those interacting with the settings. Designers will collect a diverse array of materials as “data” when exploring context. This might include design probes, which participants directly interact with. Design probes are meant to inspire and force new ways of thinking.
Developing a Design Space: Designers externalize through sketching then move towards more polished or communicative diagrams, prototypes, renders, etc. Design proposals are rarely elaborate and rarely contain technological details, but instead communicate motivations, emotions, functionalities, etc. “Collections of proposals allow a design space to emerge, making clear a bounded range of possibilities characterised by a range of dimensions we are interested in exploring” (pg. 157). Proposals are provisional and allow for elaboration, development, and changes.
Refinement and Making: After developing a design space via contextual research and technical experimentation, the designer then focuses on which directions to move forward with. The design is often settled through consensus or clear requirements being met through the design space. Design process to artifact is a slow materialization, starting once again from sketches of the artifact before moving into physical prototypes and then final specifications. “In the end, the final design, if well made, is the result of a tightly woven web of judgements that are contingent and situated, and shaped by an indefinite mix of practical, conceptual, cultural and personal considerations. Yet the result, a highly finished product, is an ‘ultimate particular’ (Stolterman, 2008 ), as definite and precise as any scientific theory” (pg. 158).
Assessment and Learning: Though a design may not be designed to be usable, the “theory” of a design is still often tested with the intended users. This might occur through lab based user testing or naturalistic field tests. This is not hypothesis testing. Further, different opinions or uses may be uncovered. “Given that designs can be appreciated from a number of different perspectives, and that different people may find different ways to engage and make meaning with them—or fail to do so—multiple, inconsistent and even incompatible accounts may all be equally true” (pg. 159).

Productive Indiscipline

Productive indiscipline is “borrowing from all disciplines or none to claim extraordinary methodological freedom” (pg. 162). Design does not need to inherit any specific disciplinary discourse.

Study,‌ ‌Build,‌ ‌Repeat:‌ ‌Using‌ ‌Online‌ ‌Communities‌ ‌as‌ ‌a‌ ‌Research‌ ‌Platform‌ ‌ - Loren‌ ‌Terveen,‌ ‌John‌ ‌Riedl,‌ ‌Joseph‌ ‌A.‌ ‌Konstan,‌ ‌and‌ ‌Cliff‌ ‌Lampe‌ ‌

Access to Communities

They describe “access” to online communities in 4 ways:

Access to usage data : This enables behavioral analysis, modeling, simulation, and evaluation of algorithms.
Access to users : This enables random assignment experiments, surveys, and interviews.
Access to APIs / plug-ins : This enables the empirical evaluation of new social interaction algorithms and user interfaces, as long as they can be implemented within the available APIs; systematic methods of subject recruitment may or may not be possible.
Access to software infrastructure : This allows for the introduction of arbitrary new features, full logging of behavioral data, systematic recruitment of subjects, and random assignment experiments.

Field Deployments: Knowing from Using in Context - Katie A. Siek , Gillian R. Hayes , Mark W. Newman , and John C. Tang

Field deployment involves deploying “robust prototypes in the wild” (pg. 119). Field deployments can be expensive, resource intensive, and time consuming, but provide researchers with data about real world usage of a system.

Context of use is critical to HCI work. They allow researchers to collect empirical data in naturalistic settings. Compared to other HCI techniques, field deployment focuses on two things: “ (1) They seek to evaluate the impacts novel technologies and particular populations, activities, and tasks have on each other” and “(2) They seek to perform such evaluations within the intended context of use” (pg. 120). They address the limitations of lab-based studies being in a controlled environment and can uncover unaccounted for things about the general environment. They can inform future design, develop stakeholder buy in, and provide empirical evidence about the emergence of systems in everyday life.

Types of Field Deployment

Convenience Deployment: Like “convenience sampling,” convenience deployment involves deploying the technology in a place that is easy to deploy, often due to its familiarity and existing social connections (e.g., with family and friends). The caveat is that this sample is likely not representative of a general or broader population. These can be useful for assessing the study design before a full deployment.
Semi-controlled Studies: These studies are done with participants who the research team begins not really knowing, but whom the team keeps in regular contact with throughout the study. Participants are generally recruited for the purpose of the study and are only allowed to use the prototype for the duration of the study. They can suffer from issues of non-generalizability and selection/acquisition bias, but they offer the research team a better position to argue that the study is generalizable to a wider audience than convenience deployments.
In the Wild: Deployments that are as naturalistic as possible. The technology is deployed to people unknown by the research team, who are not invested in the project. The prototypes used are often of commercial beta quality. Researchers can collect real world data often directly from the prototype itself. These are rarer in non-commercial research.

Methods of Field Deployment

Finding a field setting in which to deploy the system: Choosing a field impacts what can be learned about the deployment. Researchers should choose a field motivated by their research questions. In some cases, researchers may need to develop relationships with community partners for deployment. Researchers must often be dedicated to maintaining a long term relationship with stakeholders.
Defining the goals of a field deployment study: Field deployment studies have a wide variety of possible goals. The authors outline a few guiding principles:
Separating Adoption from Use: Ideally, participants will want to continue using the system. Guiding questions are: “ (1) Will people use this prototype? (2) If they do, will they enjoy it, will they see benefits from it, etc.? :
The Users’ Needs or Research Questions: As deployment research is iterative and often necessitates close relationships with participants, researchers should iteratively revisit research questions. User needs may guide additional questions.
The Context of the Researcher’s Affiliation: Researchers must be aware how their position shapes perceptions of the work. Users may respond with bias towards affirming the researcher’s goal. This can be especially challenging in work settings where workers feel mandated to use the system.
Recruiting participants and ethical considerations Given the time commitment and intensity of deployments, retention may be difficult. Researchers may need to consider rolling deployment if they have limited resources. Researchers should avoid disrupting participant or partner daily life. They should also consider if the population is oversampled and may not appreciate being continuously studied. Researchers should consider the ethical obligations outlined in an IRB as well as appropriate compensation.
Designing data collection instruments What to measure and how. Researchers might use qualitative methods like interviews, surveys, or observation. Quantitatively, they might also use surveys, as well as log data and sensor data. It is worth being mindful how burdensome data collection over long periods can be on participants (e.g., many interviews).
Conducting the field study Deployment studies are messy and unpredictable and the researcher may need to make changes during it. Incremental analysis is necessary to understand what is happening and how to shift goals. The data may be messy.
Ending the deployment The research team must consider the impact, if any, on participants and community partners. They should consider impacts the technology has had on participants and the ethics of removing the technology. It can often be awkward to remove the prototype if participants found it useful and are attached. Researchers might consider post-study support for some duration.
Analyzing the data Data is rich, messy, and there will likely be a lot of it. They don’t offer much concrete advice on data analysis, but they do say developing a theory can be challenging.

Research‌ ‌Through‌ ‌Design‌ ‌in‌ ‌HCI‌ ‌- ‌John‌ ‌Zimmerman‌ ‌and‌ ‌Jodi‌ ‌Forlizzi‌ ‌

Research through design is “an approach to conducting scholarly research that employs the methods, practices, and processes of design practice with the intention of generating new knowledge” (pg. 167). Design researchers focus on how design produces new and valuable knowledge.

The authors discuss the disconnect between design and research, and how they often seem to be going in “opposite” directions. RtD does not believe that research is inherently science, and adopts design’s strengths of reflexive practice and reinterpreting problems through making and critiquing artifacts. However, it is more systematic than pure design practice. It is also reflective in its assumptions about the world and requires detailed documentation of decisions and actions made throughout the process. Design research focuses on valuable knowledge gained through design, such as new design methods for designers or artifacts that sensitize a community to a specific space or domain.

The authors state that RtD is “connected to events in the design research community and to the emergence of interaction design as a design discipline distinct from architecture, industrial design, and communication design” (pg. 109). They draw distinctions between Research For Design (researching intended to advance design practices) and Research Into Design (research on design practices in practice). Specifically, they trace RtD to researchers from Technical Universities in the Netherlands in the 1990s, which developed a new research space called Rich Interaction based on designing new methods for how people interact with things.

RTD and PD

The origins of the participatory design movement “began in Scandinavia as a reaction to the disruptive force of information technology as it entered the workplace and caused breakdowns in traditional roles and responsibilities” (pg. 172). Participatory design embraces Marxist philosophy and focuses on building new tools and practices through democratic processes. Like RtD, participatory design can be used to generate new processes, practices, methods, and artifacts. However, it seems participatory design has an inherent approach to democratic teams and rapid prototyping that RtD might not necessitate.

RTD Model

The model developed through interviews and workshops by the authors contributes three forms of knowledge gleaned through RtD work: how, true, and real. They describe these three knowledges as the following: “From engineers, design researchers take “how” knowledge; the latest technical possibilities. From behavioral scientists, they take “true” knowledge, models and theories of human behavior. From anthropologists they take “real” knowledge; thick descriptions of how the world currently works” (pg. 176). Using these knowledges as inputs, design researchers ideate possible visions for inventions that address challenges, opportunities, and gaps in the current world.

The authors’ RtD model illustrates four research outputs:

Produce technical opportunities for engineers to implement and improve on technical advancement. For example, technical aspects users might benefit from, like routine pick-up and drop-off times for parents.
Reveal gaps in current behavioral theory. For example, why parents don’t develop attachments to stories read on an e-Reader, despite developing them with physical books.
Create new situations/practices. This may be done through real-world deployments, for example. The example they provide is the deployment of a real-time transit system which changed riders’ engagement.
Reveal design patterns by making multiple designs around the same problem. For example, testing six different designs around self perception can reveal perspectives other designers can use in their own designs.

RTD Steps

Select: choosing a research project, and whether to focus on a problem or design opportunity. The authors write that the researcher must “select a new material to play with, a context and target population to understand and empathize with, a societal issue or insight, and/or a theoretical framing they wish to apply to interaction” (pg. 185).
Design: After making a selection, the researcher or team moves on to design activities. They should consider what design methodologies to use (e.g., lab, field, showroom). This stage also includes a literature review on existing examples relevant to the design.
Evaluate: After finishing the design, the team can begin an interactive process of critiquing and redesigning. This process involves continuous evaluation and rethinking of the framing of the design. They should document their changes and rationale at every step.
Reflect and disseminate: The team should reflect on what they learned and disseminate the research, whether through peer-reviewed academic articles or through videos or demonstrations. If the work contains a working system, the researchers might deploy it for broader use.
Repeat: Repeat the same process or investigation over again for best results.

Experimental Research in HCI - Darren Gergle and Desney S. Tan

Experimental research is most associated with positivist research, as described in Moses and Knutsen. The purpose is to show how the manipulation of one variable has causal effects on a controlled variable. The elements of experimental research as are follows:

Causality: Changes in X lead to changes in Y
Variables: What the research manipulates. Independent variables are manipulated. Dependent variables are not. We measure the effects of independent variables on dependent variables.
Hypothesis: The predicted relationship between the independent and dependent variables.
Random assignment: Of participants to experimental conditions, as an attempt to control for selection bias and increase likelihood that results across groups are resulting from the treatment they are exposed to.

Advantages

One advantage is internal validity, which means “the extent to which the experimental approach allows the researcher to minimize biases or systematic error and demonstrate a strong causal connection” (pg. 194). It also allows a researcher to isolate variables to test their effects, as well as build up multiple variable models to test different interactions. Few other methods can claim this.

It is also useful for applying statistical methods for analysis. Further, through systematic testing, experimental methods can be used to advance theory. Replicating experiments builds confidence in findings and can build towards theory or more universal principles.

Limitations

It requires well-defined variables and hypotheses, which can be difficult when there are many uncontrollable factors (as in human data). Also, poor external validity can be an issue. External validity is how valid the results are when extended to other contexts. Experiments can lead to artificial lab settings that do not translate well outside the lab. Statistical tests are often applied erroneously. And finally, a hypothesis cannot be truly proven absolutely; evidence is simply gathered to support it.

Components

Hypothesis formation: Development of a statement about the relationship between two or more variables. It must define both variables and the predicted relationship between them. It must be precise in this defining. Further, it must be meaningful, in that it extends or adds new knowledge. Iit needs to be testable, such that one variable can be manipulated and confounding factors can be controlled. Finally, it must be falsifiable; you must be able to disprove the hypothesis with empirical evidence.
Estimation techniques: Focused on establishing the magnitude of effect through confidence intervals and effect sizes. Bayesian statistics may also be employed. Estimation techniques can be used to answer more sophisticated questions, such as: “How well does it work across a range of settings and contexts?” (pg. 199).
Variables: As stated above, you need a dependent variable(s) and independent variable. You might also use control and covariate variables.
Independent variables: Manipulated by the researcher. Conditions key factor being examined. Must be able to establish well controlled conditions. Must be able to provide a clear operational definition and confirm intended effects. Manipulation checks should be conducted to ensure the desired influence on participants. The range of values is also important to consider. Finally, one must choose meaningful variables to study, in terms of being theoretically or practically interesting.
Dependent variables: The variable for which outcomes are predicted. Reliability is important to ensure you get the same results every time an experiment is repeated. You must clearly specify the rules for quantifying your measurement and clearly define the scope and boundaries of what will be measured. One must all consider the validity of the variable. There are multiple forms of validity. Face validity is the weakest (your measure appears to measure what is supposed to). Concurrent validity “demonstrates a correlation between the two measures at the same point in time” (pg. 201). Predictive validity “ is a validation approach where the DV is shown to accurately predict some other conceptually related variable later in time” (pg. 201). The DV must be sensitive enough to detect differences based on the IV.
Control variables: A potential IV is held constant, meant to mitigate fluctuations in an unmeasured variable.
Covariate: Additional variables that may influence the value of a dependent variable but are not controlled by the researcher and naturally vary.

Classes of Experiments

Randomized experiments: Participants randomly assigned to conditions to avoid bias and ensure parity between experiment groups. These experiments can be both single factor (single IV and single DV) and multi-factor (multiple IVs and single DV). Between-subjects is most common, and the goal is to measure differences between the control group and the independent variable group. Within-subjects is when participants are assigned to all conditions (all levels of the IV) or have repeated exposure to a single condition. Factorial designs are for observing multiple independent variables at the same time.
Quasi-experiments: Since true randomization is often hard for HCI research, quasi-experiments are meant to address internal validity threats due to lack of randomization.” The designs tend to vary along two primary dimensions: those with or without control or comparison groups; and those with or without pre- and post-intervention measures” (pg. 211). Non-equivalent groups design is most common and attempts to measure the effect of some intervention (e.g., in a classroom). Interrupted time-series designs infer effects by measuring before and after variables.

MAGIC Criteria

Magnitude: Understanding the size of effects being reported and whether it is large enough to have real world impact.
Articulation: The degree of detail in the reported research findings. Articulation would support replicability.
Generality: The extent to which results expand beyond the context of the study. This includes external validity.
Interestingness: Importance of the research findings in terms of theoretical, practical, or novel implications.
Credibility: Convinces reviewers and readers the work is trustworthy and performed competently.

Crowdsourcing in HCI Research - Serge Egelman , Ed H. Chi, and Steven Dow

Crowdsourcing is recruiting many people online to contribute small amounts of effort to a larger goal. Many crowdsourcing platforms have arisen that researchers use, which pay small amounts of money for crowdworker participation (e.g., Amazon Mechanical Turk). Wikipedia would also be considered a crowdsourced platform, with many users contributing labor towards factual wiki entries.

Researchers should also consider which platforms are most appropriate, given the scale of the project or the quality sought. Crowdsourcing is most appropriate for tasks that require little supervision, and directions should be clear. Crowdsourcing may not work as well for qualitative research either. The authors offer a number of questions to guide whether crowdsourcing is useful for specific research:

Are the tasks well suited for crowdsourcing?
If it is a user study, what are the tradeoffs between having participants perform the task online versus in a laboratory?
How much should crowd workers earn for the task?
How can researchers ensure good results from crowdsourcing?

Sensor Data Streams - Stephen Voida, Donald J. Patterson , and Shwetak N. Patel

Sensors are used to “collect streams of low-level data about people and their environments … Their locations, physiological states, contact with other people, situated uses of devices, and other digital traces can potentially be recorded and analyzed” (pg. 291). This can be done with or without participant knowledge and over the course of time. It can be done quickly and with low overhead, especially as sensors have become cheaper. Sensors can be used in mixed-methods approaches, as the data collected is often quantitative in nature but interpreting it requires qualitative analysis as well.

The use of sensor data in HCI stems from experimental psychology, who attempt to measure and quantify human behaviors. Computer and information sciences have adopted sensor data collection alongside the improvement of sensing devices that can collect data on the researchers’ behalf. The authors compare sensor data collection to “in situ survey methods” such as the experience sampling method (ESM), which involves collecting data as a participant goes about their day.

Sensors are meant to be used to answer questions about activities, behaviors, and practices. Sensors can be deployed on a person, in an environment (like a workplace), or in a home. They can be used to assess frequency of behaviors, duration of technology use, interaction points, etc. The authors offer the following example questions for sensor data collection:

Where do people travel over the course of a day?
With whom do they normally communicate or collaborate?
What tools or information resources do they use at various points during the day?
What routines help to define a “typical” or “atypical” day?

Types of Sensors

Egocentric Sensor Data Streams: An egocentric unit of analysis focuses on an individual’s data. Example RQ: “How is a person ’ s mental state or mood affected by real-world stimuli?”
Group-Centric Sensor Data Streams: Aimed at answering questions about groups of people. Capture same signals for multiple individuals. Example RQ: “How often do members of this group interact with one another?”
Space-Centric Sensor Data Streams: How spaces are used, regardless of their inhabitants. This is often done in semi-public spaces like museums. It can also be done in private spaces, like homes or living facilities. A space must be outfitted with specific types of sensors for this method. Example RQ: “How are the occupants of a home spending their time throughout the day and night?”

Limitations

Sensor data is not good for answering “why” questions. It cannot explain why people engage in certain behaviors or spaces are used in certain ways or how individuals think about their actions. The authors recommend triangulating with other methods to better understand such questions. The gap between data and what happened in the real world will likely require researcher interpretation.

Further, the phenomenon being measured must be well understood and well defined. Much like surveys, not understanding or defining the constructs to be measured can lead to wasted resources and the data may need to be discarded. Care should also be taken when selecting the type of sensor for the type of data sought.

Also, sensors have a limited quantity of data that they can collect. They can be expensive, unreliable, noisy, and limited in range. Setting up sensors may have a large overhead in time, labor, storing and maintaining data, and analyzing data. Many sensors also produce a great deal of data, even over short periods of time, so managing large data can be challenging.

Best Practices

Generating the research questions and planning how to analyze the data streams: Define RQs, units of analysis (e.g., groups), and sensor capabilities. Determine ethical considerations of collecting data with or without participant knowledge or consent. If participants know about the collection, be upfront about scope and duration of the study and try to share data with them throughout. Provide participants with an ability to revoke consent.
Building, acquiring, or provisioning the sensors: Consider “cost, availability, technological capability, intrusiveness, or methodological needs” (pg. 302). Researchers must “either choose a sensor(s) that can reliably and accurately sense the desired phenomena or they must construct (and validate) their own custom sensor for this purpose” (pg. 302). Decide whether sensors will be worn by participants or deployed to a space. Intrusiveness of the sensors should be considered.
Determining how frequently and at what level of fidelity to collect data samples: The research should assess “how a balance is maintained between the sampling rate of the data streams and storage/bandwidth, processing, and power requirements” (pg. 304). Aggressive sensing can exhaust phone batteries, lead to large data costs, and fill storage.
Installing the sensors: Time, labor, and the number of sensors is determined by the size of the building, if it’s going to be deployed in a space. An alternative to deploying sensors in large or across many buildings would be to use a Wizard of Oz approach, where a researcher acts as a human sensor.
Storing the data representation: How and where to store the data, driven by the type and fidelity of the data. One can also use multiple storage streams and later aggregate the data into a single stream.
Making sense of the collected corpus of data: They state that the first step is usually to classify the data streams into segments or identify particular events in the data to be analyzed. In user modeling of the data, the data is treated much like it would be in machine learning: the researcher identifies which data to use as inputs to a classifier. If the researcher has access to gold-standard data (ground truth), it makes validating the model easier. Either way, the data should be divided into training, validation, and testing data.

Survey Research in HCI - Hendrik Müller, Aaron Sedley, Elizabeth Ferrall-Nunge

Surveys are used to ask a sample of a population questions that can then be generalized to the larger population. There are a wide variety of surveys and ways to collect survey responses. Many surveys are now deployed online, particularly in HCI.

Surveys are an old research method that have been used for censuses, land usage, and military conscription, to name a few. The 19th century’s usage of political polls made survey research an attractive method that began being used in research in the 20th century. Scrutiny over major polling prediction failures led to doubt in the method and happened as early as 1936. The likert scale, developed in the 1920s, was meant to minimize questionnaire bias and optimize data collection processes. Surveys in HCI have been employed to test attitudes, behaviors, and experiences with technologies, even before the advent of the internet.

The authors list a multitude of ways surveys are used in HCI:

Gather information about people’s habits, interaction with technology, or behavior
Get demographic or psychographic information to characterize a population
Get feedback on people’s experiences with a product, service, or application
Collect people’s attitudes and perceptions toward an application in the context of use
Understand people’s intents and motivations for using an application
Quantitatively measure task success with specific parts of an application
Capture people’s awareness of certain systems, services, theories, or features
Compare people’s attitudes, experiences, etc. over time and across dimensions

Steps

Research goals and constructs: Researchers should ask what they want to measure and how the data could meet research goals. When survey-appropriate research goals are met, they should be matched to constructs (unidimensional attributes that cannot be directly observed). A technique called cognitive processing can be used to pilot whether participants understand the constructs as intended.
Population and sampling Determine who and how many people will take the survey. Reaching every person in the population is not only virtually impossible, it is unnecessary. Researchers can approximate the larger population through a sampling frame that is representative of the larger population.
Questionnaire design and biases Poor questionnaire design can lead to measurement error, a deviation of respondents’ answers to what they truly meant. Any data collected in error would have to be discarded and the survey re-deployed. Further, survey questions can be designed to be leading, encouraging respondents to answer in certain ways (e.g., asking how frustrating something is). Some questions can also have multiple variables which do not align with one another, which gives misleading data. Order of the questions can also lead to bias. Researchers should try to make questions as simple as possible for ease of understanding.
Review and survey pretesting At this point, a pilot survey to test the survey constructs is necessary before full deployment. This allows the researcher to ensure the constructs are answered as intended and questions are clear. Cognitive pretesting involves in person interviews to over questions, while field testing involves deploying the survey with the intended sample to test sampling approach and drop off rates.
Implementation and launch When questions are finalized, the survey is ready to be launched using whatever sampling method has also been finalized. The authors also discuss methods for maximizing response rates.
Data analysis and reporting The final step involves preparing the data for analysis, including cleaning it up if necessary. This includes removing duplicate responses, responses which seemed too speedy, and unfinished responses, among others. An analysis of close-ended responses might use descriptive statistics or statistical tests. An analysis of open-ended responses might use thematic coding. Once analysis is completed, the researcher organizes the findings into a writeup.