PhD Student in Information Science at University of Colorado Boulder
Automating Inequality: How High-Tech Tools Profile, Police, and Punish the PoorEubanks, Virginia. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. United States, St. Martin's Publishing Group, 2018.
Eubanks focuses on the predictive tools used to monitor groups of people, for everything from insurance fraud to loan approvals, and how those predictions can have profound effects, particularly on the most disadvantaged. She points out that marginalized groups - people of color, migrants, minority religious groups, sexual minorities, and the poor - are monitored more closely by practices of data collection, as they try to access public benefits, cross borders, get healthcare, and exist in highly policed neighborhoods. The data collected becomes a means of reinforcing marginality, which she refers to as "collective red-flagging, a feedback loop of injustice" (pg. 7). This book specifically focuses on how digital tools surveil and police the resources the poor need to survive. Through empirical work over the years with both poor recipients of state benefits and those working in courts and offices on these benefits, she illuminates how automated tools are being quickly implemented with little to no political discussions about the repercussions.
The majority of the book discusses specific examples of how digital tools fail the poor and put their health and livelihoods at risk. Chapter 2 deals with the loss of healthcare as states roll out automated tools for processing claims. Chapter 3 deals with LA's coordinated entry system for the homeless, which matches unhoused people with appropriate resources based on algorithmically ranking needs. She critiques the invasive questioning for gathering predictive data and the maintaining of that data for likely surveillance. Risky and illegal behavior is requested to be considered, while that behavior is also actively criminalized by police. The model also prioritizes prison as housing, pushing people who have been imprisoned further down on the needs list. Further, the system is the only entry point for obtaining a home in LA now. Chapter 4 focuses on the Allegheny Family Screening Tool, a predictive risk tool for child abuse and neglect. Risk scores are associated with long documented histories with public services, rather than immediate risk. Further, while scores are not meant to be a black and white decision tool, scores might influence the perspectives of human investigations. She criticizes the notion that a model is less biased than a human caseworker, homeless service provider, or intake called; "I find the philosophy that sees human beings as unknowable black boxes and machines as transparent deeply troubling" (pg. 168).
Chapter 1: From Poorhouse to DatabaseThis chapter focuses on how the historical physical poorhouses of the 19th century have morphed into data-based versions today. Eubanks recounts the history of poorhouses, institutions where poor people were sent to do hard labor and were kept in tiny cells with no sanitation or proper beds. Poorhouses were plagued with physical and sexual abuse, with the specific example of the House of Industry in New York even selling the bodies of the deceased for dissection to physicians. Poorhouses required "inmates" to swear a "pauper's oath" that stripped them of the basic civil rights awarded to white men at the time: voting, marriage, and holding office. Poorhouses were inherently designed to eliminate the poor, often through death, and thus instill fear in those who might seek public services.
Given the rise of the working class rebellions in the early 19th century, elites sought ways to attack welfare. This gave rise to a scientific charity movement that sought data-driven methods for separating the deserving from the undeserving. The classification of undeserving was also rooted in eugenic ideology where undeserving people were discouraged from reproduction. Eubanks writes that "eugenics created the first database of the poor" where social scientists gathered information about their sex lives, behavior, and intelligence, collecting fingerprints, skull measurements, and descriptive labels. We rely on classifications of "deserving" versus "undeserving" poor.
She argues that while poorhouses have been demolished since the 50s, modern digital systems of poverty managment "retain a remarkable kinship with the poorhouses of the past" (pg. 16). This resemblance involves "punative, moralistic views of poverty and ... a system of high-tech containment and investigation ... [that] deters the poor from accessing public resources; polices their labor, spending, sexuality, and parenting; tries to predict their future behavior; and punishes and criminalizes those who do not comply with its dictates" (pg. 16). She highlights the moment that the digital poorhouse was born: during the Reagan era, where constitutional welfare rights made it more difficult to apply discriminatory practices, new technologies were instead developed to distribute aid "more efficiently."
Chapter 4: The Agllegheny AlgorithmIn this chapter, she outlines how human choices in the design process reflect values in different components of the AFST.
Chosen outcome variables reflect what the tool is trying to predict. In the AFST, that is potential fatalities due to child abuse. There is very statistically meaningful data given the number of child maltreatment-fatalities are so low in Allegheny County. Due to sparse data, the model itself cannot be statistically meaningful. Substantiation from caseworkers is not a great metric because such evidence states whether the caseworker believes a child might be harmed, not that they were harmed. Substantiation varies from case to case, depending on if the caseworker has a strong suspicion but no concrete evidence to trying to get a family access to resources. Instead, the AFST uses two proxy variables - community re-referral and child placement. Therefore, the AFST actually predicts decisions made by the community and decisions made by the agency and family courts, but not harm.
Predictive variables are data within a dataset that are correlated with the outcome variables. For AFST, the design team used stepwise probit regression to eliminate all outcome variables that did not allow the prefiction to reach statistical significance. This meant the AFST model went from 287 variables to 131 believed to predict harm.
Validation data is then used to see how well the model performs. For AFST, the model was tested on over 76 thousand referrals. Since 70% of those referrals were used to determine the weights of the predictor variables, the model was actually tested on 30% of the cases for prediction validation.
A perfectly predictive model would have a 100% fit under the receiver operating charactieric (ROC) curve. A model with no predictive power, which would mean it's chances of predicting something corrently is a 50/50 coin toss. The current AFST rate is 76% accurate, and as of 2016, that would mean about 3,633 incorrect predictions. Eubanks critiques the model for its design flaws which limit its accuracy, mainly that it predicts referrals, which are only hypothetical harms, and not real harms. Further, the mere documentation of hypothetical harms in the process of referrals is subject to a bias against the poor, such as a person calling a hotline because they were angry at a parent who was not actually neglectful or abusive (pg. 146). She writes that call referal "is a deeply problematic proxy for maltreatment" because call referalls "introduce the most racial bias into the system [as] the very way the model defines matreatment" (pg. 155).
She refers to how poor people are forced to trade their privacy and due process for resources for protecting their children "poverty profiling." Poor people are targeted not for behaviors but for the personal characteristic of living in poverty. The AFST "confuses parenting while poor with poor parenting" (pg. 158).
Chapter 5: The Digital PoorhouseEubanks outlines the particular characteristics of the digital poorhouse:
1. It's hard to understand.
This is because the models are complex and often secret and proprietary - either to prevent gaming of the system or to protect its targets. Eubanks argues transparency is crucial to demographic and to understand why you may be denied benefits, and ideally obtain due process.
2. It's massively scalable.
The time and cost to scale digital programs is less than physical infrastructures.
3. It's persistent.
As digital systems scale, they become harder to decomission, especially as they become entangled in everyday life (and are owned by huge corporations).
4. It's eternal.
The data is persistent in comparison to paper records, because of the physicality of paper itself. The storage of digital data also increases the risk of data breaches and leaks. "The eternal record is punishment and retribution, not justice," she writes (pg. 187).
5. We all live in it.
Neoliberal values of basing human worth on the ability to earn a wage means we are all living in the digital poorhouse. As the middle class shrinks, it is likely many will suffer from the digital tools designed to eternally shape our futures. It will make us all less able to recover from poor decisions or bad luck or targeted discrimination.