How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms

Burrell, Jenna. “How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms.” Big Data & Society, June 2016, doi:10.1177/2053951715622512.

This article focuses on algorithmic opacity as an issue for consequential decision making (e.g., loan services). Opacity refers to the "how" and "why" of algorithmic decision-making being hidden, and only receiving an output. Inputs may also be hidden, in terms of what features are being used to make decisions and what data the model was trained on initially. The article defines three forms of opacity: "(1) opacity as intentional corporate or state secrecy, (2) opacity as technical illiteracy, and (3) an opacity that arises from the characteristics of machine learning algorithms and the scale required to apply them usefully."

She points out that machine learning algorithms are not the only algorithms of concern in terms of opacity. ML may play a central or peripheral role to many socially consequential systems.

Forms of Opacity

Opacity as intentional corporate or state secrecy

As mentioned in other articles on transparency and aduting, there are two reasons for intentionally obscuring an underlying algorith,. One is trade secrecy for businesses purposes. The other is to prevent gaming or malicious behavior. Pasquale believes that proprietary algorithms for either reason are not necessary but a result of lax regulation. He is skeptical that companies are not embracing opacity for the specific purpose of control, manipulation, discrimination, and lack of regulation. The solution proposed is to make algorithmic code/models available to scrutiny through regulatory means. There is an underlying assumption that the existence of scrutiny would pressure companies to avoid intentional manipulations.

Opacity as technical illiteracy

Writing code and designing ML models is a specialized skill otherwise inaccessible to the general populace. While there have been increasing efforts to diversifying computer science, Diakopoulos also poses journalists as potential avenues for reverse engineering algorithms and reporting in plain language to the public. Of course, there is also the barrier of educating journalists on code literacy.

Opacity as the way algorithms operate at the scale of application

"Algorithms" are often multi-component systems made up of multiple algorithms, and even those engineers working on one component may be unaware of the workings of another. Audits are increasingly difficult when one must entangle a complex system. ML is often labelled with "the curse of dimensionality" due to the vast number of features used to make systems more accurate, and often less interpretable. The mass scale of data and features often leads engineers to utilize dimensionality-reduction methods like PCA, making the model even more opaque.