How AI Bias Actually Occurs (and Why It’s Difficult to Fix)

May 22, 2019 - 8 minutes read

Today, the overwhelming majority of artificial intelligence (AI) applications are made possible through deep learning.

This subset of machine learning relies on the process of applying deep neural network architectures to make decisions or solve problems. Basically, multiple algorithm layers are applied to analyze a problem and produce a “probability vector,” which might say something like: “95% confident the object is a human, 25% confident the object is a fruit”.

You can think of deep learning as machine learning on steroids. Deep neural networks enhance machines’ abilities to identify even the smallest of patterns. And this pattern-finding augmentation is behind the biggest advancements in AI today.

But with great power comes great responsibility. For all the benefits that deep learning offers, it also has vast potential to cause catastrophic damage across industries due to bias. And being aware of this bias isn’t enough; we must understand the mechanics of how it occurs to properly address it.

How AI Bias Creeps In

Bias can enter the equation during numerous stages of the deep learning process. It doesn’t always boil down to using biased training data. In reality, bias can actually start forming before the data is even collected! Let’s cover three key stages where bias can creep in.

Framing Your Problem

When a team of machine learning developers sets out to create a deep learning model, they first have to decide upon what they actually want to achieve with it. For example, a credit card company may want to use AI to help predict a customer’s creditworthiness.

Cyber law or internet law concept with 3d rendering ai robot with law scale and gavel judge

But “creditworthiness” is a nebulous (and subjective) concept. To translate it into a discrete category that can be computed, the credit card company uses its priorities to guide the solution. Maybe the company wants to maximize the number of loans getting repaid. Or maybe it wants to optimize its profit margins. Regardless of which, this context informs how the company defines creditworthiness.

Obviously, the issue here is that “those decisions are made for various business reasons other than fairness or discrimination,” says Solon Barocas, a Cornell University Assistant Professor of Information Science.

Data Collection

Bias becomes present in training data if the collected data is not representative of reality or if it reflects existing prejudices.

The first option can occur if, for example, a deep learning algorithm is fed more images of light-skinned faces than dark-skinned faces. As a result, this algorithm probably won’t be efficient at recognizing dark-skinned faces.

The second option actually happened to Seattle-based tech titan Amazon. The e-commerce giant discovered that its AI-powered recruiting tool was excluding female candidates because it was trained on historical hiring data that favored men.

Data Preparation

Bias can also be introduced when you’re preparing the data for training. This stage relies on your AI team selecting the attributes you want the algorithm to consider and prioritize. This selection of attributes is known as the “art” of deep learning, and it can have a substantial impact on your model’s prediction accuracy.

In terms of creditworthiness, attributes could be the customer’s income, age, or the number of loans successfully paid off. In the case of Amazon’s recruiting debacle, attributes could have been the job candidate’s gender, experience, and education.

Why Bias Is So Hard to Address

So, why is bias so darn difficult to identify and address? Let’s cover four of the biggest challenges in mitigating it.

The Root of the Problem Isn’t Readily Apparent

During your deep learning model’s construction, the downstream impact of your data and choices aren’t easy to see. Thus, the introduction of bias isn’t always obvious. So retroactively identifying where it came from (and how to get rid of it) can be difficult.

When Amazon’s engineers initially discovered its recruiting tool’s sexist behavior, they tried to fix it by reprogramming it to ignore explicitly gendered words like “men’s” or “women’s.” But soon after, the team found out that the updated system was still using implicitly gendered words to make decisions.

Flawed Training & Testing Processes

Deep learning models are heavily tested for performance before deployment. In theory, this would seem like the perfect chance to catch any present biases. But in practice, it’s easy to see that the status quo for testing is faulty.

After collecting data, AI teams typically split this information into two groups: one for training and one for testing. So if bias is present in the data, it’s present in both of these groups. The data the team uses to test the model’s performance contains the same biases that the model was trained with. So the test will not pick up on skewed results.

How Is Fairness Defined?

Because we know what bias looks like, we also know what fairness looks like, right? Well, not exactly. It’s not clear what the absence of bias should look like. And this topic isn’t relegated to computer science — it’s a debate that has raged on for centuries in law, philosophy, and social science.

But things get more complicated when we come back to computer science. In this case, bias must be defined mathematically. And researchers have found that there are numerous mathematical fairness definitions that happen to be mutually exclusive.

To make matters worse, other fields understand that the definition of fairness can change with time and context. But in computer science, many industry players believe that fairness should be fixed. “By fixing the answer, you’re solving a problem that looks very different than how society tends to think about these issues,” explains Andrew Selbst, a Postdoctoral Scholar at the Data & Society Research Institute.

What Now?

If you’ve come to the conclusion that bias is extremely difficult to fix, you’re not alone. “Fixing’ discrimination in algorithmic systems is not something that can be solved easily,” says Selbst. “It’s a process ongoing, just like discrimination in any other aspect of society.”

Fortunately, many AI developers and researchers are hard at work on various solutions. Some are even building algorithms to help detect and mitigate biases that may be hidden away in training data.

Do you think this big problem of AI bias can be properly addressed? And if so, what do you think the best solution is? Let us know in the comments!