Sumser: Google’s AI ethics problem just got even worse
In early December 2020, Timnit Gebru was fired from her job. The co-lead of Google’s AI Ethics Research Department was one of the authors of a paper that questioned some of the company’s most valued business models. Two-and-a-half months later, the other leader of the ethics department, Margaret Mitchell, was also fired.
The team Gebru built at Google is one of the most diverse in AI. It includes leading experts from different backgrounds. Their work often challenges mainstream AI practices. In all sorts of ways, Gebru is pioneering the field of AI ethics.
Gebru earned her credentials as a leading voice in AI ethics research with disciplined hard work. She co-authored a groundbreaking paper that showed facial recognition to be less accurate at identifying women and people of color (Google Photos has had bias issues since at least 2015, when it identified a Black woman as a gorilla).
The firings of Gebru and Mitchell tore the fabric of the tight-knit team—and have reverberated throughout the AI space.
Google’s changing narrative
Google’s story about the firings evolved over time. At first, the company referred to Gebru’s supposed transgressions as “being undiplomatic.” By the time of Mitchell’s firing, the language changed to “insubordination.” As the tech titan pivoted, an undercurrent of “angry Black woman” morphed slightly to “angry women” once Mitchell was included.
In the rapidly evolving Google narrative, new management, a reorganization and a heightened emphasis on improving diversity in the company are the fig leaves being used to try to move the story out of the news cycle. However, the problem Gebru identified involves implicit bias in Google’s core language models. While realigning the culture at Google may be a good thing, it’s a smokescreen for the deeper problems Gebru identified.
The firings have also been an unmitigated PR disaster for the tech giant. From a distance, the story sounds like “Google fires the leadership of its AI ethics team for doing their jobs.” Over the course of about 90 days, the company methodically shredded its reputation in the AI community.
Nearly 7,000 members of the AI community, including about 2,000 Google employees, signed a letter protesting the company’s treatment of its AI Ethics leadership. As a result, it will be harder to recruit and retain people in the AI function and within Google as a whole.
Mistakes always have the capacity to evolve into ever-larger messes. Instead of admitting “we screwed this one up,” Google appears to be masterfully executing its own explosion of negative publicity as it seeks to protect itself rather than adhere to its own code of conduct that mandates it “do the right thing” (formerly “don’t be evil”). It is evidence of a fragile male-dominated technical culture with a bad case of certainty. This is how the Silicon Valley “bro culture” does itself in.
What Gebru actually said
The title of the paper Gebru co-authored with Mitchell and two University of Washington researchers is: On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. The paper investigated the language models that are the underpinning of Google’s search engine. It explored the risks of the models and approaches to mitigating them.
In its recommendations, the paper suggests:
- executing pre-development testing to evaluate whether the planned approach fits a specific research need;
- carefully examining the impact on stakeholders;
- weighing the impact of models as a part of any design; and
- developing alternatives to ever-larger language models.
Importantly, the researchers caution that current approaches to language modeling do not create systems that understand language and meaning. Rather, they exhibit dexterity in the manipulation of linguistic form. Basically, they produce utterances that resemble someone with a mouthful of buzzwords and no understanding of the problem.
Since language models tend to amplify underlying biases, the authors warn of the significant risk of reinforcing harmful language and ideation. The risk ties directly to their concern about training data that is simply hoovered up from everywhere. And the internet is not known for its reasoned discourse, clear explanations and consistent matches between words and meaning. Humans are messy. Language reflects that.
The first and most powerful of the paper’s recommendations concerns the impact of ever-larger language models. The authors conclude that current research patterns have a disproportionate impact on the communities that are least likely to receive benefit from the new tools. There is a strong undertone that language model development is a contemporary form of strip mining—vandalizing less privileged communities with resource extraction. As a result, the system’s underlying data is fraught with all of the biases of the online world, only now the bias becomes part of the rules and models, which guarantees more bias, only faster.
The core of Google’s culture is to thrust ahead and discover operational flaws as the result of execution. This “move quickly and break things” approach is at the heart of many Silicon Valley firms. The paper urges restraint, research discipline, preplanning, consideration of all stakeholders and investigation of alternative approaches.
It’s hard to imagine a more sensible approach. It’s also hard to imagine a more substantive critique of the culture at Google. It is easy to understand why the company’s leadership responded as they did.
Identifying the problems nobody wants to have
AI ethics teams inevitably challenge the status quo. Their job is to ask what is fair, what could go wrong, what is the right thing to do? In contrast, design teams are constrained by the organization’s culture. They are limited to what is imaginable within their operational framework. The whole point of having an ethics function is to counteract the inevitable groupthink that comes with being a member of an organization.
Well-designed ethics teams, like the one that Gebru and Mitchell established at Google, depend on having an array of perspectives from people who represent lived experiences that are different from those in the rest of the organization, as well as different from each other. This is an attempt to ensure that the organization as a whole does not miss harmful design features simply because they are unimaginable by the design team.
It is inevitable that an ethics team will come into conflict with the organization’s hierarchy. While the function is absolutely essential to the delivery of high-quality AI (particularly in the HR realm), the role can be contentious and the consequences severe.
When an AI ethics team discovers a significant problem, exactly as it is supposed to, the organization faces a challenging problem. One of the inherent limitations of ML and AI is that, once it starts learning, you can’t simply extract the bias and move forward. It’s like trying to remove the butter from a cookie. It’s not possible. If the butter is bad, you have to start over from scratch. By identifying a significant ethical concern, the team is likely to incur financial consequences for the overall organization.
Google’s kerfuffle is just one of the early examples of a new kind of management problem. Since avoiding the ethics question boils down to potential liability for not considering the consequences of their products, companies will have to start implementing an oversight function of some kind. Inevitably, those teams will find something. This will be a real test of the integrity of the company.
Hiding from the problem and firing the team doesn’t work. It simply tells the world that you have an ethics problem you can’t manage.
Why we need ethics and oversight boards
AI ethics teams are an emerging necessity. Companies need them to understand how machine intelligence is shaping their business. HR departments need them because decision-making about people needs human oversight. Vendors need them because it’s so easy to miss the sorts of problems that can cause serious harm to people and put companies out of business.
Upon installing your AI ethics team, you can expect to face the same sorts of discoveries that Google did. After all, the job of the ethics team is to discover those sorts of things. Facebook, it turns out, is providing a good model of how to move forward with this new form of oversight.
If you search for “Facebook’s response to the oversight board’s first set of recommendations,” you’ll find a list of the board’s findings. Each recommendation is assessed and the company’s response and action are described. Not all discoveries are acted on, but all of them are assessed and the company’s response is clear.
Developed to question, inform and guide rather than govern, ethics boards are essential as we move forward with AI systems that will make predictions, categorize people and amplify the biases inherent in data, designers, organizations and our world. We need them to help prevent real problems that affect all aspects of our lives. We’re going to have to learn to handle the truth and move forward with integrity and care—we’re watching what happens when you don’t with Google.