This is Atlantic Intelligence, a limited-run series in which our writers help you wrap your mind around artificial intelligence and a new machine age. Sign up here.
Generative AI is famously data-hungry. The technology requires huge troves of digital information—text, photos, video, audio—to “learn” how to produce convincingly humanlike material. The most powerful large language models have effectively “read” just about everything; when it comes to content mined from the open web, this means that AI is especially well versed in English and a handful of other languages, to the exclusion of thousands more that people speak around the world.
But Matteo also explores how generative AI might be used as a tool to preserve languages. The grassroots efforts to create such applications move slowly. Meanwhile, tech giants charge ahead to deploy ever more powerful models on the web—crystallizing a status quo that doesn’t work for all.
Recently, Bonaventure Dossou learned of an alarming tendency in a popular AI model. The program described Fon—a language spoken by Dossou’s mother and millions of others in Benin and neighboring countries—as “a fictional language.”
This result, which I replicated, is not unusual. Dossou is accustomed to the feeling that his culture is unseen by technology that so easily serves other people. He grew up with no Wikipedia pages in Fon, and no translation programs to help him communicate with his mother in French, in which he is more fluent. “When we have a technology that treats something as simple and fundamental as our name as an error, it robs us of our personhood,” Dossou told me.
The rise of the internet, alongside decades of American hegemony, made English into a common tongue for business, politics, science, and entertainment. More than half of all websites are in English, yet more than 80 percent of people in the world don’t speak the language. Even basic aspects of digital life—searching with Google, talking to Siri, relying on autocorrect, simply typing on a smartphone—have long been closed off to much of the world. And now the generative-AI boom, despite promises to bridge languages and cultures, may only further entrench the dominance of English in life on and off the web.