
There is a quiet crisis unfolding in the global rollout of artificial intelligence, and it has Africa’s name written all over it, even if AI itself cannot quite read it.
A 2025 study on large language models (LLMs) and African languages found that of the continent’s more than 2,000 languages, only 42 appear in any meaningful way across the AI systems reviewed, with just four, Amharic, Swahili, Afrikaans, and Malagasy, handled with any degree of consistency. That leaves the overwhelming majority of African voices essentially invisible to the systems now being deployed in hiring, healthcare, finance, content moderation, and public services across the continent. For a region that is both the world’s youngest population and one of its fastest-growing tech markets, this is not a minor footnote. It is a structural problem, and it demands urgent attention from African technologists, policymakers, and entrepreneurs.
The United Nations flagged this dynamic in a recent piece on the rights of indigenous peoples in the AI age, noting that AI systems often reflect biases embedded in the data they are trained on. Unfortunately, this data frequently excludes or misrepresents marginalized communities’ voices and knowledge. The warning applies with particular force in Africa, where the gap between who builds AI and who is affected by it has rarely been wider.
The Language Problem Is Actually a Power Problem
Large language models work well for the 1.52 billion people who speak English, but underperform for speakers of other languages — and the main culprit is data: non-English languages lack the needed quantity and quality of digital material to build effective models. In Africa, this is compounded by what researchers call the “low-resource” problem. Most large language models rely on huge volumes of digital data, and the vast majority of this data is in English or a handful of other widely spoken global languages. This leaves AI models struggling to recognise, generate, or meaningfully “see” African languages, no matter how many people speak them.
The downstream consequences are real and immediate. Between January and March 2025, TikTok removed more than 450,000 videos from Kenya alone and banned over 43,000 accounts, with most removals attributed to automated systems that cannot properly process Swahili or other local languages. The same dynamic plays out in AI-powered hiring tools, credit scoring, and health diagnostics. Systems trained on Western data are being deployed for African users, and are quietly failing those users.
The dominance of English, Chinese, and French on search engines and social media limits access to information for speakers of local languages, threatening linguistic diversity and marginalising African voices in an increasingly AI-dependent society. Critically, there is also a technical dimension: using non-Latin scripts in popular AI tools actually costs more, because software breaks down sentences into smaller parts called tokens, and it takes more tokens to write the same sentence in languages that do not use the Latin alphabet, meaning users who can least afford it end up paying more to process the same amount of text.
Data Sovereignty Is the New Oil
If data is the new oil, Africa is again at risk of watching its reserves extracted with minimal local benefit. A key concern raised at the inaugural Global AI Summit on Africa in Rwanda is that current large language models are not built on African data, and that much of the continent’s data is owned by foreign entities, especially big tech firms.
Nigeria is pushing back. The Nigeria Digital Sovereignty and Fair Data Compensation Bill, 2025, seeks to establish a framework for Nigeria’s digital sovereignty and AI governance, requiring foreign digital companies to contribute fairly to the Nigerian economy and prevent the unregulated extraction of Nigerian data. IAPP It is a bold move, framing data not as a commercial by-product but as a sovereign resource, and it has implications for the entire continent.
The African Union’s 2024 AI Strategy outlines fifteen recommendations for member states, from the creation of national roadmaps to investment in shared infrastructure and regulatory harmonisation. At a ministerial dialogue held alongside the 2025 Global AI Summit, African leaders delivered a clear message: AI must not simply be imported, but built with Africa, by Africa, and for Africa.

The Opportunity Is Real, If Africa Moves Now
The picture is not entirely bleak. There is a growing movement of African researchers and developers who are refusing to wait for Silicon Valley’s attention. The African Next Voices project, funded by a $2.2 million grant from the Gates Foundation, represents the largest AI-ready language data creation initiative for multiple African languages to date, recording 9,000 hours of speech across 18 languages in Nigeria, Kenya, and South Africa. This is exactly the kind of initiative that reframes the continent’s role from passive data source to active architect.
Research groups like AfricaNLP are producing multilingual datasets, benchmarks, and models for African languages, with recent work including hate speech detection in Hausa and Igbo, Swahili news classification, and speech recognition for low-resource languages. Global Voices: These efforts matter not just technically, but politically: they demonstrate that African communities can define the terms of their own digital representation.
The UN’s framing of this as a rights issue is worth taking seriously. Just as indigenous peoples globally have asserted the right to control their knowledge, culture, and data in the face of extractive technologies, African communities — many of whom are themselves indigenous peoples — must demand the same. Meaningful inclusion in AI development is not charity; it is a matter of digital self-determination.
Africa stands at a pivotal juncture — from being largely a consumer of technologies to being an active architect of its own digital destiny. The continent has the talent pipeline, the policy momentum, and increasingly the data infrastructure to change the equation. What it cannot afford is to sleepwalk into another era of extraction dressed up as partnership.
The AI age is being written right now. The question for African technologists, regulators, and entrepreneurs is whether they will be authors of that story or merely characters in someone else’s.
No Comments