Nik Dholakia wrote in his recent research proposal AI and Global South “India will be a great market for LLMs because India has 22 major languages.”
These kinds of statements seems promising in Tech because conventionally Tech is evolved for developed countries where "ACCESS" isn't a big issue but when we approach Global South, Africa, India or any other developing countries, I mean 70% of world population then it's essential to think carefully about ACCESS.
Let's understand this issue in context of India
Smartphone users = 70 crores (37 crore = 5G connection)
WhatsApp users = 50 crore YouTube users = 40 crore.
Wired Wifi connection = 4.5 crore.
Estimates Data usage using TRAI, Nokia, Ericsson etc.
1GB per day = 12 crore. 2GB per day = 6 crore.
People who do scroll scroll or watch YouTube regularly are no more than 6 crore.
That's why, even most popular Indian YouTubers like Dhruv Rathee, get an average of only up to 1 crore views.
Data consumption isn't uniform, top 10 cities of India consumes most of the DATA.
Around 40% of Instagram creators (with ~50K followers, whose reels get recommended ~80% of the time) come from Delhi NCR and Mumbai. The top 5 cities account for ~65% of creators, and their content largely targets the top 20 cities, which drive ~60% of India’s e-commerce sales.
If you look at the language population distribution in India,
Hindi : 53 crore (India), 5 crore (Top 20 cities)
Tamil : 7 crore (India), 1 crore (Top 20 cities)
English : 8 crore (India), 6 crore (Top 20 cities)
For most queries, functional English is enough to use ChatGPT effectively.
ChatGPT can also handle Indian languages reasonably well, especially when paired with RAG for internal or non–consumer-facing use cases. Full-scale custom language training is usually only needed for consumer-facing apps like ChatGPT, where users expect fluent, natural, and culturally relevant responses in their native languages.
Let's understand ChatGPT statistics,
Daily Active Users = 3.5 Crore
Average Prompt per Day = 15+
The concept of "DE COLONIAL TECH" has a fundamental assumption that ACCESS has no COLONIAL structure.
When it comes to good-quality education, whether medical, engineering, MBA, etc, almost 80% of the top 200 colleges exist in the top 20 cities, so if you want social mobility, you need to follow the cultural norms.
M. N. Srinivas, an early Indian sociologist, defined this issue as:
Sanskritisation: adopting upper-caste language
Brahminisation: adopting upper-caste cultural practices
Westernisation: learning global Western norms
Technology and Modern Education allows lower caste to skip Sanskritisation/Brahminisation and go straight to Westernisation just like upper caste and prevent them from double cultural oppression.
Technology can’t decolonise anything unless political, economic, educational, and other “key institutions” gain sufficient decentralisation. There can’t be any inclusion without equity.