Does the net speak Dogri?

Not yet, but government and industry are trying to make the cornucopia of information on the net available in Indian languages


Taru Bhatia | January 9, 2017 | New Delhi

#Internet   #Languages   #CDAC   #TRAI   #egov  

A 2016 report of the Internet and Mobile Association of India (IAMAI) is a bit of a misnomer: it’s called ‘Proliferation of Indian Languages on Internet’, but these languages account for hardly 0.1 percent of internet content. The internet, as the Chinese and speakers of other languages would agree, is dominated by English – as high as 56 percent of the content.

This has serious consequences for India, where barely 10 percent of the people can use English. “We are leaving 90 percent of the people behind,” says Rajat Moona, director-general of the Centre for Development of Advanced Computing (C-DAC), a government research and development organisation. “We’ll not be able to move ahead without them.”

What’s worth noting, though, is that, despite the scarcity of local online content, the country has witnessed a sharp rise in internet usage in rural parts. The number of people with mobile internet access reached 87 million in December 2015 – a 99 percent growth over the previous year, according to the IAMAI report. Around 75 percent of new internet user growth is expected to come from rural India, and these users will prefer content in local languages, says a Nasscom-Akamai Technologies 2016 report.  The growth in internet usage in non-metros and rural areas owes largely to people buying more mobile phones; though not everyone is going for smartphones, even simple phones these days allow internet usage. Data from the Telecom Regulatory Authority of India (September 2016) indicates a positive growth in wireless subscriptions in Haryana (4.04%), Odisha (2.67%), West Bengal (2.39%) and Assam (2.33%). Delhi, on the other hand, saw a decline in growth rate with only 0.93% increase in subscribers.

“When you look at usage in local languages, the number is growing 45 percent year-on-year,” says Rakesh Deshmukh, co-founder and CEO of Indus OS, which developed the eponymous operating system in 12 Indian languages. “With initiatives like providing language support, this number will certainly grow in terms of rural versus urban population.”

Recognising this shift, the government, in September, mandated that manufacturers ensure that cellphones support all 22 official Indian languages. Besides, they should allow typing in English, Hindi and the user’s choice of one more Indian language. Implementation is expected from July 2017. But what use is that capability without online content in a language the user knows? “Having content in local languages is an important objective of Digital India,” says Ajay Kumar, additional secretary, ministry of electronics and information technology (MeitY). “Unless we have that part in place, most people will remain excluded.”

Hindi, Ahomia anyone?

As the medium of international communication, English dominates internet and is likely to continue doing so. Equalling that will be impossible, but translation is a way out. Technology Development for Indian Languages (TDIL), a Rs 50 crore-100 crore government programme, is working on machine-translation technology, for both text to text and text to speech. But machine translation so far is of patchy quality. “It has to be supplemented with human endeavour, and that’s a costly affair,” says Kumar. “But once technology improves, human effort can be minimised.”

C-DAC, for instance, is crowdsourcing the work of volunteers for text to text translation of central government websites. The Rs 15 crore project began in 2013, and so far, 60 websites have been translated. The focus largely remains on Hindi, though. “Some 100 websites have been identified to be made available in Indian languages. But right now we are talking only about Hindi. We are in the process of translating into other Indian languages, for which we have to identify volunteers,” says Moona.

Public contributors register on C-DAC’s localisation project management system application and suggest word meanings. There are also C-DAC-appointed members who log in and contribute. A specialist group then goes through the growing corpus, validates the contributions and finalises choices. “Crowdsourced data helps us create a parallel corpus [of words and meanings] that will improve machine translation,” says Moona. Government agencies – and private players too – are free to use the corpus: the IRCTC and Yatra portals of the railways, Bank of Maharashtra, and Snapdeal are already using it.

“The quality of translation really depends upon the data resource you create, so that the machine can understand every use of the word and what it means in each context. This is a huge task, but as the corpus you build gets better, machine translation will keep on improving,” says Kumar. However, there’s no deadline yet to make all government websites available in Indian languages.

Capital venture

Business has already seen an opportunity and is preparing for the expected surge from rural India. A recent Google ad shows a bespectacled man in shirt and trousers reading a newspaper at a railway station. From among a group of workers squatting a little away comes a voice reading out the very headline the man is scanning. He turns around, irritated, only to have all the headlines read out one after the other. It turns out that one of the workers is getting all his news updates on his smartphone via Google in Hindi. The internet giant’s initiative is one of many that online business is taking to attract more users and eventually profit from it.

Snapdeal, an e-commerce platform, has developed a user interface that supports 11 regional languages, including Hindi, Telugu, Gujarati, Tamil and Marathi. “A significant part of Snapdeal’s users come from tier-2 and tier-3 cities and the multilingual interface, developed on the basis of feedback from buyers and sellers, gives us better access to a larger audience and enables everyone across India to explore and transact without any language constraints,” says a Snapdeal spokesperson in an email reply.

Digital wallet companies, gaining popularity post demonetisation, are also seeking to expand their footprint through regional outreach: Paytm, which leads the market with 150 million users, is set to launch its application in ten Indian languages. “Our goal is to make payments and commerce more inclusive, and this new feature will help us expand the market to include users who would prefer their native languages,” says Deepak Abbot, senior vice-president of Paytm. MobiKwik, another digital wallet, has also customised its application for Indian languages. “Apps in regional languages will help them understand the wallet user interface better and form the habit of using wallets,” says Mrinal Sinha, MobiKwik’s chief operating officer.

New writing

Content-based startups are recognising that localisation will gain them a following. In Shorts, an app that compiles and distributes news, was launched in English in 2013. Sensing that growth lay in regional languages, Azhar Iqubal, its founder, decided to add a Hindi version in 2015. Today, the app has been downloaded five million times, with the Hindi version accounting for more than 10 percent. “As we see it, Hindi is going to be the big player in regional languages. Looking at its growth rate, we see our Hindi users overtaking English users,” he says. “I have been getting requests for applications in other regional languages too. In 2017, we will probably explore other languages.”

But the most popular websites continue to use English as the main language. Nikhil Pahwa, founder of MediaNama, a mobile and digital news portal, says content development in regional languages is held up because there’s no revenue model. “The kind of money local content developers are making is far less than what English content developers are making. Advertising is less because fewer people are advertising in local languages on the internet,” he adds. IAMAI notes that of the Rs 179 crore digital advertisement market, only five percent goes into local language ads. By 2020 though, with growth in local content online, it is expected to grow to 30 percent.

The government’s e-bhasha initiative, under the MeitY, is soon to be declared a mission mode project (MMP), which means it will be fast-tracked. “Once that happens, it will be about best practices for web developers to follow. There will be guidelines on how localisation of content is to be achieved and how mobile platforms have to be developed,” says a C-DAC spokesperson.

The hard graft

One stumbling block for content developers and users alike is the non-availability of user-friendly keyboards (and fonts) in Indian languages. Virtual keyboards are being used to bridge the gap, but they can be cumbersome. “Most Indian languages have 55 characters or more. They have many half-letters, diphthongs and so on. How does one type them? Combining them with a smartphone ecosystem is a challenge. At Indus OS, we have largely addressed the issue,” says Deshmukh. But he connects this to the problem of content development: “When you build a technology in Indic languages, lack of content becomes a roadblock. We need a lot of data and information in other languages as well.”

Another is the Sisyphean endeavour of updating translations and following up on changes that websites make from time to time. “Most of the government websites are dynamic – which means they keep changing their content frequently. It’s not as if you translate once and the work is done. It requires continuous updates and continuous management,” says Moona. He also mentions websites that use language as advertisers practise it— indifferent grammar, wayward punctuation, which is especially difficult to machine-translate. Artificial intelligence and self-learning programmes are expected to reach a level of sophistication which will make human intervention unnecessary. But till then, crowdsourcing is the only way out.

To overcome the hurdle posed by widespread illiteracy, the government partnered with Indus OS in 2015 to develop text-to-speech technology for smartphones for eight languages that in future can work even without an internet connection. Indus OS already has a wide range of applications in English and 12 Indian languages. It has also teamed up with five mobile makers, including Micromax, Karbonn and Intex, to make the software available. But for this to succeed, memory and operating speeds have to be optimised for low-cost smartphones.

Beyond all divides

The trend towards internet usage in regional languages has been recognised as an opportunity by business. The government has to equally recognise that policy must be geared not only to ease things for business, but also to ensure an equitable and inclusive internet for people from across all languages and, needless to say, classes.

(The story appears in the January 1-15, 2016 issue)



Other News

How much time do you spend talking on phone?

How much time do Indians spend talking on phone? It is on average 761 minutes per month, according to a new report from the Telecom Regulatory Authority of India (TRAI). The telecom regulator released its report, titled ‘The Indian Telecom Services Performance Indicators: July-Septemb

“Developing public health infrastructure key to sustainable healthcare for all”

Renowned cardiologist Dr Ramakanta Panda has said that the pandemic has exposed the inadequacy of existing healthcare systems and it is wrong to draw comparisons with Korea, a country with the population equal to that of a single Indian state. While speaking to Kailashnath Adhikari, MD, Gove

SC-appointed panel on farm laws holds first meet

The committee of experts appointed by the supreme court to deliberate with the stakeholders on the new farm laws held its first meeting here Tuesday, with one of its members saying that all stakeholders, including individual farmers, will be heard. Hearing a petition on the farm laws enacted

India’s glitch-free vaccination gathers pace

The nationwide vaccination campaign launched Saturday, the largest such exercise in the world, has started setting new benchmarks, with vaccines administered to 2,24,301 beneficiaries in the first two days. “India has vaccinated the highest number of persons on Day1 under its COVID19 v

Maharashtra to spend Rs 2,500 crore to augment, develop power infrastructure

The Maharashtra government has announced a spending of Rs 2,500 crore annually to develop infrastructure of state-owned distribution company Mahavitaran (MSEDCL).   Out of the total amount, Rs 1,500 crore will be spent on energisation of conventional agriculture pumps and Rs 1,000 crore

Launched: Largest vaccination drive in history

India on Saturday began the massive vaccination drive against Covid-19, as prime minister Narendra Modi paid tributes the ‘corona warriors’. “Such a vaccination drive at such a massive scale was never conducted in history. There are over 100 countries having less than 3 cro

Masterminds: Masterclass on World Affair with Sreeram Chaulia


Current Issue


Facebook    Twitter    Google Plus    Linkedin    Subscribe Newsletter