Google For India 2022: Using AI To Improve Indian Languages


From the most extensive attempt to understand several Indian dialects, to better comprehend farmlands, and to digitise medicines with only a smartphone,

When Google Research in India was founded three years ago, we set out on a mission to advance fundamental computer science and artificial intelligence research by assembling a strong team and collaborating with the country’s research community and then applying this research to some of India’s most pressing problems. 

It’s been an interesting and fulfilling journey with our partners, from discovering solutions for Indian language speakers to allowing improved maternal healthcare solutions to assisting in the more efficient diagnosis of avoidable diabetes blindness.

We are highly motivated by the effect of this effort in each of these sectors – solving Indian difficulties implies solving for areas as diverse as our country of over a billion people. As a result, we are delighted to discuss with you today how AI and machine learning are assisting in the delivery of more relevant and scalable solutions in three important areas: Language I Agriculture (ii), and Health (iii).

Using artificial intelligence to make the web more accessible to Indians in their language

Speaking skill is one such technology that may significantly reduce the information divide, and we are on a mission to remove obstacles so that every Indian can benefit from the life-changing potential of the internet in their preferred Indian language.

However, language poses a greater issue in the Indian language setting, where linguistic variety determines the very core of our civilization. Even the same language might sound very different depending on where you are – in Muzaffarpur, for example, five dialects are spoken! To build an efficient AI-based language model that understands all of these intricacies, researchers must be aware of these linguistic variances.

See also  Google plans to invest $9.5 billion in the United States in 2022

G4IN2022_-_AI-_G4IN22_AI_100_Languages indian language

We announced a few months ago an ambitious worldwide research initiative to construct a model that would handle the top 1,000 languages in the world, in which Google Research India will substantially help the global research team.

We have partnered with the Indian Institute of Science (IISc), whose team will gather anonymized speech data from people in 773 districts, reflecting variations in gender and age, as well as a range of educational and socioeconomic backgrounds.

This cooperation, known as project Vaani, aims to gather and transcribe open-source speech data from all 773 districts in India, with the goal of making it available in the future through the Government of India’s Bhashini initiative. This implies that anybody developing language solutions for India — companies, developers, and students — will be able to leverage this wildly diversified voice data to create technology that mimics how each Indian speaks their native language.

We’re also excited to announce that our team at Google Research in India, in collaboration with our global teams, has already begun work on the next significant step in developing more effective Indian language models. We’ve set an ambitious objective of creating a single, unified model that can handle over 100 Indian languages in both speech and text. This will open the way for many more Indian language speakers to have a significantly more inclusive experience, and we look forward to sharing more in the future years.

Creating fundamental technologies for India’s Digital Agriculture vision

With agriculture providing a living for about half of India’s population, technological breakthroughs in this industry can have far-reaching consequences. However, various problems remain, owing primarily to a lack of actionable information, as well as the expanse of our country and the relatively modest farm sizes that characterise the country’s agricultural makeup.

See also  7 ways AI is improving the use of your Pixel already

Access to agricultural data at the field level will allow decision-makers to better help farmers and make agriculture more sustainable, both financially and in terms of climate change. This procedure is currently manual and sometimes prone to mistakes and prejudice, from defining the amount of access to water resources to calculating the area under cultivation for a certain crop.


We announced today that we are developing a model that can assist build a holistic picture of India’s agricultural environment by combining our superior AI and ML skills with remote sensing technologies. The programme will be able to recognise precise occurrences for each field in the next months, such as when a crop was seeded, harvested, and so on.

This project will also allow AgriStack and other agricultural ecosystem solutions in India, with an emphasis on recognising farm-level landscape and farm borders, as well as potentially identifying crops cultivated in each field. This data will help to establish a publicly available dataset for enabling digital public goods and services, as well as encourage innovation throughout the agriculture value chain.

Using AI to improve healthcare one prescription at a time

The basic prescription is the common building component that underpins much of public and private healthcare. While these are sometimes handwritten and difficult to see, the information is critical for both patients and healthcare practitioners, for early diagnosis or self-management.

You may be wondering that we’ve had the technology to translate text from photographs for decades; what’s new, and what distinguishes prescriptions? Ironically, the same thing that makes prescriptions difficult for computers to digitise also makes them difficult for you and me to comprehend – they’re unstructured, in shorthand, and full of hints for pharmacists to decode.

See also  'Minty Fresh' Pixel 8 release teased at Google Store

We revealed today a cutting-edge AI and machine learning model that can recognise and highlight medications within handwritten prescriptions. This will serve as an assistive technology for digitising handwritten medical records by complementing humans in the loop, such as pharmacists, but no decision will be made purely on the output supplied by this technology.

While this technology is still in development, we look forward to providing more information about its wider adoption.

The transformational power of AI is obvious, but much more important is the need to develop it ethically. While we published our AI principles in 2018, we feel that getting them properly will need a collaborative effort with the entire ecosystem.

As a result, we are pleased to announce that Google has made a $1 million donation to the Indian Institute of Technology, Madras, to build the first interdisciplinary centre for Responsible AI. This centre will encourage a collaborative effort involving not just researchers, but domain experts, developers, community members, policymakers, and others to get AI right and localise it to the Indian environment.

To get real-time news alerts, join the Technewsrooms Telegram group. You can also follow us on Twitter for updates.


Leave a Comment


No comments yet. Why don’t you start the discussion?

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.