As an undergraduate, I was initially exposed to neural networks in 1990. Many individuals in the AI field were enthused about the possibilities of neural networks at the time, which were remarkable but could not yet do crucial, real-world tasks. I was ecstatic as well! My senior thesis focused on leveraging parallel computation to train neural networks, with the assumption that we only required 32X more computational power to get there. I was completely wrong. We required a million times the processing power at the time.
After only 21 years, and with exponentially more processing capacity, it was time to try again with neural networks. In 2011, my colleagues and I at Google began training very large neural networks using millions of randomly picked frames from videos. The results were stunning. The system learns to detect specific items without specific training (especially cats, the Internet is full of cats). This was one of many breakthroughs in artificial intelligence that continue to be made — at Google and elsewhere.
I offer my personal neural network history to demonstrate that, while development in AI may appear to be unusually rapid right now, it is the result of a long arc of advancement. In reality, previous to 2012, computers struggled to see, hear, or interpret spoken or written language. We’ve made extremely quick development in AI during the last ten years.
Today, we’re enthusiastic about several recent developments in artificial intelligence that Google has led — not only on the technical side but also in responsibly applying it in ways that benefit people worldwide. That includes incorporating AI into Google Cloud, our products ranging from Pixel phones to Google Search, as well as many disciplines of research and other human efforts.
As an emerging technology, we are cognizant of the problems and hazards that AI brings. We were the first big firm to disclose and operationalize a set of AI Principles, and adhering to them has allowed us to focus on making quick progress on technologies that may benefit everyone (which may seem paradoxical to some). Getting AI right requires a collaborative effort that includes not only academics but also domain experts, developers, community members, corporations, governments, and people.
excited to make announcements in three breakthroughs in AI areas: first, leveraging AI to make technology available in many more languages. Second, consider how AI may boost creativity. Third, in AI for Social Good, which includes climate adaptation.
AI will support 1,000 languages
Language is crucial to how individuals interact and make sense of their surroundings. As a result, it’s no wonder that it’s also the most natural way individuals interact with technology. However, almost 7,000 languages are spoken worldwide, and just a few are effectively represented online now. As a result, typical techniques for training language models on online content fail to represent the diversity of how people interact throughout the world. This has always been an impediment to our objective of making the world’s knowledge broadly accessible and helpful.
That is why we are introducing today the 1,000 Languages Initiative, an ambitious pledge to construct an AI model that will support the 1,000 most spoken languages, delivering greater inclusion to billions of people in marginalised areas worldwide. This will be a multi-year project – some may even call it a moonshot – but we are already making significant progress and can see the way forward.
Technology is evolving at a rapid pace, from how people use it to what it is capable of. People are increasingly accessing and sharing information through new modalities such as photos, videos, and voice. Our most sophisticated language models are multimodal, which means they can unlock information in a variety of ways. These seismic changes are accompanied by various opportunities.
As part of this programme and our focus on multimodality, we created a Universal Speech Model (USM) that has been trained on over 400 languages, giving it the most language coverage observed in a speech model to date. As we extend our study, we are collaborating with communities all across the world to collect representative speech data.
By collaborating closely with African scholars and organisations to generate and disseminate data, we just unveiled voice typing for 9 additional African languages on Gboard. In South Asia, we are actively engaging with local governments, NGOs, and academic institutions to gather representative audio samples from all dialects and languages in the region.
AI is helping inventors and artists
AI-powered generative models have the ability to liberate creativity, allowing people from all cultures to express themselves through video, images, and design in ways they could not previously.
Our scientists have been hard at work designing models that outperform the competition in terms of quality, producing images that human raters prefer over other models. We recently disclosed significant accomplishments, including the application of our diffusion model to video sequences and the generation of extended coherent films for a series of text prompts. We can combine these approaches to create the video, and now we’re publishing AI-generated super-resolution footage for the first time:
We’ll also add our text-to-image generating capabilities to AI Test Kitchen shortly, which allows anyone to learn about, experiment with, and provide feedback on new AI technology. We look forward to hearing customer comments on these demos during AI Test Kitchen Season 2. You’ll be able to use word prompts to create themed towns with “City Dreamer” and friendly monster figures who can move, dance, and leap with “Wobble.”
Text-to-3D conversion, in addition to 2D graphics, is now possible with DreamFusion, which generates a three-dimensional model that can be viewed from any angle and composited into any 3D scene. AudioLM, a model that learns to produce sounds, is likewise making substantial progress in the research community.
By listening to audio solely, you may hear genuine dialogue and piano music. AudioLM can anticipate which sounds should follow after a few seconds of an audio prompt in the same way as a language model can predict the words and sentences that follow a text prompt.
As we create these tools, we are interacting with creative groups all across the world. For example, we’re experimenting with AI-powered text production with authors using Wordcraft, which is based on our cutting-edge dialogue engine LaMDA. The first collection of these stories is available just at Wordcraft Writers Workshop.
Using AI to address climate change and health issues
AI has the ability to help humans adapt to new difficulties as well as alleviate the consequences of climate change. Wildfires are one of the worst, affecting hundreds of thousands of people daily and rising in regularity and magnitude.
Today, I’m pleased to announce that we have expanded our use of satellite images to train AI models to identify and monitor wildfires in real-time, allowing us to better forecast how they will grow and spread. We’ve established our wildfire monitoring system throughout the United States, Canada, Mexico, and portions of Australia, and since July we’ve covered more than 30 major wildfire incidents in the United States and Canada, informing our users and firefighting crews with over 30 million images. Google Search and Maps have received 7 million views.
We’re also using AI to anticipate floods, which are being worsened by climate change. We’ve already helped towns estimate when and how deep floodwaters would rise – in 2021, we delivered 115 million flood alarm alerts to 23 million people using Google Search and Maps, saving many lives.
Today, we’re excited to announce that we’re expanding our coverage to include more countries in South America (Brazil and Colombia), Sub-Saharan Africa (Burkina Faso, Cameroon, Chad, Democratic Republic of the Congo, Ivory Coast, Ghana, Guinea, Malawi, Nigeria, Sierra Leone, Angola, South Sudan, Namibia, Liberia, and South Africa), and South Asia (Sri Lanka). To make it operate in locations with less data, we applied an AI approach called transfer learning. We’re also making an announcement.
Google FloodHub, a new tool that indicates when and where floods may occur, has gone worldwide. We’ll also be introducing this information to Google Search and Maps in the future to assist more people to find safety in floods.
Finally, AI is assisting in increasing access to healthcare in underserved areas. For example, we’re investigating how AI might assist in reading and analysing the outputs of low-cost ultrasound machines, providing parents with the knowledge they need to spot problems earlier in pregnancy. We also intend to continue collaborating with caregivers and public health authorities to increase access to diabetic retinopathy screening using our Automated Retinal Disease Assessment tool (ARDA).
We’ve successfully screened more than 150,000 patients with ARDA across actual usage and prospective trials in countries such as India, Thailand, Germany, the United States, and the United Kingdom – more than half of those in 2022 alone. We’re also looking into how AI might assist your phone to detect respiratory and heart rates. This work is a component of The bigger ambition of Google Health, which includes making healthcare more accessible to everybody with a smartphone.
AI in the upcoming years
Our advances in neural network designs, machine learning algorithms, and innovative approaches to machine learning hardware have assisted AI in solving crucial, real-world issues for billions of people. Much more is on the way. What we’re offering today is an optimistic future vision – AI is allowing us to reinvent how technology may be useful. We hope you’ll join us as we investigate these new capabilities and use technology to enhance people’s lives all throughout the world.
To get real-time news alerts join the TechnewsroomsTelegram group. You can also follow us on Twitter and subscribe to our GoogleNews feed for updates.