Google I/O 2024 Unveils Gemini AI, Android 15, and Revolutionary Updates

At the end of I/O, Google’s annual developer conference at the Shoreline Amphitheater in Mountain View, Google CEO Sundar Pichai revealed that the company had said “AI” 121 times. That, essentially, was the crux of Google’s two-hour keynote — stuffing AI into every Google app and service used by more than two billion people around the world. Here are all the major updates from Google’s big event, along with some additional announcements that came after the keynote.

Gemini 1.5 Flash and Updates to Gemini 1.5 Pro

Google announced a brand new AI model called Gemini 1.5 Flash, which it says is optimized for speed and efficiency. Flash sits between Gemini 1.5 Pro and Gemini 1.5 Nano, the company’s smallest model that runs locally on devices. Google said that it created Flash because developers wanted a lighter and less expensive model than Gemini Pro to build AI-powered apps and services while keeping features like a long context window of one million tokens that differentiate Gemini Pro from competing models. Later this year, Google will double Gemini’s context window to two million tokens, enabling it to process two hours of video, 22 hours of audio, more than 60,000 lines of code, or over 1.4 million words simultaneously.

Google showed off Project Astra, an early version of a universal assistant powered by AI that Google’s DeepMind CEO Demis Hassabis said was Google’s version of an AI agent “that can be helpful in everyday life.”

In a video that Google says was shot in a single take, an Astra user moves around Google’s London office holding up their phone and pointing the camera at various things — a speaker, some code on a whiteboard, and out a window — and has a natural conversation with the app about what it sees. In one of the video’s most impressive moments, the assistant correctly tells the user where she left her glasses without the user ever having brought up the glasses.

The video ends with a twist — when the user finds and wears the missing glasses, we learn that they have an onboard camera system and are capable of using Project Astra to seamlessly carry on a conversation with the user, perhaps indicating that Google might be working on a competitor to Meta’s Ray-Ban smart glasses.

Google Photos was already intelligent when it came to searching for specific images or videos, but with AI, Google is taking things to the next level. If you’re a Google One subscriber in the US, you will be able to ask Google Photos a complex question like “show me the best photo from each national park I’ve visited” when the feature rolls out over the next few months. Google Photos will use GPS information as well as its own judgment of what is “best” to present you with options. You can also ask Google Photos to generate captions to post the photos to social media.

Google’s new AI-powered media creation engines are called Veo and Imagen 3. Veo is Google’s answer to OpenAI’s Sora. It can produce “high-quality” 1080p videos that can last “beyond a minute,” Google said, and can understand cinematic concepts like a timelapse.

Imagen 3, meanwhile, is a text-to-image generator that Google claims handles text better than its previous version, Imagen 2. The result is the company’s highest quality text-to-image model with an “incredible level of detail” for “photorealistic, lifelike images” and fewer artifacts — essentially pitting it against OpenAI’s DALLE-3.

Google is making big changes to how Search fundamentally works. Most of the updates announced today, like the ability to ask really complex questions (“Find the best yoga or pilates studios in Boston and show details on their intro offers and walking time from Beacon Hill.”) and using Search to plan meals and vacations won’t be available unless you opt-in to Search Labs, the company’s platform that lets people try out experimental features.

But a big new feature that Google is calling AI Overviews, which the company has been testing for a year now, is finally rolling out to millions of people in the US. Google Search will now present AI-generated answers on top of the results by default, and the company says that it will bring the feature to more than a billion users around the world by the end of the year.

Google is integrating Gemini directly into Android. When Android 15 releases later this year, Gemini will be aware of the app, image, or video that you’re running, and you’ll be able to pull it up as an overlay and ask it context-specific questions. Where does that leave Google Assistant, which already does this? Who knows! Google didn’t bring it up at all during today’s keynote.

WearOS 5 Battery Life Improvements

Google isn’t quite ready to roll out the latest version of its smartwatch OS, but it is promising some major battery life improvements when it comes. The company said that Wear OS 5 will consume 20 percent less power than Wear OS 4 if a user runs a marathon. Wear OS 4 already brought battery life improvements to smartwatches that support it, but it could still be a lot better at managing a device’s power. Google also provided developers with a new guide on how to conserve power and battery, so that they can create more efficient apps.

Android 15 Anti-Theft Features

Android 15’s developer preview may have been rolling for months, but there are still features to come. Theft Detection Lock is a new Android 15 feature that will use AI (there it is again) to predict phone thefts and lock things up accordingly. Google says its algorithms can detect motions associated with theft, like those associated with grabbing the phone and bolting, biking, or driving away. If an Android 15 handset pinpoints one of these situations, the phone’s screen will quickly lock, making it much harder for the phone snatcher to access your data.

There were a bunch of other updates too. Google said it would add digital watermarks to AI-generated video and text, make Gemini accessible in the side panel in Gmail and Docs, power a virtual AI teammate in Workspace, listen in on phone calls and detect if you’re being scammed in real-time, and a lot more. Source