It seems AI assistants are antique, or that’s what Google wants you to believe, for we are in the era of AI agents — and Google I/O 2024 has quickly proven that. Say hello to Project Astra, a generative AI agent with vision, text, and speech capabilities, with a sprinkling of memory and spatial awareness capabilities in tow.
Think of it as eyes for your phone that can make sense of the world around you. Point it at a mathematical equation, and it will solve it. Pointing the camera at a cat? Astra will suggest an apt name for the feline meow-ster. Ask it where you left your earbuds, and if the camera sensor has seen them, it will say something like, “You left them on the sofa.”
Astra can make sense of code appearing on a screen, identify objects and explain what they do, identify buildings, and more. Think of it as Google Lens but for the entire world, and can make sense of most anything in front of the camera’s lens.
Unlike Google Assistant, you don’t need to prompt it. Just point the camera at anything, utter the audio query, and Astra will explain in a natural language response. Google says Project Astra will be rolled out via the Gemini app later this year.
An all-seeing AI agent?
If Project Astra sounds familiar, that’s because OpenAI demoed a similar feature for ChatGPT — powered by the new GPT-4o model — just a day ago. OpenAI’s tool is currently under the red-teaming phase for safety testing and will be released in a phased manner, starting with ChatGPT Plus subscribers.
Google won’t say whether Astra will have a price tag. But given the near spontaneity of responses, the kind of visual data crunching involved, and the generative chops required to offer a meaningful response, it’s unlikely Astra will be served as a free perk owing to the compute requirements.
The best example would be the Google One AI Premium subscription, which already hides some of the new Gemini-powered experiences behind a paywall. But so far, Astra looks like the most amazing AI innovation Google has showcased ever since it wowed the world with Duplex’s capabilities a few years ago.