Unsurprisingly, AI is front and center at this year’s Google I/O developer conference. The company has just unveiled a more-advanced version of Gemini 1.5 Pro, its powerful generative AI suite. Available for developers starting today, Gemini 1.5 Pro is a multimodal language model that can work with text, voice, and various content formats.
The latest updates to Gemini 1.5 Pro introduce an extended context window, enhanced data analysis features, integrations with additional Google apps, and increased customization options. There are also improvements across crucial use cases, such as translation, coding, reasoning, and more.
Gemini 1.5 Flash
Google also introduced Gemini 1.5 Flash, which is a smaller model that has been optimized for narrower or high-frequency tasks where speed and response time matters the most.
Both 1.5 Pro and 1.5 Flash will now support a 1 million token context window, and Google also has plans to expand that to 2 million on 1.5 Pro. Both models are now available to users across 200 countries as a preview with a general rollout expected to happen in June.
Longer context window
One of the highlight capabilities of Gemini 1.5 Pro is the improved context window of 1 million tokens, which is said to be the longest of any consumer chatbots in the world. What that means is the AI can now comprehend numerous large documents — as much as 1,500 pages, or summarize approximately 100 emails. It will also eventually have the capability to process an hour of video content or codebases exceeding 30,000 lines.
Google also announced that it was even aiming to hit a 2 million token context window by the end of this year, further expanding the AI’s capabilities.
Gemini Live
To make the AI model feel more natural and intuitive, Gemini will be gaining a new Live feature that lets you have a more-enhanced conversational experience. Not only can you talk to Gemini and make queries but it can also react to a variety of sounds in your environment.
As an example, you can use the Live feature within Gemini to assist in an interview by preparing and rehearsing with you and suggesting key skills to emphasize during the interview. Additionally, upcoming features will allow Gemini to utilize your camera during Live sessions, facilitating discussions about your surroundings.
Deeper integration with apps
Gemini 1.5 Pro will also focus on enhancing the AI chatbot to function as a versatile digital assistant that’s specifically adept at managing daily tasks. To do so Google is integrating Gemini with Google Calendar, Tasks, and Keep, set to roll out soon through extensions introduced in the Bard platform last year.
This will help users to seamlessly perform actions like summarizing emails in Gmail, accessing Google Docs or Drive, and even uploading images for tasks such as adding events to Google Calendar or items to a shopping list on Google Keep. Gemini’s multimodal capabilities and proposed functionalities, like recognizing school event lists from photos or compiling recipe ingredients into shopping lists, offer a streamlined approach to organizing daily responsibilities.
Google even announced a new AI Teammates feature for Workspace users that lets you deploy virtual coworkers across your company or organization.
Personalized Gems
Gemini Advanced subscribers will soon have the option to craft Gems, a tailored version of Gemini for a more personalized interaction. Whether you need a companion while working out, a cooking assistant, a coding collaborator, or a writing mentor, Gems can be customized to suit your preferences.
By simply outlining the tasks and desired responses, Gemini will refine your instructions with a single click, creating a Gem that caters to your unique requirements.
Gemini touched nearly ever announcement from the keynote, including in the updates to Android, Search, Gmail, Google Lens, Google Photos, and more.