Google rolls out its most powerful AI models as competition from OpenAI heats up
The logo of the Google I/O developer conference can be seen at the venue in Mountain View, Calif. on May 14th, 2024.
Picture Alliance | Picture Alliance | Getty Images
Google is using its annual developer conference to showcase what the company is calling its lightest and most efficient artificial intelligence models.
At Google I/O on Tuesday, the company announced Gemini 1.5 Flash, the newest addition to the Gemini model series.
“We heard from developers that they wanted something faster and even more cost effective,” said Demis Hassabis, CEO of Google DeepMind, in a press briefing.
The unveiling comes as tech companies increasingly refocus their product development and rollouts around generative AI, which is of particular importance to Google because the new tools give consumers more advanced and creative ways to access online information compared to traditional web search.
OpenAI on Monday launched a new AI model and desktop version of ChatGPT, along with a new user interface. The new model, GPT-4o, is twice as fast as GPT-4 Turbo and half the cost, the company said.
Google also announced an improved Gemini 1.5 Pro model, which has the ability to make sense of multiple large documents — 1,500-pages total — or summarize 100 emails, according to a vice president working on Gemini.
Gemini 1.5 Pro will soon be able to handle an hour of video content, or codebases with more than 30,000 lines, Hsiao said.
“You can quickly get answers and insights about dense documents, like figuring out the details of the pet policy in your rental agreement or comparing key arguments of multiple long research papers,” Hsiao said.
OpenAI’s latest upgrade, announced this week, brings with it improved quality and speed of ChatGPT for 50 different languages. It will also be available via OpenAI’s application programming interface (API), allowing developers to begin building applications using the new model immediately, executives said.
With 35 languages, Google says Gemini 1.5 Pro has a 2 million token window, which measures context and indicates how much information the model is able to process at once. The new model has improved local reasoning, planning and image understanding, company executives said.
“It offers the longest context window of any foundational model yet,” Alphabet CEO Sundar Pichai said in the press briefing. At the event, he gave an example of a parent asking Gemini to summarize all recent emails from their child’s school.
Gemini 1.5 Pro will initially be available for testing in Workspace Labs. Gemini 1.5 Flash will be available for testing and in Vertex AI, which is Google’s machine learning platform that lets developers train and deploy AI applications.