/ NEWS

Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Google has released two updated AI models, Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002, featuring enhanced performance, faster outputs, and significantly reduced costs. These models improve upon the Gemini 1.5 series with a focus on text, code, and multimodal tasks, making them highly versatile and accessible for developers through Google AI Studio and the Gemini API.

Google has introduced two powerful updates to its Gemini 1.5 model series: the Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002. These models bring significant improvements in speed, efficiency, and cost, continuing the momentum from previous releases. With a focus on providing developers with faster outputs, reduced latency, and more affordable pricing, these models are ideal for a wide range of use cases, from long-context text synthesis to advanced multimodal applications. Both models are accessible via Google AI Studio and the Gemini API, making them easier to integrate for developers and larger organizations using Google Cloud and Vertex AI.

The Gemini 1.5 models are designed to perform exceptionally well across various text, code, and multimodal tasks. These capabilities allow them to handle complex tasks like processing 1,000-page PDFs, answering questions about large code repositories, and analyzing hour-long videos. The latest updates to Gemini 1.5 Pro and Flash build on these strengths, with significant improvements in performance metrics and model efficiency. For example, both models have seen a ~20% boost in math benchmarks and 7% better results in the MMLU-Pro benchmark, positioning them as top performers in their class.

To make the models more developer-friendly, Google has reduced the default output length by 5-20%, ensuring concise responses without sacrificing accuracy. Additionally, the models now offer faster output generation and drastically lower latency, allowing developers to work more efficiently and at scale.

Some of the key features we can clearly see reading though the official post are:

  • Massive Cost Reduction: Gemini 1.5 Pro now offers a 64% price reduction for input tokens and a 52% reduction for output tokens for prompts under 128K tokens, driving down the cost of development.
  • Increased Rate Limits: Paid tier rate limits are doubled for 1.5 Flash (up to 2,000 RPM) and tripled for 1.5 Pro (up to 1,000 RPM).
  • Speed and Latency Improvements: Both models are now 2x faster and feature 3x less latency, allowing for quicker and more efficient processing.
  • Multimodal Capabilities: Enhanced support for complex tasks like video understanding, large-scale document synthesis, and code generation.
  • Developer-Controlled Filters: Updated safety filters allow developers to configure the models to best suit their needs, with filters no longer applied by default.

The release of the Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002 models underscores Google’s commitment to making AI development more efficient, affordable, and versatile. With substantial improvements in speed, accuracy, and cost-efficiency, these models are a powerful tool for developers working on text, code, and multimodal projects. The reduction in costs, coupled with increased rate limits and faster output, ensures that Gemini 1.5 models can cater to diverse needs in AI development, from startups to large-scale enterprises. As Google continues to refine these models, the future of AI-driven innovation looks brighter than ever. To read full article, in the Google for Developers blog, go here.