Challenging the Titans: Google's New Gemini Embedding Model Dominates the Leaderboard, Alibaba's Opensource Option Advancing

Published: 19 Jul 2025
Google’s new Gemini Embedding model is now the top-ranked in the Massive Text Embedding Benchmark (MTEB), bringing a shakeup to the existing leaderboard.

Google’s recently released Gemini Embedding model has climbed rapidly, securing the number one spot in the esteemed Massive Text Embedding Benchmark (MTEB). This high-performance model, suitably titled ‘gemini-embedding-001’, has become a staple in Google’s API structure, powering applications like semantic search and retrieval-augmented generation (RAG). However, the world of embedding models is no tranquil sea. It’s an arena characterized by intense competition, with open-source alternatives attempting to unseat Google’s proprietary model. As such, businesses now have an intriguing puzzle to solve: should they opt for the high-ranking proprietary framework or the nearly-as-efficient open-source competitor that gives them more autonomy?

With the launch of Gemini Embedding, Google is recruiting a versatile game-changer into its army. Thanks to Matryoshka Representation Learning (MRL), developers can obtain a highly detailed 3072-dimension embedding and streamline it down to smaller sizes, keeping the most pertinent features intact. It’s this flexibility that permits a company to find equilibrium between model correctness, performance, and storage costs, instrumental in scaling applications efficiently. Gemini Embedding has been created as a standardized model, working well across various fields such as finance, legal, and engineering, thus easing the development process for teams needing a one-size-fits-all solution. As it supports over 100 languages and costs a mere $0.15 per million input tokens, it embodies the essence of being accessible to all.

The MTEB leaderboard narrates a different tale, though. While Gemini leads, the finish line isn’t too far off. With looming contenders including OpenAI’s widely used models and specialized challengers like Mistral, the competition is heating up. The rise of these specific models insinuates that a focused tool could surpass the performance of an all-rounder. Cohere, with its Embed 4 model, is geared specifically towards enterprises. While other models walk the path of general benchmarks, Cohere brings something new to the playing field with its focused enterprise approach. Therefore, the race is far from over. Yes, Google’s Gemini is presently in a leading position, yet others aren’t far behind, manifesting a thrilling period for AI.