Google AI Mode ranks highest in accuracy among popular AI search tools

Google logo image.

Among major artificial intelligence (AI) models, the AI that provides the most accurate answers as a search tool is the Google 'AI Mode', according to test results.

On the 27th (local time), The Washington Post (WP) reported that in an AI search tool test conducted with American public and university librarians, Google 'AI Mode' provided the most accurate responses. This test evaluated nine AI tools, including Google AI Mode, AI Overview, ChatGPT (OpenAI), Claude (Anthropic), Meta AI, Grok (xAI), perplexity, and Bing Copilot (Microsoft). ChatGPT included two models, GPT-5 and GPT-4 Turbo. Google's AI Mode and AI Overview are its search tools, where AI Mode deeply searches the web and provides answers by integrating various sources, while AI Overview summarizes search results.

The test posed 30 challenging questions and scored 900 answers provided by the AI tools. All tools were tested using only the free basic versions (as of July-August). The questions focused on five categories: quizzes, searching technical materials, recent events, inherent biases, and image recognition. The results showed that Google AI Mode received the highest score of 60.2 out of 100. ChatGPT based on GPT-5 placed second with 55.1 points, while perplexity took third with 51.3 points. Elon Musk's Grok3 scored 40.1, landing in eighth place, and Meta AI received the lowest score of 33.7. The latest model of Grok, Grok4, was not included in the test as it does not have a free version.

As expected of the search king, Google AI Mode provided the most accurate answers in the quiz and recent events institutional sectors. Bing Copilot received the highest score in searching professional sources, while perplexity excelled in image recognition. GPT-4 Turbo provided the least biased answers. GPT-5 showed overall performance improvement to secure second place, though it received lower scores than GPT-4 in some areas.

WP analyzed that although this test deliberately targeted the weaknesses of AI, it revealed that AI still fails to provide adequate answers to many everyday questions. AI struggles to determine whether the information is current and to assess the reliability of sources, sometimes confidently presenting incorrect answers. WP noted, "Ultimately, the lesson emphasized is that rather than blindly trusting AI responses, one should verify sources, check for currency, and apply critical thinking like a librarian."

※ This article has been translated by AI. Share your feedback here.