iAsk AI Outperforms OpenAI’s o1 Model in Comprehensive Generative AI Benchmark Test

Results cement iAsk Pro as the top AI for complex problem-solving

October 29, 2024 09:00 AM Eastern Daylight Time

CHICAGO--(BUSINESS WIRE)--iAsk, a Generative AI-powered answer engine designed for Gen Z, today announced that iAsk Pro, its most advanced model, has surpassed both human experts and the OpenAI o1 model for accuracy in the Graduate-Level Google-Proof Q&A (GPQA) benchmark. iAsk Pro achieved a score of 78.3%, outperforming OpenAI’s o1 score of 77.3% and positioning iAsk Pro as the most advanced AI model available today.

The GPQA benchmark is known for its complexity, evaluating AI models on their ability to accurately answer graduate-level questions in challenging subjects such as biology, physics, and chemistry. iAsk Pro’s performance not only surpassed OpenAI but also outperformed human PhDs, who score an average of 69.7%.

“Our mission is to make the world’s knowledge universally accessible and easy to understand,” said Dominik Mazur, iAsk co-founder and CEO. “By pushing the boundaries of our model to surpass human expertise, we’re not just unlocking new insights — we’re making those insights intuitive and accessible for everyone.”

iAsk Pro was tested on the GPQA Diamond subset, which includes the most challenging 198 questions out of the full 448-question set. Despite the test’s difficulty, iAsk Pro also demonstrated consistent performance across various scientific disciplines — particularly in physics, where it outperformed both human experts and all competing AI models. Human PhDs typically score below 70%, while non-experts (even with access to the internet and other academic resources) average only 34%.

This accomplishment follows iAsk Pro’s outstanding performance on other key AI benchmarks. Earlier this year, iAsk Pro scored 93.9% on the Massive Multitask Language Understanding (MMLU) test, surpassing both top human experts and other AI models. These results demonstrate iAsk Pro’s extensive knowledge across multiple disciplines. While GPQA is specifically designed to test advanced scientific knowledge and requires deep understanding and multi-step reasoning, MMLU is a more knowledge-driven, fact-based evaluation focused on a broader array of subjects, including history, economics, mathematics, and law.

“In an age where answer engines are transforming how we search for information, accuracy and trustworthiness are critical,” said Brad Folkens, iAsk Co-founder and Chief Technology Officer. “iAsk Pro’s performance on these benchmarks demonstrates our commitment to delivering not just information, but knowledge users can rely on—whether they’re students tackling difficult subjects or researchers exploring complex scientific questions.”

For more information about today’s announcement, please read our blog post. To try the iAsk Answer Engine yourself, please visit the website at iAsk.ai

About iAsk

iAsk is an advanced, free AI search engine that enables users to ask questions in natural language and receive instant, accurate answers. Engaging state-of-the-art AI known as transformer neural networks, it reports top scores for accuracy on various gold-standard academic benchmarks, delivering consistently better results than ChatGPT, Google’s Gemini, Claude.ai, and all other AI models. iAsk was named Best Search Engine of 2024 by Slashdot. For more information, visit https://iask.ai/.

Contacts

Media contact
Fitz Barth
CMAND for iAsk
fitz@cmand.co