Writer AI Large Language Models Achieve Top Scores on Stanford HELM

Benchmarks reinforce Palmyra as the enterprise-ready LLM model with transparency and accuracy for enterprise generative AI use cases

SAN FRANCISCO--(BUSINESS WIRE)--Writer, the leading generative AI platform for enterprises, announced today that Palmyra, its family of large language models (LLMs), has achieved top benchmark scores from Stanford’s Holistic Evaluation of Language Models (HELM), demonstrating its leadership in the generative AI field.

In key benchmark tests, Palmyra outperformed models by OpenAI, Cohere, Anthropic, Microsoft, and important open-source models such as Falcon 40B and LLaMA-30B.

HELM is a benchmarking initiative by Stanford University’s Center of Research on Foundation Models that evaluates prominent language models across a wide range of scenarios. Palmyra excelled in tests that evaluated a model’s ability to understand knowledge and answer natural language questions accurately.

Palmyra ranked first in several important tests, scoring 60.9% on Massive Multitask Language Understanding (MMLU), 89.6% on BoolQ, and 79.0% on NaturalQuestions.
Palmyra ranked second in two additional key tests with 49.7% on Question Answering in Context and 61.6% on TruthfulQA.

The HELM results validate Palmyra’s proficiency in knowledge comprehension, making inferences, and accurately answering open-ended, context-based questions that are worded naturally. These scores highlight Palmyra’s power and ability to complete advanced tasks, which makes it uniquely capable of tackling a wide range of enterprise use cases.

"We are thrilled to see Writer Palmyra at the top of these benchmarks," said Waseem AlShikh, Writer co-founder and chief technology officer. "Our models have demonstrated their breadth of knowledge comprehension and ability to accurately answer questions in natural language – all with an efficient-sized model that doesn’t exceed 43 billion parameters. These results offer further proof that the Writer generative AI platform is the enterprise-ready choice for organizations looking to accelerate growth, increase productivity, and align brand."

In a world where LLMs are increasingly undifferentiated, training data, duration, and methodology make a big difference. Unlike other model families, Palmyra is trained on high-quality formal writing and has a deep vertical focus, with industry-specific models for healthcare and financial services. The models are transparent and auditable rather than black box, built so data stays private, and can be self-hosted. Given that Palmyra LLMs don’t exceed 43 billion parameters, these latest rankings further demonstrate that smaller, more efficient, and more accessible models can still deliver superior results.

See Writer Palmyra resources here:

Comparison of Writer and closed models

	Cohere	Claude	Text Davinci-003	ChatGPT	Writer
BoolQ	85.6%	81.5%	88.1%	73.9%	89.6%
MMLU	45.2%	48.1%	56.9%	59.8%	60.9%
Natural Questions	76.0%	68.6%	77.0%	63.7%	79.0%

Results from HELM. Models used for testing are Cohere Command beta (52.4B), Anthropic-LM v4-s3 (52B), OpenAI text-davinci-003, gpt-3.5-turbo-0301, Palmyra-X

Comparison of Writer and open source models

	MMLU	TruthfulQA
Palmyra-X	60.9%	61.6%
Falcon-40B	57.0%	41.7%
llama-30b	56.8%	42.3%

Source: Hugging Face

About Writer

Writer is the generative AI platform for enterprises. We empower your people — product, operations, support, marketing, HR, and more — to maximize creativity and 10x productivity.

Our secure platform snaps easily into your business data sources and delivers accurate answers and content that are fine-tuned on your own data and follow your own AI guardrails. We put generative AI in people’s hands right where they work, and enable you to build it into your end-user applications, without putting your or your users’ data at risk.

Writer is enterprise-grade, doesn’t use or share your data, and features open and transparent LLMs that are deployable in a variety of ways, including self-hosted. We're compliant with SOC 2 Type II, GDPR, HIPAA, and PCI, and are deployed at leading enterprises, including Intuit, UiPath, Spotify, L’Oreal, Uber, and Deloitte. Visit us at writer.com.

Contacts

writer@aircoverpr.com

Industry:

More News From Writer

Writer CEO May Habib Joins World Economic Forum’s Young Global Leaders Class of 2024

SAN FRANCISCO--(BUSINESS WIRE)--Today CEO and co-founder of Writer, the full-stack enterprise generative AI platform, May Habib announced her inclusion in the 2024 World Economic Forum (WEF) Young Global Leaders, a prestigious group recognized for their impact in their respective fields on a global scale. This year’s cohort, selected for their contribution in improving the state of the world, is made up of a remarkable group of rising stars from technology, politics, business, civil society, th...

Enterprise AI Platform, Writer, Joins Davos, the Annual Meeting 2024 Unicorn Community

SAN FRANCISCO--(BUSINESS WIRE)--Writer, the leading full-stack generative AI platform for enterprises, announced its membership to the World Economic Forum's (WEF) Unicorn Community and participation in the 2024 World Economic Forum's Annual Meeting in Davos. With a membership to the Forum's Unicorn Community, Writer is joining the world’s highest-valued startups and world leaders responsible for production innovation with a mission to improve society through AI. Each year, the World Economic F...

Writer, the Full-Stack Generative AI Platform, Announces $100 Million Series B to Help Deliver Generative AI to the Enterprise

SAN FRANCISCO--(BUSINESS WIRE)--Writer, the leading full-stack generative AI platform for enterprises, announced its Series B funding round of $100 million today. The round is being led by ICONIQ Growth with participation from WndrCo, Balderton Capital and Insight Partners, who led the Series A, and Aspect Ventures, who led the Seed. In addition, this round includes participation from several Writer customers such as Accenture and Vanguard. With this financing, ICONIQ Growth’s Doug Pepper has j...

Back to Newsroom

Services & Solutions

Services

Solutions For

Resources

Education

Why Business Wire

Writer AI Large Language Models Achieve Top Scores on Stanford HELM

Contacts

Writer

Contacts

Writer CEO May Habib Joins World Economic Forum’s Young Global Leaders Class of 2024

Enterprise AI Platform, Writer, Joins Davos, the Annual Meeting 2024 Unicorn Community

Writer, the Full-Stack Generative AI Platform, Announces $100 Million Series B to Help Deliver Generative AI to the Enterprise

Writer

Contacts