Shelly Palmer - Is Claude 3 the ChatGPT Killer?

Achieving "near-human" capabilities on specific benchmarks does not equate to Claude 3 possessing general intelligence akin to human cognition.

Shelly Palmer Mar 5, 2024 12:00 PM

shellypalmertuesday — Vithun Khamsong/Moment/Getty Images

Anthropic claims that Claude 3, the company's most recent AI release, has achieved "near-human" capabilities in various cognitive tasks. It's a bold claim. Let's put it in perspective.

Anthropic's claims for Claude 3 center around its performance across a range of cognitive tasks, including reasoning, expert knowledge, mathematics, and language fluency. The company suggests that the Opus model (in particular) exhibits near-human levels of comprehension and fluency on complex tasks. This claim is supported by Claude 3 Opus outperforming OpenAI's GPT-4 (the underlying model that powers ChatGPT) on 10 AI benchmarks, including MMLU (undergraduate level knowledge), GSM8K (grade school math), HumanEval (coding), and HellaSwag (common knowledge).

Despite these achievements, it's important to note that achieving "near-human" capabilities on specific benchmarks does not equate to Claude 3 possessing general intelligence akin to human cognition. The AI research community often uses terms like "know" or "reason" to describe large language models' capabilities, but use of these words does not imply that these models have consciousness or understanding in the human sense.

This new iteration of the Claude AI model series includes three versions: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus, each offering different levels of complexity and performance. The most powerful among them, Claude 3 Opus, is available through a subscription service, while Sonnet powers the Claude.ai chatbot accessible for free with an email sign-in.

Claude 3's advancements are not limited to cognitive tasks. The models demonstrate improved performance in areas like coding, understanding non-English languages, and adhering to brand voice guidelines. They also feature advanced vision capabilities, enabling them to process a wide range of visual formats, including photos, charts, graphs, and technical diagrams. This makes Claude 3 models particularly useful for applications that involve PDFs, flowcharts, or presentation slides.

Anthropic says that it trained Claude 3 on both nonpublic internal and public-facing data, utilizing hardware from Amazon Web Services (AWS) and Google Cloud. They also claim the model is more accurate and less likely to hallucinate.

That said, you should keep Anthropic's claims about Claude 3's "near-human" capabilities in perspective. Outperforming its competitors on AI benchmarks does not equate to human-like consciousness or understanding. When artificial general intelligence (AGI) is achieved, you won't need to read my daily newsletter to get the news.

As always your thoughts and comments are both welcome and encouraged. Just reply to this email. -s

[email protected]

ABOUT SHELLY PALMER

Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications and CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named he covers tech and business for , is a regular commentator on CNN and writes a popular . He's a , and the creator of the popular, free online course, . Follow or visit .

麻豆传媒