Anthropic's Claude 3.5 Sonnet outperforms GPT-4o in most benchmarks

Anthropic has launched the Claude 3.5 Sonnet, a mid-range model that outperforms competitors' products and even beats Anthropic's current top-of-the-line model, the Claude 3 Opus, in various reviews.

Claude 3.5 Sonnet is currently accessible for free through Claude.ai and the Claude iOS app, with higher rate limits for Claude Pro and Team plan subscribers, and is also available through Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI. This model is priced at $3 per million input tokens and $15 per million output tokens, with a context window of 200,000 tokens.

Anthropic claims that Claude 3.5 Sonnet “sets new industry benchmarks for graduate-level reasoning ability (GPQA), undergraduate-level knowledge (MMLU), and coding ability (HumanEval).” The model excels at creating high-quality content with a natural tone, with an improved ability to understand nuance, humor, and complex instructions.

Claude 3.5 Sonnet runs twice as fast as Claude 3 Opus and is ideal for complex tasks like contextual customer support and multi-step workflow orchestration. In an internal agent coding evaluation, it resolved 64% of issues vs. 38% for Claude 3 Opus, significantly outperforming the competition.

The model also shows improved vision capabilities, outperforming Claude 3 Opus on standard vision benchmarks. The advances are particularly noticeable in tasks that require visual reasoning, such as interpreting charts and graphs. Claude 3.5 Sonnet can accurately transcribe text from imperfect images, a valuable feature for industries such as retail, logistics and financial services.

In tandem with the model release, Anthropic introduced Artifacts to Claude.ai, a new feature that enhances user interaction with AI, allowing users to view, edit, and build on Claude-generated content in real time, creating a more collaborative working environment.

Despite its vastly improved intelligence, the Claude 3.5 Sonnet maintains Anthropic's commitment to safety and privacy: “Our models undergo rigorous testing and are trained to reduce misuse,” the company said.

External experts, including the UK AI Safety Institute (UK AISI) and child safety experts from Thorne, have been involved in testing and refining the model's safety mechanisms.

Anthropic is committed to protecting user privacy, stating that “we will not train generative models with user-submitted data unless explicitly permitted by the user. To date, we have never trained a generative model with customer- or user-submitted data.”

Going forward, Anthropic plans to release Claude 3.5 Haiku and Claude 3.5 Opus later this year to complete the Claude 3.5 model family. The company is also developing new modalities and features to support more business use cases, including integration with enterprise applications and memory capabilities for a more personalized user experience.

(Image courtesy of Anthropic)

reference: OpenAI co-founder Ilya Sutskever's new startup aims for 'safe superintelligence'

Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California and London – this comprehensive event will take place alongside other major events such as Intelligent Automation Conference, BlockX, Digital Transformation Week and Cyber Security & Cloud Expo.

Find out about upcoming enterprise technology events and webinars hosted by TechForge here.

tag: ai, anthropological, artificial intelligence, benchmark, claude, claude 3.5, model

Source link

What's Hot

US regulators say Amazon is responsible for dangerous goods sold by third-party sellers

Bangkok Post – Temu disrupts online retailers in Thailand

Egypt's Cartona raises $8.1 million as investors pull out of B2B e-commerce in Africa

Anthropic's Claude 3.5 Sonnet outperforms GPT-4o in most benchmarks

AI News Today – The Dales Report

AI News Today – July 25, 2024

AI-Powered WAF vs. Traditional Firewalls: Protecting Web Applications

Senators Investigate OpenAI's Safety and Employment Practices

AI News Today – The Dales Report

AI News Today – July 25, 2024

AI-Powered WAF vs. Traditional Firewalls: Protecting Web Applications

Senators Investigate OpenAI's Safety and Employment Practices

Institutional investors are actively investing in this cryptocurrency startup

Cryptocurrency wallet provider Exodus’ NYSE American stock listing postponed due to SEC review

SEC files final memorandum on Ripple lawsuit

Subscribe to Updates

What's Hot

Anthropic's Claude 3.5 Sonnet outperforms GPT-4o in most benchmarks

Related Posts