We generally don't publish this kind of work, because we do not wish to advance the rate of AI capabilities progress.
In addition, we aim to be thoughtful about demonstrations of frontier capabilities.
We've subsequently begun deploying Claude
Now that the gap between it and the public state of the art is smaller.
Opus
Our most intelligent model
Outperforms its peers
On most of the common evaluation benchmarks for AI systems
Claude 3 Opus is our most intelligent model
With best in market performance on highly complex tasks
We do not wish to advance the rate of AI capabilities progress
These new features will include interactive coding
And more advanced agentic capabilities
Our hypothesis is that being at the frontier of AI development
Is the most effective way to steer
We do not wish to advance the rate of AI
We do not wish to advance the rate of AI capabilities progress
We do not wish to advance the rate of AI
We do not wish to advance the rate of AI