Image by author
Anthropic recently launched a new series of ai models that outperformed GPT-4 and Gemini in benchmark tests. With the ai industry growing and evolving rapidly, Claude 3 models are making significant strides as the next big thing in large language models (LLM).
In this blog post, we will explore the performance benchmarks of all 3 Claude models. We will also learn about the new Python API that supports simple, asynchronous, and streaming response generation, along with its improved vision capabilities.
Claude 3 represents an important advance in the field of artificial intelligence technology. It outperforms state-of-the-art language models on several evaluation benchmarks, including MMLU, GPQA, and GSM8K, demonstrating near-human levels of comprehension and fluency on complex tasks.
The Claude 3 models come in three variants: Haiku, sonnet and opuseach with their unique capabilities and strengths.
- haiku It is the fastest and most cost-effective model, capable of reading and processing information-dense research articles in less than three seconds.
- Sonnet It is 2x faster than Claude 2 and 2.1, excelling at tasks that require quick responses, such as knowledge retrieval or sales automation.
- Opus It offers similar speeds to Claude 2 and 2.1 but with much higher intelligence levels.
According to the table below, Claude 3 Opus outperformed GPT-4 and Gemini Ultra in all LLM benchmarks, making it the new leader in the ai world.
table of claudius 3
One of the significant improvements to the Claude 3 models is their strong vision capabilities. They can process various visual formats, including photographs, charts, graphs, and technical diagrams.
table of claudius 3
You can start using the latest model by going to https://www.anthropic.com/claude and create a new account. It's pretty simple compared to the OpenAI playing field.
- Before installing the Python package, we need to go to https://console.anthropic.com/dashboard and get the API key.
- Instead of providing the API key directly to create the client object, you can set the `ANTHROPIC_API_KEY` environment variable and provide it as a key.
- Install the Python `anthropic` package using PIP.
- Create the customer object using the API key. We will use the client for text generation, viewing capabilities, and streaming.
import os
import anthropic
from IPython.display import Markdown, display
client = anthropic.Anthropic(
api_key=os.environ("ANTHROPIC_API_KEY"),
)
Let's try the old Python API to check if it still works or not. We will provide the completion API with the model name, maximum token length and message.
from anthropic import HUMAN_PROMPT, AI_PROMPT
completion = client.completions.create(
model="claude-3-opus-20240229",
max_tokens_to_sample=300,
prompt=f"{HUMAN_PROMPT} How do I cook a original pasta?{AI_PROMPT}",
)
Markdown(completion.completion)
The error shows that we cannot use the above API for the model `claude-3-opus-20240229`. Instead, we need to use the Messages API.
Let's use the Messages API to generate the response. Instead of requesting it, we should provide the messages argument with a list of dictionaries containing the function and the content.
Prompt = "Write the Julia code for the simple data analysis."
message = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
messages=(
{"role": "user", "content": Prompt}
)
)
Markdown(message.content(0).text)
Using IPython Markdown will display the response in Markdown format. Which means it will display bullets, code blocks, headers, and links cleanly.
We can also provide a system message to personalize your response. In our case we ask Claude 3 Opus to respond in Urdu language.
client = anthropic.Anthropic(
api_key=os.environ("ANTHROPIC_API_KEY"),
)
Prompt = "Write a blog about neural networks."
message = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
system="Respond only in Urdu.",
messages=(
{"role": "user", "content": Prompt}
)
)
Markdown(message.content(0).text)
The Opus model is quite good. I mean I can understand it quite clearly.
Synchronous APIs execute API requests sequentially, blocking until a response is received before invoking the next call. Asynchronous APIs, on the other hand, allow multiple simultaneous requests without blocking them, making them more efficient and scalable.
- We have to create an Async Anthropic client.
- Create the main function with async.
- Generate the response using the await syntax.
- Execute the main function using the await syntax.
import asyncio
from anthropic import AsyncAnthropic
client = AsyncAnthropic(
api_key=os.environ("ANTHROPIC_API_KEY"),
)
async def main() -> None:
Prompt = "What is LLMOps and how do I start learning it?"
message = await client.messages.create(
max_tokens=1024,
messages=(
{
"role": "user",
"content": Prompt,
}
),
model="claude-3-opus-20240229",
)
display(Markdown(message.content(0).text))
await main()
Note: If you are using async in Jupyter Notebook, try using await main(), instead of asyncio.run(main())
Streaming is an approach that allows the output of a language model to be processed as soon as it is available, without waiting for the complete response. This method minimizes perceived latency by returning output token by token, rather than all at once.
Instead of `messages.create`, we will use `messages.stream` for streaming responses and use a loop to display multiple words from the response as soon as they are available.
from anthropic import Anthropic
client = anthropic.Anthropic(
api_key=os.environ("ANTHROPIC_API_KEY"),
)
Prompt = "Write a mermaid code for typical MLOps workflow."
completion = client.messages.stream(
max_tokens=1024,
messages=(
{
"role": "user",
"content": Prompt,
}
),
model="claude-3-opus-20240229",
)
with completion as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
As we can see, we are generating the response quite quickly.
We can also use an asynchronous function with streaming. You just need to be creative and combine them.
import asyncio
from anthropic import AsyncAnthropic
client = AsyncAnthropic()
async def main() -> None:
completion = client.messages.stream(
max_tokens=1024,
messages=(
{
"role": "user",
"content": Prompt,
}
),
model="claude-3-opus-20240229",
)
async with completion as stream:
async for text in stream.text_stream:
print(text, end="", flush=True)
await main()
Claude 3 Vision has improved over time and to get the response you just need to provide the base64 image type to the Messages API.
In this example, we will use tulips (Image 1) and flamingos (Image 2) photos from Pexel.com to generate the answer by asking questions about the image.
We will use the `httpx` library to retrieve both images from pexel.com and convert them to base64 encoding.
import anthropic
import base64
import httpx
client = anthropic.Anthropic()
media_type = "image/jpeg"
img_url_1 = "https://images.pexels.com/photos/20230232/pexels-photo-20230232/free-photo-of-tulips-in-a-vase-against-a-green-background.jpeg"
image_data_1 = base64.b64encode(httpx.get(img_url_1).content).decode("utf-8")
img_url_2 = "https://images.pexels.com/photos/20255306/pexels-photo-20255306/free-photo-of-flamingos-in-the-water.jpeg"
image_data_2 = base64.b64encode(httpx.get(img_url_2).content).decode("utf-8")
We provide base64-encoded images to the Messages API in image content blocks. Follow the coding pattern below to correctly generate the response.
message = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
messages=(
{
"role": "user",
"content": (
{
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": image_data_1,
},
},
{
"type": "text",
"text": "Write a poem using this image."
}
),
}
),
)
Markdown(message.content(0).text)
We have a beautiful poem about tulips.
Let's try uploading multiple images to the same Claude 3 Messages API.
message = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
messages=(
{
"role": "user",
"content": (
{
"type": "text",
"text": "Image 1:"
},
{
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": image_data_1,
},
},
{
"type": "text",
"text": "Image 2:"
},
{
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": image_data_2,
},
},
{
"type": "text",
"text": "Write a short story using these images."
}
),
}
),
)
Markdown(message.content(0).text)
We have a short story about a Garden of Tulips and Flamingos.
If you are having trouble running the code, here is a Deep Notes Workspace where you can review and run the code yourself.
I think the Claude 3 Opus is a promising model, although it may not be as fast as GPT-4 and Gemini. I think paid users may have better speeds.
In this tutorial, we learned about Anthropic's new model series called Claude 3, reviewed its benchmarks, and tested its vision capabilities. We also learned to generate simple, asynchronous and flow responses. It's too early to say if it's the best LLM out there, but if we look at the official test benchmarks, we have a new king on the ai throne.
Abid Ali Awan (@1abidaliawan) is a certified professional data scientist who loves building machine learning models. Currently, he focuses on content creation and writing technical blogs on data science and machine learning technologies. Abid has a Master's degree in technology Management and a Bachelor's degree in Telecommunications Engineering. His vision is to build an artificial intelligence product using a graph neural network for students struggling with mental illness.