o1 vs o1 pro: Is it worth spending $200?

OpenAI recently launched o1 and o1 pro in their 12 Days of OpenAI – Live Updates, offering unlimited access via a $200 ChatGPT Pro subscription. With much speculation surrounding its capabilities, I asked myself: Is this premium subscription worth investing in? To answer this, I pitted these two ai models against each other in six challenging tasks. This article explores its strengths, weaknesses and overall performance. In the end, you will clearly understand whether the $200 ChatGPT Pro subscription is the right choice for you or not.

Challenge 1. Solve a zebra problem

As a first test of o1 vs o1 pro, I am taking a zebra problem: difficult level of this website. Let's see which one solves it best!

Immediate:

Solve this zebra problem:

o1 Answer:

Putting this answer as a solution, it turns out to be incorrect:

o1 pro Answer:

Putting this answer as a solution, it is correct:

Observation:

o1 pro took much longer than o1 to respond. o1 fails to solve the problem, while o1 pro does!

Verdict:

o1 | o1 pro

Challenge 2: Find the difference

Immediate:

Find 3 differences in the two images:

o1 Answer:

Item in hand: In the picture on the left, she is holding a spoon. In the picture on the right, she is holding an ice cream cone.
Hair accessory: In the picture on the left, her hair is straight. In the image on the right, she is wearing a small pink bow in her hair.
Background object: In the picture on the left, there is a green sewing machine on the shelf behind her. In the image on the right, the sewing machine is replaced by a simple green box (or frame).

Observation:

Only the second difference is correct, which is the missing hair accessory, the rest 2 are incorrect.

o1 pro Answer:

Girl's Hair Accessory: In the image on the right, she has a pink bow in her hair, which is not present in the image on the left.
What he has in his hand: In the image on the left, he is holding a spoon. In the picture on the right, she is holding an ice cream cone.
The color of the middle ball in the ice cream bowl: In the left image, the middle ball is pink, while in the right image, that ball has turned white.

Observation:

Only the first difference is correct. The rest 2 are incorrect.

Both o1 and o1 pro could not answer correctly. However, o1 was faster in generating the response.

Verdict:

o1 | o1 pro

Challenge 3: Solve a math problem

In this challenge, I will give the models a computational indefinite integrals problem. Let's see which one is capable of solving it!

Immediate:

Solve this math problem:

Response from o1 and o1 pro (same):

Observation:

Both models provided the correct answer, but o1 was much faster than o1 pro at finding the solution.

Verdict:

o1 | o1 pro

Challenge 4: Analyze the article

Immediate:

Read the article: https://www.analyticsvidhya.com/blog/2024/07/building-agent-rag-systems-with-langgraph/ to understand the process of creating a vector database for Wikipedia data. Summarize key steps concisely.

o1 Answer:

o1 pro Answer:

Observation:

The “o1 pro answer” is closer to the actual implementation in the article. Here's why:

The article provides a much more detailed, step-by-step implementation, including:

Using specific libraries like LangChain and OpenAI embeds
Loading Wikipedia data from a specific file
Using Chroma as a vector database
Implementation of advanced RAG components such as:
- Reformulation of queries
- Document Relevance Rating
- Web search integration
- A complex LangGraph workflow

o1 pro's answer captures more nuance by mentioning:

Specific embedding models (sentence transformers)
Vector Database Options
Metadata storage
Evidence Recovery
Integration with a RAG pipeline

In contrast, the initial “o1 answer” is more generic and lacks the technical depth demonstrated in the article. So o1 pro's answer is significantly closer to the actual implementation of the article.

Verdict:

o1 | o1 pro

Challenge 5: Image Creation

Immediate:

Create an image of a cat.

o1 Answer:

or 1 professional answer:

Observation:

Both o1 and o1 pro failed to generate images, indicating that both versions of o1 do not support generating images. However, when giving the same message to GPT 4o, I received the response:

Therefore, it is safe to say that only GPT 4o surpasses o1 and o1 pro in imaging.

Verdict:

o1 | o1 pro

Challenge 6: Create a logical flowchart

Immediate:

Create a complete flowchart illustrating the reflection pattern in agent ai.

o1 Answer:

o1 pro Answer:

They both provided incomplete flow chats, so I decided to update my message. Here is my updated message:

New message:

These are the steps involved in the reflection pattern:

Generate initial output
Self-assessment/criticism of the result
Identify errors, gaps or areas for improvement
Develop suggestions for improvement
Review/refine the result
Repeat steps 2 to 5 until you achieve a satisfactory result
Create a new flowchart now

o1 Answer:

o1 pro Answer:

Observation:

Although the content of both answers is the same, o1 definitely wins by providing a real flowchart, while o1 pro only provided the correct content.

Verdict:

o1 | o1 pro

Results table: o1 vs o1 pro

Challenge	Verdict
Zebra problem	o1 pro was successful, but slower
Find differences	Both models performed poorly.
math problem	Both were resolved correctly, o1 was faster.
Analyzing the article	o1 pro provided more depth
Image creation	Neither of them could generate images (GPT 4o could)
Create a logical flowchart	o1 won by creating a real flowchart

o1 pro seems to have a slight advantage in terms of depth and accuracy in problem solving, particularly in complex tasks like solving the zebra problem and analyzing technical articles. However, o1 tends to be faster and works well on simpler tasks. The verdict seems to be that o1 pro is marginally better, especially for more complex or technical challenges that require deeper understanding.

Also read: Is the new o1 model better than the GPT-4o?

Final note

While o1 pro shows promise in solving complex problems, it is important to consider your specific needs and budget. For basic to intermediate tasks, GPT-4o or other more affordable alternatives may be sufficient. If solving complex problems is a priority and you're willing to invest, o1 pro could be a valuable tool.

However, as OpenAI continually refines these models, it would be prudent to wait for further updates before making a final decision. OpenAI will likely add more benefits to the $200 ChatGPT Pro plan in the future.

What do you think about this? Let me know in the comments section below.

Stay tuned to Analytics Vidhya blog for more awesome updates!

Nitika Sharma

Hi, I'm Nitika, a tech-savvy content creator and marketer. Creativity and learning new things come naturally to me. I have experience creating results-based content strategies. I am well versed in SEO management, keyword operations, web content writing, communication, content strategy, editing and writing.

o1 vs o1 pro: Is it worth spending $200?

Technical Terrence Team

Close to £6, can the Rolls-Royce share price still offer any value?

Leave a Reply Cancel reply

Recommended.

Stellantis reveals a pre-production variant of the Ram 1500 REV

Microsoft introduces Phi Silica: a 3.3 billion-parameter AI model that transforms efficiency and performance in personal computing

Tesla reduces Model Y range estimates by about 20 miles

Python sorted() function explained | by Misha Sv | Jan, 2023

Student Engagement Requires More Than EdTech Tools

Categories

Important Links

o1 vs o1 pro: Is it worth spending $200?

Challenge 1. Solve a zebra problem

Challenge 2: Find the difference

Challenge 3: Solve a math problem

Challenge 4: Analyze the article

Challenge 5: Image Creation

Challenge 6: Create a logical flowchart

Results table: o1 vs o1 pro

Final note

Related

Technical Terrence Team

Close to £6, can the Rolls-Royce share price still offer any value?

Leave a Reply Cancel reply

Recommended.

Stellantis reveals a pre-production variant of the Ram 1500 REV

Microsoft introduces Phi Silica: a 3.3 billion-parameter AI model that transforms efficiency and performance in personal computing

Tesla reduces Model Y range estimates by about 20 miles

Python sorted() function explained | by Misha Sv | Jan, 2023

Student Engagement Requires More Than EdTech Tools

Categories

Important Links

Get daily news updates to your inbox!