Question 1

Is GPT-5.4 better than Claude Sonnet 4.6?

Accepted Answer

In my quick real-world tests, Sonnet 4.6 outperformed GPT-5.4. GPT-5.4 hallucinated on a niche “oldest presidential card sets” question and gave a weaker list on valuable baseball cards. Sonnet was more grounded and produced more correct, industry-real answers.

Question 2

What did OpenAI claim GPT-5.4 is designed for?

Accepted Answer

The press release keeps repeating “professional work,” and they position GPT-5.4 as more consistent and polished for knowledge-work tasks. They also highlight agent-style evaluation across dozens of occupations and improvements in coding and tool usage. I’m always skeptical of the wording, so I tested it directly.

Question 3

How much does GPT-5.4 cost in the API?

Accepted Answer

GPT-5.4 is $2.50 per million input tokens and $15 per million output tokens. GPT-5.4 Pro is where it gets wild: $30 per million input and $180 per million output tokens. If you’re building automations, don’t accidentally default to Pro thinking it’s “basically the same price.”

Question 4

How does GPT-5.4 pricing compare to Gemini 3.1 Pro and Claude?

Accepted Answer

Gemini 3.1 Pro is roughly in the same ballpark as GPT-5.4 for many workloads (depending on context length), with output around $12–$18 per million tokens. Claude Sonnet 4.6 is $3 input and $15 output, so output is basically comparable to GPT-5.4. Claude Opus is more expensive, but GPT-5.4 Pro is the real outlier on output cost.

Question 5

Does GPT-5.4 have a bigger context window than GPT-5.2?

Accepted Answer

No—according to what I saw, GPT-5.4’s context window in ChatGPT remains unchanged from GPT-5.2 Thinking at 1 million tokens. That was one of the first things I checked because it’s a big deal for agent workflows. So if you were hoping for a context bump, it’s not there.

Question 6

How did GPT-5.4 perform on spreadsheet data cleanup?

Accepted Answer

Claude Sonnet 4.6 did exactly what I wanted: standardized dates, cleaned emails, fixed formatting, and even gave me an issue log and a clean downloadable file. GPT-5.4 didn’t fully fix the sign-up dates and didn’t clean up the sheet to the same standard. For spreadsheet-heavy automations, Sonnet was the easy winner in my test.

Question 7

What is GPT-5.4 “computer use” and why does it matter for agents?

Accepted Answer

OpenAI is positioning GPT-5.4 as their first generation model with native computer use capabilities, aimed at developers building agents that complete real tasks across websites and software. That’s the direction the whole space is moving—agents doing actual work, not just chatting. But again, the marketing is one thing; your workflow tests are what matter.

GPT-5.4 Is Here (Worse Than Sonnet 4.6?)

🛍️ Products Mentioned (4)

Website & Blog

Want to make money with AI skills? Join our free community — real projects, real client strategies, and the exact stack we use

Hire me for Data Work

Mentorships

About This Video

Frequently Asked Questions

🎬 More from Ryan & Matt Data Science