TLDR
- GPT-5.4 Pro scores 150 IQ on MESNA Norway test, up from 136 of previous o3 model.
- OpenAI’s latest model leads public IQ leaderboard above all major AI systems.
- GPT-5.4 Pro improves coding, tool use, and document handling in professional work.
- Public IQ benchmarks show AI capability rising fast enough to influence business planning.
- TrackingAI confirms GPT-5.4 Pro outperforms Cluade, Gemini, Qwen, and Grok models.
OpenAI’s latest GPT-5.4 Pro has reached a new milestone, scoring 150 on the MESNA Norway IQ test. This surpasses the previous record of 136 held by the o3 model and places GPT-5.4 Pro at the top of public AI IQ leaderboards, showing faster reasoning and stronger problem-solving across professional tasks.
GPT-5.4 Pro Surpasses Previous AI IQ Record
OpenAI’s newest model, GPT-5.4 Pro, has reached 150 IQ on the MESNA Norway test. This score surpasses the previous record of 136 set by OpenAI’s o3 model. TrackingAI’s public leaderboard now ranks GPT-5.4 Pro at the top among AI systems. The result is being seen as a clear step forward in AI reasoning capabilities.
The MESNA Norway test measures pattern recognition, logic, and problem-solving skills. GPT-5.4 Pro achieved the score while also showing improved coding and tool use. The result places the model in the upper range of human IQ scores. Observers note that the benchmark reflects reasoning speed and abstraction abilities.
OpenAI described GPT-5.4 Pro as optimized for professional work environments. The system handles long tasks with its one-million-token context window. It also supports complex workflows in coding, research, and document handling. The company emphasized improvements in accuracy and task efficiency across applications.
AI IQ Benchmarks and Performance
Public IQ-style tests remain limited but provide a clear comparison across models. TrackingAI publishes results from both public and private IQ-style tests. GPT-5.4 Pro now leads the leaderboard, above systems like Cluade, Gemini, Qwen, and Grok. These benchmarks measure reasoning, pattern recognition, and problem-solving speed.
Scores on these tests can vary due to test design, format, and training exposure. Even with these limits, the increase from 136 to 150 is a measurable jump. The number signals broader gains in tool use, coding, and complex task handling. The leaderboard offers developers and enterprises a simple benchmark for capability.
OpenAI’s previous o3 model achieved 136 IQ on the same test in 2025. GPT-4.1 earlier introduced a one-million-token window, improving long-horizon task handling. The latest result continues the trend of rising AI performance across multiple measures. TrackingAI reports scores as rolling averages across multiple test completions.
Broader Context for AI Use
GPT-5.4 Pro’s capability rise aligns with increased AI adoption in enterprises. Firms use AI to manage research, document workflows, code verification, and browsing tasks. The model’s efficiency and accuracy help reduce manual work and streamline processes. It can assist in planning, searching, and producing work over extended contexts.
The jump in IQ is notable but also reflects improvements in professional task handling. OpenAI positions GPT-5.4 Pro as both a research and business productivity tool. The model’s performance growth may influence software budgets, workflows, and deployment decisions. TrackingAI notes the score is part of a broader trend of rising frontier AI abilities.
OpenAI continues to expand GPT-5.4 Pro’s capabilities with native computer and tool use. The system can execute tasks over longer sequences and handle complex instructions. Its performance benchmarks suggest continued growth in both reasoning and professional application. The model now serves as a practical signal for AI strength in public benchmarking.





