ChatGPT-maker
OpenAI has launched GPT-5.2 artificial intelligence model, days after CEO Sam Altman issued an internal "code red" pausing internal projects and redirecting teams to accelerate development. This comes as a response to
Google's Gemini 3 that was launched last month. Announcing the new model, OpenAI said “We designed GPT‑5.2 to unlock even more economic value for people; it’s better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long contexts, using tools, and handling complex, multi-step projects.”
As revealed by the company, the new models is aimed to bring more economic value for users.
“GPT‑5.2 Thinking sets a new state-of-the-art score, and is our first model that performs at or above a human expert level. Specifically, GPT‑5.2 Thinking beats or ties top industry professionals on 70.9% of comparisons on GDPval knowledge work tasks, according to expert human judges. These tasks include making presentations, spreadsheets, and other artifacts,” the company said.
'Trump Needs Indian Skilled Workers': Expert On Big Tech's AI Bet On India
Coding with GPT 5.2
OpenAI says that GPT‑5.2 Thinking sets a new state of the art of 55.6% on SWE-Bench Pro, a rigorous evaluation of real-world software engineering.
Unlike SWE-bench Verified, which only tests Python, SWE-Bench Pro tests four languages and aims to be more contamination-resistant, challenging, diverse, and industrially relevant.
On SWE-bench Verified (not plotted), GPT‑5.2 Thinking scored 80%, as claimed by the AI maker. “For everyday professional use, this translates into a model that can more reliably debug production code, implement feature requests, refactor large codebases, and ship fixes end-to-end with less manual intervention,” it said.
OpenAI further claims GPT‑5.2 Thinking hallucinates less than GPT‑5.1 Thinking. On a set of de-identified queries from ChatGPT, responses with errors were 30%rel less common. For professionals, this means fewer mistakes when using the model for research, writing, analysis, and decision support—making the model more dependable for everyday knowledge work.