Misplaced Pages

OpenAI o3: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 15:41, 26 December 2024 edit2a02:2f4:412f:300:e53c:86c0:6f9d:2e1c (talk) CapabilitiesTags: Mobile edit Mobile app edit Android app edit App select source← Previous edit Revision as of 23:43, 26 December 2024 edit undo58.110.40.115 (talk) I added an explanation of the difference between the o3 and o3-mini models.Tags: Reverted Visual editNext edit →
Line 11: Line 11:


==History== ==History==
The OpenAI o3 model was announced on December 20, 2024, with the designation "o3" chosen to avoid trademark conflict with the existing UK mobile carrier named ]. The model is available in two versions: o3 and o3-mini. Until January 10, 2025, OpenAI invites safety and security researchers to apply for early access of these models.<ref name="auto"/><ref>{{Cite web |date=December 20, 2024 |title=Early access for safety testing |url=https://openai.com/index/early-access-for-safety-testing/ |website=OpenAI}}</ref> OpenAI plans to release o3-mini to the public in January 2025.<ref>{{Cite web|url=https://arstechnica.com/information-technology/2024/12/openai-announces-o3-and-o3-mini-its-next-simulated-reasoning-models/|title=OpenAI announces o3 and o3-mini, its next simulated reasoning models|first=Benj|last=Edwards|date=December 20, 2024|website=Ars Technica}}</ref> The OpenAI o3 model was announced on December 20, 2024, with the designation "o3" chosen to avoid trademark conflict with the existing UK mobile carrier named ]. The model is available in two versions: o3 and o3-mini. Until January 10, 2025, OpenAI invites safety and security researchers to apply for early access of these models.<ref name="auto"/><ref>{{Cite web |date=December 20, 2024 |title=Early access for safety testing |url=https://openai.com/index/early-access-for-safety-testing/ |website=OpenAI}}</ref> OpenAI plans to release o3-mini to the public in January 2025.<ref>{{Cite web|url=https://arstechnica.com/information-technology/2024/12/openai-announces-o3-and-o3-mini-its-next-simulated-reasoning-models/|title=OpenAI announces o3 and o3-mini, its next simulated reasoning models|first=Benj|last=Edwards|date=December 20, 2024|website=Ars Technica}}</ref> The o3-mini model will serve as a limited teaser of the coming o3 model.<ref>{{Cite web |last=Edwards |first=Benj |date=2024-12-20 |title=OpenAI announces o3 and o3-mini, its next simulated reasoning models |url=https://arstechnica.com/information-technology/2024/12/openai-announces-o3-and-o3-mini-its-next-simulated-reasoning-models/ |access-date=2024-12-26 |website=Ars Technica |language=en-US}}</ref>


==Capabilities== ==Capabilities==

Revision as of 23:43, 26 December 2024

Large language model
o3
Developer(s)OpenAI
PredecessorOpenAI o1
TypeGenerative pre-trained transformer

OpenAI o3 is a generative pre-trained transformer (GPT) model developed by OpenAI as a successor to the OpenAI o1 model. It is designed to devote additional deliberation time when addressing questions that require step-by-step logical reasoning.

History

The OpenAI o3 model was announced on December 20, 2024, with the designation "o3" chosen to avoid trademark conflict with the existing UK mobile carrier named O2. The model is available in two versions: o3 and o3-mini. Until January 10, 2025, OpenAI invites safety and security researchers to apply for early access of these models. OpenAI plans to release o3-mini to the public in January 2025. The o3-mini model will serve as a limited teaser of the coming o3 model.

Capabilities

Reinforcement learning was used to teach o3 to "think" before generating answers, using what OpenAI refers to as a "private chain of thought". This approach enables the model to plan ahead and reason through tasks, performing a series of intermediate reasoning steps to assist in solving the problem, at the cost of needing additional computing power and increasing the latency of responses.

o3 demonstrates improved performance compared to o1 in complex tasks, including coding, mathematics, and science. OpenAI reported that o3 achieved a score of 87.7% on the GPQA Diamond benchmark, which contains expert-level science questions not publicly available online.

On SWE-bench Verified, a software engineering benchmark assessing the ability to solve real GitHub issues, o3 scored 71.7%, compared to 48.9% for o1. On Codeforces, o3 reached an Elo score of 2727, whereas o1 obtained 1891.

On the ARC-AGI benchmark, which evaluates an AI's ability to handle new, challenging mathematical and logical problems, o3 attained three times the accuracy of o1.

References

  1. ^ Knight, Will (December 20, 2024). "OpenAI Upgrades Its Smartest AI Model With Improved Reasoning Skills". Wired.
  2. "OpenAI Unveils New A.l. That Can 'Reason' Through Math and Science Problems". The New York Times. 2024-12-20.
  3. "Early access for safety testing". OpenAI. December 20, 2024.
  4. Edwards, Benj (December 20, 2024). "OpenAI announces o3 and o3-mini, its next simulated reasoning models". Ars Technica.
  5. Edwards, Benj (2024-12-20). "OpenAI announces o3 and o3-mini, its next simulated reasoning models". Ars Technica. Retrieved 2024-12-26.
  6. Zeff, Maxwell; Wiggers, Kyle (2024-12-20). "OpenAI announces new o3 models". TechCrunch. Retrieved 2024-12-22.
  7. ^ Franzen, Carl; David, Emilia (2024-12-20). "OpenAI confirms new frontier models o3 and o3-mini". VentureBeat. Retrieved 2024-12-26.
  8. Hsu, Jeremy. "OpenAI's o3 model aced a test of AI reasoning – but it's still not AGI". New Scientist. Retrieved 2024-12-22.
OpenAI
Products
Foundation models
People
CEOs
Board of directors
Current
Former
Related
Categories: