Last Thursday, OpenAI introduced its latest model, dubbed “OpenAI o1.” While notable for its performance on a range of difficult tasks and benchmarks — the San Francisco-based AI lab says the model significantly outperformed its flagship GPT-4o model on competitive math and coding benchmarks — o1 has many excited because of what it could portend for the next stage in AI development. According to o1’s accompanying system card, OpenAI used large-scale reinforcement learning to train o1 to perform “chain of thought” reasoning, enabling it to deliberate its way through complex problems and execute more sophisticated strategies.
While OpenAI hasn’t published much explicit information on how this training worked, observers think the model was trained on human-produced examples of step-by-step reasoning, enabling the model to better understand the process of complex problem-solving (this kind of process-focused training was detailed in a 2023 paper by OpenAI researchers). Importantly, this reasoning process, which occurs when the model is generating outputs (known as “inference”), corresponds to the amount of time and computing power dedicated to it — the more time and computing power, the better the quality of the output.
Much of the recent investment in AI is due to a similar scaling property: the more computing power AI developers devote to training an AI model, the more sophisticated the model. With o1 demonstrating a similar scaling law for inference-time, OpenAI seems to have set the stage for the next phase in AI development: massive up-front investment in AI training coupled with heavy continuous spending on inference. Both o1 and o1-mini — a faster, cheaper-to-use version of the reasoning model — are available to paid ChatGPT users and through OpenAI’s API.
More: Scaling: The State of Play in AI | OpenAI co-founder Sutskever’s new safety-focused AI startup SSI raises $1 billion
This newsletter excerpt is from the September 19, 2024, edition of policy.ai — CSET’s newsletter on artificial intelligence, emerging technology, and security policy, written by Alex Friedland. Other stories from this edition include:
- EU Report Calls for Significant Economic Changes to Avoid “Slow Agony”
- TSMC’s Arizona Fab and Intel’s Foundry Business — A Chips News Roundup
- California Governor Weighs Signing Far-Reaching AI Regulation SB 1047
- DIU Looks to Generative AI for Joint Planning and Wargaming Help
Read the full newsletter and subscribe to receive every edition of policy.ai.