Agentic Reasoning (o1, o3, Reasoning Models)
Models that 'think' before answering — internal chains-of-thought, test-time compute, and self-correction.
A New Class of Models
OpenAI o1 (Sep 2024) introduced "reasoning models" — LLMs that explicitly reason step-by-step before producing the visible answer. Successors include o3, DeepSeek R1, Claude with extended thinking, Gemini 2.5 thinking, and many others.
How They Work
Instead of generating the answer immediately, the model:
- Produces a long internal "thinking" sequence (CoT-style)
- Explores multiple approaches
- Self-corrects
- Then emits the final visible answer
All lessons in this course
- Agentic Reasoning (o1, o3, Reasoning Models)
- Hybrid Symbolic + Neural Agents
- Multimodal Agents (Vision + Voice + Action)
- Open Problems: Robustness, Alignment, Long-Horizon Memory