- Joined
- Sep 22, 2018
- Messages
- 42,274
- Location
- Moonbase Caligula
- SL Rez
- 2008
- Joined SLU
- 2009
- SLU Posts
- 55565
OpenAI's new reasoning AI models hallucinate more | TechCrunch
OpenAI's reasoning AI models are getting better, but their hallucinating isn't, according to benchmark results.
OpenAI’s recently launched o3 and o4-mini AI models are state-of-the-art in many respects. However, the new models still hallucinate, or make things up — in fact, they hallucinate more than several of OpenAI’s older models.
Hallucinations have proven to be one of the biggest and most difficult problems to solve in AI, impacting even today’s best-performing systems. Historically, each new model has improved slightly in the hallucination department, hallucinating less than its predecessor. But that doesn’t seem to be the case for o3 and o4-mini.
According to OpenAI’s internal tests, o3 and o4-mini, which are so-called reasoning models, hallucinate more often than the company’s previous reasoning models — o1, o1-mini, and o3-mini — as well as OpenAI’s traditional, “non-reasoning” models, such as GPT-4o.
It's truly lovely that o3, what OpenAI calls their "most powerful reasoning model," not only can't stop making shit up, but is getting better at it. Perhaps some of the problem is that OpenAI itself has issues with hallucinations about their own work.Perhaps more concerning, the ChatGPT maker doesn’t really know why it’s happening.

