All Mini Models Non - Search News

OpenAI's o3 and o4-mini hallucinate way higher than previous models

First reported by TechCrunch, OpenAI's system card detailed the PersonQA evaluation results, designed to test for hallucinations. From the results of this evaluation, o3's hallucination rate is 33 ...

TechCrunch

OpenAI’s new reasoning AI models hallucinate more

OpenAI’s recently launched o3 and o4-mini AI models are state-of-the-art in many respects. However, the new models still hallucinate, or make things up — in fact, they hallucinate more than several of ...

ZDNet

OpenAI and Anthropic evaluated each others' models - which ones came out on top

Anthropic and OpenAI ran their own tests on each other's models. The two labs published findings in separate reports. The goal was to identify gaps in order to build better and safer models. The AI ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

OpenAI's o3 and o4-mini hallucinate way higher than previous models

OpenAI’s new reasoning AI models hallucinate more

OpenAI and Anthropic evaluated each others' models - which ones came out on top

Trending now