AI Hallucinations: A Challenge Too Costly to Ignore
The reliance on artificial intelligence (AI) for critical tasks is shaken by new findings that suggest AI-induced hallucinations are unavoidable. Despite being productivity-enhancing, large language models frequently produce incorrect outputs, posing significant risks in applications demanding accuracy, like accounting and law, and challenging optimistic projections for AI-driven efficiencies.
Recent research raises doubts about the reliability of artificial intelligence in high-stakes tasks. Large language models (LLMs), driving tools like ChatGPT, are widely recognized for enhancing productivity but are plagued by a tendency to hallucinate, or create false outputs. This issue worsens as models process more data, undermining their accuracy.
An experiment by JV Roig from Kamiwaza AI tested LLMs with varying text lengths. China's Zhipu AI's GLM 4.5 showed a 1.2% error rate at 32,000 words, increasing to 3.2% at 128,000 words, while averages rose from 6.8% to 10%. This suggests real-world applications, relying on vast data, could exacerbate hallucinations.
Despite advancements, fixing hallucinations remains elusive. Tsinghua University researchers attribute them to inherent characteristics of LLM training. As LLMs struggle with precision, especially in critical areas like finance and law, companies could face existential threats with growing low-cost AI competition.
(With inputs from agencies.)

