The AI revolution has been sweeping through various industries, often replacing humans at tasks once considered distinctly human. However, as a recent study underlines, all that glitters may not be silicon gold. Despite their capacity to beat human physicians in medical exams, leading AI language models like GPT-4 struggle to assist humans in reaching accurate diagnoses.
Researchers at the University of Oxford recently pointed out the gap between AI’s medical proficiency and its practical application. The study revolved around language learning models (LLMs), which have been proven to surpass human doctors in medical exams. However, when provided with real-life medical scenarios to diagnose using the help of LLMs, human participants were able to correctly identify the conditions only 34.5% of the time. When diagnosing themselves on their terms at home, though, they were 76% more likely to be accurate.