The JAMA Network recently published a research paper comparing the diagnostic accuracy of a physician and a medical app. Physician error is common; the flow of information is quicker with the advent of mobile phone apps and online symptom checkers. Due to the rise of internet educated patients, people often tend to diagnose themselves.
But the question remains as to how accurate these medical apps truly are?
Similar studies were published to tackle this issue but from a different aspect. A study published by the Journal of Electro-cardiology compared the accuracy of physicians and computers interpreting ECG graphs. The results were varied as only 13.5% of computer ECG readings (excluding pacemaker) required revision. Nevertheless, the final conclusion of computer diagnosis of paced rhythms remains problematic. Thus, computer-based electrocardiogram rhythm diagnoses remain mandatory by physicians for correct results.
The diagnostic accuracy of computers (symptom checker algorithms) vs. physicians is undiscovered. To fill this niche, a small-scale experiment was conducted. The first step was to find out which app proves to be the best. The study compared 23 different symptom checkers and came to the conclusion that Human Dx is the most accurate one amongst all. The experiment used clinical cases which included the patients’ history and symptoms but no physical examinations and lab findings. 45 clinical cases were formed which were then divided into three subsections based on their level of difficulty; easy, medium & hard. 26 cases were common and 19 were uncommon. Afterwards, the physicians were asked to submit free text ranked differential diagnoses for each. Each of the 46 clinical case was solved by 20 physicians. Finally, two external physicians hand-reviewed each of the submitted diagnoses and decided whether the applicant listed the correct diagnosis first or in the top three diagnosis.
First, physicians regularly listed the correct diagnosis across all the clinical cases compared to the symptom checkers (72.1% vs. 34.0, P<0.001). The top 3 differentials listed were 84.3% vs. 51.2%, P<0.001. The research also showed that across all the different levels of difficulties, physicians were more likely to correctly diagnose the harder clinical cases and even more likely to correctly diagnose the uncommon clinical cases. On the other hand, the symptom checkers were more likely to correctly diagnose easier clinical cases and the common case.
This study was the first ever direct comparison between doctors and apps. It concluded that physicians greatly outperformed computer algorithms in diagnostic accuracy. However, it should be noted that the clinical cases presented did not contain: lab results, radiological findings, physical findings etc. Thus, these clinical cases did not reflect the real complexities of patients. For in-depth findings, we would need to further evaluate the strengths of symptoms checkers and research their true effective advantage. Once we know how to implement them correctly in the medical field, this will potentially enable physicians to diagnose even more effectively. Until proven otherwise, there is no substitute to a medical doctor and a medical team.
Author: Kavish Khatib
Photo sources: www.dogtownmedia.com, www.tekrevue.com, www. atelier.net