Researchers found that the CNN was more reliable at identifying melanoma, an aggressive form of skin cancer, and benign moles (called nevi) than even experienced dermatologists.
“I expected only a performance on an even level with the physicians. The outperformance even of the average experienced and trained dermatologists was a major surprise,” first author Dr. Holger Haenssle, a senior physician at the department of dermatology, University of Heidelberg, Germany, told Healthline.
The study builds off a landmark paper from researchers at Stanford in 2017 that first proposed that artificial intelligence could, through machine learning, visually identify skin cancer after examining thousands of pictures of the disease. Machine learning is a process through which artificial intelligence “learns” and improves its performance at a given task based on past performance and data.
The process that the CNN used in this research is based on an algorithm developed by Google that allows artificial intelligence to visually differentiate between thousands of different objects.
However, until now the accuracy of the process had not been compared to a large group of human dermatologists.
To conduct the study, Haenssle and his team assembled a test set of one-hundred dermoscopic images that included melanomas and benign nevi.
Dermoscopic images are magnified, high-resolution images created with a tool known as a dermoscope. They allow for easier diagnoses than inspection with the naked eye.
The images were analyzed both by the CNN and the group of 58 dermatologists from around the world.
Dermatologists from 17 different countries and different skill levels, from beginner to expert, took part. Of the dermatologists, 17 (roughly one-third) were “beginner,” meaning less than two years of experience in dermoscopy; 11 were “skilled,” with 2 to 5 years of experience; and 30 (more than half) were considered “experts,” with over five years of experience.
In the first stage of the study, dermatologists were given just the images alone and asked to diagnose whether it was a melanoma or benign nevi. They were then asked to indicate their decision for subsequent action: Whether surgery (excision), short- or long-term follow-up required, or no action needed.
After the first round of analysis, researchers gave the dermatologists additional close-up images of the same areas and clinical information, such as age and sex of the patient and where the lesion was located on the body.
During the first portion of the study, the dermatologists identified 86.6 percent of melanomas, although the more experienced doctors scored higher at 89 percent. On average, the group accurately identified 71.3 percent of the benign moles.
When artificial intelligence was put to the same task, it correctly identified 95 percent of melanomas.
During the subsequent stage of the study, when doctors were given additional information, their accuracy improved. Average detection of melanomas increased to 88.9 percent, and benign moles to 75.7 percent.
But, even with that additional information, the performance of human dermatologists was still worse than the CNN.
The widespread integration of machine learning AI into dermatological clinical practices is likely to increase detection of skin cancers and improve outcomes.
In an accompanying editorial, Dr. Victoria Mar, at Monash University, Melbourne, Australia, and professor H. Peter Soyer, The University of Queensland, Brisbane, Australia, described how this technology might become common practice.
“In the future, I think AI will be integrated into practice as a diagnostic aide, particularly in primary care, to support the decision to excise a lesion, refer, or otherwise to reassure that it is benign,” Mar told Healthline.
Mar said that AI could work in conjunction with physicians to treat more patients more effectively.
“There is the potential for AI technology to be integrated with 2D or 3D skin imaging systems, which means that the majority of benign lesions would be already filtered by the machine, so that we can spend more time concentrating on the difficult or more concerning lesions,” she said. “To me, this means a more productive interaction with the patient where we can focus on appropriate management and provide more streamlined care.”