Dermatology data sets lack ethnicity, skin type information

Only 1.3% of images included in the public data set included patient ethnicity.
By Laura Lovett
01:34 pm
Share

Photo by Angela Roma from Pexels

Dermatology is a hot space in digital health. Even Google is creating tools to help patients find answers to dermatological questions. However, there are still many questions about how artificial intelligence algorithms are trained, and whether the data sets used are representative of the population. 

New research out of the Lancet found that publicly available datasets that are used to train skin cancer diagnosis algorithms are lacking data surrounding ethnicity and skin type.

“Although this represents a rich data resource for innovation, lack of transparency in metadata reporting for clinically essential characteristics (such as ethnicity and Fitzpatrick skin type) limits the clinical utility of these images alone. These issues are not limited to dermatology datasets, but have also been reported in ophthalmology and radiology,” authors of the research wrote. 

Researchers combined data sets from MEDLINE, Google, and Google Dataset Search and found 21 open-access data sets that contained 106,950 skin-lesion images. However, overall information about ethnicity and skin type was limited. Patient ethnicity was only available in 1.3% (1,415) of all images, and skin type is only included in 2,236 images (2.1%).

Where ethnicity was reported, there was no representation of individuals with African, Afro-Caribbean or South Asian backgrounds. Additionally, where skin type was reported, only 11 images were from individuals with darker skin shades (Fitzpatrick skin type V or V1).

Researchers found that among the data sets there was “substantial under-representation of darker skin types.” The bulk of the data sets came from European, North American and Oceanian countries. 

WHY IT MATTERS 

While more and more digital dermatology products come onto the market, many of these may not work for all kinds of skin. 

“These findings highlight the dangers of implementing algorithms for widespread use on broad populations without dataset transparency, especially if algorithm training was undertaken using a restricted demographic cohort.

"Algorithm underperformance and misdiagnosis have serious implications for patients with skin cancer. They not only risk missing treatable malignancies, but can also result in avoidable surgical procedures and cause unnecessary anxiety,” the authors of the report wrote. 

THE LARGER TREND 

While digital dermatology made headlines in the spring with Google’s AI-powered dermatology assistant's reveal, skin health products have been on market for some time

First Derm, SkinVision, DermaSensor, VisualDX and Doctor Hazel AI  are all working on digital dermatology products. 

 
Share