A shortage of data scientists could be holding back advances in healthcare

The tools to unlock healthcare advances in big data already exist, but implementing those tools will require a change in approach.

Booz Allen Hamilton Chief Medical Offier Kevin Vigilante, M.D., M.P.H., and co-author Steve Escaravage, M.S., a senior vice president in Booz Allen’s digital, analytics and strategy practice, recently published an op-ed in the New England Journal of Medicine laying out their vision of the intelligence community as a model for healthcare leaders dealing with the ever-increasing volume of data the industry accrues.

Some experts have suggested looking to commercial tech companies like Netflix or Google, but Vigilante and Escaravage say those companies cover a much narrower set of data types. They also don’t require nearly the degree of precision and security in their datasets that researchers in the healthcare space do.

At the same time, the healthcare industry’s current inability to tap into the enormous set of data it currently could be holding back innumerable potential scientific advances, according to Dr. Vigilante. “The overwhelming majority of the data we have in healthcare goes unused because we don’t have the right environments to integrate it,” he said.

RELATED: Healthcare data are plentiful now. Why isn't it easier for researchers to use it?

Getting the data into shape only represents a part of the problem. Fruitful analysis requires a shift in the way researchers think about the path to discovery. Vigilante sees this moment as an inflection point in the history of biological sciences.

“We now have so much data that the data itself is the scientific substrate—you can do discovery by looking at massive amounts of data and seeing patterns you would never otherwise have imagined existed,” he said.

Those patterns in turn lead to the hypotheses that traditionally drove the production of data to describe scientific phenomena. The concept isn’t exactly new, either: Vigilante pointed out that Lynn Etheredge outlined a goal of a “rapid-learning health system” in an article published by Health Affairs in 2007. He believes the situation hasn’t changed yet because healthcare hasn’t adapted its methods to the new reality.

Integrating and mining data like the intelligence agencies do may not need to be as heavy a lift as it appears, either. The intelligence agencies have essentially been in an information war since the early 2000s, which has driven a rapid pace of innovation in the field. Taxpayers have already paid for innovative approaches such as next-generation data lakes and automated metatagging, both of which are key elements in producing the type of well-integrated, high-quality, highly secure datasets required in both the intelligence and healthcare fields. Vigilante points out that sharing those techniques and approaches across the federal government makes a lot more sense than reinventing the wheel.

RELATED: Use of predictive analytics is helping reduce costs at payers, providers: survey

“Classically trained statisticians use traditional models: regression analysis and so forth. The intel community has another toolkit of analytic methods that are by and large completely unused in healthcare,” he said.

According to Escaravage, intelligence analysts have changed their approach dramatically over the past two decades. Data scientists have become a much larger part of analysis as intelligence analysts have developed the ability to actively engage with source data and collection.

“The ability to write queries and script procedures, combine data, ask questions, that’s all a minute-by-minute activity today,” he said, adding, “Twenty years ago, that wasn’t the case.”

By contrast, Vigilante sees a gross undersupply of data scientists in the healthcare industry today.

“There’s a human capital issue that will take at least a decade to address as we migrate to more advanced methods,” he said.

In the end, he believes it will be well worth the investment, however, as that will finally enable the rapid-learning healthcare system—and unleash the potential locked in the constantly growing pile of healthcare data.