Blood-Red Tape: How Redundant Data Collection Leads to Scandal

Andy Oram praxagora

Throughout the health care field, clinical and administrative staff complain about the burden of collecting data required by government regulations–often with no idea what purpose the data serves. A lot of regulatory requirements are desperate stabs at filling the gaps caused by a lack of data standards and of interoperability–yes, a decade into the U.S. government’s goal of making data exchange simple and universal in health care.

But now, ill-considered data collection requirements led to a lurid headline on the front page of the Sunday New York Times on March 14: “Maggots, “Rape and Yet Five Stars: How U.S. Ratings of Nursing Homes Mislead the Public.” This extensive examination of the five-star system offered by the Centers for Medicare & Medicaid Services (CMS) was a marathon exercise in big data, where reporters “combed through 373,000 reports by state inspectors and examined financial statements submitted to the government by more than 10,000 nursing homes.”

The results were predictable. When the self-reported data by nursing homes was checked against the facts–hospitalizations, inspection reports–it turned out that a huge number of nursing homes underreported incidents, overestimated staffing, and made other adjustments to reality. These administrators violated not only the law but the ancient Deuteronomic injunction: “Do not have two differing weights in your bag–one heavy, one light.” Like merchants who buy using one set of weights and sell using another, the nursing homes were gaming the system.

A follow-up article shows how the investigation is leading to legal action.

The one big question that the reporters failed to ask in this long article–and the question we all should ask–is: If the New York Times could get accurate data on nursing home safety, why can’t CMS?

Put another way, why did CMS ignore accurate sources of data and instead force the nursing homes to create another data set–in a process that was ripe for inaccuracy, whether intentional or accidental?

This scandal is just a particularly shocking outcome of a system that has resisted modern data-gathering and analysis for decades. Let’s take a look.

More Time on Forms Than on Patients

It’s well-known that screen time is contributing to doctor burnout. Computers speed up some tasks and prevent some kinds of errors–but spending more than half of their (very long) workday on the computer is clearly not the best use of doctors’ time.

Meaningful Use in the 2010 decade imposed more and more data collection requirements. It has been reported to me that much of the data was already in records, collected during clinical visits, but that the electronic record vendors didn’t bother investing the time to repurpose that data. Instead, dutifully responding to regulations, they added new fields that a clinician or administrator had to fill out.

Meanwhile, the rest of the computer field was striving to make data collection sleeker and more consistent. In finance, retail, airlines, and elsewhere, data was being curated, cleaned, and stored in data lakes. The watchword was “a single source of truth.” For fast retrieval, duplication was sometimes necessary, but it was done through rigorously defined flows–a completely different world from the haphazard data storage practices of health care.

The nursing home report in the New York Times should cause us all to re-examine not just the CMS rating system, but attitudes toward data collection and storage throughout the health care system. An example of forward-looking thinking is a recent recommendation for more centralized data collection on COVID-19 vaccinations.

Privacy concerns

Of course, when health care reformers call on institutions to do better sharing, the old guard puts up a privacy shield. I’m not referring to the devious “privacy shield” that the European Union recently discarded in its dealings with U.S. companies, but a shield to protect hospitals, insurers, and others who wanted to hoard data. Business secrets are another oft-used parry to prevent sharing data.

And it’s true that patient data is hard to protect. Professional, sophisticated anonymization/deidentification techniques work. The experts know how to create data sets in which the risk of reidentifying a patient is negligible (though never totally zero). But this expertise is in short supply.

HIPAA provided about 15 simple steps in a “safe harbor” for deidentifying patient data. In 1996 when the law was passed, compute power was less available and anonymization techniques were less sophisticated, so perhaps simple things like removing fields were the best recourse they had. The safe harbor was long understood to be too lax in some ways and too strict in others

Good data anonymization currently includes analyzing a population and determining which values are relatively rare in each field. You have to be more careful to fuzz the data for people with a rare disease than people with a common condition. The calculation is different for each data set, meaning it has to be calculated by each institution for its particular data. And it must be recalculated periodically, because data sets change and attacks on data change.

But calculations throughout the computer field are getting easier over time. Many businesses that used to assign analytics to programing staff now have tools that let “citizen data scientists” accomplish the tasks. The health care field could standardize tools to help their members create robust anonymized data sets.

Data sharing requires an investment. But how many billions of dollars will we save by freeing clinicians and administrators from redundant, error-prone data entry? How many billions of dollars would we save if patients in unsafe nursing homes had been removed before they got sick?

About the author

View All Posts

Andy Oram

Andy is a writer and editor in the computer field. His editorial projects have ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. A correspondent for Healthcare IT Today, Andy also writes often on policy issues related to the Internet and on trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business. Conferences where he has presented talks include O'Reilly's Open Source Convention, FISL (Brazil), FOSDEM (Brussels), DebConf, and LibrePlanet. Andy participates in the Association for Computing Machinery's policy organization, named USTPC, and is on the editorial board of the Linux Professional Institute.

Grand Rounds and Doctor On Demand to Merge, Creating First of its Kind Patient-Centric Integrated Virtual Healthcare Company

Home Care Services and Technology Set to Explode

Cookie	Duration	Description
__cfruid	session	This cookie is set by the provider Cloudflare. This cookie is used for load balancing and for identifying trusted web traffic.
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
AWSALBCORS	7 days	This cookie is used for load balancing services provded by Amazon inorder to optimize the user experience. Amazon has updated the ALB and CLB so that customers can continue to use the CORS request with stickness.
AWSELB	session	This cookie is associated with Amazon Web Services and is used for managing sticky sessions across production servers.
cf_ob_info		This cookie is set by the provider Cloudflare. The cookie provides informations on HTTP Status Code returned by the origin web server, the Ray ID of the original failed request and the data center serving the traffic.
cf_use_ob		This cookie is set by the provider Cloudflare content delivery network. This cookie is used for determining whether it should continue serving "Always Online" until the cookie expires.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-non-necessary	1 hour	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non-necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
gdpr_status	6 months 2 days	This cookie is set by the provider Media.net. This cookie is used to check the status whether the user has accepted the cookie consent box. It also helps in not showing the cookie consent box upon re-entry to the website.
JSESSIONID	session	Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
ts	1 year 1 month	This cookie is provided by the PayPal. It is used to support payment service in a website.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie is set by CloudFlare. The cookie is used to support Cloudflare Bot Management.
_alid_	session	This cookie is set by the provider mielevod-vh.akamaihd.net. This cookie is used for making the live streaming of video content more efficient.
akavpau_ppsd	session	This cookie is provided by Paypal. The cookie is used in context with transactions on the website.
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
language	session	This cookie is used to store the language preference of the user.
lidc	1 day	This cookie is set by LinkedIn and used for routing.
sp_landing	1 day	This cookie is set by the provider Spotify. This cookie is used to implement audio content from spotify on the website. It also helps in collecting information on user interaction with this audio content.
sp_t	1 year	This cookie is set by the provider Spotify. This cookie is used to implement audio content from spotify on the website. It also helps in collecting information on user interaction with this audio content.
v1st	1 year 1 month	This cookie is set by the provider TripAdvisor. This cookie is used to show user reviews, awards and information recieved on the community of TripAdvisor. It helps to collect information about how visitors use the website.

Cookie	Duration	Description
AWSELBCORS	session	This cookie is used for load balancing, inorder to optimize the service. It also stores the information regarding which server cluster is serving the visitor.
dmvk	session	This cookie is set by the provider Dailymotion. This cookie is used for collecting statistical data of the visitor behaviour on the website. It is used for internal analytics.
sid	past	This cookie is very common and is used for session state management.

Cookie	Duration	Description
__gads	1 year 24 days	This cookie is set by Google and stored under the name dounleclick.com. This cookie is used to track how many times users see a particular advert which helps in measuring the success of the campaign and calculate the revenue generated by the campaign. These cookies can only be read from the domain that it is set on so it will not track any data while browsing through another sites.
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_131168995_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
CONSENT	16 years 4 months 2 days 9 hours	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.
UID	2 years	No description available.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.
WMF-Last-Access	1 month 20 hours	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
DSID	1 hour	This cookie is setup by doubleclick.net. This cookie is used by Google to make advertising more engaging to users and are stored under doubleclick.net. It contains an encrypted unique ID.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
NID	6 months	This cookie is used to a profile based on user's interest and display personalized ads to the users.
OAGEO	session	This cookie is set by the provider OpenX. This cookie is used for advertising campaigns on the website. The cookie helps in avoiding the same ad showing repeatedly.
OAID	1 year	This cookie is set when an AdsWizz website visitor have opted out the collection of information by AdsWizz service or opted to disable the targeted ads by AdsWizz.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.
yt-remote-connected-devices	never	These cookies are set via embedded youtube-videos.
yt-remote-device-id	never	These cookies are set via embedded youtube-videos.
yt.innertube::nextId	never	These cookies are set via embedded youtube-videos.
yt.innertube::requests	never	These cookies are set via embedded youtube-videos.

About the author

Andy Oram

What’s Revenue Cycle Management Going to Look Like in 2025?

A Look at the Biggest Challenge Healthcare RCM Professionals Face

Helping Clinicians Gain More Control Over Their Time

To Become Patient-Centered, Listen to Patients

TigerConnect Reduces Staff Frustration With Better Communication & Scheduling

You may also like

About the author

Andy Oram

Just for You

Healthcare IT Podcasts

Featured Articles

Categories

Popular Articles

Healthcare IT Today Podcast

Follow Us