We are Still Learning the Lesson Charles Babbage Taught Us in 1821

Back in 1821 when Charles Babbage introduced the world to his Difference Engine, one of the world’s first mechanical computers, he taught us that bad input = bad output. This is a lesson we are still learning today in healthcare. As we leap into the world of artificial intelligence (AI), machine learning (ML), and large language models, we would do well to remember this lesson before relying too heavily on the output of these fantastical technologies.

At the recent HIMSS23 Conference in Chicago, Charlie Harp, CEO of Clinical Architecture – a company that provides solutions for healthcare data quality, interoperability, and clinical documentation – delivered a spotlight session that highlighted the work of Babbage and his important lesson.

Charlie Harp giving us a history lesson – channeling Charles Babbage and his “difference engine” – even back in the mid-19th century bad input = bad output #interop #HIMSS23 @ClinicalArch pic.twitter.com/Wtdr9zhJvj

— Colin Hung (@Colin_Hung) April 19, 2023

Relying Output from Bad Input is Bad

In his presentation Harp recited this hilarious quote from Babbage from the early 1800’s (you have to read it and imagine a posh British accent):

“On two occasions I have been asked [by members of Parliament], ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’ I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.”

Using dry English wit, Babbage effectively created the concept of “Garbage In. Garbage Out.” This concept is as true today as it was in Babbage’s time and we are in danger of not learning the lesson.

In recent months the world has become enamored with new AI tools like ChatGPT which can perform amazing feats of writing while responding to user created prompts. Over the past few years, AI tools have been helping to improve radiology workflows, direct patients to the most appropriate level of care, optimize clinician schedules, and automated thousands of administrative tasks. Yet, rarely do we ask the question – what was the data that was used to train these AI tools? Is that data representative of the world these AI tools now operate in? Are we confident the data was of good quality?

Harp challenged the audience to think about this during his HIMSS23 presentation.

In order to truly leverage new algorithms, apply ML or even to get better reporting, the data on a patient needs to be as accurate as possible and as high quality as possible or we risk drawing bad conclusions – via Charlie Harp #HIMSS23 @ClinicalArch #HITsm pic.twitter.com/SXWYttx2O1

— Colin Hung (@Colin_Hung) April 19, 2023

More Focus on Data is Happening

It was very apropos that Harp presented his own data analysis as part of his presentation. He analyzed the occurrence of the term “Garbage in. Garbage out.” In PubMed and plotted the results over time.

Love the analysis that Charlie Harp did on PubMed – looking at the occurence of “Garbage-in, Garbage-out” For a long time, it was quiet but as we starting using more data the occurrence started to rise. @ClinicalArch #HIMSS23 pic.twitter.com/bkeYuFMlzV

— Colin Hung (@Colin_Hung) April 19, 2023

From 1972 to 1998 there is barely a mention of the term. From 1999 to 2022, however, Harp found a steady rise in the use of the term in publications. Interestingly, Harp also plotted the major Health IT milestones on his chart like – MIPPA, MIPS, MACRA, and CURES. You could say that the concern around data seems to be growing as the need for quality data rises through regulations.

This analysis aligns with a message that Healthcare IT Today has discussed with Harp on recent occasions – that quality data is vital to healthcare.

Quality of Patient Data

According to Harp, the biggest determinant of the level of quality of health data is patient data.

Charlie Harp shares that’s patient data is the largest contributor to the overall quality of your data. #HIMSS23 @ClinicalArch pic.twitter.com/k3pKSi8ODs

— Healthcare IT Today (@hcittoday) April 19, 2023

In a recent data quality survey conducted by Clinical Architecture, Harp’s team found two interesting results. First was that healthcare organizations felt that their SDOH, Allergies, and Procedure data was the poorest quality vs Demographic data, which was ranked highest quality.

What is the quality of the specific domain of patient data?

Demographic data ranked as high quality, SDOH ranked the poorest quality. #HIMSS23 @ClinicalArch pic.twitter.com/HRB8TFW1eS

— Healthcare IT Today (@hcittoday) April 19, 2023

The second interesting result was that overall, organizations are not very confident in the quality of patient data they have collected. Worse, the survey found that organizations have very little trust in data that originates from outside their organization.

The survey found that we do not feel that the quality of our patient data is where it should be and we don’t trust the data that comes from others (it is the only data worse than ours).#HIMSS23 @ClinicalArch pic.twitter.com/KQdUKG8ZXe

— Healthcare IT Today (@hcittoday) April 19, 2023

And therein lies the paradox. If we ourselves are not confident in the quality of the health data we have collected, then how confident should we be in tools that are based on or trained on that same data? After all, where do the companies that are making the AI algorithms get the datasets they use for training? Makes you wonder.

“Poor patient data quality impacts our ability to be successful as an industry”#HIMSS23 @ClinicalArch pic.twitter.com/Tws6uOE9hV

— Healthcare IT Today (@hcittoday) April 19, 2023

For the full survey results check out: https://clinicalarchitecture.com/data-quality-survey/

Improving Data Quality

Harp ended his presentation on a positive note by quoting Aristotle, which for accuracy, he used the actual translated quote: “As it is not one swallow or fine day that makes a spring, so it is not one day or a short time that makes a man blessed and happy.” In other words, our desired endpoint does not happen in an instant or with one event. It is achieved over time.

If we want quality health data (and we should, according to Harp) then we need to invest the time and resources to make it so. We have the technology to solve our data challenges, now we need to be willing to commit ourselves to the journey of data quality.

If we don’t then Babbage will have been correct all those years ago.

Learn more about Clinical Architecture at: https://clinicalarchitecture.com/

Clinical Architecture is a sponsor of Healthcare Scene.

Cookie	Duration	Description
__cfruid	session	This cookie is set by the provider Cloudflare. This cookie is used for load balancing and for identifying trusted web traffic.
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
AWSALBCORS	7 days	This cookie is used for load balancing services provded by Amazon inorder to optimize the user experience. Amazon has updated the ALB and CLB so that customers can continue to use the CORS request with stickness.
AWSELB	session	This cookie is associated with Amazon Web Services and is used for managing sticky sessions across production servers.
cf_ob_info		This cookie is set by the provider Cloudflare. The cookie provides informations on HTTP Status Code returned by the origin web server, the Ray ID of the original failed request and the data center serving the traffic.
cf_use_ob		This cookie is set by the provider Cloudflare content delivery network. This cookie is used for determining whether it should continue serving "Always Online" until the cookie expires.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-non-necessary	1 hour	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non-necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
gdpr_status	6 months 2 days	This cookie is set by the provider Media.net. This cookie is used to check the status whether the user has accepted the cookie consent box. It also helps in not showing the cookie consent box upon re-entry to the website.
JSESSIONID	session	Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
ts	1 year 1 month	This cookie is provided by the PayPal. It is used to support payment service in a website.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie is set by CloudFlare. The cookie is used to support Cloudflare Bot Management.
_alid_	session	This cookie is set by the provider mielevod-vh.akamaihd.net. This cookie is used for making the live streaming of video content more efficient.
akavpau_ppsd	session	This cookie is provided by Paypal. The cookie is used in context with transactions on the website.
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
language	session	This cookie is used to store the language preference of the user.
lidc	1 day	This cookie is set by LinkedIn and used for routing.
sp_landing	1 day	This cookie is set by the provider Spotify. This cookie is used to implement audio content from spotify on the website. It also helps in collecting information on user interaction with this audio content.
sp_t	1 year	This cookie is set by the provider Spotify. This cookie is used to implement audio content from spotify on the website. It also helps in collecting information on user interaction with this audio content.
v1st	1 year 1 month	This cookie is set by the provider TripAdvisor. This cookie is used to show user reviews, awards and information recieved on the community of TripAdvisor. It helps to collect information about how visitors use the website.

Cookie	Duration	Description
AWSELBCORS	session	This cookie is used for load balancing, inorder to optimize the service. It also stores the information regarding which server cluster is serving the visitor.
dmvk	session	This cookie is set by the provider Dailymotion. This cookie is used for collecting statistical data of the visitor behaviour on the website. It is used for internal analytics.
sid	past	This cookie is very common and is used for session state management.

Cookie	Duration	Description
__gads	1 year 24 days	This cookie is set by Google and stored under the name dounleclick.com. This cookie is used to track how many times users see a particular advert which helps in measuring the success of the campaign and calculate the revenue generated by the campaign. These cookies can only be read from the domain that it is set on so it will not track any data while browsing through another sites.
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_131168995_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
CONSENT	16 years 4 months 2 days 9 hours	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.
UID	2 years	No description available.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.
WMF-Last-Access	1 month 20 hours	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
DSID	1 hour	This cookie is setup by doubleclick.net. This cookie is used by Google to make advertising more engaging to users and are stored under doubleclick.net. It contains an encrypted unique ID.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
NID	6 months	This cookie is used to a profile based on user's interest and display personalized ads to the users.
OAGEO	session	This cookie is set by the provider OpenX. This cookie is used for advertising campaigns on the website. The cookie helps in avoiding the same ad showing repeatedly.
OAID	1 year	This cookie is set when an AdsWizz website visitor have opted out the collection of information by AdsWizz service or opted to disable the targeted ads by AdsWizz.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.
yt-remote-connected-devices	never	These cookies are set via embedded youtube-videos.
yt-remote-device-id	never	These cookies are set via embedded youtube-videos.
yt.innertube::nextId	never	These cookies are set via embedded youtube-videos.
yt.innertube::requests	never	These cookies are set via embedded youtube-videos.

We are Still Learning the Lesson Charles Babbage Taught Us in 1821

About the author

Colin Hung

Just for You

Healthcare IT Podcasts

Featured Articles

Lighting the Way to Efficient Patient Care – How Interoperability and Analytics Change the Nurse Call Paradigm

A Case Study in Ambient Clinical Voice at Children’s Hospital Los Angeles

Know Your Data and How You’ll Use It

Categories

Popular Articles

Healthcare IT Today Podcast

Follow Us