All is not fair in health and data: Understanding and ensuring algorithmic fairness

The common belief that data is objective and therefore unbiased is a myth. This can be a hard pill to swallow for professionals in the data-driven field of laboratory medicine. Given the growing enthusiasm for machine learning and data analytics to improve quality and efficiency, we must be aware of a significant hidden danger: unfairness. Mark Zaydman, MD, PhD, describes the source of this often-overlooked problem: “Algorithms are not training in some fair universe, they train in our reality with all our warts and issues.” In other words, if we do not ensure fairness in the algorithms we use, they will codify, scale, and exacerbate existing disparities in harmful ways.

This morning’s scientific session, “Ensuring Equity and Fairness in Machine Learning and Data Analytics,” will provide a primer on algorithmic fairness to equip all laboratorians, regardless of their informatics background, with necessary tools to safely deploy algorithms for patient care. This session comes on the heels of Monday’s discussion of algorithmic fairness and data visualization hosted by ADLM’s Health Equity and Access Division and the Data Analytics Steering Committee, which also featured the winners of the ADLM FairLabs Data Analytics Challenge.

In the first presentation, Zaydman will introduce algorithmic fairness, define objective strategies to measure it, and discuss the uses and limitations of the various approaches. He will illustrate concepts of fairness using the landmark Obermeyer study as an example. This famous study scrutinized a broadly deployed resource allocation algorithm that exhibited a bias against Black patients. In a classic example of algorithmic inequity, Black patients had to be significantly sicker than their White counterparts to receive the same algorithmic risk score and additional health care resources. Zaydman argues that we need to reimagine our quality-management programs to include algorithmic fairness and equity.

Weishen Pan, PhD, will then explain why machine-learning algorithms generate unfair outputs. He will present his primary research using causal inference methods to mathematically identify real-world drivers of algorithmic bias. This strategy facilitates a “root cause” analysis of unfairness to determine its different plausible sources and inform downstream corrective actions.

During the final presentation of the session, Jenny Yang, MSc, BASC, will focus on methods for engineering fair algorithms at all stages in the development process, including before, during, and after training. She will provide examples from her own research using “fairness-aware” algorithms to produce accurate and unbiased COVID-19 prediction tools.

There are many opportunities to use data science to improve how we deliver equitable patient care. To be good stewards of machine learning in laboratory medicine, we must expand our criteria for successful algorithms to go beyond accuracy and the ability to be implemented; they must also include fairness.

Attendees will come away from this session with an understanding of algorithmic bias, its roots, and tools for ensuring fairness. Given laboratory medicine’s long history of creating rich data sets and the collective enthusiasm about the potential of machine learning and data analytics, there has never been a better time to address this critical problem.