Embracing data science in lab medicine

In January 2026, the Association for Diagnostics & Laboratory Medicine (ADLM) is launching a Data Science in Laboratory Medicine Certificate Program designed by and for clinical laboratorians. Through real-world examples, the program will highlight how to use data to address daily lab challenges. Participants will learn about core data science concepts and get practical tools to optimize processes, improve testing algorithms, reduce errors, and enhance efficiency.

We spoke to Patrick Mathias, MD, PhD, lead faculty member of the new program, about what he hopes participants will gain from it, and how he first became interested in data science. Mathias is associate professor and associate medical director of the informatics division in the department of laboratory medicine and pathology at the University of Washington School of Medicine, and also serves as the department's vice chair of clinical operations.

How did data science become one of your areas of expertise?

My undergraduate degree is in electrical engineering. As part of that education, I learned a bit of computer science. At the time, data science was still an emerging discipline, so it was not on my radar at all. After receiving a master’s degree in electrical and computer engineering, I attended medical school for an MD/PhD. During that period, I conducted research that introduced me to the computer programming language R, which is used to analyze data. Although data science was not part of my long-term plan, I was intrigued.

Then I did a residency in laboratory medicine and clinical pathology because I wanted to apply my engineering background to the diagnostic space. That’s when I learned about clinical informatics and realized I wanted to pursue it as a subspecialty. I started to see the value of working with diverse data sets that include data generated in the lab, as well as data from electronic medical records (EMRs) that lives outside of the lab. I began thinking about how to use larger data systems to improve patient care.

How does your lab use data science to improve testing and patient care?

At the most basic level, we monitor our quality metrics across the organization. In addition to tracking routine measures like turnaround times for our tests, we’re getting deeper into laboratory information and EMR data to assess patient wait times and other measures of care. And we’re leveraging automation to do it without having to put in a lot of manual work. For example, we gather patient satisfaction data from interactive kiosks that patients use after phlebotomy visits.

So we do a lot of the same things that other laboratories do, but we’ve used data science to take that to the next level. We aim to measure the full impact of how patients interact with the laboratory and compare our lab’s performance with national benchmarks for patient satisfaction and other service delivery metrics.

As data science becomes more sophisticated, what do you hope labs will be able to accomplish with it in the future?

I want labs to contribute positively to patient outcomes, not just by using data from routine testing, but also by looking into the EMR. For example, how can we use this information to identify gaps in care? How can we as a laboratory think more holistically about patients? And how can we support the processes within our healthcare systems that are dependent on the laboratory? I want as many people as possible to learn about data science. Capturing just a small amount of the data that lives outside of the laboratory could make a big difference in patient health outcomes.

On the flipside of that question, what’s something people want data science to accomplish that may not be realistic?

There’s a lot of hype around artificial intelligence (AI), with the hot thing right now being large language models (LLM) such as ChatGPT and Claude. While I think there’s a lot of value in these new tools and models, there are also inflated expectations that LLMs will solve all of our problems.

As with any nascent technology, we have a new set of tools, and we’re thinking critically about how to apply those tools to benefit our patient population. Researchers already use AI in both small pilot studies and larger, more ambitious prospective trials. But we also need to understand that these efforts may not lead to the rosy outcomes people expect, at least in the short term. That being said, I’m an optimist by nature, and I think AI technology will ultimately lead to great things.

One concrete example: At my institution, we have used AI to predict how many patients will come to one of our phlebotomy areas three to four days from now. The model works pretty well, but the challenge is figuring out how to implement it to deliver improved patient outcomes. In reality, it’s really hard to make an employee come in early for a predicted patient surge. Even though AI forecasts something operational, we have a lot of hurdles to clear before that information can become useful for patients.

What other hurdles face data science in lab medicine over the next 5 to 10 years?

Building awareness among laboratorians poses the biggest hurdle. We need education and training to help lab professionals understand the strengths and weaknesses of these technologies. Like any new tool, data science has a gradual on-ramp. It will take time before people fully understand how to deploy new tools.

In general, data science as a field is much more mature than it was when I started out. At the same time, it still lacks reliable resources that relay its principles and how to incorporate them into healthcare training programs. That was one of the motivations behind introducing ADLM’s data science certificate program. We wanted to ensure that our community has solid foundational materials that they can use to develop this knowledge.

Why is it important for laboratory medicine professionals to learn about data science?

Laboratory medicine has always had a responsibility to produce large volumes of objective patient data. Downstream users of that information include clinicians and healthcare systems, many of whom don’t understand the limitations of that data. Laboratory professionals should be in the driver’s seat when it comes to ensuring appropriate data usage. We should inform our colleagues about what can and cannot be learned from the information available.

What does ADLM’s data science certificate program cover?

We spent a good amount of time deliberately designing the most useful curriculum for laboratorians. We started by asking the broader informatics community which topics they found most valuable in the realm of data science. From there, we created a foundational program meant to promote data literacy and reinforce basic concepts such as how healthcare data is generated and used; principles of analysis, statistics, and visualization; and AI applications in the laboratory.

We also tried to frame all of these topics around where the data comes from, how laboratorians should conceptualize data, and the best practices for data collection and analysis.

What sets this program apart from other data science educational programs?

I don’t know that there’s any other program that focuses quite this intensely on laboratory medicine, with a faculty that includes pathologists and specialists in laboratory medicine who can share real-life experience and examples.

Once people complete this program, what should they be able to do in practice?

We hope to provide people with the framework to think about what their local resources are within their laboratories and who to work with to get the data they need. We want them to learn good principles for using that data and drawing insights from it.

Within ADLM, we envision delivering additional data science content through the ADLM Annual Meeting and other educational materials like webinars and publications. These resources will build on the certificate program’s foundation, with deeper dives into specific areas like AI, data visualization, data management, and the newest applications of those techniques in laboratory medicine. This program is just the first step in a longer learning journey.

Jen A. Miller is a freelance journalist who lives in Audubon, New Jersey. +Bluesky: @byjenamiller.bsky.social

Read the full January-February issue of CLN here.