Stopping Infection Outbreaks with AI and Big Data

Steven Cherry Hi this is Steven Cherry for IEEE Spectrum’s podcast, Fixing the Future.

Hospitals are where we go to get cured of infections and diseases, but sadly, sometimes tragically, and ironically, they are also places we go to get them. According to the U.S. Centers for Disease Control and Prevention, “On any given day, about one in 31 hospital patients has at least one healthcare-associated infection.”

Yet, according to Dr. Lee Harrison, “The current method used by hospitals to find and stop infectious disease transmission among patients is antiquated. These practices haven’t changed significantly in over a century.”

Until perhaps now. Doctors at UPMC, the University of Pittsburgh Medical Center, have developed a new method that uses three distinct, relatively new, technologies, whole genome sequencing surveillance, and machine learning, and electronic health records to identify undetected outbreaks and their transmission routes.

Dr. Lee Harrison is a Professor at the University of Pittsburgh, where he’s the Associate Chief of Epidemiology and Education. and, more to our point today, the head of its Infectious Diseases Epidemiology Research Unit. He’s a co-author and corresponding author of a new paper that describes the new outbreak-detecting methodology, and he’s my guest today.

Lee, welcome to the podcast.

Lee Harrison Thanks, Steven. Nice to be here.

Steven Cherry Maybe we could start with the practices that haven’t changed in over a century … How do hospitals currently find and stop infectious disease transmissions?

Lee Harrison So, most hospitals use an approach that focuses on somebody noticing something, something happening. So, for example, a nurse on a particular nursing unit notices that there are three patients with C.diff —

Steven Cherry And C.diff is —

Lee Harrison — is Clostridium difficile. It’s a serious infection that occurs in patients who have had antibiotics, and typically we’ll will alert the infection prevention folks in the hospital and then an investigation will be started.

Steven Cherry That 1-in-31 statistic is more than a little scary. Does that mean that if I’m in the hospital for three days and nights, I have like a one in 10 chance of picking up an infection I wouldn’t have gotten if I weren’t in the hospital?

Lee Harrison There are different ways that people get infections in the hospital. Sometimes the infections the patients get are actually from their own bodies. So, for example, if you come into the hospital and you’re already colonized with MRSA, as an example, then if you have surgery, then you can develop a wound infection from that. But in other people, the infection acquired in the hospital is actually transmitted to them in the hospital. So there’s two ways that patients get hospital-acquired infections. But yeah, they are quite common, as you cited in the CDC study.

Steven Cherry So this new system, which you and your colleagues call the Enhanced Detection System for Healthcare-Associated Transmission, or EDS-dash-HAT—which, by the way, that abbreviation works better on paper than over the air—starts by using genomic sequencing of all hospital infections as soon as they’re detected or suspected? How does that work?

Lee Harrison Yeah, so, we tried to make the acronym easy. We actually call it “edhat.” Yeah, the way it works is that we determined a long time ago that we were missing outbreaks by the traditional method. And so what we decided to do was combine several novel approaches. One is, as you pointed out, whole-genome sequencing surveillance. And the difference between that and what we currently do is reactive whole-genome sequencing. So in the scenario I painted before, when somebody in the hospital thinks there’s an outbreak, we can get the bacteria from the suspected outbreak and do whole-genome sequencing and confirm or refute the outbreak. And what we find is a lot of times we refute the outbreak. It looks like there may be an outbreak, but when we do the sequencing, the cases are unrelated. So EDS-HAT uses whole-genome sequencing surveillance, which basically says anybody who has been in the hospital for at least three days and has one of our target high-impact bacterial infections, we go down to the clinical microbiology lab. We collect those bacteria and we sequence all of them.

Steven Cherry And then that sequencing is entered into the patient’s electronic health record?

Lee Harrison No, it’s not because the sequencing itself is not used for individual patient care. So the genetic makeup of the bacteria that they’re infected with really is not that relevant for the treatment because for treating, you just need to know what antibiotics will kill the bacteria, and that’s what the clinicians will use. We keep a separate database of EDS-HAT whole-genome sequencing,

Steven Cherry I see. So the algorithms that look through the electronic health records, they’re doing a kind of contact tracing.

Lee Harrison Yeah, so that’s a really good question. So the first step is the whole genome sequencing surveillance. And when we identify highly related bacteria, that tells us that there is an outbreak. It does not tell us what’s causing the outbreak. And traditionally, you know, people would go into the electronic health record and fish around. But traditionally mostly would be looking for clusters that are occurring at the same time on the same nursing clinic. We know that misses a lot of outbreaks. So, the reason we go into the electronic health record is basically to identify that transmission route that is causing the outbreak that was identified by the sequencing.

Steven Cherry So the algorithms are looking for things like the proximity of two patients in terms of their hospital beds or whether they both had a procedure that uses the same equipment, or the same doctor or nurse treated them. So, all of that information is in the electronic health record, or is it somehow inferred by the algorithm?

Lee Harrison That information is in the electronic health records. So a really important thing about EDS-HAT, and this is in contrast to when you’re investigating a community outbreak of, say, salmonella. If you’re trying to figure out the transmission route in an outbreak of salmonella, you’ve got to track down the patients, hope they will answer the telephone, hope they remember where they were exposed to it. And so figuring out how transmission is occurring in the community is very difficult.

In the hospital, all of the epidemiology is available in the electronic health record: where they were, when they were there, who they had contact with through their roommates, or who their health care providers were. And importantly, any procedure that was done to them. So, when hospitals do a procedure, they need to get paid for it. So, they put a charge code in the electronic health record pretty quickly. And so, we can exploit all of that information to say, ‘let’s take our ten patients in a supposed outbreak defined by whole-genome sequencing surveillance. And let’s compare them to all the other patients in the hospital and see which exposures are more common in the cases of the outbreak as compared to the rest of the patient population.’ And that’s how the data mining works and identifies those exposures that are likely causing these outbreaks.

Steven Cherry We’re speaking with Dr Lee Harrison. When we come back, we’ll talk about more about this data mining of electronic health records and about the infectious disease that’s currently ravaging the entire planet.

Fixing the Future is supported by COMSOL, the makers of COMSOL Multiphysics simulation software. Companies like the Manufacturing Technology Centre are revolutionizing the designs of additive manufactured parts by first building simulation apps from COMSOL models, allowing them to share their analyses with different teams and explore new manufacturing opportunities with their own customers. Learn more about simulation apps and find this and other case studies at comsol.com/blog/apps.

We’re back with my guest Dr Lee Harrison of the University of Pittsburgh’s Medical Center.

Steven Cherry Lee, I should disclose at this point that I have my own connection to the University of Pittsburgh. I teach there, though not at the Medical Center; I’m in the English Department, but my primary care physician is a UPMC doctor, and last year I was in a UPMC hospital for kidney stones, so I guess I have my own stake in all this working.

Lee Harrison Interesting to know.

Steven Cherry So does this system involve there being a lot more information in the electronic health records than most hospitals currently have? Or could most hospitals start doing this right away?

Lee Harrison Yeah. I mean, the beauty of EDS-HAT is that we are mining data elements that are present in basically all electronic health records. And so the potential for expansion outside of UPMC, we think, is tremendous.

Steven Cherry There are no particular implications for patient privacy in all of this?

Lee Harrison Well, I mean, anytime you access medical data, you have to be very conscientious about how you treat those data. We are part of the infection prevention team and so we have access to electronic health records as part of our contributions to infection prevention at our hospital. But whenever you’re dealing with electronic health record data, you have to be very aware of the regulations involving patient data.

Steven Cherry And are there any additional burdens placed on doctors and nurses? I mean, as you say, this uses information that’s basically always in the health record to begin with. I mean, my understanding is that these electronic health records are a little bit intrusive, or at least doctors and nurses sometimes resent them as intruding in their relationships with their patients, and it takes an amount of time that they could be devoting to their patients in other ways.

Lee Harrison I think that’s somewhat of a different issue. Yes. So, there has been some pushback about the amount of time needed to enter data and the electronic health record. I sort of view it differently. I remember an era when I was writing handwritten notes. My handwriting is illegible. Oftentimes, I was rounding [i.e., on rounds] with a team of infectious diseases fellows and residents and medical students, and they’d have to stand around and watch me write this illegible note. And it was a great waste of time. So now it’s much more efficient, in my view, where we can round on everybody and then—either back in my office or at home in the evening—I can put in my consult notes on whatever infectious disease patient I’m seeing.

There are some downsides, but in my view, it’s a great improvement in how we care for patients, particularly because, you know, we used to roam around the hospital trying to find a patient’s chart. Now you can access a patient’s chart really anywhere. And so again, there are disadvantages, but I think it’s been a great improvement in how we care for patients.

Steven Cherry And let’s just close the loop on how you how the system is actually used … when you detect an outbreak and you do this sort of machine-learning contact tracing. What’s the next step to limit the outbreak?

Lee Harrison OK, so first of all, we have just completed a several-year validation—development and validation—of EDS-HAT. It had, up until now, in our hospital, run in the background, with a very long lag period. So we would wait six to 12 months to actually sequence the isolates. And the idea was that we would be able to compare it to running in the background with what we traditionally would do in terms of infection prevention.

Now that we’ve proven that EDS-HAT could potentially really make dramatic improvements to patient safety, we are just now moving it to [be] a real-time infection prevention tool. But the way it works is you identify an outbreak, you identify—through the electronic health record, you identify the transmission route, and then we work hand in hand with our infection prevention team. So, anything that requires intervention, we immediately notify the infection prevention team so they can initiate the appropriate intervention. And that varies depending on what the transmission route is that we’ve identified.

Steven Cherry And if the transmission route is said to be people who use the same device, how would that work?

Lee Harrison Yes, that’s a great example. So, one of the outbreaks we identified in this retrospective two-year analysis is a very serious set of infections caused by a contaminated gastroscope. When the machine-learning tells us that gastroscope is the likely transmission route and we manually go into electronic health record and say, how plausible is this? In this case, the patients had gastrostomy from the same gastroscope. And we also had other genetic evidence that in fact, it was the cause of the outbreak. So, the bottom line is you basically go and you remove that gastroscope from circulation and figure out why it’s causing the outbreak. In the case of a device like that, the intervention is basically 100 percent effective. You just stop using that device.

Steven Cherry Are there particular potential benefits to a system like this, if we should happen to be suffering a once in a century pandemic?

Lee Harrison Well, I mean, so separate from the pandemic. I mean, what we’ve shown is that this has potential to really improve patient safety by identifying. … The most striking thing about our analysis is that we identified outbreaks that were very serious and that were unidentified by traditional approaches. And to me, that’s the most striking fact.

We had an outbreak. It was occurring at interventional radiology that had been going on for at least a year and we have no idea when it started. It definitely wasn’t identified by traditional methods. And also, you know, it allowed us to intervene immediately and stop the outbreak. So that’s really the value of EDS-HAT. And the other thing that we found is that it has the potential to save hospitals money.

I was surprised by the result, but when you think about it, whole-genome sequencing is relatively cheap. And these infections that we can prevent are very, very expensive. So you don’t have to prevent very many infections to pay for all the costs of running the system. We estimated we could have saved UPMC anywhere from just under $200,000 to just under $700,000 over the two-year period.

But I didn’t understand your question about the relationship to the pandemic.

Steven Cherry Well, I’m just wondering if there’s intra-hospital transmission of COVID.

Lee Harrison That’s a great question. So before …traditionally EDS-HAT has been for bacterial infections. And when I’m rounding on the inpatient consult service for infectious diseases at our hospital, I’ve noticed that, for example, you know, you have … A patient’s been in the hospital for three months and all of a sudden they develop a respiratory syndrome. You test them and they’re positive for influenza. Now, you know, for a fact, they acquired that influenza in the hospital because they’ve been in the hospital so long. So, before the pandemic started, we had just started to look at expanding EDS-HAT into respiratory viruses. And then when the pandemic hit, we rapidly pivoted to sequencing not only influenza, but also SARS-CoV-2. And the answer to your question is yes, there is value. There is evidence for transmission of SARS-CoV-2, just like other respiratory viruses, in the hospital. We’ve shown that whole-genome sequencing can tell you how transmission of COVID and other respiratory illnesses is occurring in hospitals.

Steven Cherry And so, you know, if hypothetically, every hospital in an area were using this system, would it be possible to replace some of the human contact tracing and do it more efficiently?

Lee Harrison Yeah, that’s a great question. So what we’re finding with EDS-HAT is that the data mining of the electronic health record can help us to identify transmission routes that are very difficult to identify by traditional EHR review. But we’re also finding that oftentimes the data mining results basically give us a listing or a ranking of likely transmission routes, and that does require some manual review to determine the plausibility of the various transmission routes that are being given to us. So the data mining is, I think, a huge innovation and a huge advance. But I don’t think it’s ever going to [not] require that need for some human intervention to figure out the best—the most likely—transmission route. And then decide what to do about it.

Steven Cherry If I could ask you one other question, more or less unrelated to this new system, in the midst of a once-in-a-century pandemic, it’s easy to forget other diseases and other vaccines. Recently the UK had a real-world test of a vaccine for meningitis B, the first large-scale test for children. You co-wrote an editorial for an article about this vaccine in the New England Journal of Medicine. Apparently, this is an incredibly dangerous disease. One of the paper’s authors is quoted as saying it’s “one of the fastest, most vicious infections you can have. The child can be sneezing in the morning and be dead in the evening, even if they get to the hospital.” Are you worried about the way vaccines are becoming politicized and refused, even as we seem to be entering a golden age of developing new vaccines for ancient, but still terrifying, diseases?

Lee Harrison Yeah, that’s a great question. I view the politicization of vaccines as both irrational and very dangerous. Because what we’re seeing now with the COVID pandemic is that what your political beliefs are are somewhat correlated with whether you’re getting vaccinated or not. And what we’re seeing is that is leading to a lot of unnecessary illness and deaths from COVID. And I’m hoping that somehow we can move past this at some point because from a public health standpoint, it’s very, very concerning and it’s also not only the risk of the individual not getting vaccinated, but you’re seeing what’s happening in the US with the Delta variant. We’re having still a devastating pandemic with almost 1200 deaths a day from COVID, and most of those are preventable by a very, very safe and effective vaccine.

Steven Cherry Well, Lee, I think anytime a new technology, or three different technologies woven together, can change the way we’ve been doing things for a century, it’s a pretty exciting development. Thank you for your role in making hospitals safer and for joining us today.

Lee Harrison Thanks for inviting me, Steve. It’s really been a pleasure.

Fixing the Future is brought to you by IEEE Spectrum magazine and sponsored by COMSOL, makers of mathematical modeling software and a longtime supporter of IEEE Spectrum as a way to connect and communicate with engineers.

IEEE Spectrum is the member magazine of the Institute of Electrical and Electronic Engineers, a professional organization dedicated to advancing technology for the benefit of humanity.

This interview was recorded 9 December 2021, on Adobe Audition via Zoom, and edited in Audacity. Our theme music is by Chad Crouch.

You can subscribe to Fixing the Future wherever you get your podcasts, or listen on the Spectrum website, where you’ll also find transcripts of all our episodes. We welcome your feedback on the web or in social media, and your rating us at your favorite app.

For Fixing the Future, I’m Steven Cherry.