Let’s not outsource humanity

If you’ve attended any talks about AI in radiology in the last couple of years (including mine), then you’ve probably heard Geoffrey Hinton’s 2016 prediction that “in 5 years deep learning is going to do better than radiologists” and so we should stop training new radiologists. Turns out that wasn’t just wrong, it was shockingly wrong, as we find ourselves in 2024 with a huge staffing shortage, with radiologists more in demand than ever.

What struck me as odd about AI’s entry into radiology wasn’t the dire predictions—which make for good headlines, after all—but the choice to target imaging diagnosis in the first place. Making diagnoses is one of the most challenging things we do as doctors; it’s something we take pride in and value the most, even when so much of the practice of medicine becomes routine. Not only that, it’s the most high-risk, high-stakes area, where small mistakes can lead to big problems for the patient, the practitioner, and anyone else involved.

There is so much in the practice of medicine that feels dehumanizing, why start with the idea of replacing humans in the work they actually enjoy, when you could start by helping them instead? Radiology includes many repetitive, boring tasks—from digging through records to find clinical information, to locating prior exams and comparing lesion sizes, to navigating complex diagnostic scoring systems. Radiologists would welcome computers doing these tasks for them, probably faster and with better accuracy. Hinton could just as easily have said that in five years, radiologists would be “reading 10 times faster, with 10 times the quality, and 10 times the job satisfaction.” In hindsight, that wouldn’t have been true, either, but at least it would have been welcomed. Instead, threatened radiologists now have a jaded attitude toward AI; they will gleefully point out glaring AI errors, with remarks like, “And this is supposed to take my job!” The narrative that computers would replace doctors has turned potential allies into harsh skeptics.

What surprises me is that we are making the same mistake with generative AI. AI is being presented as capable of creating art, music, or even replacing human connections like friendship. Instead of showing how AI can help free people from mundane tasks, we keep talking about replacing inherently human activities.

In healthcare outside of radiology, the story is similar, with AI algorithms writing doctors’ notes and responding to patient questions, with the promise of preventing physician burnout. You may have even heard that AI is “more empathetic” than doctors when answering patient questions.* But the very premise is flawed since, by definition, a computer can’t be empathetic. It may be judged as such, but only until it’s revealed to be a chatbot—at which point, the perceived empathy vanishes. Empathy is more than just nice words, it’s a connection between two people—the emotional difference between a hug and a weighted blanket.

I’m not saying there isn’t a place for these tools. Much of what a doctor writes is actually just “scribing”—documenting objectively what was said or done—and maybe that should be automated. Scribing is one of many mundane, repetitive tasks that currently burden healthcare. But hidden within these tasks, there is also a very human art: the act of putting all the pieces together in an assessment and creating a plan. Ideally the assessment and plan should be based on evidence-based medicine, but must also take into account the fuzziness of reality, when information is incomplete, or the patient’s symptoms deviate from classic textbook examples. It also has to consider the unique needs and circumstances of the individual patient. Not every encounter requires a high level of empathy, knowledge, wisdom, and experience, but I feel safe in saying it’s needed more often than it’s provided. It can be hard to draw the line between the routine and the complex, and automating the note risks crossing that line, leaning too heavily on the computer to do all the work. Just like drivers using autopilot might fall asleep at the wheel, automation without guardrails risks taking the art out of medicine.

The case is similar with chatbots answering patient questions. Sure, sometimes the patient just needs a medication refill or needs to know when the lab opens—that’s a great opportunity to offer them a well-designed chatbot. But other times, among the banal, straightforward patient messages is something more important: a chance to connect. Sometimes a patient needs more than just information, and wants to feel that their concerns were heard and considered. They want to know that someone cares about them. How are we going to make sure that we don’t automate away those moments? We need to strike a balance.

What might such a balance look like? Here’s one way we could start with clinical note-writing AI:

A clinical note typically consists of 4 parts, in what is called a SOAP note: Subjective (what the patient says), Objective (test results, physical exam), Assessment (what the doctor thinks is going on), and Plan (what actions are going to be taken). We could limit the AI scribe to the parts that are really just listening and summarizing, the Subjective and Objective sections, allowing for minimal or no scribing in the Assessment and Plans. With the time saved, physicians could spend just a few more precious moments on the final sections, actually capturing their thought process, which is so often missing in modern documentation: What’s the most likely diagnosis, and why? Why are we ordering these tests, and how will the results change the plan? Too often, clinical notes document what the doctor did, but not why—a crucial component in deciding what to do next.

To help make sure notes maintain high quality, we could add a feedback loop using something akin to what was used in the NYU Langone paper, in which an AI algorithm assessed the quality of clinical notes according to a defined standard, and gave feedback to doctors. We could decide what a good note looks like, and then use AI to help nudge doctors in that direction.

In the rush to automate and improve efficiency, we risk taking the humanity out of healthcare. I think we can do both—reduce the task burden and allow healthcare workers to spend more time in the most human of activities. When deploying these tools, we have to ask ourselves where to draw the line between what we automate and what we save for ourselves, and how to recognize the moments when it matters.


*The JAMA paper actually claimed that the responses were rated as more empathetic by judges who didn’t know the source of the responses, but it is frequently misreported as showing that chatbots were more empathetic than doctors, which is not possible.


If you know someone who might like this post, please feel free to share it with them. If you’d like to receive these posts (and a few bonus posts) by email, subscribe here.

Leave a comment