The Problem with Terminators

In the movie The Terminator, Reese, the bodyguard from the future, talks about how the machines took over:

“Defense network computers. New… powerful… hooked into everything, trusted to run it all. They say it got smart, a new order of intelligence. Then it saw all people as a threat, not just the ones on the other side. Decided our fate in a microsecond: extermination.”

He says this as if the computers decided themselves that humans weren’t needed. That’s not how AI – artificial intelligence – works. Machine learning algorithms don’t determine their own goals or targets; those are set by people. People choose the goal and the machine learning chooses the best way to get there. Somewhere along the line, in the Terminator world, a human chose a goal for the algorithm that did not require the presence of humans in order to be successful, and that was the source of the problem.

In reading articles and comments online, I see a lot of misunderstanding about how AI works. Some seem to think that an AI algorithm is purely math and therefore free of all human bias and limitations. Others anthropomorphize the algorithms, with the image of AI as a person without proper emotions – like Data or The Vision. These are both under- and over-estimates of the human element within AI.

Let me describe a very straightforward machine learning example. You can train an algorithm to identify all the cars in a set of photos. You give it many labeled images with both cars and things that aren’t cars, and eventually it will learn to correctly identify the cars in the images. When you apply that algorithm to new pictures, it may even find cars that a person might miss – in the background, or half hidden behind a tree, for example.

That seems easy enough, right? In order to train an algorithm, what you need to provide is a data set and a definition of success. In my example, the data set would be the labeled collection of pictures and the definition of success would be “car.”

Let’s focus on the “definition of success” part. I’m going to divide this into two parts, based on the types of problems that we might want to solve: truth and target conditions.

Truth is the goal when you are defining something. You want to find all the pictures of cars, and the algorithm will be judged by how accurately and completely it can do that. It will fail whenever it doesn’t recognize a car or when it mistakenly calls something else a car. In order to train this algorithm, you need to know what a car looks like. Seems easy enough, but in real-world pictures, there may be situations that test your definition. Is a Ford Taurus cut in half a car? What about just a front fender? Here are images from my own photos – are these cars or not?

The algorithm can only find the truth as you have defined it. The better your initial definition of truth, the more your algorithm will work in the way that you want it to. 

Slightly different from training the algorithm to determine the truth, is training for a target condition. This aims toward reaching an optimal end state of a system. Just as you needed to know the truth to teach the truth, you need to know what end state you are going for. If you are creating an algorithm to stack blocks, for example, you have to know what your optimized end state is: optimized height, or width, or stability, or if you want the numbered blocks to be in order, or whatever else. You can’t just ask it to stack blocks and hope it does it right. 

You pick the goal and the algorithm finds the optimal way to get to that goal. And then it can perform that task, over and over, endlessly and reliably. How might these different goals work in healthcare?

What about the goal of truth or in the case of medicine, diagnosis? Despite what patients may want or expect, we actually have a pretty loose relationship to absolute truth in medicine. A patient may want to know “Doc, do I have X disease?” But instead, we might tell them:

  • I don’t know if you have X, but you don’t have W, Y, or Z, all of which can look like X and are more serious.
  • We’re going to act as if you have X, and if you get better, maybe that’s what you had.
  • Even if you do have X, the only problem with that are the symptoms, so we’re just going to treat those.

Often in medicine, we don’t need to know the absolute truth, we just need to know enough to take action and improve the patient’s condition. 

What happens when we want to train an algorithm to make a diagnosis? If we wish to train an algorithm to find cases of pneumonia, for instance, we first need to know what pneumonia looks like and we need to find examples of it. But do we really know?  Most doctors can confidently describe a classic case of pneumonia, but many cases of pneumonia do not have the classic signs or symptoms. As is said in medicine, “not all patients read the book.” Furthermore, patients are often treated presumptively with antibiotics and we never know if they truly had pneumonia or not.

If shown 100 patients with cough and fever, even a panel of experts might have a hard time reaching consensus on which patients have a definite case of pneumonia. If we train the algorithm with only classic cases, it may miss the less typical or milder cases, and if we train it with a wider group of “pneumonia” patients without a clear diagnosis, then it may overdiagnose. The algorithm can only be as good as the definition of truth it’s given in the first place.

The issue is similar with target conditions. An example of this is triage of emergency imaging studies.

Let’s imagine a hypothetical but realistic situation in which we we train an AI to pre-scan the images and find and prioritize several possible diagnoses, like pneumothorax, renal stones, and intracranial hemorrhage. We have chosen those medical conditions because we have algorithms to find them and they may require urgent treatment. The target condition for the AI is: “prioritized cases with possible pneumothorax, renal stones or intracranial hemorrhage.” What we need to consider is whether this target condition is really the desired one. What happens to all the other cases? What about urgent conditions for which we don’t yet have algorithms? For every case we move forward, we move another one back. You may move a renal stone case up in priority and as a result delay the diagnosis of a patient with bowel ischemia, which may be a much more urgent problem.

We already make these kinds of decisions, of course, when we create rules and guidelines for our existing processes. The difference is that algorithms can be more easily scaled up than human-based processes and may be more readily trusted, given the opacity of the underlying reasoning and our tendency to trust computers.

Cathy O’Neil, the author of Weapons of Math Destruction calls AI algorithms “opinions embedded in math.” In some ways, this is the reverse of what we saw in the Terminator movie. The terminators were robots covered in human flesh, for camouflage (also to save money on the special effects budget). AI has a human core, wrapped up in math and processors. It’s human fallibility disguised as machine perfection.

AI starts with human ideas and then forms processes around them. The processes will be free of the bias, fatigue, and inattention that plague many human-run endeavors, but any flaws, biases, or incompleteness that exist in the core goals will be repeated and multiplied. That doesn’t mean that we can’t or shouldn’t begin to explore how AI can help us provide better healthcare. In fact, I think it’s incumbent upon us to do so. But before we go automating something, before we set a computer to a task that it will perform exactly and endlessly, we need to make sure we understand ourselves first, and what our goals are. We can’t be blinded by the perception that math is perfect and computers are infallible. And we need to put in place the feedback and iterative improvement processes to correct any initial errors.

I’m only touching the surface of the topic of AI, but if you want to know more, I highly recommend reading anything by Cathy O’Neil, mentioned earlier. You can start by watching this video on her website, and explore from there.

I also learn a lot from Luke Oakden-Rayner, one of the most thoughtful online voices on AI in radiology. I mentioned him in a previous post. I would recommend starting at this page in in his blog.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s