From the lab to the field-key success factors for AI adoption in healthcare

“No AI built to date can match the understanding and insight of a human doctor. We are not there” Joshua Bengio — Artificial Intelligence expert, 2018 Turing Award winner.

The rise of AI in healthcare

While artificial intelligence is not about to replace doctors in some areas of healthcare, professionals are beginning to see AI as reliable and useful in the practice of their profession.
- Surgical robotics
- Radiology
-Clinical decision support
-Automation of administrative tasks
-Detection of depressive states.

source :

The “failures” of AI applied to medicine

In 2018, medical experts and IBM Watson Health customers identified “multiple examples of dangerous and incorrect treatment recommendations” from the IBM for oncology supercomputer. The main reason for these errors is the training of the “Watson for oncology” model with synthesized data and not real data from real cases.

Despite this bad publicity, at the 2019 ASCO Annual Meeting, IBM Watson Health presented a randomized study evaluating the role of AI in clinical decision making for 1,000 patients in India diagnosed with breast, lung, and colorectal cancer. Indian oncologists at the Cancer Center in Bangalore changed their treatment decisions in 13.6 percent of cases based on information provided by Watson.

In 2020, a Google Health study — the first to examine the impact of a deep learning tool in real-world clinical settings — revealed that a model that worked perfectly in the lab was much less effective in real-world care in Thailand.

In the laboratory, this model was able to identify signs of diabetic retinopathy (diabetic fundus examination) with 90% accuracy and in less than 10 minutes. Trained with high quality images, this model was unable to make predictions with scans performed under poor lighting conditions: 1/5th of the images were rejected.
On the field, Internet connection problems slowed down the analysis because the system deployed in production was based on Google cloud.

For physicians and patients to confidently adopt AI in the field, researchers must design and train models with real-world data and consider any technological and human components that could impact the quality of the model in the field at the design stage.

Regulation for more trust

Regulation is also a factor that increases AI adoption and user confidence. In the US, this role is taken on by the FDA (U.S Food and Drug Daministration). The FDA has set up three levels of regulation.

  • 510(k) clearance

A 510(k) clearance is granted to an algorithm when it has been demonstrated that it is at least as safe and effective as another algorithm already legally marketed.

  • Premarket approval (PMA)

Premarket approval is granted to algorithms for Class III medical devices that may have a significant impact on human health. Their evaluation undergoes further scientific and regulatory processes to determine their
safety and effectiveness.

  • De novo pathway

The de novo classification is used to classify new medical devices for which there is no legal authorization. The FDA conducts a risk assessment of the device in question before approving it and allowing it to be marketed.

By the end of 2020, the FDA had issued 85.9% 510(k) clearances, 12.5% de novo clearances and 1.6% pre-market approvals. These technologies were developed in the fields of radiology (46.9%), cardiology (25.0%) and internal/general medicine (15.6%).

Integrating AI into the care routine

Model performance and regulation are not enough to deploy and integrate AI into a care routine.

An article published in October 2020, “Real-World Integration of a Sepsis Deep Learning Technology Into Routine Clinical Care: Implementation Study,” shows that the successful deployment of Sepsis Watch developed by Duke University was not only a technical endeavor, but also a social and emotional one. A true change management project.

Creation of the model and application

Sepsis is a serious infection that spreads through the body by blood from an initial infection site. Most often of bacterial origin, it can also be caused by viruses, fungi or parasites. Regardless of the germ involved, sepsis is a medical emergency.

Duke University researchers have developed a model that predicts the likelihood of a patient becoming septic based on laboratory data, vital signs data, medication data and clinical data. This model links a multitask Gaussian process to a recurrent neural network classifier.


Once the model’s performance was validated, the researchers deployed a “Sepsis Watch” app to visualize the model’s predictions and respond quickly to sepsis risks.

Sepsis Watch Data visualisation

Success factors for deploying Sepsis Watch

Validating the model and deploying it to production via an app is not enough to integrate Sepsis Watch into the care process. Speaking at the Machine Learning for Healthcare 2019 conference, Will Ratliff, project manager for this integration, explained how he addressed the fears surrounding the project: failure to see, failure to move, failure to finish.

Deployment in project mode:

Remove the inability to see the future with a process map and writing a comprehensive solution guide.

Process diagram
Complete Guide

Lifting the inability to change with training sessions, creating a training site, setting up a governance committee.

Training Document
Training Website
governance committee

Lifting the inability to transform the trial by tracking adoption rates with dataviz and improving the Sepsis Watch interface following feedback from nurses

Follow-up of the adoption
Interface improvement

AI capable of delegating to an expert

Complementarity between technology and the caregiver is one of the keys to successful AI adoption.
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), Hussein Mozannar and David Sontag, have developed a model that can make a prediction about a task or decide to delegate the decision to an expert.
The system has two parts: a “classifier” able to predict the presence or absence of lung pathology, and a “replayer” able to decides whether a given task should be handled by its own classifier or by the human expert.

This article was written from these resources

MLHC 2019 Presentation on Sepsis Watch, Pythia (à partir de 29 min 13')

Diplodocus interested in the applications of artificial intelligence to healthcare. Twitter : @