The rise of Reinforcement Learning
Reinforcement Learning (RL) refers to a class of machine learning problems whose goal is to learn from successive experiments.
At each time step, in an observed environment, the algorithm does some actions that will modify its state. This will bring it a local reward. The value function corresponds to the accumulation of the rewards. It is this accumulation that the algorithm must maximize.
Reinforcement learning (RL) is an area of machine learning that has received a lot of attention since 2015, following DeepMind’s (Google) publications of an AI that plays Atari, an AI that learns to walk, an AI that wins at GO (AlphaGo) and chess (AlphaZero).
Producing flexible behaviours in simulated environments
The agility and flexibility of a monkey swinging through the trees or a football player dodging opponents and scoring a…
Playing Atari with Deep Reinforcement Learning
We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory…
AlphaGo: The story so far
AlphaGo is the first computer program to defeat a professional human Go player, the first to defeat a Go world…
Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess
It is non-trivial to design engaging and balanced sets of game rules. Modern chess has evolved over centuries, but…
The creation of Gym by OpenAI (a nonprofit organization started by Elon Musk founded in 2015), a library designed in Python, of environments designed to test and develop reinforcement learning algorithms, has contributed to the growth of research in this branch of Deep Learning. This tool has increased the reproducibility of algorithms and the learning of the future researchers in RL.
Barriers to be overcome in applying RL to Health
Despite the popularity of reinforcement learning over the past five years, the use of reinforcement learning remains limited and faces the following obstacles.
- Unlike many video games, researchers are not able to observe everything that happens in the body (environment).
- Health data are non-stationary. Patients’ symptoms are often recorded at irregular intervals and some patients’ vital signs are recorded more often than others (condition).
- RL algorithms are data intensive. Researchers can’t simulate a patient’s treatment like they simulate a chess game for Alphazero (not ethical). Health data remains scarce and difficult to obtain.
- The reward function is difficult to determine. Periodic improvements in blood pressure may not impact the final patient’s condition in the case of sepsis. It is necessary to consider causality in interpreting the effects of a treatment.
Applications of RL in Healthcare
RL algorithms are used to optimize the treatment of “chronic” diseases: optimizing the dosage of chemotherapy drugs (cancer), optimizing antiretroviral therapy (HIV), adapting antiepileptic drugs for seizure control (epilepsy). They are also used to improve intensive care: optimization of treatment strategies for sepsis.
The research paper “Reinforcement Learning in Healthcare: A Survey” lists the use cases for RL in cancer treatment and critical care.
The rise of Robotic Surgery
Another field of application for RL algorithms is robotic surgery. Surgical robots, as Intuitive Surgical’s Da Vinci⃝R surgical system, have enabled more efficient surgeries by improving dexterity and reducing surgeon fatigue.
In 2019, researchers at the Institute of Electrical and Electronics Engineers (IEEE) released the first reinforcement learning environments for surgical robots, called dVRL, modeled as the OpenAI’s Gym environments.
These environments facilitate the prototyping and implementation of Reinforcement Learning algorithms in the field of surgical robotics.
Learn more about the topics covered in this article
Series about reinforcement learning (RL) by Deeplizard
David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning | Lex Fridman Podcast #86
Learn more about robotic surgery (YouTube playlist)