By: Nirmayee Vilekar, RIG Inc Intern Researcher

Ever wonder how the ventilators in ICUs are managed? It requires physicians, nurses, and respiratory therapists to continuously monitor an admitted patient’s health to manage the ventilators. The airflow, pressure and oxygen content need to be adjusted according to the patient’s status. If he/she is facing difficulty breathing, the airflow must be increased to ensure that the patient on life support does not fall short of oxygen. During COVID times, it was clear how vital the role ventilators played in clinical treatment of critically ill patients.

So, what is this monitoring a patient on ventilator all about? It considers a multitude of factors such as lab values, vitals, comorbidities, and disease progression. These factors must be considered before deciding on a patient’s ventilation regimen. A well-trained model with reinforcement learning can be used to integrate all these factors and thus, decide the optimal level of ventilation to be provided to a patient. While the knowledge that a medic

al expert brings surpasses currently available trained software, to help reduce their cognitive load, such a system would prove beneficial. Reinforcement learning is all about learning through experience. There is a specific reward when an action is accomplished. It aims to find an optimal policy based on the inputs from the environment thus maximizing its reward.


What is Reinforcement Learning (RL) all about?

Reinforcement Learning is a branch of Machine Learning where the agent learns from the environment by receiving a reward when it performs an action successfully or is penalized when it deviates away from its target. The major inputs to a reinforcement learning model are state space which include the number of states or conditions, action space which include the number of actions or transitions and a target or goal which must be achieved. It gets a reward or penalty when it transitions from its previous state to a new state by taking an action. For example, if a child is learning to walk, he/she gets a chocolate for standing up and higher reward for taking steps.

In the case where a ventilator must be controlled by reinforcement learning, the state space may include pulmonary pressures, cardiac indices, and image data. Also, the patient’s existing condition must be considered. For the action spaces, oxygen inflow, ideal body weight adjusted tidal volume which is the amount of inhaled and exhaled air with each respiratory cycle must be considered.  Patients’ clinical response to both their disease and the treatment would result in clinical improvement, a reward, or a penalty for deterioration. Hence, when there is a high penalty, this acts as feedback for the system to modify the ventilation, pressure support or oxygen flow. To design this, a reinforcement learning algorithm such as Q-learning can be used. Q-learning has a q-table which includes the value of an action in a particular state. Hence, the actions such as oxygen inflow at a patient’s existing condition can be stored in a q-table. These values are then used to predict the best course of action which depends on the current state of the agent.




Although, reinforcement learning has its own benefits, any machine learning or software technique would never be able to replace humans. The goal is not AI, rather IA, or intelligent assist, making it easier for the clinical team to care for patients, by reducing the cognitive load.  Reinforcement learning will surely help unburden resources. We have seen the pressure on the medical workforce during pandemic times; hence, employing technology to assist humans would surely help in the long run.





  1. Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care, By: Arne Peine, Ahmed Hallawa, Johannes Bickenbach, Guido Dartmann, Lejla Begic Fazlic, Anke Schmeink, Gerd Ascheid, Christoph Thiemermann, Andreas Schuppert, Ryan Kindle, Leo Celi, Gernot Marx & Lukas Martin