MIT Sloan Health Systems Initiative
Update on the 2019 HSI Research Projects
Four HSI-funded projects that began in the fall of 2019 report promising and intriguing work
Prescribing the best treatment or combination of treatments for someone who is affected by Substance Use Disorder is a difficult practice that suffers from a high rate of unpredictability. Originally, Professor Georgia Perakis, together with Visiting Professor Dessi Pachamanova, and PhD students from the Operations Research Center, Amine Bennouna and Omar Skali-Lami were awarded funding to work together on this specific challenge. But the team took the research much further, to a more general level. The result of their work thus far addresses the question “what is the optimal treatment path for a particular patient based on past data from other patients?" More generally, what is the best course of treatment to reach the best possible outcome? Their interim research report shows evidence of a robust method to answer this question.
Their model’s algorithm uses probability, statistical learning and machine learning. Starting with a Markov Decision Process, the researchers devised a method to model the evolution of the patient’s state or health from their historical health data and demographics such as age and gender. The model can incorporate various patient data points, both numerical values and data from free text in the medical record.
Their method uncovers patients that respond similarly to the same treatment, and uses information about their transitions to predict outcomes. At each point in the process, the algorithm recommends a treatment that takes into account everything that came before. and moves the patient toward the optimal outcome, such as recovery from a disease or the best management of a chronic condition.
A treatment does not have to be only one thing. For example, a treatment for substance use disorder may be talk therapy and drug therapy combined. If after a treatment, the patient has not had the best outcome, the algorithm suggests the best options at the next decision point.
One characteristic of this model is that it can be counterfactual. That is, it can suggest treatments that aren’t even on the physician’s radar or that haven’t been used before with this particular patient. It combines the knowledge of multiple sources (e.g., clinical teams) and multiple partial patient treatment histories to reconstruct complete optimal treatment paths for patients. The model can uncover things clinicians just know, the know-how that comes from experience, not only from books or schooling.
This model works as clinical decision support tool, not an omnipotent oracle; it provides the clinician with an interpretable suggestion for further treatment. Using such suggestions may encourage adoption since it acts as a guide, not a demand. The clinician still can weigh the recommendation against his or her own experience. However, as it is used and turns out to be helpful, the clinician may be more likely to trust the tool’s suggestions.
There are two notable problem characteristics that make constructing this algorithm very difficult. First is dimensionality. There are many differences among individual patients and therefore ways to represent possible states based on patient characteristics. Added to these characteristics or features are the many possible transitions between states based on patient reactions to treatments, further complicating the challenge. In a clinical trial, project coordinators make sure to enroll people who are essentially the same on the most important aspects in order for the trial to result in statistically significant data. In the case when the algorithm is meant to work for any patient, the researchers do not have that luxury.
Second, the setting is healthcare, and clinicians cannot simply experiment on patients. This means that the algorithm needs to learn only from data that are already collected. Clinicians cannot prescribe subpar treatments or explore in order to get more data to train and finesse the algorithm.
One way the team addressed the dimensionality challenge was by grouping patients who have similar characteristics and react similarly to the same treatment. It regroups the patients as new information arises. The algorithm finds the optimal treatment for each group. And the algorithm does not need an infinite number or even a large number of patient groups for it to work. Depending on the complexity of the condition, the team can determine the minimum number of patient groups needed with confidence.
In summary, the research team created a model that is interpretable, learns the latent structure of patient treatment transitions by analyzing large data sets to construct appropriate groups of patients, and predicts the optimal recommendation from historical data, without experimentation. And even more, the team has proven guarantees for the model’s performance that depend on the complexity of the condition and the amount of available data for estimation. In silico experiments suggest that the method can be trusted.
This research is still a work in progress. For the next step, the team is going to test it against HIV data. They will use medical data from the abundant HIV literature. They will also derive their definitions of healthy and unhealthy states from the medical literature. While many researchers are building models of risk or treatment choices, Perakis and her team have created a stand-out model. It is interpretable and robust. It is disease-agnostic in that it can be used for any type of health condition. It is data driven, easily updated and incorporates probability and the latest in machine learning methods. And it has practical application. It is eminently usable. A front end for clinicians can be made simple while hiding all the powerful algorithms in the back. Ultimately it is action research that both clinicians and patients can trust.
One outcome from MIT Sloan HSI’s convening on Substance Use Disorder (SUD) in May 2019, was several funded research projects to address challenges related to the diagnosis and successful, sustainable treatment of SUD, a progressive condition with costly and severely detrimental health outcomes. Smart allocation of scarce resources is key.
Professors Jónas Oddur Jónasson, Nikos Trichakis and their team are working with the Staten Island Performing Provider System’s (SIPPS) data to investigate how well machine learning models predict an individual’s risk for adverse opioid-related outcomes, e.g., death or overdose, even if they have not been prescribed opioid medications.
This Research Is Novel in Both the Data Set and the Method: SUD has become a crowded field of study, but, the team’s effort stands out in two significant ways: 1) it uses machine learning, which is relatively unusual 2) it uses Electronic Health Record data, and pharmacy and medical insurance data simultaneously. This is a data set that hasn’t previously been used for this type of study. Not only is the data set used in the project unique in this field, but also the team is creating their own method and model rather than relying on previously vetted designs. Their collaborator, SIPPS is interested in opioid-related harm in their general patient population, whereas the benchmark literature focuses on the subset of patients using prescription opioids
The SIPPS provided data on about 100,000 Medicaid patients living on Staten Island who had any medical encounter or filled prescription for an opioid between July 2014 and November 2019.
The inputs for the machine learning algorithm include 90 data variables that account for known risk factors for opioid-related harm such as demographics, prescriptions history for opioids and other potentially addicting medications, and the number and type of encounters with healthcare clinicians for both physical and mental health. Rather than focusing on predicting deaths -- which is rather like closing the barn doors after the horses have run out - the outcome variables focus on harm, e.g., dependence, overdose, abuse.
Initial Results: For the initial analyses, the team focused specifically on the most severe harm outcome, opioid overdose. Preliminary results were encouraging. Intervening with the 10% of patients identified by the model as highest-risk would identify more than 50% of the patients who had an overdose during the prediction window.
In some of the later steps of this project, the researchers will be adding data types that are even more unusual in this sort of work. They are going to be able to cull data from law enforcement, legal system and social service providers.
The impact on treatment could be significant, resulting in focusing resources where they are most needed to try to prevent/minimize further harm in this at-risk population. If the team can demonstrate that their methodology results in a significant improvement in prediction accuracy, a tool based on that methodology may be used by clinicians when deciding with which patients to intervene and how to best serve them.
“Food is Medicine” is an inspiring and popular idea, but there is little rigorous evidence on its effectiveness. That is changing with Professor Joseph Doyle’s randomized control trial with a group of a health system‘s clients, which kicked off in early 2019. This project is a large-scale randomized evaluation of a food-as-medicine approach. A dedicated dietitian prescribes fresh food to a cohort of food-insecure patients with diabetes—enough for two meals a day over five days per week for the patient and the patient’s household. A number of support systems and programs are in place to complement the “prescription”, including education about how to prepare the food and to manage their health.
Along with his team, Doyle is measuring clinical outcomes (blood sugar control, blood pressure, weight, and cholesterol) as well as healthcare utilization and patient surveys of self-assessed health, healthy behaviors, and patient satisfaction. They are also measuring spillover effects on household members, and testing whether this program is a complement or substitute for other healthy behavior plans such as other diabetes and non-diabetes preventive care.
Previous case studies of individual participants point to large improvements in diet and health. The goal of this project is to test this idea at a larger scale with a credible research design—a randomized controlled trial.
COVID-19 Impact on Program Delivery: There was some concern about the effect of the pandemic on this program that involves substantial face-to-face interaction. The program was able to pivot to a bi-weekly food distribution service largely through curb-side service. The clinical team’s educational components converted to telemedicine and/or telephonic outreach to minimize risks associated with COVID exposure while ensuring participants felt well supported. While these changes were certainly not initially planned, the program’s continued success with these changes during these difficult times may bode well for adapting it to other types of communities in the future.
One of the diagnosis tools in the oncology arsenal is a liquid biopsy. This procedure may be preferable to a traditional biopsy since it is non-invasive, quicker, may detect cancer earlier and help formulate treatment plans.
However, liquid biopsy tests need to be accurate to be useful, and they are only as good as the software that underpins them. Professor Vivek Farias developed advanced machine learning methods, which, when applied to the liquid biopsy technology, resulted in a vast improvement on the previous technique in test studies.