Soy Chen, MS; Danielle Bergman, BSN, RN; Kelly Miller, DNP, MPH, APRN, FNP-BC; Allison Kavanagh, MS; John Frownfelter, MD, MSIS; and John Showalter, MD
This study demonstrates that it is possible to generate a highly accurate model to predict inpatient and emergency department utilization using data on socioeconomic determinants of care.
Objectives: To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data.
Study Design: The aim of this study was to demonstrate the influences of socioeconomic determinants of health (SDH) with regard to avoidable patient-level healthcare utilization. The study investigated the ability of machine learning models to predict risk using only publicly available and purchasable SDH data. A total of 138,115 patients were analyzed from a deidentified database representing 3 health systems in the United States.
Methods: A hold-out methodology was used to ensure that the model’s performance could be tested on a completely independent set of subjects. A proprietary decision tree methodology was used to make the predictions. Only the socioeconomic features—age group, gender, and race—were used in the prediction of a patient’s risk of admission.
Results: The decision tree–based machine learning approach analyzed in this study was able to predict inpatient and emergency department utilization with a high degree of discrimination using only purchasable and publicly available data on SDH.
Conclusions: This study indicates that it is possible to risk-stratify patients’ risk of utilization without interacting with the patient or collecting information beyond the patient’s age, gender, race, and address. The implications of this application are wide and have the potential to positively affect health systems by facilitating targeted patient outreach with specific, individualized interventions to tackle detrimental SDH at not only the individual level but also the neighborhood level.
Am J Manag Care. 2020;26(1):In Press
This investigation demonstrates that it is possible to predict individual hospital and emergency department utilization using publicly available data on socioeconomic determinants of care and purchased behavioral data, without requiring clinical risk factors.
- It is possible to risk-stratify patients’ risk of utilization without interacting with the patient or collecting information beyond the patient’s age, gender, race, and address.
- The implications of this application are wide and have the potential to positively affect health systems by facilitating targeted patient outreach with specific, individualized interventions to tackle detrimental social determinants of health at not only the individual level but also the neighborhood level.
As defined by the World Health Organization and recognized by HHS, social determinants of health (SDH) are the conditions in which people live, work, play, and age. SDH affect a wide range of health-related outcomes, such as chronic conditions, preventable hospitalizations, morbidity, and mortality.1,2 Decades of study results have found that sociodemographic status, racial and ethnic disparities, and individual behaviors directly correlate with an increase in the prevalence and incidence of chronic diseases.3-5 In the United States, chronic diseases have contributed significantly to the rise in healthcare costs, with approximately 90% of annual healthcare expenditures—$3.3 trillion—attributed to caring for patients with chronic and mental health conditions.6 Almost half of all Americans suffer from at least 1 chronic disease (eg, cancer, heart disease, stroke, chronic obstructive pulmonary disease, diabetes),7-9 and two-thirds of deaths are caused by 1 or more of these chronic diseases. In addition, nationwide trends show a projected overall increase in chronic conditions.10,11 Thus, it is imperative to address SDH not only at the individual level but at the population level as well.
Associations between economic inequality and health disparities exist in the United States; for example, residents of impoverished communities are at a higher risk of mental health issues, chronic diseases, increased mortality, and lower life expectancy.12 Inequalities include lack of access to healthy food, with 17.4 million households considered food insecure13; decreased receipt of preventive medical care, with 1 in 4 individuals without a primary care provider14; 3.6 million people failing to obtain medical care due to transportation barriers10; and 65.9% of food assistance program clients reporting the necessity to choose between food and medical care.15 The need for providers and communities to address SDH is apparent; however, healthcare providers have limited ability and limited access to do so within their existing workflow. Entering SDH data in electronic health records (EHRs) is predominantly a manual documentation process completed by providers with a limited range of determinants and relies on patients’ self-report accuracy.16,17 From a healthcare management approach, there is no evidence-based screening recommendation for SDH; however, several policy statements support screening patients for disparities.18,19 According to a recent Kaiser Permanente survey of 1006 US adults 18 years or older demographically matched to represent the US population per Census data, 97% of respondents indicated that their healthcare provider should ask about social needs during medical visits and 80% expressed a desire for their healthcare provider to share information about resources to address their individual needs.20
Recent changes in healthcare policies and initiatives, including the introduction of the Accountable Health Communities established by the Patient Protection and Affordable Care Act (ACA),16,21 are aimed toward reducing health inequalities. Such changes direct attention to the health-related social needs of Medicare and Medicaid beneficiaries and how addressing those needs can drive improvements in population health.22 In addition, expansions made to CMS’ Medicare Advantage program include a greater level of coverage for SDH. Coverage plans now include services like telemonitoring, benefits for over-the-counter medications, and rides to medical appointments for patients without transportation.23,24 This expansion requires more data sources to track new kinds of information that are not readily available within the EHR.25
Current analyses, predictive models, and prevention initiatives focus on addressing SDH at the population level or the zip code level.26-30 The shortcoming of this approach is a gap in addressing the individual patient’s needs, such as defining clinical action steps that are relevant to the patient as opposed to an overall population approach. Advancements in cognitive science allow for the analysis of individual contributions of SDH at the patient level, informing appropriate interventions that can reduce the risk of negative health outcomes such as preventable readmissions and/or hospitalizations.31 Additionally, increasing access to SDH based on geography (ie, Census Data Application Programming Interface) and the ability to purchase individual behavioral indexes may decrease the need to collect large sets of data from individual patients.
The aim of this study is to demonstrate the influences of socioeconomic determinants of health with regard to utilization at the individual level. Given what is known about the contribution of socioeconomic determinants of health, machine learning should be able to predict utilization independent of the patient’s clinical condition while defining which determinants confer the greatest risk. The study will investigate the ability to predict risk with publicly available and purchasable SDH data.
We selected 138,115 patients from a deidentified database representing 3 health systems in the United States. The patient sample was selected to develop the most generalizable model. Both adult and pediatric patients were included, health systems were chosen from 3 diverse geographical areas, and all patients with at least 1 ambulatory, emergency department (ED), or inpatient visit during the month of November 2018 were included. The health systems were in urban Ohio, urban Georgia, and rural Alabama.
Credit: Google News