Data offers extraordinary power – and no two organizations harness this power better than the Department of Veterans Affairs (VA) and the Department of Energy’s Oak Ridge National Laboratory (ORNL). These two groups rallied around a shared goal – to leverage technology to improve the lives of veterans and citizens alike.
Launched in 2011, the Million Veteran Program (MVP) sought to learn how genes, lifestyle, and military exposures affect health and illness. Today, over 825,000 veteran partners make up one of the world’s largest programs on genetics and health.
MVP helps researchers across the country better understand how genes affect health and illness – helping them to prevent illnesses and improve treatments of disease. Research using MVP data is already a part of more than 30 VA projects, including efforts focused on understanding the role of genes in PTSD, diabetes, cancer, heart disease, and suicide. This research is helping the VA to better understand the role genes play in many common illnesses, especially those among combat veterans.
Suicide prevention is the VA’s highest priority. Since 2016, more than 6,000 veterans died by suicide and from 2005-2016, the rate of veteran suicides in the U.S. increased by more than 25 percent.
The VA partnered with ORNL to revamp a project focused on predictive models and advanced informatics to identify at-risk veterans. For example, the medication possession ratio (MPR) algorithm creates individualized summaries of veteran’s medication patterns – which medications they are prescribing or how often they are filling them. This model helps pinpoint veterans who have inconsistent medication usage patterns. Historically, MPR calculations are have been limited in scope. The model typically includes only certain types of medication and covers a narrow class of the total veteran population in the Veterans Health Administration database.
ORNL was able to accelerate this algorithm and increase coverage to all current medications, as well as recent past prescriptions and all 9 million veterans in the database. The previous version of the model would have taken 75 hours to run – now it runs in only 15 minutes, a 300-fold improvement.
“Now we can observe and reach a much larger population that’s potentially at risk – and look at even more risk factors,” said Edmon Begoli, principal investigator on the project and director of the Scalable Protected Data Facilities (SPDF) at the National Center for Computational Sciences at ORNL. “The potential to provide far greater predictive services is there.”
Announced in July, the VA and ORNL are teaming up with the Department of Health and Human Services (HHS) to form the COVID-19 Insights Partnership, which aims to coordinate and share health data, research, and expertise to aid in the fight against COVID-19.
The partnership leverages ORNL’s Summit – the United States’ fastest supercomputer to conduct research – by running large-scale, complex analyses on a vast amount of health data. The research and analysis conducted will focus on vaccine and therapeutic development and outcomes, virology, and other critical scientific topics to understand COVID-19 better.
Behind the scenes, these projects all have one thing in common – they require the collection, analysis, and storage of massive amounts of data. Once the supercomputers at ORNL process and narrow down the usable data, Pure Storage technology leverages that data for artificial intelligence and machine learning.
When you have vast quantities and diversity of data – like the data powering these critical programs – there must be a data-centric architecture supporting it. Data does not exist in one accessible place, it’s not always clean and ready-to-use, and the storage platforms from the past are not built for today’s work.
The VA and ORNL are collaborating on groundbreaking research that is changing the lives of veterans and citizens alike. The data powering this research is massive – and the infrastructure that supports it must be up to the task. A modern, data-centric architecture leverages this data for results.