The National Institutes of Health (NIH) on June 4 released details of a five-year plan that it said aims to overhaul its data infrastructure, management, and analytical abilities in order to boost NIH’s data science capabilities as volumes of health-related data continue to swell in the coming years.
Among several steps that NIH–the Federal government’s primary biomedical and public health research agency–plans to take over the next five years are efforts to improve data infrastructure, including optimizing data storage and security functions and connecting disparate NIH data systems. The plan also includes modernizing its “data repository ecosystem,” supporting storage and sharing of individual data sets, and better integrating clinical and observational data into biomedical science data.
The agency pointed out that its efforts to improve data science capabilities come as growth in health-related data–particularly genomics data–is expected to increase exponentially in the coming years. NIH said the total amount of genomics data by 2025 is expected to equal or exceed totals from astronomy, YouTube, and Twitter.
Cost and implementation timelines appear still to be determined, the agency said, although it appears open to many possibilities.
“Machine learning, deep learning, artificial intelligence, and virtual-reality technologies are examples of data-related innovations that may yield transformative changes for biomedical research over the coming decade,” it said. “The ability to experiment with new ways to optimize technology-intensive research will inform decisions regarding future policies, approaches, and business practices, and will allow NIH to adopt more cost-effective ways to capture, access, sustain, and reuse high-value biomedical data resources in the future.”
“To this end, NIH must weave its existing data-science efforts into the larger data ecosystem and fully intends to take advantage of current and emerging data-management and technological expertise, computational platforms, and tools available from the commercial sector through a variety of innovative public-private partnerships,” the agency said.
Data security issues will also be top-of-mind as the agency makes its plans.
“Proper handling of the vast domain of clinical data that is being continually generated from a range of data producers is a challenge for NIH and the biomedical research community, including the private sector,” the agency said. “NIH must develop, promote, and practice robust and proactive information-security approaches to ensure appropriate stewardship of patient data and to enable scientific advances stemming from authentic, trusted data sources. NIH will ensure that clinical-data collection, storage, and use adheres to stringent security requirements and applicable law, to protect against data compromise or loss.”