Seventy percent to 80 percent of data analytics work is preparing the data for its specific purpose, according to Adam Wilson, CEO of Trifacta.
Many Federal agencies oversee vast amounts of big data, from medical records to socioeconomic information. Wilson, who spoke at Cloudera’s Government Forum on April 25, said that agencies could cut down on this “janitorial work” by instructing the people who are most familiar with the data to manage it.
“Expedite it by letting people who know data best do that work. We found that you have a growing number of knowledge workers, and a shrinking number of people who are qualified to provision data for those individuals,” Wilson said. “If you can shift the workers, they’re going to thank you. You’ll find they’re probably pretty good at it. Democratize production along with consumption.”
Trifacta, which produces data management software, partners with Federal agencies on certain projects. For example, the company recently worked with the Centers for Disease Control to stem an HIV outbreak in rural Indiana. Wilson said this project required a quick pooling of various sorts of data, such as medical records, geospatial coordinates, demographics, and statistics.
“These data types had to come together quickly. Time is of the essence in these situations,” Wilson said. “Letting people who are experts get involved right from the start is saving people’s lives.”
Access to and dissemination of medical data is also important to Caryl Brzymialkiewicz, chief data officer of the Department of Health and Human Services Office of Inspector General. HHS OIG’s purpose is to identify fraud and abuse cases involved with health care issues.
HHS OIG’s success is partly due to its teamwork with other agencies, according to Brzymialkiewicz. Last year, she and her team of operational experts, strategists, and analysts worked with the Department of Justice and the FBI on a billion-dollar fraud case involving the owner of Miami-based nursing facilities, a hospital administrator, and a physician’s assistant.
For this case, as well as one in which her office investigated an oncologist in Michigan who administered cancer treatment for patients who did not have cancer, Brzymialkiewicz said she relied on medical data.
“I’m very proud of my team. That’s a billion-dollar case that’s been indicted,” Brzymialkiewicz said. “The billion-dollar case was enabled by having access to [the Centers for Medicare and Medicaid] systems.”
However, Brzymialkiewicz said agencies need to work as well with their internal members as they do with their customers. She said a great partnership with her department’s chief technology officer helps shape her projects, which include recruiting talented people who recently graduated from college.
Collaborating with internal leadership molds an agency’s priorities, according to Brzymialkiewicz. She also stressed the importance of OIG’s work, stating that the office’s watchdog reports are meant to help.
“How many people are happy when the inspector general comes to their door?” Brzymialkiewicz said. “We want people to take action on it. We don’t just want to write a report. It’s taking the thinking and discipline and focusing it on internal data.”
Vimesh Patel, director of the National Counterterrorism Center’s Office of Data Strategy and Innovation, echoed Brzymialkiewicz’s emphasis on agency collaboration. NCTC’s purpose is to combat terrorism by sharing intelligence and analyzing information to help key partners such as the FBI, the CIA, and the departments of Justice and Homeland Security.
Patel said his chief role is making sure NCTC analysts have the data they need to help their partners.
“I want to make sure that, if someone at NCTC needs data, it’s available and that they’re not afraid to use it,” Patel said.
NCTC was one of the agencies mentioned in Department of Justice Office of Inspector General’s joint review on the domestic sharing of counterterrorism information, which was issued on March 31. OIG found that the partners in the Information Sharing Environment, namely components of the Office of the Director of National Intelligence, DHS, and DOJ, need to establish consistent rules for sharing information.
Specifically, the report said that NCTC, DHS, and DOJ need to develop guidance on future intelligence information sharing practices. Patel said that compliance regulations, such as security rules dictating who is allowed to view certain information, sometimes slow NCTC’s work.
“In my world, all data is never together. It never will be. We need to find a way to work with that,” Patel said. “Sometimes we’re really successful, sometimes we’re not.”
Patel said he and his team are using data analytics in their collaboration with other agencies as they battle the growing presence of homegrown violent extremism.
Wilson stated that, while some organizations have mastered big data management, others still have a lot to learn. He said the ability to quickly adapt databases and update information separates those who will excel in data management in the next few years from those who will struggle.
“Those who are the most adaptable end up surviving. Embrace the idea that gone are the days when it takes six months to add a table to a database,” Wilson said. “That needs to happen in clicks. If it doesn’t, you’re going to get lapped. That type of agility separates winners from losers.”