Charles Chen, director of the AI and Emerging Technology Office at the Department of State, said at the June 30 AI in Action webcast that data lakes are “certainly essential for the success of machine learning and AI.”
Whether the data lake exists in a physical environment, the cloud, or a hybrid space, Chen praised the data management system for enabling dataset collaboration and organization.
“Having that data lake model means numerous different teams out there could essentially all tap into that same data lake resource … not only can they access it they should also contribute to those datasets,” he said.
Chen explained that emerging technology such as machine learning (ML) and AI rely heavily on access to lots of training data. “Accuracy of machine learning requires very large datasets as a training model,” he said, “and AI really shines when you start cross referencing multiple data streams and datasets.”
Plus, as organizations and Federal agencies develop more sophisticated AI and ML with lots of data, that technology can be trained to keep up with data governance and analysis. Chen said he thinks AI could transform real-time data governance by extracting the maximum value from the date at the outside. AI and ML can also be trained to realize inefficiencies to adapt in real time to transform decision making processes.
Chen gave the example of using data management and emerging technologies in a cybersecurity setting. AI and ML can analyze massive amounts of datasets more efficiently than humans can to elevate threat prevention, he explained.
“AI can evolve and become more intelligent as they analyze greater datasets, allowing humans to make more informed decisions versus always fighting fires,” Chen explained.