COVID-19 as the Ultimate Resiliency Test

The spread of COVID-19 has drastically changed how government IT thinks of resiliency. Resiliency and continuity of operations planning were reserved for potential disaster scenarios – how agencies might respond should an attack or system shutdown occur, and how to recover.

But the landscape has shifted over the past six to eight weeks. Resiliency and continuity of operations are no longer hypotheticals; they’re happening now as the vast majority of government workers are now working remotely. All parts of government – from Federal agencies to state and local governments – were forced to prepare their networks, systems, and devices for an unprecedented way of operating, in a matter of weeks.

Federal agencies shifted from having about 22 percent of their employees participating in some telework in 2018 – even just for one day – to having the vast majority now logging on from home on a daily basis. It’s an impressive feat from an IT perspective, considering government is still delivering citizen services without major disruption. So, is creating resilient networks as easy as it sounds? It depends on where you sit.

How Agencies are Faring

The Office of Management and Budget (OMB) laid out its directive for telework operations last month, setting in motion agency efforts to expand capabilities. For example, Social Security Administration staff needed to install VoIP software on employee laptops and set up rules for phone hearings, while Veterans Affairs needed capacity on its networks to support telehealth services.

Telework is a complex challenge for any organization, even more so for massive, dispersed organizations under immense time pressure. It requires not only secure and robust network access, but also the devices workers need, productivity software, and rules about the treatment of sensitive data. Agencies that were already working toward digital transformation and cloud computing subsequently experienced an easier transition. With virtualized networking in place, all an agency would have to do is request additional capacity from their FedRAMP-approved cloud provider. In contrast, agencies running most of their critical systems on premises lacked the scalability and flexibility needed to support remote operations – both on the front end and back end. They lacked the capacity and speed to adjust in real time.

We’re seeing clear evidence that cloud and adaptability go hand in hand – and agencies that can adapt are more prepared and resilient. As a result, their operations are still smooth and government can continue to provide essential citizen services without disruption. Continuity of operations planning is not about responding to a crisis perfectly from start to finish; rather, it’s about acknowledging a range of possible scenarios and ensuring the right processes and tools are in place to respond as best as possible in the moment, and adjust and improve as needed.

Resiliency is “the ability of the network to operate and support the mission … before, during, and after an attack by a capable adversary. You have processes in place and technology in place to isolate, contain, or workaround threats and continue operating,” explains former Navy CIO and Department of Defense (DoD) Deputy CIO Robert Carey, now VP/GM of Global Public Sector Solutions at RSA Security. “You have an understanding of how to prioritize your systems – which things really, really need to be up and running, and which may be able to be less resilient and probably don’t. And you have a means to maintain that capability until the threat is resolved.” Although in this case the threat is a pandemic, rather than a natural disaster or cyberattack, the need for resiliency is constant.

DoD CIO Dana Deasy initiated a Telework Readiness Task Force to accommodate the agency’s sudden shift to telework – supporting as many as 4 million DoD military and civilian teleworkers. The task force has helped identify and successfully act on priorities, including a new Commercial Virtual Remote Environment that’s seeing rapid user ramp-up. ISP connections at the Defense Information Systems Agency (DISA) and the Joint Service Provider are up by 30 percent. Call volume capacity at the department increased by 50 percent and DISA onboarded additional endpoints to increase capability by over 300 percent, while the use of global video services, Outlook Web Access, and enterprise audio conferencing grew tenfold, Deasy said.

It’s worth acknowledging that disasters affect government organizations in many different ways. Some agencies will be more challenged than others due to the type of services they provide. For example, even if agencies had strong resiliency plans in place from an IT perspective, spikes in demand for telehealth, data sharing, and claims processing led to unprecedented network and security challenges for those responsible for health services and financial support. How these agencies adapt to this extraordinary challenge will help write the playbook for other organizations at the Federal, state, and local levels as we look to the future.

The Federal government has largely succeeded in continuing operations digitally despite the fast-moving crisis. Systems are up and running and the workforce can largely work from home. New policies will help sustain the momentum.

The Road Ahead

Rep. Gerry Connolly, D-Va., last week said he’d continue pushing for a $3 billion Technology Modernization Fund (TMF) budget increase that would finance technology-related modernization activities to prevent, prepare for, and respond to coronavirus.

“Some of agency IT needs to expand telework during COVID-19 can be addressed with direct appropriations to an agency,” he said. “But there are also larger and more complex modernization efforts that should be funded by the TMF. The application process requires agencies to really think through how they will go about modernizing an IT system and the TMF Board will provide an additional layer of oversight to ensure that the project is on track throughout the duration of the acquisition.”

Although the proposal remains in progress, the Cybersecurity and Infrastructure Security Agency’s (CISA) Trusted Internet Connections (TIC) 3.0 Interim Telework Guidance is actionable now. The guidance includes 18 universal security capabilities that agencies should consider when transitioning to telework. Supporting OMB’s call for agencies to use technology “to the greatest extent practicable to support mission continuity,” the interim guidance provides a framework for government IT to enable secure employee connections to private government networks and cloud environments.

Thousands of employees working at home from their laptops, and logging in with two-factor authentication on their mobile devices, means the attack surface for malicious actors has expanded, and threats are steadily on the rise – making it critical to prioritize cybersecurity best practices.

With that in mind, the capabilities and guidance outlined in TIC Interim Telework Guidance 3.0 include:

  • Central log management with analysis: Agencies should activate additional logging and increase log alerts to detect malicious activity, and ensure adequate storage for additional logs.
  • Incident response plan and incident handling: Organizations need to account for remote devices and monitor for activities atypical of telework. They should also monitor shared services for potential breaches.
  • Strong authentication: Agencies should ensure users are authenticated to all servers using multi-factor authentication in accordance with OMB M-19-17. As agencies move away from traditional network architectures for remote access, there will be a greater reliance on authentication mechanisms to validate the remote user.
  • Resilience: Agencies should proactively ensure services can scale as necessary to handle telework by users.

CISA notes that the document is not meant to be prescriptive; “it should be leveraged by agencies and adapted.” So how can agencies turn this into tangible action?

Building Resilient Networks

Agencies should review their process to confirm they have efficient network management in place. Tools that support resilient networks include visibility tools, SIEM, and threat hunting, as well as strong identity and access management.

“We’ve got to balance the risk management approach, what I refer to as ‘cyber admin,’ with what I call ‘cyber actual,’” Carey says. “Cyber admin is language that says what I’ve done. Cyber actual means if somebody were to try to break in, how hard would it actually be? Does my cyber defense architecture make bad guys work hard?”

Proactive approaches, such as red teaming, are experiential learning devices that can help agencies evaluate their network preparedness and resiliency.

The past six to eight weeks have tested government’s continuity of operations plans in real time. Moving beyond the immediate term, agency leaders and their workforce will benefit from compiling their learnings from this critical period, integrating telework best practices into normal operations – and prioritizing in future budget plans support for a network ecosystem that can operate before, during, and after a disaster.

We don’t know when normal operations may resume nationally or statewide, or what “normal” will even look like. Along with TMF support, redoubling efforts for telework through increased network capacity, scalability, and flexibility will help make sure any agency is modernized and fully prepared for the unexpected.

Categories

Recent