Patching is Hard

Home / Blogs

Patching is Hard

	By Steven Bellovin Professor of Computer Science at Columbia University
	May 12, 2017 Views: 20,694 Comments: 2

There are many news reports of a ransomware worm. Much of the National Health Service in the UK has been hit; so has FedEx. The patch for the flaw exploited by this malware has been out for a while, but many companies haven’t installed it. Naturally, this has prompted a lot of victim-blaming: they should have patched their systems. Yes, they should have, but many didn’t. Why not? Because patching is very hard and very risk, and the more complex your systems are, the harder and riskier it is.

Patching is hard? Yes—and every major tech player, no matter how sophisticated they are, has had catastrophic failures when they tried to change something. Google once bricked Chromebooks with an update. A Facebook configuration change took the site offline for 2.5 hours. Microsoft ruined network configuration and partially bricked some computers; even their newest patch isn’t trouble-free. An iOS update from Apple bricked some iPad Pros. Even Amazon knocked AWS off the air.

There are lots of reasons for any of these, but let’s focus on OS patches. Microsoft—and they’re probably the best in the business at this—devotes a lot of resources to testing patches. But they can’t test every possible user device configuration, nor can they test against every software package, especially if it’s locally written. An amazing amount of software inadvertently relies on OS bugs; sometimes, a vendor deliberately relies on non-standard APIs because there appears to be no other way to accomplish something. The inevitable result is that on occasion, these well-tested patches will break some computers. Enterprises know this, so they’re generally slow to patch. I learned the phrase “never install .0 of anything” in 1971, but while software today is much better, it’s not perfect and never will be. Enterprises often face a stark choice with security patches: take the risk of being knocked of the air by hackers, or take the risk of knocking yourself off the air. The result is that there is often an inverse correlation between the size of an organization and how rapidly it installs patches. This isn’t good, but with the very best technical people, both at the OS vendor and on site, it may be inevitable.

To be sure, there are good ways and bad ways to handle patches. Smart companies immediately start running patched software in their test labs, pounding on it with well-crafted regression tests and simulated user tests. They know that eventually, all operating systems become unsupported, and they plan (and budget) for replacement computers, and they make sure their own applications run on newer operating systems. If it won’t, they update or replace those applications, because running on an unsupported operating system is foolhardy.

Companies that aren’t sophisticated enough don’t do any of that. Budget-constrained enterprises postpone OS upgrades, often indefinitely. Government agencies are often the worst at that, because they’re dependent on budgets that are subject to the whims of politicians. But you can’t do that and expect your infrastructure to survive. Windows XP support ended more than three year ago. System administrators who haven’t upgraded since then may be negligent; more likely, they couldn’t persuade management (or Congress or Parliament…) to fund the necessary upgrade.

(The really bad problem is with embedded systems—and hospitals have lots of those. That’s “just” the Internet of Things security problem writ large. But IoT devices are often unpatchable; there’s no sustainable economic model for most of them. That, however, is a subject for another day.)

Today’s attack is blocked by the MS17-010 patch, which was released March 14. (It fixes holes allegedly exploited by the US intelligence community, but that’s a completely different topic. I’m on record as saying that the government should report flaws.) Two months seems like plenty of time to test, and it probably is enough—but is it enough time for remediation if you find a problem? Imagine the possible conversation between FedEx’s CSO and its CIO:

“We’ve got to install MS17-010; these are serious holes.”

“We can’t just yet. We’ve been testing it for the last two weeks; it breaks the shipping label software in 25% of our stores.”

“How long will a fix take?”

“About three months—we have to get updated database software from a vendor, and to install it we have to update the API the billing software uses.”

“OK, but hurry—these flaws have gotten lots of attention. I don’t think we have much time.”

So—if you’re the CIO, what do you do? Break the company, or risk an attack? (Again, this is an imaginary conversation.)

That patching is so hard is very unfortunate. Solving it is a research question. Vendors are doing what they can to improve the reliability of patches, but it’s a really, really difficult problem.

NORDVPN DISCOUNT - CircleID x NordVPN
Get NordVPN [74% +3 extra months, from $2.99/month]

By Steven Bellovin, Professor of Computer Science at Columbia University — Bellovin is the co-author of Firewalls and Internet Security: Repelling the Wily Hacker, and holds several patents on cryptographic and network protocols. He has served on many National Research Council study committees, including those on information systems trustworthiness, the privacy implications of authentication technologies, and cybersecurity research needs.
Visit Page

Filed Under

Comments

Unpatched systems aren't a major issue with ransomware Niel Harper – May 13, 2017 11:05 AM

Hardened servers make it more difficult for core infrastructure to be compromised by ransomware. And because most ransomware is propagated by an endpoint visiting a compromised website that host rootkits, the use of virtualized desktops with non-persisent images can reduce the time and complexity associated with patch management. One of the best defenses for malware is a robust backup and recovery solution. My preferred approach is disk-to-disk-to-tape with the encrypted tapes and offsite backup storage. Couple this with real time replication for critical data sets, and this allows an organization to better protect themselves from malware attacks, because recent versions of critical day are available for recovery.

# 1 Reply | Link | Report Problems

Good point, but ... Neil Schwartzman – May 14, 2017 3:04 PM

Please enjoy my other side of the coin, here: http://www.circleid.com/posts/20170514_the_criminals_behind_wannacry/

# 2 Reply | Link | Report Problems

The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.