Meltdown and Spectre Expose the Dark Side of Superfast Computers
Hundreds of gadget makers and software companies at this week’s annual Consumer Electronics Show (CES) in Las Vegas are staking the success of their newest products on the latest and greatest processors from Intel, AMD, ARM and others. But those bets are looking shaky, even by Sin City’s standards, after last week’s bombshell that many of those processors are plagued by serious security vulnerabilities known as Meltdown and Spectre.
Processors lend a degree of intelligence to just about any electronic device—including the thousands of automobiles, home appliances and gaming systems displayed at the exhibition. It is now clear that the insatiable need for faster processors has had a dark side, as chipmakers cut corners on security, exposing potentially billions of personal computers, mobile devices and other electronics to a new crop of digital attacks for years to come.
Every computer relies on a piece of software known as a kernel to, among other things, manage the interactions between end-user applications—spreadsheets, Web browsers, etcetera—and the underlying central processing unit and memory. The kernel starts and stops the other programs, enforces security settings and restricts access to a device’s memory and data resources. Not surprisingly, the kernel’s speed determines how fast the computer performs as a whole. Chipmakers protect the kernel by isolating it from other programs running on the computer, unless those programs are given specific permission—or “privilege”—to access the kernel.
Meltdown dissolves that isolation, potentially letting an attacker’s malicious software breach the kernel and steal whatever information it finds there—including personal data and passwords. Spectre impairs the kernel’s ability to stop a malicious program from stealing data from other software that uses the processor. Researchers working independently at Google Project Zero, security vendor Cyberus Technology and Graz University of Technology in Austria coordinated their announcement of Meltdown last week. Spectre’s existence surfaced around the same time, courtesy of investigations carried out separately by multiple educational institutions, cybersecurity research firms and noted cryptographer Paul Kocher (pdf).
Kocher, best known for his work helping revise a set of cryptographic protocols used to secure computer networks, spoke with Scientific American about how shortcuts for increasing processor speeds led to both vulnerabilities, why it took decades to find these bugs and how to protect computers from attack.[An edited transcript of the conversation follows.]
How did chipmakers compromise computer security to create faster processors?
The underlying issue is that processor clock speeds have largely maxed out. If you want to make a processor faster, you have to get more work done per clock cycle. The other thing that isn’t changing much is the speed of memory. Optimizations, then, become the key to speed increases when designing a computer processor. For example, if your processor comes to a spot where it’s waiting for information from memory, you don’t want to have the processor sit idle until the data come back from memory. Instead, the processor can speculate on the information it will receive and begin working ahead rather than waiting. When the processor guesses right, it gets to keep this extra work; the processor’s speculative execution gives a significant performance boost. Under normal circumstances the percentage of work that is lost because the processor makes the wrong guess and has to backtrack is in the single digits. This optimization has been part of the standard playbook of how to make a fast processor for many years.
What vulnerabilities did speculative execution create?
Meltdown and Spectre both involve this shortcut, but they work in different ways and have very different implications. Meltdown leverages an issue where ordinary unprivileged code can read memory with kernel permissions. This lets an attacker who can run software on the computer read the entire contents of the physical memory—which, for example, is a big problem in cloud servers where multiple clients share the same server.
Spectre, in contrast, doesn’t involve any privilege escalation issues. Instead it takes advantage of permissions that the code being attacked already has, but tricks the user’s system to do something speculatively that the program would never have done legitimately, and doing so leaks memory contents.
What prompted your research that ultimately uncovered Spectre?
The timing was that I had sold my company (Cryptography Research) and had time to get my hands dirty doing research. What originally got me working on this was the question: Where have we made trade-offs between performance and security, and added complexity, in ways where security was not the top priority? I was completely unaware of Google Project Zero’s work on this until after I reported the issue and fully implemented the exploits in my [research] paper. The timing is a coincidence, although it’s actually more surprising these vulnerabilities weren’t found a long time ago.
The fact that Meltdown was discovered by several groups of researchers working independently makes more sense—partly because there was a post online by [security researcher] Anders Fogh, in which he had investigated the issue but did not find the problem. He then published a description of his work, which got other people thinking about the problem. There was also work on a patch set named KAISER (short for kernel address isolation to have side-channels efficiently removed) to address attacks against KASLR (kernel address space layout randomization—a security technique to make exploits harder by placing various objects at random, rather than fixed, addresses). That turns out to the fix for Meltdown, so it also drew attention to the issue.
Why didn’t Intel or the other chipmakers discover the problem first?
That’s a fair question. The mess they’re dealing with right now is vastly more painful than what they would have gone through if they had found and fixed the problem immediately. They would have been in a much better position than external researchers to find these vulnerabilities, since they know in intimate detail how all of the technology works—whereas I was poking around without any inside knowledge. For Spectre in particular, it’s also worth asking why it wasn’t found by ARM or more generally computer scientists who were teaching speculative execution in microprocessor academic courses. One answer might be that people tend to look at the things they know to look at, and Spectre involves a problem that cuts across different disciplines, where people working on one aspect of a technology don’t know as much about other aspects.
Still, if a few security people had looked closely at speculative execution and considered different ways it could go wrong, I think a significant number of them would have realized that it was a dangerous idea. I suspect that a lot more unpleasant things are lurking under the covers that we’re not seeing because questions about the security implications are not being asked.
What does this mean for the future of chip design?
I think there’s a bit of a fog of war right now makes it hard to come up with a clear answer. There are different, very high-level things that chipmakers can do. One of them is to leave it to software developers to deal with complex countermeasures, but I think this will largely fail since developers are not equipped to do this. Another is to build chips that combine cores with different security properties, and have separate execution units that are faster versus safer execution units on the same die. If you’re playing a game on your mobile device, the big processor core is running. If your phone receives a data packet while it’s asleep, the smaller core handles that.
Most of the really critical security applications don’t need a lot of performance. If you’re doing a wire transfer, for example, it doesn’t take a lot of [computing] power to get the user’s confirmation to do the cryptography and to transmit the result. If you’re playing a video game, you really don’t need that kind of security but it does need the best performance. Still, it’s too early to say what things will look in 10 or even five years from now.
How do the patches for these vulnerabilities slow a processor’s performance?
The fix for Meltdown changes the way kernel memory is mapped. The way things worked before, the switch between normal user code and the kernel was a very lightweight transition. With the patches, more work has to be done on these transitions. The impact on performance will depend on the type of workload. If almost all of a program’s [number crunching] is done in user mode, with very few switches to kernel mode, you won’t see any significant impact. In contrast, if you have a piece of code that spends a lot of its time quickly switching back and forth between user and kernel modes—such as reading very small bits of information from files stored on a very fast disk—there ends up adding a lot of overhead.
How does the Spectre fix work?
Spectre can only be mitigated—not fixed—at this time, because the flaw impacts a lot of different software including operating systems, drivers, Web servers and databases. Some of this software—such as drivers—is rarely updated. A lot of the software is also very complex, so developing fixes is a Herculean issue. For some CPUs there are also some microcode patches that help mitigate one of the aspects of Spectre, but there is a staggering amount of work required to get all this to work correctly. Even worse, Spectre involves things that are very hard to detect in the processor, making testing of countermeasures really, really difficult. Some chips will just have to be replaced because they can’t be updated. As a result, this is an issue that’s going to be with us for a long time.
How can computer users protect themselves?
On the Meltdown side, there are a number of fixes that can be implemented at the operating-system level—for example in MacOS, Windows, Linux and iOS operating system updates. If you look at the trajectory of Meltdown, there are certainly attackers who can and will attack unpatched computers, so it’s urgent to get that patch applied. Once the patch is installed, so far as we know, Meltdown ceases to be a security threat.
With Spectre, on the other hand, there are partial mitigations. Some chips can’t be updated, others could be updated but the company that made the product with the chip won’t bother to pass the updates along. Often the updates won’t address all of the problems. Still, it’s important to keep the risk in perspective. There are plenty of other security threats that we live with every day. Spectre isn’t necessarily more dangerous than other vulnerabilities already causing security pros enormous headaches. With Spectre, you’re tricking the processor into making wrong predictions at a particular spot in the software running on the victim’s computer. The attackers need to have awareness of the software code that the victim is using, and set up the right conditions for the attack to work. As a result, there isn’t a single attack program that will work across many computers.
What is the most important takeaway message from the discoveries of Meltdown and Spectre?
The bigger picture relates to how we’re approaching security, and our inability to make systems secure even in the places where security is most needed. These bugs are a symptom of that problem. When you optimize for objectives—such as speed—that interfere with security, you can reasonably expect that you’re going to end up with problems. Spectre is a very clean example of a security/performance trade-off, where speed optimizations led directly to security problems. The fact that these security vulnerabilities affect all of the major microprocessor manufacturers really indicates that there has been a failure of thought and attention, rather than specific error that an individual or even a single company has made.