DARPA on the State of Computer Security

A little over a year ago, the Defense Advanced Projects Research Administration released a broad agency announcement soliciting proposals for “Clean-slate design of Resilient, Adaptive, Secure Hosts (CRASH)”. I stumbled upon the original announcement shortly after its release. Although the solicitation deadline has since passed, the CRASH announcement is interesting for its concisely stated perspective and list of objectives. DARPA is well known for setting extremely ambitious applied research objectives and very often finding a team capable of meeting them (e.g. the DEKA robotic arm), and I think that such a record of success lends some extra weight to their pronouncements.

The CRASH announcement is interesting for the precise reason that it succinctly summarizes the state of the art in understanding what computer security threats are and where they come from. CRASH comes from the view that:


There are very few fundamental sources of the vulnerabilities exploited by cyber attackers. These attacks stem from the fact that current computer systems cannot enforce the intended semantics of their computations.

This statement is a close corrollary of the principle that a system’s behavior cannot be incorrect ex post facto. That is, if a system’s behavior is consistent with the semantics of its specification, then that behavior cannot be faulted for its consequences, since those consequences are a part of what was effectively asked for. (This principle has a colorful history of depictions in literature; see W. W. Jacobs’ “The Monkey’s Paw”.) Vulnerabilities come from either oversights on the part of system architects and implementers who somehow fail to stick to the specification, or they are ambiguities that are essential to the specification itself.

The first source of vulnerability presents an engineering problem: more automation, better tools, and more scupulous practices can very effectively eliminate the chance of discrepancies between specification and implementation. The second source, however, presents a much more difficult theoretical problem. If a specification is essentially ambiguous, then no amount of care or precision in the design process can rule out unwanted results.

The CRASH solicitation identifies four major sources of this kind of ambiguity, namely failures to consistently enforce:

  • Memory safety
  • Type safety
  • The distinction between code and data
  • Access control and information flow policies

There may arguably be some overlap between these, and although this is a good list, it’s probably not exhaustive. However, it’s fairly easy to see that failure to enforce any one of these items represents, in very real and concrete way, an ambiguity (or perhaps an irony) in the semantics of the affected computation.

CRASH arguably represents the standard intuitions in formal methods and computer security today: eliminate the grey areas as thoroughly as possible, and automate/systematize the process enough to make it practical. There is a bit of wrinkle in this appproach, though, in that CRASH also demands systems that can dynamically adapt themselves, e.g. by “[l]earn[ing] from previous attacks how to guard against and cope with future attacks.” To me, there seems to be a direct tension between systems that consistently enforce a rigid semantics but are still adaptive enough to detect and resist intrusions and attacks. While it seems premature to claim that the two goals are wholly inconsistent, it is not at all clear where lies the best path between systems that are scrupulously consistent with their specifications and systems that can dynamically adapt to changing conditions. There must be some reasonable trade-off, but what is it?

I think there are two types of interesting questions to be drawn from these considerations. They are:

  1. Taking the adversarial perspective, what is the most effective way to discover ambiguities in a system’s semantics? Is it possible to discover these ambiguities systematically? Can attacks be directly (automatically) constructed from a description of these ambiguities?
  2. In general, how much consistency with a given specification must be sacrificed in order to gain a given measure of adaptiveness? (There is a small amount of work in this area; see “The Price of Programmability” by Michael Conrad.) Which kinds of statically specified behavior are independent from which kinds of adaptive behavior? What kind of languages can effectively specify both adaptive behaviors and static behaviors?


Leave a comment