High-Assurance Base64

Author: David Lazar

Galois' mission is improving the trustworthiness of critical systems. Trustworthiness is an inherent property of a system, but we need to produce evidence of its trustworthiness in order for people to make informed decisions. The evidence, and its presentation is a key part of what is often called an assurance case.

The kinds of evidence that make up an assurance case can differ, depending on how critical the system is. The more critical, the more thorough and convincing the assurance case should be. Formal methods — mathematical proofs of software correctness — are often called for when evaluating the most critical systems. A formal method approach to assurance produces a strong argument about a model of the system. In contrast, testing produces a weak argument about the system itself. The reason it is a weak argument is that for any non-trivial component, testing will only exercise a minuscule percent of the possible input space. In contrast, a formal proof says that for a model of the system, some property holds for all inputs. Phrased this way, it's clear that testing and formal evidence provide complementary evidence of a system's correctness.

In this post we describe a fast, formally verified C implementation of Base64 encoding. The 10,000 foot description of the proof approach is:

  • choose the system,
  • create a formal specification of the system,
  • create a model of the implementation, and
  • prove that the implementation meets the specification.

The code and proof are available here:

The following sections describe the different parts of this repository.

The C Code

The C code in base64encode.c is a fast implementation of Base64 encoding. It is based on libb64, which uses coroutines to achieve speed. The nontrivial control-flow of this code makes proving it correct more challenging. This is the artifact we want to develop assurance evidence for.

Typing "make" at the top of our repository builds the b64enc tool, a simple frontend to the C code. Here is a quick timing comparison between our code and the base64 utility that is part of GNU Coreutils:

$ time b64enc 500MB.random > x.b64
real: 1.197s  user: 0.61s  cpu: 0.48s

$ time base64 -w0 500MB.random > y.b64 real: 2.615s  user: 1.81s  cpu: 0.62s

$ diff -w x.b64 y.b64 $

The Cryptol Specification

The Cryptol specification in base64.cry is based on RFC 4648. Ignoring padding and assuming Cryptol is in big-endian mode, the encoding routine is just two lines:

encode : {a c} (fin a, c == (8 * a + 5) / 6, 6 * c >= 8 * a) =>
        [a][8] -> [c][8];
encode xs = [| alphabet @ x || x <- groupBy(6, join xs # zero) |];

The corresponding decode function is similarly short. The b64encode and b64decode functions are wrappers around encode and decode that handle padding.

Our specification provides the following correctness theorem, polymorphic over all lengths of x:

theorem b64encodeLeftInv: {x}. b64decode (b64encode x) == x;

Cryptol can't prove polymorphic theorems on its own, so we must monomorphize the theorem to prove it:

base64> :prove (b64encodeLeftInv : [16][8] -> Bit)

base64> :prove (b64encodeLeftInv : [200][8] -> Bit) Q.E.D.

The Proof

This proof/ subdirectory contains infrastructure for proving the C code correct. The Makefile in this directory orchestrates the proof. Typing make n=16 generates and proves a theorem that says the base64_encode function from the C code (as compiled by Clang) is equivalent to the b64encode function in the Cryptol specification for the given input length n:

$ make n=16
Proving function equivalent to reference:
encode_aig : [16][8] -> [24][8]

Let's see what the Makefile is doing behind the scenes.

The C code in sym_encode.c is a wrapper around our C code that passes a symbolic value to the base64_encode function. This code is compiled by Clang to LLVM bytecode.

The LLVM Symbolic Simulator (a prototype tool currently under development by Galois) is used to extract a formal model encode_aig of the LLVM bytecode. The model can be loaded into Cryptol and used like any other function:

proof> :t encode_aig
encode_aig : [16][8] -> [24][8]

proof> encode_aig "16 characters..." "MTYgY2hhcmFjdGVycy4uLg=="

In particular, we can write a theorem about this function:

theorem MatchesRef : {x}. encode_aig x == b64encode x;

...and then prove it:

proof> :prove MatchesRef

Success! Amazingly, this proof systems scales to large values of n where exhaustive checking is not feasible:

$ time make n=1000
Proving function equivalent to reference:
encode_aig : [1000][8] -> [1336][8]
real: 17.882s  user: 16.31s  cpu: 1.50s


The success of the proof gives us high confidence that our C code is correct.

To reiterate our steps:

  • choose the system — we chose Base64 encode,
  • create a formal specification of the system — we did this with a few lines of Cryptol,
  • create a model of the implementation — we used our LLVM Symbolic Simulator to generate the model from the C code, and
  • prove that the implementation meets the specification — to do this, the Cryptol tool calls out to a SAT solver to prove the AIGs are equivalent.

The weak-link of many approaches to using formal methods applied to the software correctness challenge is the model-to-code correspondence. The approach we took above addresses that weakness by automatically generating the model via symbolic simulation of a low-level representation of the program — in this case, the LLVM bytecode. This approach would miss bugs or malware in the path from LLVM to executable, but remains a compelling argument for the correctness of the C code. In critical applications, it makes sense to include the compiler in the scope of an overall assurance case.


(Tech Talk) New Directions in Random Testing, From Mars Rovers to JavaScript Engines

Galois is pleased to host the following tech talk. These talks are open to the interested public--please join us! (There is no need to pre-register for the talk.)

This talk is on Thursday.

title: New Directions in Random Testing: from Mars Rovers to JavaScript Engines
speaker: Alex Groce
time: Thursday, 12 September 2013, 10:30am
location: Galois Inc.
421 SW 6th Ave. Suite 300,
Portland, OR, USA
(3rd floor of the Commonwealth building)

abstract: One of the most effective ways to test complex language implementations, file systems, and other critical systems software is random test generation. This talk will cover a number of recent results that show how---despite the importance of hand-tooled random test generators for complex testing targets--- there are methods that can be easily applied in almost any setting to greatly improve the effectiveness of random testing. Surprisingly, giving up on potentially finding any bug with every test makes it possible to find more bugs over all. The practical problem of finding distinct bugs in a large set of randomly generated tests, where the frequency of some bugs may be orders of magnitude higher than other bugs, is also open to non ad-hoc methods.

bio: Alex Groce received his PhD in Computer Science from Carnegie Mellon University in 2005, and B.S. degrees in Computer Science and Multidisciplinary Studies (with a focus on English literature) from North Carolina State University in 1999. Before joining the Oregon State University faculty in 2009, he was a core member of the Laboratory for Reliable Software at NASA’s Jet Propulsion Laboratory, and taught classes on Software Testing at the California Institute of Technology. His activities at JPL included a role as lead developer and designer for test automation for the Mars Science Laboratory Curiosity mission's internal flight software test team, and lead roles in testing file systems for space missions. His research interests are in software engineering, particularly testing, model checking, code analysis, debugging, and error explanation and fault localization.


Tech Talk: Type-directed compilation in the wild: Haskell and Core

Galois is pleased to host the following tech talk. These talks are open to the interested public--please join us! (There is no need to pre-register for the talk.)

Please note the unusual day for this talk; it is on Monday.

title: Type-directed compilation in the wild: Haskell and Core
speaker: Simon Peyton Jones
time: Monday, 29 July 2013, 10:30am
location: Galois Inc.
421 SW 6th Ave. Suite 300,
Portland, OR, USA
(3rd floor of the Commonwealth building)

abstract: Academic papers often describe typed calculi, but it is rare to find one in a production compiler. Indeed, I think the Glasgow Haskell Compiler (GHC) may be the only production compiler in the world that really has a remorselessly statically-typed intermediate language, informally called "Core", or (when writing academic papers) the more respectable-sounding "System FC".

As real compilers go, GHC's Core language is tiny: it is a slight extension of System F, with letrec, data types, and case expressions. Yet all of Haskell (now a bit of a monster) gets translated into it. In the last few years we have added one new feature to Core, namely typed (but erasable) coercions that witness type equalities, which turn Core into a particular kind of proof-carrying code. This single addition has opened the door to a range of source-language extensions, such as GADTs and type families.

In this talk I'll describe Core, and how it has affected GHC's development over the last two decades, concentrating particularly on recent developments, coercions, evidence, and type families.

To test your mettle I hope to end up with the problem we are currently wrestling with: proving consistency of a non-terminating rewrite system with non-left-linear rules.

bio: Simon Peyton Jones, MA, MBCS, CEng, graduated from Trinity College Cambridge in 1980. After two years in industry, he spent seven years as a lecturer at University College London, and nine years as a professor at Glasgow University, before moving to Microsoft Research (Cambridge) in 1998.

His main research interest is in functional programming languages, their implementation, and their application. He has led a succession of research projects focused around the design and implementation of production-quality functional-language systems for both uniprocessors and parallel machines. He was a key contributor to the design of the now-standard functional language Haskell, and is the lead designer of the widely-used Glasgow Haskell Compiler (GHC). He has written two textbooks about the implementation of functional languages.

More generally, he is interested in language design, rich type systems, software component architectures, compiler technology, code generation, runtime systems, virtual machines, and garbage collection. He is particularly motivated by direct use of principled theory to practical language design and implementation -- that's one reason he loves functional programming so much. His home page is at


Tech Talk: SMACCMPilot: flying quadcopters using new techniques for embedded programming

Galois is pleased to host the following tech talk. These talks are open to the interested public--please join us! (There is no need to pre-register for the talk.)

title:SMACCMPilot: flying quadcopters using new techniques for embedded programming

speaker:Pat Hickey
time:Tuesday, 02 July 2013, 10:30am
Galois Inc.
421 SW 6th Ave. Suite 300,
Portland, OR, USA
(3rd floor of the Commonwealth building)

abstract: At Galois, we're building critical flight control software using new software methods for embedded systems programming. We will show how we used new domain-specific languages which permit low-level hardware manipulation while still providing guarantees of type and memory safety. The flagship application for these new languages is called SMACCMPilot, a clean slate design of quadcopter flight control software built on open-source hardware. This talk will introduce our new software methods and show how we built SMACCMPilot to be high assurance without sacrificing programmer productivity.

bio: Since 2011, Pat Hickey has been a Member of Technical Staff at Galois, Inc. He has worked as an engineer on the Autonomous Systems Hardening (ASH) project funded by the Air Force Research Lab (AFRL), and the High-Assurance Cyber Military Systems (HACMS) project funded by DARPA. Mr. Hickey has served as a core developer of the ArduPilot open source project since 2011. He has a BS in Electrical Engineering at Rutgers University.


SIMON and SPECK in Cryptol

Last week, the NSA published two families of lightweight block ciphers, SIMON and SPECK:

We've formally specified both ciphers in Cryptol:

The following sections explore some applications of our specifications.


SIMON and SPECK are cipher families: each algorithm in the family offers different security and performance based on parameters such as block size, key size, and number of rounds. In Cryptol, we can use type variables to represent these parameters. For example, consider the type signature of SIMON's encrypt function:

encrypt : {n m T j} (...) => [m][n] -> ([n], [n]) -> ([n], [n]);

Each type variable corresponds to a parameter in the cipher:

  • n is the word size (the block size is 2n)
  • m is the number of key words
  • T is the number of rounds
  • j specifies which round constant to use

The encrypt function can be instantiated with almost any values for its parameters (subject to some constraints not shown here). This gives us an elegant way to construct the variants of SIMON:

Simon32_64  = encrypt `{n=16, m=4, T=32, j=0};
Simon48_72  = encrypt `{n=24, m=3, T=36, j=0};
Simon48_96  = encrypt `{n=24, m=4, T=36, j=1};
Simon64_96  = encrypt `{n=32, m=3, T=42, j=2};
Simon64_128 = encrypt `{n=32, m=4, T=44, j=3};

We can also experiment with stronger or weaker variants of the cipher:

simon> :let weakSimon = encrypt `{n=8, m=4, T=16, j=0}
simon> :type weakSimon
weakSimon : [4][8] -> ([8],[8]) -> ([8],[8])


An important property of block ciphers is that decryption is the inverse operation of encryption. We can state this property in Cryptol for the Speck64_96 encryption function (and its corresponding Speck64_96' decryption function) as follows:

theorem correctSpeck64_96: {k b}. Speck64_96' k (Speck64_96 k b) == b;

The theorem is universally quantified over all keys k and all blocks b.

We can use Cryptol's :check command to test whether the theorem holds for several randomly generated inputs:

speck> :check correctSpeck64_96
Checking case 1000 of 1000 (100.00%) 
1000 tests passed OK
[Coverage: 0.00%. (1000/1461501637330902918203684832716283019655932542976)]

However this did not give us good coverage of the state space. Instead, we can use Cryptol's :prove command to show the theorem holds for every possible input:

speck> :prove correctSpeck64_96

Cryptol proves the theorem by asking a SAT solver whether the negation of the theorem is satisfiable. If the negation is satisfiable, Cryptol returns the satisfying assignment as a counterexample to the theorem. In this case, the SAT solver says the negation is unsatisfiable so the theorem holds. The proof completes in less than 30 seconds!

Page 1 ... 3 4 5 6 7 ... 49 Next 5 Entries »