ZUC is a stream cipher that is proposed for inclusion in the "4G" mobile standard named LTE (Long Term Evolution), the future of secure href="http://en.wikipedia.org/wiki/4G">GSM. The proposal is actually comprised several different algorithms:
- A stream cipher named ZUC,
- LTE encryption algorithm (128-EEA3),
based on ZUC,
- LTE integrity algorithm (128-EIA3),
which is a hash function using ZUC as its
core.
In this post, we will develop a Cryptol implementation of class="caps">ZUC, providing executable reference code. In addition, we will also discuss some of the security concerns raised for version 1.4 of the algorithm: We will use Cryptol to automatically identify an IV collision vulnerability. This weakness is addressed in version 1.5 of the algorithm: We will also prove that the proposed fix is indeed valid, again using Cryptol's formal verification tools.
While we do assume some familiarity with ZUC and Cryptol, this article should shed some light into how Cryptol can be used in the cryptographic design and evaluation processes even if you skip all the ZUC specific details.
Our implementation follows the latest href="http://www.gsmworld.com/documents/EEA3_EIA3_ZUC_v1_5.pdf">specification
of ZUC, which is at version 1.5 at the time of writing.
Addition in GF(231-1)
One of the core operations in ZUC is addition of two 31-bit numbers, say a and b. The specification states that this operation can be done as follows:
- Compute v = a+b
- If the carry bit is 1, set v=v+1
We transliterate this algorithm to Cryptol straightforwardly:
plus : ([31], [31]) -> [31]; plus (a, b) = if sab ! 0 then sab'+1 else sab' where { sab : [32]; sab = (a # zero) + (b # zero); sab' : [31]; sab' = take(31, sab); };
Note how we detect overflow by doing the 32-bit addition and checking the final bit. We can generalize plus to style="font-family: monospace;">add, which will add any sequence of numbers in GF(231-1) using plus repetitively:
add : {a} (fin a) => [a][31] -> [31]; add xs = sums ! 0 where sums = [0] # [| plus (s, x) || s <- sums || x <- xs |];
We simply generate the partial sums, and return the final sum by selecting the last element.
A digression on plus
As an interesting aside, the ZUC specification calls the plus operation we have defined
above addition modulo 231-1, which does not seem to be the traditional modular addition. In particular, you can prove that style="font-family: monospace;">plus will never evaluate to 0 unless its arguments are both 0,
which is not quite how modular addition typically behaves. We can prove this claim using Cryptol's satisfiability checker:
ZUC> :sat (\(a, b) -> (a != 0) & (b != 0) & (plus(a, b) == 0))No variable assignment satisfies this function
It appears that the algorithm has been designed with modular addition in mind, with tweaks to avoid having it generate style="font-family: monospace;">0 values on purpose.
The LFSR
At the core of ZUC lies an class="caps">LFSR (linear feedback shift register), which comprises of 16 cells, each of which is precisely 31 bits wide. Cryptol's type-system has been designed to accurately capture such specifications:
type LFSR = [16][31];
Note that we do not have to resort to using 32-bit machine integers or some other machine mandated bit-size, freeing us to give a fully faithful implementation of the specification.
ZUC uses the LFSR in two different modes. In the initialization mode, it takes a 31-bit input u, and transforms the LFSR by performing the following computation, where sstyle="font-family: monospace;">i refers to the i-th cell of the LFSR:
215s15 + 217s13
+ 221s10
+ 220s4 + (1+28)
s0 mod (231 - 1)
ZUC adds this value to the input style="font-family: monospace;">u, and then "tweaks" the sum to
avoid a 0 result. The Cryptol code below implements the specification quite closely:
LFSRWithInitializationMode : ([31], LFSR) -> LFSR;LFSRWithInitializationMode (u, ss) = (ss @@ [1 .. 15]) # [s16] where { v = add [| s <<< c || s <- ss @@ [15 13 10 4 0 0] || c <- [15 17 21 20 8 0] |]; vu = add [v u] s16 = if vu == 0 then 0x7fffffff else vu;};
Note how we pick elements of the LFSR and the coefficients by a simple sequence comprehension.
In the work mode, there is no initializer value u, but otherwise the operation is very similar:
LFSRWithWorkMode : class="caps">LFSR -> LFSR;class="caps">LFSRWithWorkMode ss = (ss @@ [1 .. 15]) # [s16] where { v = add [| s <<< c || s <- ss @@ [15 13 10 4 0 0] || c <- [15 17 21 20 8 0] |]; s16 = if v == 0 then 0x7fffffff else v;};
We could have defined LFSRWithWorkMode in terms of LFSRWithInitializationMode by passing 0 for style="font-family: monospace;">u, but the above definition follows the specification much more closely, a desirable thing to do for a reference implementation. (Also, this version of style="font-family: monospace;">LFSRWithWorkMode is a bit faster for obvious reasons, saving us some cycles during execution.)
Bit Reorganization
The middle layer of ZUC takes the class="caps">LFSR and shuffles its contents as follows:
BitReorganization : LFSR -> [4][32];BitReorganization ss = [| y # x || (x, y) <- [(hi s15, lo s14) (lo s11, hi s9) (lo s7, hi s5) (lo s2, hi s0)] |] where { lo, hi : [31] -> [16]; lo x = x @@ [0 .. 15]; hi x = x @@ [15 .. 30]; [s0 s2 s5 s7 s9 s11 s14 s15] = ss @@ [0 2 5 7 9 11 14 15]; };
There isn't much to say about bit-reorganization, except to note that selecting high and low bytes of a 31-bit word comes out quite clean in Cryptol, thanks to bit-level addressability and compact selection operation @@. Note how ZUC defines the higher 16 bits of a 31 bit number by picking bits 15 through 30; which is just as natural in Cryptol to express as any other slice of a given word.
The nonlinear function F
Cryptol implementation of ZUC's style="font-family: monospace;">F function follows the specification almost literally:
F : ([3][32], [2][32]) -> ([32], [2][32]);F ([X0 X1 X2], [R1 R2]) = (W, [R1' R2']) where { W = (X0 ^ R1) + R2; W1 = R1 + X1; W2 = R2 ^ X2; [W1L W1H] = split W1; [W2L W2H] = split W2; R1' = S(L1(W2H # W1L)); R2' = S(L2(W1H # W2L)); };
Note that we keep track of the parameters style="font-family: monospace;">R1 and style="font-family: monospace;">R2 explicitly. Being a purely functional language, Cryptol does not have any notion of state, and hence does not support in-place updates. However, this does not mean inefficient execution: The purely
functional approach makes sure the specifications remain easy to develop and reason about, while backend tools (compilers, synthesizers, etc.) can transform the code to run efficiently on various targets appropriately, such as on class="caps">FPGA's or in software.
We will skip the details of ZUC's S-boxes and the functions L1 and class="caps">L2, but you can see their implementation in the attached full implementation.
Loading the Key
ZUC receives a 128 bit key and a 128-bit IV (initialization vector), to construct the initial starting
configuration of the LFSR. The following definition follows the specification (Section 3.5) literally:
LoadKey : ([128], [128]) -> LFSR;LoadKey (key, iv) = [| i # d # k || k <- ks || i <- is || d <- ds |] where { ks : [16][8]; ks = split key; is : [16][8]; is = split iv; ds : [16][15]; ds = [0b100010011010111 0b010011010111100 0b110001001101011 0b001001101011110 0b101011110001001 0b011010111100010 0b111000100110101 0b000100110101111 0b100110101111000 0b010111100010011 0b110101111000100 0b001101011110001 0b101111000100110 0b011110001001101 0b111100010011010 0b100011110101100]; };
Initializing ZUC
During the initialization stage, ZUC loads the key and the IV, and then repeatedly performs bit-reorganization, a run of F, and a run of class="caps">LFSR in initialization mode. This process is repeated 32 times. For purposes
that will become clear later, we represent this operation as a Cryptol stream function that returns an infinite sequence of ZUC configurations:
type ZUC = (LFSR, [32], [32]);InitializeZUC : ([128], [128]) -> [inf]ZUC;InitializeZUC (key, iv) = outs where { initLFSR = LoadKey (key, iv); outs = [(initLFSR, 0, 0)] # [| step out || out <- outs |]; step (lfsr, R1, R2) = (LFSRWithInitializationMode(take(31, w >> 1), lfsr), R1', R2') where { [X0 X1 X2 _] = BitReorganization(lfsr); (w, [R1' R2']) = F ([X0 X1 X2], [R1 R2]); }; };
Executing ZUC
We need two more pieces of functionality. In the so called style="font-style: italic;">working stage, ZUC runs
bit-reorganization, F, and LFSR in work mode,
discarding the
result of the call to F:
WorkingStage : ZUC -> ZUC;WorkingStage (lfsr, R1, R2) = (lfsr', R1', R2') where { [X0 X1 X2 _] = BitReorganization(lfsr); (_, [R1' R2']) = F ([X0 X1 X2], [R1 R2]); lfsr' = LFSRWithWorkMode(lfsr); };
Cryptol's pattern-matching based definitions come out quite nicely
in picking (and ignoring) results of operations.
In the production stage,
ZUC transforms works just like in the working stage, except the
result of the call to F is returned as the next 32-bit key, after
XOR'ing with the last word
of what bit-reorganization returns. Again, the Cryptol code is
straightforward:
ProductionStage : ZUC -> ([32], ZUC);ProductionStage (lfsr, R1, R2) = (w ^ X3, (lfsr', R1', R2')) where { [X0 X1 X2 X3] = BitReorganization(lfsr); (w, [R1' R2']) = F ([X0 X1 X2], [R1 R2]); lfsr' = LFSRWithWorkMode(lfsr); };
The ZUC API
We can finally give the ZUC API. Given a key and an IV, ZUC initializes itself, and then keeps calling ProductionStage to generate successive sequences of 32-bit words as key expansion. The result is most naturally captured in Cryptol as an infinite sequence of 32-bit words:
ZUC : ([128], [128]) -> [inf][32];ZUC (key, iv) = tail [| w || (w, _) <- zucs |] where { initZuc = WorkingStage(InitializeZUC @ 32); zucs = [(zero, initZuc)] # [| ProductionStage zuc || (_, zuc) <- zucs |]; };
Encryption can now be done by taking the plaintext and XOR'ing with the successive words that come out of the above function: Clearly, decryption is the same as encryption, and the fact that they are inverses follows trivially from the fact that it's a mere XOR operation.
And this completes our development of ZUC in Cryptol. You can see the entire code here.
Testing
One thing about crypto-algorithm development is that it is hard to convince oneself that the algorithm is
coded correctly. ZUC is no exception. Hopefully, Cryptol makes that task easier by abstracting away from
many of the machine-specific details, providing a language that allows one to express idioms that appear
in cryptography quite concisely, thus removing a whole class of bugs that has nothing to do with the algorithm itself but rather with how it has to be implemented.
The other aspect of Cryptol is that the specification, as high level as it is, remains executable. So, we can use our implementation and test it against the published test-vectors for ZUC. Here's the first example from the test href="http://www.gsmworld.com/documents/EEA3_EIA3_Test_Data_v1_1.pdf">document
(Section 3.3):
ZUC> take(2, ZUC(0, 0))[0x27bede74 0x018082da]
(The full implementation has further test vectors from the specification, see the theorem style="font-family: monospace;">ZUC_TestVectors.)class="Apple-style-span"
style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Times; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;">
Note that since our implementation of ZUC returns an infinite sequence, we have used the "take" function to just look at the first two outputs. Naturally, we can pull out as many outputs as we would like from that infinite stream.
Security of ZUC
One essential activity in crypto-algorithm design is to provide rigorous security arguments. While the current state-of-the-art in developing such arguments relies largely
on human intelligence, we can use tools to attest our findings. In this section we will see how to use Cryptol to mechanically demonstrate an IV collision vulnerability found in version 1.4 of the ZUC specification,
and how the modifications in version 1.5 addressed the problem.
An IV collision occurs if two different IV's cause the algorithm to initialize itself to precisely the same state, thus losing entropy. Cryptographers seriouslyworryabout
such
losses
of entropy as they can lead to efficient attacks by cutting down the
search space significantly. In a recent conference, Wu et al.,
href="http://www.spms.ntu.edu.sg/Asiacrypt2010/Common/rumpsession.html">demonstrated
one such vulnerability in ZUC
1.4. In that version of the specification, ZUC
had a
slightly different initialization sequence. First, instead of
addition in GF(231-1), it performed a simple class="caps">XOR when LFSR is
used
in
the
initialization
mode. Second, version 1.4 XOR'd the last
byte from
the bit-reorganization during initialization with the result of the
call to the nonlinear function F. (You can see the precise differences
between versions 1.4
and 1.5 in the attached Cryptol href="http://www.galois.com/%7Elerkok/programs/ZUC.cry">code,
search for the occurrences of the variable style="font-family: monospace;">version1_5, which
distinguishes
between the two.)
As demonstrated by Wu et al., the 1.4 version of the algorithm
suffers from IV collision: That is, two different IV's
can
result
in
the
precise
same
ZUC state, causing loss
of entropy. It is easy to express this property as a Cryptol theorem:
theorem ZUC_isResistantToCollisionAttack: {k iv1 iv2}. if(iv1 != iv2) then InitializeZUC(k, iv1) @ 1 != InitializeZUC(k, iv2) @ 1 else True;
Let's spend a moment on what the above theorem is stating. It says
that for all values of k,
iv1, and style="font-family: monospace;">iv2, the initial state of class="caps">ZUC will be different so long as style="font-family: monospace;">iv1 and
iv2 are not the same. If
this theorem
holds of our algorithm, it would mean that
there is no entropy loss due to IV collision. (We also now see why we
chose
InitializeZUC to return an
infinite stream: This way we can simply look at the result of the first
step, which will create a
simpler verification problem for the backend SAT/SMT
solver
used
by
Cryptol.)
Here is Cryptol's response when we tell it to prove the above
theorem, using version 1.4 of the specification:
ZUC> :set sbvclass="caps">ZUC> :prove ZUC_isResistantToCollisionAttackFalsifiable.class="caps">ZUC_isResistantToCollisionAttack(0x6bff61ffff8fcdffffffc996ffffff1a, 0xff08fc0085e000000a0000f5008f000a, 0xff08fc0085e000000a0000f5008f008a) = False
The first command tells Cryptol to switch to the SBV
mode, which allows for formal proofs. In the second command, we
asked Cryptol to prove that the theorem holds of our implementation.
Not only Cryptol told us
that the theorem we have stated is false, it also provided a concrete
counterexample! (Note that those two IV values are different, check the
second to last digit!)
Let's verify that the vulnerability indeed does exist with the style="font-family: monospace;">iv values Cryptol gave us:
ZUC> take(2,ZUC(0x6bff61ffff8fcdffffffc996ffffff1a,0xff08fc0085e000000a0000f5008f000a))[0xa415abbe 0x673f1eb9]ZUC> take(2,ZUC(0x6bff61ffff8fcdffffffc996ffffff1a,0xff08fc0085e000000a0000f5008f008a))[0xa415abbe 0x673f1eb9]
Voila! We have two different IV's, yet we
get precisely the same key-sequence. Cryptol can attest to what Wu et
al. showed, providing a concrete counterexample to wit.
In response to this vulnerability, the ZUC algorithm
was
slightly
tweaked
to
remove
the
possibility
of
collision.
(Again,
you
can
look
at
the
attached
Cryptol
code
to
see what those
changes were, search for the word version1_5.)
Is the vulnerability really removed? While mathematicians will have
their own tools to claim as such, we can use Cryptol to verify that the
fix indeed does work (and
is correctly implemented) in our version as well. With
version 1.5 of the spec, we have:
ZUC> :prove class="caps">ZUC_isResistantToCollisionAttackclass="caps">Q.E.D.
(The above proof takes a couple of seconds to complete on my
laptop.)
It is important to note that the above theorem does not
prove that there are no IV collisions in ZUC
1.5. This is because we've only proved the theorem after the first run
of the
InitializeZUC
routine. Recall
that the actual implementation actually runs that operation 32 times.
While we can express the full theorem in
Cryptol as well, it generates quite a large verification problem, and
the SAT solver running on my laptop is not
quite
powerful enough to tackle it. (The proof might indeed be feasible on a
more decent machine with enough RAM.
One can also construct an
argument that the initialization step is injective for all possible
LFSR configurations
that LoadKey will
produce,
thus completing the proof in two steps. We leave that
as an exercise for the interested reader!) In any case, our proof above
shows that ZUC version
1.5 is at least free of "easy to find" IV collision attacks.
Summary
Designing cryptographic algorithms requires a deep understanding of
the underlying science of cryptography, and a fair amount of the
mathematics thus involved. Implementing such algorithms
need not! We believe that the Cryptol toolset provides the right idioms
and the tools to simplify cryptographic algorithm
implementations and evaluations, abstracting away from machine specific
details and platform specific concerns. Specifications remain
pure, and hence easier to reason about and communicate. The executable
nature of Cryptol also makes it
easy to just play around with your implementations, without worrying
about myriads of implementation specific concerns.
(Compare how you would do a similar study of ZUC if
you
had
to
use
C
or
Java; how much of your time would be spent on the actual
algorithm, and how much on "everything else.") Once the algorithm is
developed, the
compilation and synthesis tools of Cryptol can help in creating
artifacts that can be
deployed in software or in an FPGA. By
separating the concerns of specification from implementation, Cryptol
provides new
means of simplifying crypto-algorithm development and evaluation.
The full ZUC implementation in Cryptol
can be downloaded href="http://www.galois.com/%7Elerkok/programs/ZUC.cry">here. Free
licenses for Cryptol are available at www.cryptol.net.