Skip to content

Commit

Permalink
Small modifications to mini-a3s writeup
Browse files Browse the repository at this point in the history
Signed-off-by: Julia ^-^ (good evening!)
  • Loading branch information
JuliaPoo committed Aug 19, 2021
1 parent 3bcc54c commit 5329a51
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 51 deletions.
80 changes: 36 additions & 44 deletions RARCTF-2021/Mini-A3S/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,31 +5,31 @@ The challenge is similar to the challenge [A3S](../A3S). We are given an impleme
1. Trits are used instead of bits. This means working in `GF(3)` instead of `GF(2)`.
2. The key expansion allows for keys larger than a round key. This means recovering the last round key isn't enough to recover the original key. This will become relevant later.

We are also given `3^9` plaintext and ciphertext pairs, and we have to recover the original key used in order to get the flag.
We are also given `3^9` plaintext and ciphertext pairs and we have to recover the original key used in order to get the flag.

## Metadata

So again I didn't solve this challenge during the CTF, but I really liked this challenge so here we are. This is by far the hardest CTF challenge I've solved so thanks for the great run!

This writeup is _long_, it took me a while to write everything down.
Also, this writeup is _long_, it took me a while to write everything down.

# Some A3S Background

As mentioned before, A3S is very similar to [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard). A3S operates on a state of `27` trits, arranged as `3x3` trytes (`3` trits to a tryte, kinda like `8` bits to a byte). The plaintext is converted to this state, which then passes through some operations, and the final state is then converted to the ciphertext.

Just like AES, A3S has the operations for encryption `sub`, `shift_rows`, `mix_columns`, `add_round_key`. More details in the [challenge pdf](chal/help.pdf).
Just like AES, A3S has the operations for encryption: `substitute`, `shift_rows`, `mix_columns`, `add_round_key`. More details in the [challenge pdf](chal/help.pdf).

An important thing to note is that `shift_rows`, `mix_columns` and `add_round_key` are all linear operations on the state. In a secure implementation, the `sub` would be the crucial component ensuring the cipher isn't affine, which is Very Very Bad™. See my solution for the challenge [A3S](../A3S). The lookup table for this substitution is the all important `SBOX`.
An important thing to note is that `shift_rows`, `mix_columns` and `add_round_key` are all linear operations on the state. In a secure implementation, the `substitute` would be the crucial component ensuring the cipher isn't affine, which is Very Very Bad™. See my solution for the challenge [A3S](../A3S). The lookup table for this substitution is the all important `SBOX`.

The key expansion for A3S relies on `sub` as well to ensure the key expansion isn't affine. This will be important later.
The key expansion for A3S relies on `substitute` as well to ensure the key expansion isn't affine. This will be important later.

# The Vuln

From now on, assume we are working with `GF(3)` unless stated otherwise. This means if you were to see say `2+2`, take modulo 3 and have `2+2=1`.
From now on, assume we are working with `GF(3)` unless stated otherwise. This means if you were to see, say, `2+2`, take modulo 3 and we have `2+2=1`.

I shamefully didn't notice when solving [A3S](../A3S) but the author provided a hint in the [A3S source](chal/mini.py): A comment `Secure enough ig` right above the SBOX.
I shamefully didn't notice when solving [A3S](../A3S) but the author provided a hint in the [A3S source](chal/mini.py): A comment `Secure enough ig` right above the SBOX that's also present in the [A3S](../A3S) challenge.

But yes! Just like it's child challenge, the SBOX is vulnerable. This can be seen by plotting it's difference distribution table, which is plotting `a+b` over `SBOX[a] + SBOX[b]`.
So yes! Just like its child challenge, the SBOX is vulnerable. This can be seen by plotting its difference distribution table, which is plotting `a+b` over `SBOX[a] + SBOX[b]`.

```python
from mini import * # Import challenge code
Expand All @@ -53,7 +53,7 @@ plt.show()
<img src="rsrc/sbox_diff.png" alt="SBOX differential table">
</p>

This looks bad, as a good SBOX should have a pretty uniform differential table. This is a good indication that the SBOX is _really_ close to being affine. It's pretty straightforward to derive this affine approximation:
This looks bad, as a good SBOX should have a pretty uniform differential table. It's a good indication that the SBOX is _really_ close to being affine, and it's pretty straightforward to derive this affine approximation:

```python
def sbox_affine(i:tuple):
Expand All @@ -72,7 +72,7 @@ Plotting `SBOX_AFFINE`'s differntial table yields:
<img src="rsrc/sbox_affine_diff.png" alt="SBOX_AFFINE differential table">
</p>

Also `SBOX` and `SBOX_AFFINE` turn out to be _really_ similay:
Also `SBOX` and `SBOX_AFFINE` turn out to be _really_ similar:

```python
print(SBOX_AFFINE)
Expand All @@ -86,25 +86,25 @@ print(SBOX)

Remember how the `SBOX` is the all important component that ensures that the cipher isn't affine? Having `SBOX` behave like an affine transform _most_ of the time is pretty bad.

Now, being affine `93%` of the time is very unlikely to make a particular instance of A3S encryption affine. Since the SBOX is applied _many many times_ during encryption, there is bound to be once where the SBOX isn't affine.
Now, despite being affine `93%` of the time, it's very unlikely to make a particular instance of A3S encryption entirely affine. Since the SBOX is applied _many many times_ during encryption, there is bound to be once where the SBOX isn't affine, causing the whole encryption to not be affine.

So instead of saying that with this `SBOX`, the A3S approximates an affine transform, I believe it'll be way more helpful to know that small sections of an A3S encryption instance is gonna be affine. We don't know where and when these affine portions are, just that small parts of it is _pretty likely_ to be affine.
So instead of saying that with this `SBOX`, the A3S approximates an affine transform, it's a better description to say that small sections of an A3S encryption instance is gonna be affine. We don't know where and when these affine portions are, but we can be fairly confident that small parts of it is _likely_ to be affine.

# Attack Approach

Unlike in the previous challenge [A3S](../A3S), it's unlikely that there's a plaintext-ciphertext pair that's completely affine for us to exploit. However, we can exploit the _very probably_ affine-ness of small parts of the encryption.
Unlike in the previous challenge [A3S](../A3S), it's unlikely that there's a plaintext-ciphertext pair that's completely affine for us to exploit. However, we can exploit the _very probably_ affine-ness of small parts of the encryption.

Furthermore, we'd want to recover all the round keys, which are `3*8*9 = 216` `GF(3)` values. This means that we'd want to recover at least `216` constraints to solve for the round keys. Preferably, these constraints are linear to make solving for the round keys really easy.
Furthermore, we'd want to recover all the round keys, which are `3*8*9 = 216` `GF(3)` values. This means that we'd want to recover at least `216` constraints to solve for the round keys. Preferably, these constraints are linear in `GF(3)` to make solving for the round keys really easy.

There are 2 places to get these constraints. The obvious one is the key expansion algorithm, and the next is the plaintext-ciphertext pairs.
There are 2 places to get these constraints. The obvious one is the key expansion algorithm itself, and the next is the plaintext-ciphertext pairs.

From now on, `kr` will refer to the round keys, plaintext is `pt` and ciphertext is `ct`.

So here's the plan:

1. Create a completely symbolic affine version of the A3S (that's a close approximation)
2. Recover as much constraints on `kr` from the key expansion.
3. Correlate the output of this affine version with the known `pt` and `ct` pairs to form more constraints on `kr`
1. Create a symbolic, affine version of the A3S that's a close approximation to the original.
2. Recover as many constraints on `kr` from the key expansion.
3. Correlate the output of our affine A3S with the known `pt` and `ct` pairs to form more constraints on `kr`
4. With hopefully > 216 constraints, we can solve for `kr`.

Recovering all the round keys is equivalent to recovering the original key as the original key is simply the first few round keys.
Expand Down Expand Up @@ -168,13 +168,6 @@ def sbox_affine(i:tuple):
(1 + i[0]*0 + i[1]*2 + i[2]*2)
)

def unsbox_affine(i:tuple):
return (
(2 + i[0]*1 + i[1]*1 + i[2]*1),
(1 + i[0]*0 + i[1]*2 + i[2]*2),
(0 + i[0]*0 + i[1]*1 + i[2]*0)
)

def expand(tyt):
words = tyt_to_wrd(tyt)
size = len(words)
Expand Down Expand Up @@ -248,7 +241,7 @@ Now I can simply call, say, `mix(pt)` and get the symbolic representation of `mi
<img src="rsrc/symbolic_demo.png" alt="Symbolic model demo">
</p>

## Savaging ~168 linear constraints from the key expansion
## Scavaging ~168 linear constraints from the key expansion

Refering to the [.pdf](./chal/help.pdf) given:

Expand All @@ -273,9 +266,9 @@ for i in range(6,24):
xkey_const1.append(0)
```

The second line is a little trickier. Notice the `sub` operation? It's affine `93%` of the time, but not _all_ the time. During the key expansion of the key for this challenge, `SBOX` is used a total of 12 times. This means there's around `12 * 8% ~ 1` instance where the constraint isn't linear.
The second line is a little trickier. Notice the `Sub` operation? It's affine `93%` of the time, but not _all_ the time. During the key expansion of the key for this challenge, `SBOX` is used a total of 12 times. This means there's around `12 * 8% ~ 1` instance where the constraint isn't linear.

If we replace the `sub` operation with its affine counterpart, we'd generate around `36` constraints. However, we expect about one of the SBOX applications to differ from our affine approximation. Each time that happens, it invalidates `3` of our constraints. So we should expect about `33` valid linear constraints, we just don't know _which_ of the SBOX affine approximation is invalid. No matter, since there's only `12` SBOX applications, we can simply bruteforce which one wasn't valid if it exists.
If we replace the `sub` operation with its affine counterpart, we'd generate around `36` constraints. However, we expect about one of the SBOX applications to differ from our affine approximation. Each time that happens, it invalidates `3` of our constraints. So we should expect about `36-3 = 33` valid linear constraints, we just don't know _which_ of the SBOX affine approximation is invalid. No matter. Since there's only `12` SBOX applications, we can simply bruteforce which one wasn't valid if it exists.

Creating the constraints:

Expand Down Expand Up @@ -311,11 +304,11 @@ for k in xkey_eqns:
xkey_const2.append(s)
```

This gives a total of about `135 + 33 = 168` linear constraints on `kr`, still about `48` constraints short of the `216` goal.
This gives a total of about `135 + 33 = 168` linear constraints on `kr`. That's still about `48` constraints short of the `216` goal.

## Scavaging the known pt-ct pairs for 54 linear constraints.

This is where the bulk of my time was spent solving this challenge: Scraping for enough constraints to proceed.
This is where the bulk of my time was spent solving this challenge: Scraping for enough constraints from the pt-ct pairs.

We'll start by defining the areas of interest. These areas are where we'd correlate the output of our affine model and the actual A3S output to derive more constraints. I've found 2 areas of interests, each giving `27` constraints. There are more areas, but the areas I've chosen make implementation really easy:

Expand Down Expand Up @@ -345,7 +338,7 @@ def forward_to_checkpoint(ctt, keys, checkpoint):
return ctt
```

As you can see, I chose to stop at checkpoints `0` and `1`, where `0` is right before the last `sub` operations, and `1` is before the last `add_round_key` operation.
As you can see, I chose to stop at checkpoints `0` and `1`, where `0` is right before the last `substitute` operation, and `1` is before the last `add_round_key` operation.

I've also implemented a function that outputs matrices of interest at that checkpoint:

Expand Down Expand Up @@ -403,20 +396,20 @@ Here's the plan:

Why it works:

Using the affine model for step `1`, the difference will not be affected by `kr`. Let the 2 plaintexts be `pt1, pt2`:
Using the affine model for step `1`, the difference will not be affected by `kr`. To see this, let the 2 plaintexts be `pt1, pt2`:

```
step1(pt1, pt2) = ptmat*pt1 + ptconst + kptmat*kr - (ptmat*pt2 + ptconst + kptmat*kr)
= ptmat*(pt1 - pt2)
```

`ptmat` is a constant and hence step 1 output is independent of `kr`.
As `ptmat` is a constant, step 1 output is independent of `kr`.

Now, if for step `3`, we were to use the affine model to partially decrypt, and take the difference, we'd have the output of step `3` independent of `last_key`. But this partial decryption goes through SBOX too, and the SBOX of the original A3S isn't completely affine! This means that our choice of `last_key` does affect the output of step `3`! You can think of this as "leaking" information of the `last_key`.
Now, if for step `3`, we were to use the affine model to partially decrypt, and take the difference, we'd have the output of step `3` independent of `last_key`. However, since this partial decryption goes through SBOX, which isn't perfectly affine in the original A3S, our choice of `last_key` actually does affect the output of step `3`, albeit rarely (~8% of the time). You can think of this as "leaking" information of the `last_key` by virtue of the SBOX not being perfectly affine.

Now why should expect `step1(pt1, pt2) == step3(ct1, ct2)` to hold true at a higher probability if `last_key` is correctly guessed? That's because if `step1` is done with the original A3S instead of our affine model (with the correct `kr`), the equality should _always_ hold. For our affine model however, it should hold with a somewhat higher probability, but with the added benefit of **not being dependent on `kr`**. This means we can guess the value of `last_key` independently of the rest of `kr`. In addition, we can guess `last_key` tryte by tryte, requiring only `27*9` guesses per known `pt1-pt2, ct1-ct2` pair.
So why should we expect `step1(pt1, pt2) == step3(ct1, ct2)` to hold true at a higher probability if `last_key` is correctly guessed? If `step1` is done with the original A3S instead of our affine model (with the correct `kr`), the equality should _always_ hold. Hence, it's probable enough that with our affine approximation, this would hold as well, but with the important added benefit of **not being dependent on `kr`**. This means we can guess the value of `last_key` independently of the rest of `kr`. In addition, we can guess `last_key` tryte by tryte, requiring only `27*9` guesses per known `pt1-pt2, ct1-ct2` pair.

And since each tryte we try is independent of one another, each section of the A3S algorithm we are relying on to be affine is small, which means there's an _okay_ chance of our affine model behaving the same way as the original A3S.
Since each tryte we try is independent of one another, each section of the A3S algorithm we are relying on to be affine is small, which means there's an _okay_ chance of our affine model behaving the same way as the original A3S!

Here's the implementation:

Expand Down Expand Up @@ -497,9 +490,9 @@ plt.show()
<img src="rsrc/checkpoint0_scores.png" alt="Plot of scores of checkpoint 0 analysis">
</p>

The peaks are pretty defined, so we can be confident that the recovered `last_key` is correct. And here we have `27` constraints!
The peaks are pretty defined, so we can be confident that the recovered `last_key` is correct. Hence we have here `27` constraints!

A side note, in regular AES, recovering the last round key like we have here would be enough to recover the original key as the expansion algorithm is reversible. However, in A3S, to support arbituary length keys, this is no longer possible.
On a side note, in regular AES, recovering the last round key like we have here would be enough to recover the original key as the expansion algorithm is reversible. However, in A3S, to support arbituary length keys, this is no longer possible.

To see why, the original key in this challenge is `3*3*5 = 45` trits. However, we only know `27` trits of information! That's not nearly enough to know the full original key!

Expand All @@ -508,8 +501,8 @@ To see why, the original key in this challenge is `3*3*5 = 45` trits. However, w

We can recover `27` more constraints with checkpoint 1 with a similar, but way simpler, attack. The plan:

1. Partially encrypt all known `pt` to checkpoint 1 with our affine model with `kr` be all `0`s.
2. Assume our affine model output `A(pt)` is the same as original A3S and take `ct - A(pt)`
1. Partially encrypt all known `pt` to checkpoint 1 with our affine model with `kr` be all `0`s. Let the output of this step be `A(pt)`
2. Take `ct - A(pt)`
3. The most common value of each trit of `ct - A(pt)` will be equal to the corresponding trit of `M x kr` where `M` is a constant matrix.

And boom! Another `27` constraints!
Expand All @@ -526,12 +519,11 @@ A(pt) = ptmat*pt + ptconst + kptmat*0
At the same time, `ct` was encrypted with a none-zero `kr`! So:

```
ct = pt + last_key
= ptmat*pt + ptconst + kptmat*kr + last_key // Assuming our affine model holds
ct = ptmat*pt + ptconst + kptmat*kr + last_key // Assuming our affine model holds
= A(pt) + kptmat*kr + last_key
M x kr = kptmat*kr + last_key
= ct - A(pt)
kptmat*kr + last_key = M x kr
= ct - A(pt)
```

And there! We have `M x kr = <stuff we can compute>`. Of course this does not hold _all the time_. The affine model would hold _once in a while_ for certain trits in the equation. That's why we should take the most probable value for each trit in `ct - A(pt)`.
Expand Down
7 changes: 0 additions & 7 deletions RARCTF-2021/Mini-A3S/sol/solve.sage
Original file line number Diff line number Diff line change
Expand Up @@ -61,13 +61,6 @@ def sbox_affine(i:tuple):
(1 + i[0]*0 + i[1]*2 + i[2]*2)
)

def unsbox_affine(i:tuple):
return (
(2 + i[0]*1 + i[1]*1 + i[2]*1),
(1 + i[0]*0 + i[1]*2 + i[2]*2),
(0 + i[0]*0 + i[1]*1 + i[2]*0)
)

def expand(tyt):
words = tyt_to_wrd(tyt)
size = len(words)
Expand Down

0 comments on commit 5329a51

Please sign in to comment.