-
Notifications
You must be signed in to change notification settings - Fork 0
/
nsa_backdoor.jl
346 lines (283 loc) · 17.9 KB
/
nsa_backdoor.jl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
### A Pluto.jl notebook ###
# v0.19.11
using Markdown
using InteractiveUtils
# ╔═╡ 0714979c-30ff-11ed-2173-330d20197137
begin
using Primes
using ProgressLogging
using DelimitedFiles
end
# ╔═╡ a28d8f4f-e0ce-43cc-978b-80a1d1d978a5
md"""
# NSA Backdoor
Last weekend I stumbled into NERD's rabbit hole more or less by accident (read [here](https://github.com/vobst/ctf-hireme) how I found my way out). This weekend I was deliberately searching for an excuse to hack all night long and ended up with the [NSA backdoor challenge](https://play.picoctf.org/practice/challenge/283?category=2&page=5) from this year's picoCTF. I thought: "If I'm not able to solve it in the end, it's at least a good opportunity to learn about some [RSA](https://en.wikipedia.org/wiki/RSA_(cryptosystem)) and Diffie-Hellman crypto related stuff.", and got to work.
## Overview of the Challenge
We are given two text files `gen.py` and `output.txt`. The former contains a Python script and the latter two big integers `N` and `c`. Skimming through the script, we can see that it:
- reads a string from a file `flag.txt` in the current working directory and interprets the bytestring of its ASCII representation as one big integer called `FLAG`, let's denote it by $x$,
- generates two random primes $p$ and $q$,
- calculates $N=pq$ and $c=3^x\mod{N}$, which it prints to the console.
Running it, after creating a dummy flag, yields an output similar, but not equal, to the one we are given. Thus, the challenge is clear: Compute the [discrete logarithm (DL)](https://en.wikipedia.org/wiki/Discrete_logarithm) of the given $c$ in the cyclic group $⟨3⟩⊂\mathbb{Z}_N^\times$ with respect to base $3$. Here, $\mathbb{Z}_N^\times$ is the [multiplicative group of integers modulo](https://en.wikipedia.org/wiki/Multiplicative_group_of_integers_modulo_n) $N$ and $⟨3⟩=\{1, 3, 3^2, \dots\}$ is the cyclic subgroup generated by $3$.
## A bit of Background
These days most of the internet traffic is using protocols that perform some variant of [Diffie-Hellman (DH) key exchanges](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange) during their initial handshake, e.g., the [TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security) or [SSH](https://en.wikipedia.org/wiki/Secure_Shell), to bootstrap a shared secret. Both, the [*computational* DH](https://en.wikipedia.org/wiki/Computational_Diffie%E2%80%93Hellman_assumption) and [*decisional* DH](https://en.wikipedia.org/wiki/Decisional_Diffie%E2%80%93Hellman_assumption) assumptions are *stronger* than the discrete logarithm assumption, i.e., if we can efficiently compute discrete logarithms the corresponding problems become efficiently solvable as well. Thus, super smart experts have spend much time carefully choosing groups in which the DL problem is *believed* to be hard, which are then [standardized](https://www.rfc-editor.org/rfc/rfc7919.html#page-19) and used by implementations. However, there have long been [rumors](https://safecurves.cr.yp.to/rigid.html) that three-letter government agencies might have meddled with this process to standardize groups that they know how to break. During protocol version upgrades it is therefore not uncommon that new groups are added and old ones removed due to security concerns.
## Analyzing the Challenge in Detail
The reason that so much thought goes into choosing parameters for cryptography is that you can screw up in many subtle ways, even when using well-established cryptosystems, and be left with no guarantees at all. So let's have a closer look at our challenge.
First of all, we should check for some low-hanging fruits: the way that the RNG is seeded with `os.urandom(32)` seems legit, this specific modulus $N$ has not been [reported as factored](https://factordb.com/) yet, the algorithm ensures that `p,q` are indeed large primes, it makes use of library functions for most things (instead of using own implementations), and the secret exponent is certainly large enough to disallow the use of a normal logarithm.
What is striking, however, is that the script uses a composite modulus in the first place. In cryptographic applications one commonly encounters a prime modulus of the form $p=rq+1$, where $q$ is also prime and $r>1$ an integer. Then, exponentiation is done in an order $q$ subgroup of $\mathbb{Z}_p^\times$. This approach allows to control the order of the cyclic group to meet security needs, the fact that the order is prime cripples the [Pohlig-Hellman algorithm](https://en.wikipedia.org/wiki/Pohlig%E2%80%93Hellman_algorithm), and common operations like uniform sampling, checking for membership and finding a generator can be done efficiently.
Furthermore, the script is pretty picky about what kind of primes $p$ and $q$ it will choose. They must be exactly 1024 bit long, that's okay, but $p-1$ and $q-1$ are set up to have peculiar properties:
- They have got no duplicate factors in their [prime decomposition](https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic).
- All but two of their odd prime factors are in the interval $(2^{15}, 2^{16})$ (for $p$) and $(2^{16}, 2^{17})$ (for $q$). They are chosen uniform at random.
- The other two odd prime factors are chosen in order to produce an end result of the correct size next to a prime. They are also chosen uniform at random but may lie below the range of the rest.
This means that both $(p-1)$ and $(q-1)$ each have about 65 prime factors, all of which have multiplicity one. What is more, its pretty likely that they don't share any prime factors besides $2$, i.e., $\mathrm{gcd}(p-1,q-1)=2$. A natural question to ask is how many primes there are to choose those factors from. Using the [Prime Number Theorem](https://en.wikipedia.org/wiki/Prime_number_theorem) we can estimate this number, e.g. about 5.5k in the range $(2^{16}, 2^{17})$. Okay, still more ways to choose 65 among those than there are atoms on earth so we have to keep thinking. At this stage I was also checking how big $|p-q|$ is on average, to see if [Fermat factorization](https://en.wikipedia.org/wiki/Fermat%27s_factorization_method) might be applicable, but this isn't viable as well. However, even if I would be able to factor $N$ and then use the [Chinese Remainder Theorem (CRT)](https://en.wikipedia.org/wiki/Chinese_remainder_theorem), I would still need to calculate two DLs, one in $\mathbb{Z}_p^\times$ and one in $\mathbb{Z}_q^\times$, which are still ridiculously large numbers.
Finally, I remembered that
$|\mathbb{Z}_N^\times|=\Phi(N)=\Phi(p)\Phi(q)=(p-1)(q-1),$
where $\Phi$ is [Euler's totient function](https://en.wikipedia.org/wiki/Euler%27s_totient_function) and $|\mathbb{Z}_N^\times|$ is the cardinality (or order) of the group. By [Lagrange's Theorem](https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory)) we have that
$|\langle 3\rangle| \big\vert |\mathbb{Z}_N^\times|.$
This implies that the prime factors of $|\langle 3\rangle|$ are a subset of the prime factors of $(p-1)(q-1)$, whose peculiar structure we described above! In particular, there are no large prime factors and thus Pohlig-Hellman should compute the DL in no time. Time to write some code.
## Order-Finding and Factoring
Ah, damn it, Pohlig-Hellman requires a prime factorization of the group's order (of $|\langle 3\rangle|$, not $|\mathbb{Z}_N^\times|$, to make matters a little bit brighter). Unfortunately we are not done with thinking just yet.
In general groups, finding the order of a group element is considered a hard problem, and there are no efficient *classical* algorithms known to this date. However, despite it usually being presented as a factoring algorithm, the actual **quantum** part of Shor's algorithm is in fact performing order-finding. One can show that an efficient algorithm for order-finding implies a factoring algorithm by using it to find non-trivial roots of unity.
In the present case, however, we can leverage the fact that the set of prime factors that can appear in $|\langle 3\rangle|$ is pretty limited. As I noticed above, the prime factors of $(p-1)(q-1)$ are picked from a set of only about ~10k primes (remember: distinct ranges, no duplicates, maybe two medium ones).
By definition we have that for any $\gamma\in \mathbb{N}$
$1\equiv 3^{\gamma|\langle 3\rangle|}\mod{N}.$
Thus, if we consider the product $X$ of all of the roughly 11k possible prime factors $f$ that the script can pick
$X=\prod_i f_i$
it holds that $|\langle 3\rangle|\big\vert X$ ($X$ certainly 'contains' all actual prime factors) and thus
$1\equiv 3^{X} \mod{N}.\quad\quad(1)$
Furthermore, if we define $X_j=X/f_j$ we have that
$3^{X_j}\equiv 1 \mod{N} \Leftrightarrow f_j \text{ is not needed for prime factorization}.$
Based on this observation we can now write an iterative algorithm that finds the prime factors of $|\langle 3\rangle|$ in roughly 20min on my 7 year old laptop.
"""
# ╔═╡ f35b3397-974b-4ae6-9fda-52922f8a740d
function find_order!(modulus::BigInt, factors::AbstractVector, base::Number)
X = prod(factors)
@progress for i in axes(factors, 1)
X_i = X ÷ factors[i]
if powermod(base, X_i, modulus) == 1
factors[i] = 1
X = X_i
end
end
return factors
end
# ╔═╡ f232477c-6270-42ab-a134-deba2542adc2
md"""
We can test the algorithm for some small modulus and a random guess of candidate factors to gain some confidence before launching the real run.
"""
# ╔═╡ 615240ee-b70c-444d-8115-117eb19acc3c
let factors = find_order!(BigInt(5*7), [2,2,3,3,5,5,7,7,11,11,13,13], 4)
[powermod(4, i, 35) for i in 1:prod(factors)+1]
end
# ╔═╡ e591dd6b-f498-4500-bb5c-6b9f8738ac25
md"""
Now we just need to generate a not too large set of candidate factors (its easy to find a range that contains the intermediate-size factors by checking Eq. (1)). Uncomment the last lines to run the search yourself, otherwise we will continue with my precomputed list.
"""
# ╔═╡ b891d019-c92e-4cf1-b748-8083a7bd26dd
md"""
Note that we can, in fact, compute $\Phi(N)$ from $|\langle 3\rangle|$ since
$\Phi(N)=N-p-q+1\approx N\quad \text{ and } \quad\frac{N}{|\langle 3\rangle|}\approx4.$
As we are at it anyway, we can now factor $N$ just for the fun of it:
"""
# ╔═╡ fd19e57e-d598-4c21-b89b-b67498773999
md"""
While this last step is not strictly necessary to solve the challenge, it serves as a friendly reminder that you should be knowing what you are doing if you ever fancy choosing your own RSA modulus.
"""
# ╔═╡ 9a41bdcb-32bd-465b-af87-9829368f9e1f
md"""
## Computing the Discrete Logarithm
We now have all the ingredients to compute the DL of `c` using the [Pohlig-Hellman algorithm](https://en.wikipedia.org/wiki/Pohlig%E2%80%93Hellman_algorithm). Note that the subroutine for prime-power subgroups reduces to the [baby-step giant-step algorithm](https://en.wikipedia.org/wiki/Baby-step_giant-step) since all prime factors have a multiplicity of one.
"""
# ╔═╡ c9dcc520-c954-42aa-b05e-87543cb1a783
# subroutine of Pohlig-Hellman
function BabyStepGiantStep(modulus, order, g, h)
m = BigInt(ceil(sqrt(order)))
kBabySteps = Dict( powermod(g, j, modulus) => j for j in 0:m-1 )
kGiantStep = powermod(g, order-m, modulus)
for i in 0:m-1
j = get(kBabySteps, h, -1)
if j != -1
return i*m + j
end
h *= kGiantStep
h = mod(h, modulus)
end
return -1
end
# ╔═╡ aa25f8ad-9f6d-42d5-9559-0b7bd0f0a88c
md"""
Now capturing the flag is just a matter of being able to recall how to solve a system of simultaneous congruences, i.e., inverting the isomorphism in the CRT ...
"""
# ╔═╡ b487280a-d733-4012-9ba1-617030be92b8
function PohligHellman(modulus, order, g, h, factors)
x_i = []
for p_i in factors
g_i = powermod(g, order ÷ p_i, modulus)
h_i = powermod(h, order ÷ p_i, modulus)
push!(x_i, BabyStepGiantStep(modulus, p_i, g_i, h_i))
end
x = BigInt(0)
for (p_i, x_p_i) in zip(factors, x_i) # solve simultaneous congruences
tmp_prod = order ÷ p_i
tmp_inv = invmod(tmp_prod, p_i)
x += mod(x_p_i * tmp_prod * tmp_inv, order)
end
return mod(x, order)
end
# ╔═╡ b042442d-e213-4ce2-bec0-4248fd3a1782
md"""
... or, at least I thought that this would be the tough part. Turns out, it takes much longer to convince this wonderful young programming language to [reinterpret an arbitrary-precision integer as a string](https://discourse.julialang.org/t/converting-uint8-array-to-bigint-and-back/19110/18) ... :)
"""
# ╔═╡ 33d3e000-4620-435e-b934-d20a0c3e10dc
md"""
That's it, Sunday is gonna be over in just a few minutes time and I sure as hell know a bit more about crypto than I did back on Friday evening.
"""
# ╔═╡ 4459f38c-bc15-4b13-97d9-73c8cec5c4b1
md"""
### Constant Definitions
"""
# ╔═╡ 1700661c-7354-4682-8da1-ae44e510ad5e
begin
modulus = parse(BigInt, "b8f60cccb62dd03d47507d31de750c31b2b3c6b34b14dd16dbb966b1b9a8dee317b99fd503c4535665c2c1a873d91e2625d5b5a6ab1a3676f5ca75e93e41fe3670bfc2bbb67ee9da752fe5a82b7f5790b538cf8346180d4c9af45f430262385b97b668f8de33a39f7d9352d88c59825f2cc3a05e0ea947099ab738093e8125d06199584b843117652abe69c596c780d0edb6e0be5e0eee4f8930202e690a61cdfc61a2d4788c1295cd1fd82b8a46ee82f6a0ff5fa33dbed764de0d846c47955ab94478852a6cf9e575f667ef6f07faa970b099a810bce044ca43389a7c457748843534ec4e3d0cb413cceb528f1224fc5ef95f88998c129aed4afae784204d91", base=16)
h = parse(BigInt, "42191cb5a4dda884da57abdbc0c2181e3e38f21b7a29a25a32face49b84feae8902b2af3da00ffe224dd6d8866bb0d3223c67dbc5ee0564b6956920e4bb7305bcabe844780e60aa64276b6bff2591929c0405e2c1c32c91977bbc94a144f9d275e250cbafdc796e1027e5e3bfb8d4829fad6ce998bfd688ece288d952c502ae71e7a0bbf475f0a7d4f0993ed398bca43e7a54e38519b7a09a055391f3ba08f2eadc823f42394813b19ebbb0dcac34d7dcfb21b724aa0190d176e5d8e6e6f23c1ab01e05c9f845fdcc99dc398f8c43f17fcf0223ee1dde87a186fc24107e5552566850d835ae29537a1d50fde9e5eb8028f409590d327205a07f08b44a22606b5", base=16)
end
# ╔═╡ d896a4d7-1e31-48ce-94a9-d74e07a80625
begin
factors = []
order3 = BigInt(1)
for f in readdlm("./factors_|<3>|.csv", ',', Int64)
if f != 1
global order3 *= f
push!(factors, f)
end
end
# <x> is a subgroup of index 4
@assert modulus/order3 ≈ 4
# Φ(N)=|ℤ_N^×|
ΦN = 4 * order3
end
# ╔═╡ ba434453-3256-4187-9bc4-87e4febd266b
let factors
# possible prime factors of |⟨x⟩|, (p-1)(q-1)
factors = BigInt.(vcat([2, 2], Primes.primes(2^12,2^14), Primes.primes(2^15, 2^17)))
#find_order!(modulus, factors, 3)
#@show factors
end
# ╔═╡ de8d5672-f54f-448c-b297-6fd9fedcc022
begin
# algorithm for factoring N by finding a non-trivial root of unity mod N
function factor_N(modulus, ΦN)
e, d = 3, invmod(3, ΦN)
k = e * d - 1
u, r = k, 0
while u & 1 == 0
u >>= 1
r += 1
end # k = 2^r * u
more = true
root = BigInt(0)
while more
x = rand(BigInt(1):modulus) # chance of hitting a good x is > 1/2
# chance of missing ℤ_N^× is negligible
prev = BigInt(0)
for i in 0:r
cur = powermod(x, 2^i * u, modulus)
if cur == 1 && prev != 0
if prev + 1 == modulus || prev == 1
break # trivial root of untiy
end
more = false
root = prev # non-trivial root of untiy
break
end
prev = cur
end
end
p = gcd(root-1, modulus)
q = modulus ÷ p
return (p, q)
end
p, q = factor_N(modulus, ΦN)
# We factored N!
@assert modulus == p * q
@show (p,q)
end
# ╔═╡ dd4cbfd0-4f81-4fb0-a011-51bca46eba63
let x = PohligHellman(modulus, order3, 3, h, factors), flag
to_bytes(x) = reinterpret(UInt8, [x])
function ubig2ints(x::BigInt)
copy(unsafe_wrap(Array, x.d, x.size))
end
join(String.(reverse(reverse.(to_bytes.(ubig2ints(x))))))[5:end]
end
# ╔═╡ 00000000-0000-0000-0000-000000000001
PLUTO_PROJECT_TOML_CONTENTS = """
[deps]
DelimitedFiles = "8bb1440f-4735-579b-a4ab-409b98df4dab"
Primes = "27ebfcd6-29c5-5fa9-bf4b-fb8fc14df3ae"
ProgressLogging = "33c8b6b6-d38a-422a-b730-caa89a2f386c"
[compat]
Primes = "~0.5.3"
ProgressLogging = "~0.1.4"
"""
# ╔═╡ 00000000-0000-0000-0000-000000000002
PLUTO_MANIFEST_TOML_CONTENTS = """
# This file is machine-generated - editing it directly is not advised
julia_version = "1.8.0"
manifest_format = "2.0"
project_hash = "32ba576aa692262622eae3ba5318826b43dec2fd"
[[deps.DelimitedFiles]]
deps = ["Mmap"]
uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab"
[[deps.IntegerMathUtils]]
git-tree-sha1 = "f366daebdfb079fd1fe4e3d560f99a0c892e15bc"
uuid = "18e54dd8-cb9d-406c-a71d-865a43cbb235"
version = "0.1.0"
[[deps.Logging]]
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
[[deps.Mmap]]
uuid = "a63ad114-7e13-5084-954f-fe012c677804"
[[deps.Primes]]
deps = ["IntegerMathUtils"]
git-tree-sha1 = "311a2aa90a64076ea0fac2ad7492e914e6feeb81"
uuid = "27ebfcd6-29c5-5fa9-bf4b-fb8fc14df3ae"
version = "0.5.3"
[[deps.ProgressLogging]]
deps = ["Logging", "SHA", "UUIDs"]
git-tree-sha1 = "80d919dee55b9c50e8d9e2da5eeafff3fe58b539"
uuid = "33c8b6b6-d38a-422a-b730-caa89a2f386c"
version = "0.1.4"
[[deps.Random]]
deps = ["SHA", "Serialization"]
uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
[[deps.SHA]]
uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
version = "0.7.0"
[[deps.Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
[[deps.UUIDs]]
deps = ["Random", "SHA"]
uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
"""
# ╔═╡ Cell order:
# ╟─0714979c-30ff-11ed-2173-330d20197137
# ╟─a28d8f4f-e0ce-43cc-978b-80a1d1d978a5
# ╠═f35b3397-974b-4ae6-9fda-52922f8a740d
# ╟─f232477c-6270-42ab-a134-deba2542adc2
# ╠═615240ee-b70c-444d-8115-117eb19acc3c
# ╟─e591dd6b-f498-4500-bb5c-6b9f8738ac25
# ╠═ba434453-3256-4187-9bc4-87e4febd266b
# ╠═d896a4d7-1e31-48ce-94a9-d74e07a80625
# ╟─b891d019-c92e-4cf1-b748-8083a7bd26dd
# ╠═de8d5672-f54f-448c-b297-6fd9fedcc022
# ╟─fd19e57e-d598-4c21-b89b-b67498773999
# ╟─9a41bdcb-32bd-465b-af87-9829368f9e1f
# ╠═c9dcc520-c954-42aa-b05e-87543cb1a783
# ╟─aa25f8ad-9f6d-42d5-9559-0b7bd0f0a88c
# ╠═b487280a-d733-4012-9ba1-617030be92b8
# ╟─b042442d-e213-4ce2-bec0-4248fd3a1782
# ╠═dd4cbfd0-4f81-4fb0-a011-51bca46eba63
# ╟─33d3e000-4620-435e-b934-d20a0c3e10dc
# ╟─4459f38c-bc15-4b13-97d9-73c8cec5c4b1
# ╟─1700661c-7354-4682-8da1-ae44e510ad5e
# ╟─00000000-0000-0000-0000-000000000001
# ╟─00000000-0000-0000-0000-000000000002