Hola Amigo (reverse)
Hola Amigo was a hard reverse challenge from FCSC 2023, on which I got first blood 🩸. It was the least solved reverse engineering challenge, and one of the least solved tasks overall from the CTF.
The description states that one of our friends downloaded a demo on their Amiga 500, which unfortunately happened to be a ransomware that encrypted their favorite game's floppy disk.
We are asked to analyze the ransomware's floppy disk (ransomware_floppy.adf
) and to decrypt the data (flag_floppy_enc.adf
).
Initial recon, running & debugging
We learn that adf
stands for Amiga Disk File, a format used by Amiga computers and emulators to store images of floppy disks. The command file
also reveals that it was most likely created with a tool called exe2adf
.
A bit of research brings us to ADFlib, an open-source tool to play around with the Amiga filesystem. In particular, it allows to extract the floppy disk's data with unadf
.
Running it on the flag's floppy disk gives us the encrypted flag. Immediately, we notice that its length is a multiple of 16, which hints at the use of a block cipher (maybe AES).
Then, we are also able to extract the ransomware's binary, a.exe
.
Amiga 500 runs on the Motorola 68000 architecture (shortened m68k
). IDA can disassemble this architecture, but I chose to work with Ghidra as it provides a decompiler as well (although it is far from perfect).
But before reversing anything, it would be more comfortable if we could emulate the program and setup a debugging environment.
I chose to use the FS-UAE Amiga emulator, that runs on Linux and provides debugging support. Arbitrary floppy disks should be placed inside ~/Documents/FS-UAE/Floppies
in order to be loaded. Then, we need to create a configuration file in ~/Documents/FS-UAE/Configurations/Default.fs-uae
:
Note the console_debugger
flag that enables the use of the debugger.
Now, when FS-UAE is launched, it should load the ransomware. At first, we are greeted with a window that asks us to wait for a surprise.
But around ten seconds later, a gruesome sight awaits us...
At any moment, we can bring up the debugging console with F11 + D
, which pauses the execution:
Depending on when the execution was paused, the floppy disk may or may not have been loaded already. Thanksfully, it seemed to be mapped practically all the time at the same address (or very near), which made it easier to set breakpoints in advance.
Reversing: the file encryption
Let's load the program with Ghidra and select the default Motorola 68000 architecture (32-bits, big-endian).
Getting started is not easy. The binary is a bit large (100 Ko) and Ghidra seems to miss a lot of code that we need to disassemble manually.
One string is stored in clear text inside the data section (only one...) that says:
However, no cross-reference shows up for this string, so it is unclear what we can achieve from it.
After spending a lot of time exploring around and identifying potential candidates for cryptographic functions, a new idea popped in my mind.
This blog post about reverse engineering Amiga games mentions the following:
Standard Amiga library calls: Text strings in the game may reference library files, such as graphics.library or dos.library. Early on in a program you will see references to ABSEXECBASE ($4) followed by a jump to an offset; e.g. JSR (-30,A6). Calls to ABSEXECBASE are offets of exec.library, and you will often see this to load other libraries with the OpenLibrary function, i.e. JSR (-552,A6). A fully documented list of major library offsets and hardware registers appears in Mapping the Amiga.
The linked documentation is very useful, as it gives the corresponding offsets to all the standard library calls.
I assumed that the ransomware, at some point, had to read from the flag file and then write to it. We quickly stumble upon these relevant calls:
Therefore, looking for a function that features the instructions JSR (-0x1E, A6)
, JSR (-0x2A, A6)
and JSR (-0x30, A6)
sounds like a good idea. And there is one!
I called this function encrypt_file
, and tried to identify its main components.
First, the file is opened. Then, a certain function FUN_00000f28
is called, maybe to initialize a cryptographic context. Finally, 16-bytes blocks are read from the file. Each block undergoes the FUN_00001074
function, probably an encryption function, and the output is written back to the file.
I will spare the details of reversing these two cryptographic functions. It is rather tedious work, but nonetheless very classic when you have to reverse binaries that include cryptographic libraries: there are often big context structures wandering around and many different parameters and modes are implemented, all of this impacting the code's legibility and making it harder to find exactly which cryptographic primitive is actually being used.
We eventually understand that the flag is encrypted using AES-256 in CBC mode and a null IV. However, it is hard to tell at first glance where the key comes from. Debugging can help validating this fact and retrieving the key, but it turns out the key seems to change every time we run the program again (of course, this is quite a classic behavior for a ransomware).
The goal now is therefore to find how the AES key is generated, and which key was used to encrypt the flag.
Reversing: the key generation process
First, we identify that the encrypt_file
function is called with the key
and its size key_size
(which is always 32 bytes) as parameters.
The parent function, its only cross-reference, is rather large and cryptic. I called it ransomware
. I went for a "bottom-up" approach in order to understand where the key comes from.
Here are the main blocks that constitute this function:
Yup, there's still a lot of cryptographic functions going on... AES was the easy part.
Starting with the hash function part in orange. We immediately notice the presence of some magic values:
Googling leads us to SM3. It is a hash function used in the Chinese National Standard, that outputs a 256-bit digest (fits perfectly to feed the subsequent AES part). It features different rounds of compression using 64-byte blocks.
Now, we need to understand what exactly is fed to this SM3 part. Debugging shows that its input is pretty much 128 seemingly random bytes.
At this point, as I was debugging and casually navigating through the memory by searching for known bytes with the s
command, I stumbled upon a few interesting plaintext strings. Namely, I was looking for the strings that showed up on the screen (like "we are preparing a surprise"), thinking they were perhaps initially encrypted, and then decrypted on-the-fly.
What's this?! A rather compelling magic string from out of nowhere...
According to Wikipedia, NIST SP 800-90A is a publication by the NIST, called Recommendation for Random Number Generation Using Deterministic Random Bit Generators, that specifies three different allegedly secure pseudo-random number generators: Hash DRBG, HMAC DRBG and CTR DRBG.
The latter is referenced in the magic string: we can assume the program makes use at some point of the CTR DRBG algorithm using the AES-256 block cipher primitive.
With a bit more reversing, we find that the function FUN_00006aac
is responsible for decrypting the strings in memory. It is called with the offset of the string to decrypt and a 8-bit key . It uses AES-CTR with an initial zero counter and a key consisting of all bytes . This knowledge is not essential to solve the challenge.
Back to the missing blocks: the random number generation. The idea now is to correlate what we see in the binary with an existing library, because there's so much code (and so much useless code) that it seems unlikely it was implemented for this challenge only.
After a bit of research, we find the libdrbg library, conveniently designed by the ANSSI, and most notably by the challenge author himself.
With this knowledge, we are able to grasp a more astute understanding of the random generation block.
CTR DRBG is instantiated through the ctr_drbg_instantiate
function, and takes in several parameters:
an entropy input;
a nonce;
a personalization string.
In the challenge, we assess that the nonce is an array consisting of the bytes 0x00, 0x01, ... 0x7F, and the personalization string is the magic string from earlier (NIST_SP80090A_AES256-CTR_DRBG_PERS_1337_CRYPT0R_FROM_AMIGAGA
). It is still unclear, however, how the entropy input is generated --- we will come back to it later.
Then, random bytes can be generated using the ctr_drbg_generate
method. It takes in some additional entropy input and outputs 128 bytes of random data.
In the challenge, this function is called twice, with two different additional entropy inputs which seem to be generated using the same method than for the first entropy input.
Let's now dive into how these entropy inputs are computed. This is done through the FUN_000012f4
function, which I renamed get_entropy_input
.
Unfortunately, Ghidra's decompiler has a very hard time understanding these patterns and messes up the arguments, so we have to resort to the assembly code.
Before calling the get_entropy_input
function itself, some important arguments are prepared. The (-0x100, A5)
pointer stores the output entropy buffer, and 0x100 is its size.
The third argument comes from the D0
register, which itself is derived from a library call. Grepping through the Mapping the Amiga document, here's what we find:
That's right, as in 9 ransomware CTF challenges out of 10, a timestamp is involved! We should have seen this coming.
The CurrentTime
library call writes two longs (one for the seconds and one for the microseconds) to user-provided pointers: here, the local variables (-0x3ba, A5)
and (-0x3b6, A5)
. However, only the first one is passed to the next function: the microseconds are basically discarded. It is also important to note that the timestamp is byte-swapped before it is passed over.
Finally, here is the get_entropy_input
function:
We see that there is also a fourth argument, which I called idx
. Its value is 0 for the first call, 1 for the second and 2 for the third. The k
value is the timestamp in seconds. entropy_box
is a 256-byte array of constant, random data that we can dump.
One should notice that the timestamp is shifted 16 bits to the left. Therefore, there are only 16 bits of entropy (the upper bits of the timestamp), which is easily bruteforceable if need be.
We now have pretty much everything we need in order to generate the key!
Reimplementing the algorithm
At first, I thought I could simply get the timestamp from the flag.txt
file and plug it directly in the algorithm. By patching the executable or modifying the returned timestamp through debugging, we could let it compute the AES key automatically for us.
Unfortunately, this does not work: this timestamp is not correct. This was hinted at in the challenge's description, which warned us that we shouldn't take the encrypted floppy disk's metadata for granted, as they were most likely overwritten.
Consequently, we need to brute-force the possible AES keys, and thus reimplement the whole key generation algorithm. Thanksfully, we can reuse the libdrbg library.
After spending quite some time assimilating the library and debugging my implementation, which seemed to correspond very closely to the executable's after performing various dynamic checks, I still couldn't find the correct key and I realized that something was wrong.
My mistake was to patch the executable so that it always returns the same timestamp in order to make the calculations deterministic and ease the debugging process. But I totally neglected a crucial detail : there are three distinct calls to CurrentTime()
, and these may return different timestamps.
Indeed, the random number generation function is quite computationally heavy for the Amiga 500. With the FS-UAE emulator, I measured a delay of 8 seconds between the first two calls, and 4 seconds between the last two. We need to take into account these delays in our brute-force.
Additionally, since the timestamp is byte-swapped, the k >> 0x10
quantity in get_entropy_input
represents seconds of lower significance, which means that the three timestamps are really distinct for small delays.
Even with this new realization, it took me a lot of time to find the correct pair --- it came down to basically guessing how slow the author's emulator was when they encrypted the flag. Bruteforcing the timestamp was already not instantaneous, so expanding the search space for the delays came with a certain computational cost.
Eventually, I found the correct delays (18 seconds and 6 seconds) along with the timestamp (0x06fc). Here is my brute-force solve script. Note: the libdrbg library happens to implement the SM3 hash function as well, which was a blessing.
We find the correct AES key, and the decrypted flag!
Conclusion
Hola Amigo was a quite fun and well-designed challenge that tackled an architecture and an environment I was not familiar with at all. It also taught me a few new cryptographic primitives.
I thought the brute-force part at the very end was not that guessy, but still slightly frustrating, especially when you've spent your whole week-end reversing the binary, managing to get a correct implementation up and running, and having the pressure of not letting your first blood fall into someone else's hands. 😀
Last updated