Home » Security Bloggers Network » Solving Garbage with Radare2

Solving Garbage with Radare2

by Kelsey Clark on November 23, 2020

Flare-on is a great CTF-style event. Its popularity means you can find many write ups for the same challenge, so you can pick and choose new techniques and tools to learn. I have seen a couple of challenge write-ups, but here I will solve this challenge using Radare2 by virtualizing some instructions with ESIL. This approach avoids having to completely repair the binary file and shows how easily ESIL can be used to solve otherwise difficult problems.

Challenge

You are provided a message explaining that the challenge had been deleted accidentally. The challenge file was recovered from the disk, but it is apparently corrupted. The goal is apparently to fix the challenge file and execute it.

First, we run the Linux file command on the challenge file:

Copy to Clipboard

crackers(2_garbage)# file garbage.exe garbage.exe: PE32 executable (console) Intel 80386, for MS Windows, UPX compressed

I was only vaguely aware of UPX before this challenge, but a quick Google search found a tool to decompress UPX files. This tool runs in Linux and is in the arch repos, so it was easy to install.

Running the tool shows our roadblock.

Copy to Clipboard

crackers(2_garbage)# upx -d garbage.exe Ultimate Packer for eXecutables Copyright (C) 1996 – 2020 UPX git-d7ba31+ Markus Oberhumer, Laszlo Molnar & John Reiser Jan 23rd 2020 File size Ratio Format Name ——————– —— ———– ———– upx: garbage.exe: OverlayException: invalid overlay size; file is possibly corruptUnpacked 1 file: 0 ok, 1 error.

The file is too corrupted for UPX to unpack.

I have no familiarity with how UPX works under the hood–I don’t know what an “overlay size” is. Since the UPX tool is open source, I downloaded that. I used ripgrep to quickly search the source code for a reference to the error. If you are unfamiliar with ripgrep, it is essentially like find . -print0|xargs -0 grep, only better in every way. I highly recommend you check it out.

Copy to Clipboard

crackers(upx)# rg ‘invalid overlay size’ src/packer.cpp 579: throw OverlayException(“invalid overlay size; file is possibly corrupt”);

There’s only one error message, but it occurs in the function Packer::checkOverlay, which is called in multiple places.

I then modified the build system to include symbols and compiled the UPX package. I used GDB to debug the UPX package and set a breakpoint on Packer::checkOverlay so that I could get a backtrace to the calling function. This led me to the PeFile::unpack0 function. Now, in all honesty, I have no idea what is going on in this code. So, I added a ton of printf statements to the source code.

The code ended up looking like this:

Copy to Clipboard

template <typename ht, typename LEXX, typename ord_mask_t> void PeFile::unpack0(OutputFile *fo, const ht &ih, ht &oh, ord_mask_t ord_mask, bool set_oft) { //infoHeader(“[Processing %s, format %s, %d sections]”, fn_basename(fi->getName()), getName(), objs); handleStub(fi,fo,pe_offset); if (ih.filealign == 0) throwCantUnpack(“unexpected value in the PE header”); const unsigned iobjs = ih.objects; for (size_t i=1; i<=iobjs; i++) { printf("== %lx %x==\n", i, iobjs); printf("file_size: 0x%lx\n", file_size); printf("ih.filealign: 0x%x\n", (unsigned int) ih.filealign); printf("isection[i - 1].rawdataptr: 0x%x\n", (unsigned int) isection[i - 1].rawdataptr); printf("isection[i - 1].size: 0x%x\n", (unsigned int) isection[i - 1].size); printf("sum: 0x%x\n", (unsigned int) ( isection[i - 1].rawdataptr + isection[i - 1].size )); printf("sub: 0x%lx\n", file_size - ALIGN_UP(isection[i - 1].rawdataptr + isection[i - 1].size, ih.filealign)); } const unsigned overlay = file_size - ALIGN_UP(isection[iobjs - 1].rawdataptr + isection[iobjs - 1].size, ih.filealign); checkOverlay(overlay);

The checkOverlay call is where everything goes bad, so I added the for loop above it to print out all the numbers that seem relevant. I then grabbed some properly UPX packed files and ran my modded version of UPX on them to see what the output of my loop would be. I compared this output with the challenge file.

This process took some guessing and checking, but soon it became obvious the file size was wrong. All that work and I should have just used a hex editor on the file to see the XML data at the end of the file was cut off.

To fix the file size, I simply added bytes to the end of the file. You don’t even have to be exact to get UPX to unpack it. Either way, because you are not fixing the file properly, you won’t get the file back properly. It will be good enough for the next stage, though.

Running the file

The file won’t execute on Windows. Googling the error showed a lot of other people with the same problem and implied I needed some extra software. This is wrong–I know the real reason is because the file is corrupted. How corrupted, though? Can Radare2 make sense of it?

Copy to Clipboard

crackers(test)# r2 garbage.exe [0x00401473]> aa [x] Analyze all flags starting with sym. and entry0 (aa) [0x00401473]> afl 0x00401473 21 396 -> 343 entry0 0x0040106b 3 432 main 0x00401000 3 69 fcn.00401000 0x00401045 3 38 fcn.00401045 0x0040121b 3 17 fcn.0040121b 0x004014a5 3 249 loc.004014a5 0x0040c573 1 6 sub._IsProcessorFeaturePresent 0x0040147d 1 40 fcn.0040147d

Hey! Not even an error, which is kind of unusual in itself for Radare2. Maybe we can skip fixing the file any further and just reverse the thing. Looking at the main function, the control flow seems simple enough. I can quickly grep out all the calls to see what functions are being used.

Copy to Clipboard

│ 0x0040114b e8b0feffff call fcn.00401000 │ 0x00401166 ff150cd04000 call dword [sym.imp._CreateFileA] ; 0x40d00c ; HANDLE CreateFileA(LPCSTR lpFileName, DWORD dwDesiredAccess, DWORD dwShareMode, LPSECURITY_ATTRIBUTES lpSecurityAttributes, DWORD dwCreationDisposition, DWORD dwFlagsAndAttributes, HANDLE hTemplateFile) │ 0x00401174 e8ccfeffff call fcn.00401045 │ │ 0x00401198 e863feffff call fcn.00401000 │ │ 0x004011ae ff1504d04000 call dword [sym.imp._WriteFile] ; 0x40d004 ; BOOL WriteFile(HANDLE hFile, LPCVOID lpBuffer, DWORD nNumberOfBytesToWrite, LPDWORD lpNumberOfBytesWritten, LPOVERLAPPED lpOverlapped) │ │ 0x004011ba e886feffff call fcn.00401045 │ │ 0x004011c0 ff1510d04000 call dword [sym.imp._CloseHandle] ; 0x40d010 ; “&$\x01” ; BOOL CloseHandle(HANDLE hObject) │ │ 0x004011da e821feffff call fcn.00401000 │ │ 0x004011ea ff150cd14000 call dword [sym.imp._ShellExecuteA] ; 0x40d10c ; “B$\x01” ; HINSTANCE ShellExecuteA(HWND hwnd, LPCSTR lpOperation, LPCSTR lpFile, LPCSTR lpParameters, LPCSTR lpDirectory, INT nShowCmd) │ │ 0x004011f6 e84afeffff call fcn.00401045 │ 0x004011fd ff1500d04000 call dword [sym.imp._GetCurrentProcess] ; 0x40d000 ; HANDLE GetCurrentProcess(void) │ 0x00401204 ff1508d04000 call dword [sym.imp._TerminateProcess] ; 0x40d008 ; BOOL TerminateProcess(HANDLE hProcess, UINT uExitCode) │ 0x00401214 e802000000 call fcn.0040121b

Don’t get anxious here, just look at the known functions. We have calls to CreateFileA, WriteFile, CloseHandle, ShellExecute, GetCurrentProcess and TerminateProcess. It looks like we make a file, put something in it, and then likely execute it. We want to know what goes into that file. I won’t put it here but r2 detected a large and ugly string at the start of the main function. This string is probably the encoded/encrypted contents of the file or a key to be used to decode/decrypt the contents of the file. We also see a lot of values moved onto the stack, implying another set of data.

Copy to Clipboard

[0x004010dd]> pd 15 │ 0x004010dd 8d8dc8feffff lea ecx, [var_138h]│ 0x004010e3 c745c02a2d42. mov dword [var_40h], 0xf422d2a │ 0x004010ea c745c43e5064. mov dword [var_3ch], 0xd64503e │ 0x004010f1 c745c85d041b. mov dword [var_38h], 0x171b045d │ 0x004010f8 c745cc163603. mov dword [var_34h], 0x5033616 │ 0x004010ff c745d0342009. mov dword [var_30h], 0x8092034 │ 0x00401106 c745d4632124. mov dword [var_2ch], 0xe242163 │ 0x0040110d c745d8151434. mov dword [var_28h], 0x58341415 │ 0x00401114 c745dc1a2979. mov dword [var_24h], 0x3a79291a │ 0x0040111b c745e0000056. mov dword [var_20h], 0x58560000 │ 0x00401122 c645e454 mov byte [var_1ch], 0x54 ; ‘T’ ; 84 │ 0x00401126 c745e8380e02. mov dword [var_18h], 0x3b020e38 │ 0x0040112d c745ec193b1b. mov dword [var_14h], 0x341b3b19 │ 0x00401134 c745f01b0c23. mov dword [var_10h], 0x3e230c1b │ 0x0040113b c745f4330811. mov dword [var_ch], 0x42110833

Figuring out what this is doing statically is maybe good for learning, but it is also time consuming. We can’t execute it because our binary is broken, so let’s just use ESIL.

ESIL can quickly emulate these instructions, and afterward you can just view the memory without worrying about having correct offsets. Wait a minute! There are not many windows API calls and none of them are used to decode data. We could likely decode all this information with ESIL and avoid working through the assembly or decompilation.

Using ESIL

I am going to treat this section as a very basic introduction to ESIL. There is a lot of text here, but that is because I am explaining what I am doing and why.

In reality, it was only a few minutes after reaching this point before I solved the challenge. This whole section can be summarized by a 2 minute 18 second asciinema video. So, if you get bored by the text or you already know ESIL, I recommend you watch that and skip to the blog conclusion. Refer back to the blog if you get confused about why I did something; for example, the second aeip command (explained below).

All the ESIL commands are in the “ae” family of commands. I recommend running ae? to see all the commands and get help for them. Don’t be daunted by what you find there; we will only be using three of the commands.

First, we have to tell ESIL to get some memory ready for us. This is the initialize memory command or aeim. Next, we have to tell ESIL where to start running code. This just means setting the instruction pointer to the main function. So, seek the main function and run aeip.

Copy to Clipboard

[0x00401473]> s main [0x0040106b]> aeim [0x0040106b]> aeip

I like to manually run ESIL a few steps to make sure it is doing what I think it should be doing. I usually do this in visual mode with the V command. Visual mode works differently–keys will do things immediately. Hit p a few times to change the visual mode. I like the one that shows assembly, stack, and registers at once. If you want to run a command from visual mode, you can quit visual mode with q. Alternatively, you can hit : and get a command prompt. After the prompt is done, you will go back to the visual mode. The important thing is s will single step execution in ESIL and holding s will step fast. For this challenge, I single stepped slowly through the code and checked the stack to see what data was being put there.

Once you single step into the first function call, you can observe the first xor decoding loop. You can also see the result of the xor is placed in the address pointed to by ecx, which means we can look at that memory with x @ecx. Single stepping a loop will get old eventually, so use aecu to continue until the end of the function. Don’t copy paste the address like a windows user, instead scroll to the address you want and run aecu $$. In r2 world, the $$ value is equivalent to the current address. Once you’re at the end of the function, you can see the fully decoded string with another x @ecx.

Copy to Clipboard

:> x @ecx – offset – 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 0x00177fe4 7369 6e6b 5f74 6865 5f74 616e 6b65 722e sink_the_tanker. 0x00177ff4 7662 7300 b299 57bb 0080 1700 0000 0000 vbs…W……… 0x00178004 0000 0000 0000 0000 0000 0000 0000 0000 ……………. …

It looks like we decoded the file name. Now, carefully single step until you return from the function–but be careful! Don’t step any more. You should see you are about to execute the following instructions.

Copy to Clipboard

│ 0x00401150 53 push ebx │ 0x00401151 6880000000 push 0x80 │ 0x00401156 6a02 push 2 │ 0x00401158 53 push ebx │ 0x00401159 6a02 push 2 │ 0x0040115b 6800000040 push 0x40000000 │ 0x00401160 ffb5c8feffff push dword [var_138h]│ 0x00401166 ff150cd04000 call dword [sym.imp._CreateFileA] ; │ 0x0040116c 8d8dc8feffff lea ecx, [var_138h]

This is a 32 bit x86 program, so arguments are passed on the stack. These instructions are preparing the arguments for the CreateFileA call. The CreateFileA function is then expected to reset the stack to its previous condition when returning. ESIL does not have a built-in Windows Kernel, so it can’t create a file and return a handle to it. We are just going to skip it by putting the instruction pointer at first instruction after the call instruction. This is not a new command; we just scroll to the desired instruction and run the aeip command.

We also have to fix the stack as if we did return from the function. We can do this by manipulating the esp register with ESIL code using the ae command. That takes math, though, and I would probably just off-by-one it or something.

So instead, let’s skip the function call and all the argument pushing. If we never put the arguments on the stack, we won’t have to fix them later. So don’t execute the push ebx instruction. Instead, scroll until you get to the first instruction after the CreateFileA and just set this as the current instruction with aeip. Now we can continue emulating until the next windows API call. Turns out we don’t have to go that far.

I would use aecu again to get to the next function call quickly. Step into the function and we can quickly see it is another xor encoder. Again, it is decoding something to the address of ecx. It looks like gibberish, though… Did we mess up our stack? \

This had me second guessing myself for a bit. If you did mess up, you can remove the current VM memory with aeim- and add a new one again with aeim and start over. However, rest assured. Everything is good. This is decoding something that will be used to decode the final string. Step out of this function and into the next function.

This is the last function we need. It is another xor decoder, and we actually have seen it before. This will decode to ecx again. You can step though the loop and watch the string slowly appear. Once you get to the end of the loop, you can print that memory value as a string with the x/ gdb like commands. So x/1s @ecx will show the following:

Copy to Clipboard

:> x/1s @ecx 0x00177fa4 MsgBox(“Congrats! Your key is: C0rruptGarbag3@flare-on.com”)

Woo! We got it!

Conclusion

I am admittedly biased towards r2, and ESIL is the only emulator I have used this way. That said, I can’t help but feel that this is pretty cool. You could do all this with any other emulator, but you would have to give that emulator the same information we gave r2. You would have to tell the emulator where to start executing and what instructions to skip (or how to adjust the stack). Likely, an emulator won’t be able to actually write a file, and so it won’t be obvious where the flag actually ends up. You would have to tell it to keep an eye on where ecx is pointed during decoding loops. Having an emulator built into r2 makes all this trivial. You can figure it all out as you go.

The post Solving Garbage with Radare2 appeared first on Hurricane Labs.