Skip to main content

Hacking - Best OF Reverse Engineering - Part11

How to Identify and Bypass Anti-reversing Techniques

Learn the anti-reversing techniques used by malware authors to thwart the detection and analysis of their precious malware. Find out about the premier shareware debugging tool Ollydbg and how it can help you bypass these anti-reversing techniques.

This article aims to look at anti-reversing techniques used in the wild. These are tricks used by malware authors to stop or impede reverse engineers from analysing their files. As an entry level article we will look at:

• Setting up a safe analysis environment

• Ollydbg an X86 debugger

• Basic techniques like;

    • Verification of dropped location

    • Anti-debugger

    • Obfuscation of strings

    • Hiding APIs

    • Anti-Virtualisation

We will look at the code as written by the malware authors in C++. We will compare this code to the
debugger code in Ollydbg. Ollydbg is the x86 debugger of choice for reverse engineers. We will look at the different techniques and possible improvements. We will also find out how to bypass each technique using Ollydbg. Finally, I have written a small ‘Reverse_Me.exe’ that contains all of these techniques so you can practice your newly gained malware smashing expertise.

Analysis Environment

First off we need an analysis environment. The ‘Reverse_Me.exe’ I have provided is not malicious. It is, however, good practice to only analyse files in a safe environment. Ideally, all your analysis would occur on a second computer which is not connected to any network. Typically, this analysis computer would run an operating system other than Windows. This machine hosts multiple virtual machines (Win XP, Win7, Server 2008) and samples are transferred by ‘snicker-net.’ Typically, the samples would be password protected in zip files. Having different host and guest operating systems reduces the chances of propagation of malware. A quicker way to get you started is to use a Virtual Machine and ensure that all shares are read-only. Disable all network connections before performing any analysis. It’s not perfect but if you are mindful it should be adequate to get you started. Start by downloading your virtualisation environment of choice; VMware, Virtualbox, Windows Hypervisor, etc. (I have used a VMWare detection in the anti-virtualisation layer of the Reverse-Me sample). It is common for antimalware engineers to use Windows XP SP2 as an analysis machine, the idea being that this version of Windows has weaker security so it has a better chance of running. That said Windows 7 is perfectly adequate, I have done testing on both. After installing any required tools,
take a snapshot so you can jump back to this point, this will save you having to remove the malware from your machine. Your environment is now setup so let us look at the tools.

Tools

For tools  try and limit it to just one; ‘Ollydbg.’ Ollydbg is a debugger just like the debugger
in your compiler but it can run without source code. It does this by converting the machine code into
assembler so that it is human readable. It also gives us the ability to view and edit the assembler code
as well as the values in the registers and on the stack and heap. Ollydbg has some very powerful plugins that can help you bypass many of the techniques I will mention. These Plugins are outside the scope of this article but please feel free to investigate yourself. Ollydbg is shareware but the author, Oleh Yuschuk, does ask you to register with him if you use it frequently or commercially http://www.ollydbg.de/register. txt. Version 2 of Ollybdg is available but it is still in beta so we are going to use V1.1 for this article. Please download it from http://www.ollydbg.de/.

Going to use a hex editor written by Eugene Suslikov, mainly to show parts of the PE file system.
You don’t need it to get through this article but a demo version of Hiew is available on his website http:// www.hiew.ru/. If you get serious about reversing, Hiew is a must have tool.

Microsoft Visual Studio 2010

Visual Studio 2010 to compile the “reverse me” sample, if you do not have it installed on your
analysis machine you will require the following DLLs to run the binary: ttp://www.microsoft.com/en-us/download/details.aspx?id=5555.

Getting started with Ollydbg

Download Ollydbg and unzip it into its own directory. It does not need to be installed. When you open Ollydbg for the first time you will more than likely be met by the warning in Figure 1. Using the menus at the top of the window navigate to Options->Appearance->Directories and point it to the directory that you just dropped Ollydbg into.













When you open a file in Ollydbg you will see four panes in the window.

• Top-Left – Disassembler Pane

• Top-Right – Registers and Flags Pane

• Bottom-Left – Hex Dump Pane

• Bottom-Right – Stack Pane

We are mainly going to use the disassembler pane. The registers and flags panes we will use to manipulate jumps and see the values in the register. We will not use the dump and stack pane at this stage.

We are going to use short-cut keys for speed; the following shortcuts are all you should need;

• F2 Toggle breakpoint

• F7 Step into

• F8 Step over

• F9 Run continually

• Ctrl-G Go-to a Virtual address

We are mainly going to use strings to navigate for simplicity. If you right click on the disassembler pane and select ‘Search For’-> ‘All referenced Text Strings’ (Figure 2). You will see the strings of each layer; just double click on that required layer to get to its location in code. On the top left hand corner of the main window you will see something like “CPU – main thread, module <module_name>”, this will tell you the module you are currently running in. When you open the ‘Reverse_Me’ in Ollydbg it may start in the ntdll module, just press F9 and it will go to the entry point of the ‘Reverse_Me’. The first instruction in the ‘Reverse_Me’ sample is a call.


















The Binary

Just a short preamble, malware usually consists of layers. Typically, the most external is a packer of some sort (UPX, Aspack, etc.). I have not added a packer to this Reverse_Me.exe, although most are not hard to bypass and easy to add. I think they would overly complicate the binary for such a short article. I have tried to make all the layers very easy to identify by putting in lots of strings that you can search for. I have not encrypted each layer as would be typical of a “Reverse_Me” puzzle. This is to help in your navigation through the binary. It does leave you open to jump to the final layer and skip the rest . The virtual addresses in the article may not correspond to the ones on your machine so please use the strings. I have displayed some of the strings in Figure 3. You will have to press <Enter> before each layer initiates. This may be a pain but it will help you to be systematic in your steps.















Layer 1: Verification of dropped location

A lot of malware will drop executables onto your system. I frequently see ‘dll’ files dropped into the ‘C:\ Windows\system32’ directory. Some malware will confirm its location before it will run. The anti-malware engineer is probably going to analyse the file in a directory like C:\Infected\<current_date>. So, this basic trick can be effective against simple dynamic analysis. We will see later how to obfuscate strings which would make this technique even harder to detect by hiding the word “Temp.”

Listing 1. Verification of dropped location

void First_challenge()
{
    char buf[255];
    char buf_temp[] = {‘T’,’e’,’m’,’p’};
    // getcwd gets the current working directory
    _getcwd(buf,255);
    bool Program_Running_In_Temp_Folder = true;
    // we are starting at 3 to avoid the drive letter
    for (int temp = 3; temp < 7; temp++)
    {
        if (buf[temp] != buf_temp[temp-3])
        Program_Running_In_Temp_Folder = false;
    }
    if (Program_Running_In_Temp_Folder)
        printf ( “Well done first layer passed” );
    else
        printf ( “Sorry not this time, you are in the wrong directory” );
        exit(0);
}

Layer 1: The C++ code

In Code Segment 1 there is a short function that checks that a file is in a directory called Temp.

The corresponding assembler code as produced by Ollydbg is in Figure 4. As this may be your first time seeing assembler we will try and walk you through the code. The first point to identify is the call to _getcwd, this will get the current working directory. The next few lines compare the values in the path to the hex digits 0x54, 0x65, 0x6D, 0x70. If you pull up an ASCII table from the web you will find that these hex bytes correspond to the string ‘Temp.’ The final two jumps in the image below can redirect you away form “Well done first layer passed.” This will happen if any of the hex bytes that represent ‘Temp’ do not match the path supplied by _getcwd.













Locate and set a breakpoint (F2) on the line with JNZ (jump not equal to zero). If you click F9 it will
run to that breakpoint. Now look at the top right of your screen and you should see a set of flags like the Figure 5, the registers and flag Pane. Locate the flag Z and click it. This will toggle the jump. Click it again. You should be able to see a small arrow showing you where the jump will terminate. By toggling the jump you can insure that it will not jump but fall through to ‘Test AL AL’. Repeat the flag manipulation on the next jump at JE (jump equal too) to insure you are directed to the “Well done first layer passed”. This technique of manipulating the jump can be used throughout the binary to jump to your chosen branch.


















Layer 2: Anti-debugger

Anti-debugging techniques are used by programs to detect if it runs under control of a debugger. The aim is to impede the process of reverse-engineering. There are a lot of anti-debugger tricks, we will just show you the most basic. It is based around the following windows function (Listing 2). It is simply an ‘if statement’ as you can see in Code Segment 2 (Listing 3).

Listing 2. IsDebuggerPresent API

BOOL WINAPI IsDebuggerPresent(void);

Listing 3. IsDebuggerPresent ‘if statement’

void Second_challenge()
{
    if( IsDebuggerPresent() )
    {
        printf(“Running in a debugger”);
        exit (0);
    }
    else
    {
        printf(“Not running in a debugger”);
    }
}

The assembler code is available in Figure 6. It calls the IsDebuggerPresent API and based on its
response jumps to the “Not running in a debugger” printf or continues on to the printf which is passed
“Running in a Debugger” and then the program exits. After a debug trick you will normally see a crash or exit. The Idea being that the analyst will think the file is benign or corrupt. To bypass this trick we are again going to use the zero flag as shown in the previous example. If we set the zero flag to 1 we will jump to the „Not running in a debugger” branch and continue to the next layer.





























Layer 3 Obfuscation of strings and hiding APIs

Two topics together as they are intrinsically linked. Windows executable files follow
a structure called the PE file structure. This structure tells Windows how to load the executable into memory and what bit of code to run first, among other things. Without going into too much detail the PE structure has many tables and one that holds imports. This table is called the imports table and contains all the APIs that are called by the executable. As a Reverse engineer this is a very good place to start. It will give you a good Idea of what the program is going to do. If you see loads of networking APIs in a program that claims to be a calculator it would raise your suspicions. Figure 7 shows part of the Import table displayed by the excellent tool Hiew. In the table you can see APIs that we have used already e.g. IsDebuggerPresent. You will not see CreateFileA. Please notice two important API’s LoadLibrary and GetProcAdress as these two API’s give us theability to load any API.

Layer 3:GetProcAdress

‘GetProcAddress’ is essentially a wild card. You can use ‘GetProcAddress’ to get the address needed to call any other API. There is a catch, you must pass the name on the API you require to ‘GetProcAddress’. That would mean that although the API is not visible in the Imports table it will be glaring obvious in a string dump of the file. So, a malware author will typically obfuscate the strings in the binary and then pass them to a deobfuscation routine. The deobfuscation routine will pass the cleartext API names to ‘GetProcAddress’ to get the location of the API. So, between the obfuscation of the strings and the use of ‘GetProcAddress’ they can hide the APIs they are calling.

Layer 3: String Obfuscation

If you run a strings dump on the binary you will see something like Figure 3. If you scroll down through the strings in Hiew or another tool you will not see the following strings although they are used in the next function

• ‘Kernel32’

• ‘CreateFileA ‘

• <A secret code to pass layer 3>

Used three types of obfuscation to hide the above strings. The first two are very similar and are really just to subvert a string search of the binary. When you see the C++ code they will look very easy to see through. When you view the assembler code it will be slightly more difficult. First is a method where you push values into an array and then convert the array to a string, see Listing 4.

Listing 4. Character Buffer to String Obfuscation, pushed in order

LPCWSTR get_Kernel32_string()
    {
        char buffer_Kernel32[9];
        buffer_Kernel32[0] = ‘K’;
        buffer_Kernel32[1] = ‘e’;
        buffer_Kernel32[2] = ‘r’;
        buffer_Kernel32[3] = ‘n’;
        buffer_Kernel32[4] = ‘e’;
        buffer_Kernel32[5] = ‘l’;
        buffer_Kernel32[6] = ‘3’;
        buffer_Kernel32[7] = ‘2’;
        buffer_Kernel32[8] = ‘\0’;

        //The following is code to convert the char buffer into a LPCWSTR
        size_t newsize = strlen(buffer_Kernel32) + 1;
        wchar_t * wcstring = new wchar_t[newsize];
        size_t convertedChars = 0;
        mbstowcs_s(&convertedChars, wcstring, newsize, buffer_Kernel32, _TRUNCATE);
        return wcstring;
    }

Let’s look at the same code in assembler it’s a lot more difficult to find. Pull out your ASCII table again.If you look at the cluster of four mov instructions highlighted below, you will see the two DWORDs aremoved onto the stack. If you translate these hex bytes into ASCII and change the byte order you will see‘Kernel32.’ So, this simple method is very effective at obfuscating strings (Figure 8).



















The second type of obfuscation is very similar. It uses the same technique but goes a step further. It does not add the characters to the array in order. For longer strings this can make the reverse engineer’s job very tough. Let’s have a look at the C++ code in Listing 5.

Listing 5. Character Buffer to String Obfuscation, unordered

LPCSTR get_CreateFileA_string()
    {
        char * buffer_CreateFileA = new char[12];
        buffer_CreateFileA[1] = ‘r’; //0x72
        buffer_CreateFileA[2] = ‘e’; //0x65
        buffer_CreateFileA[3] = ‘a’; //0x61
        buffer_CreateFileA[8] = ‘l’; //0x6c
        buffer_CreateFileA[6] = ‘F’; //0x46
        buffer_CreateFileA[7] = ‘i’; //0x69
        buffer_CreateFileA[4] = ‘t’; //0x74
        buffer_CreateFileA[0] = ‘C’; //0x43
        buffer_CreateFileA[9] = ‘e’; //0x65
        buffer_CreateFileA[5] = ‘e’; //0x65
        buffer_CreateFileA[10] = ‘A’; //0x41
        buffer_CreateFileA[11] = ‘\0’;
        return (LPCSTR)buffer_CreateFileA;
    }

As you can see, the values are not pushed in order. If you look at the code you can see ‘realFitCeeA’! It is not a huge leap to get ‘CreateFileA’ from this. But this method is surprisingly effective. How does it look in Assembler, Figure 9:













The block of ‘mov’ instructions builds the string. As you can see, it is much harder to pull out CreateFileA from this code. It is a very simple and effective obfuscation technique. The API name is built on the ESI register and then passed to GetProcAddress. So, a good option is to put a breakpoint on all GetProcAdresses calls. By looking at the stack you can see what is being passed into the function. This will give you a more complete picture of the APIs that are being called.

The final type of obfuscation we are going to look at is called Exclusive OR (Xor for short). Xor is very popular with malware authors. It is a very basic type of ‘encryption’. Don’t even want to use the word encryption as the technique is more like polarization. One pass, encrypts the string and a second pass with the same key decrypts the string. It is very light weight and fast. It is also very easy to break.

Listing 6. Secret Code Buffer, (ciphertext) Xored with 0xFA to produce plaintext

    unsigned char buffer_SecretCode[24] = {0xae, 0x92, 0x93, 0x89, 0xda, 0x93,
        0x89, 0xda, 0x8e, 0x92, 0x9f, 0xda, 0xa9, 0x9f, 0x99, 0x88, 0x9f, 0x8e,
        0xda, 0xb9, 0x95, 0x9e, 0x9f};

    for ( int i = 0; i < sizeof(buffer_SecretCode); i++ )
        buffer_SecretCode[i] ^= 0xFA;

The string  wanted to hide was copied it into a buffer. I ran the code once and it created the ciphertext.Placed this ciphertext into the original buffer so the next time I ran it would create the plaintext. Only used a byte wise encryption, malware may use longer keys. The C++ code to build the buffer containing the chipertext is below followed by the decryption loop: Listing 6.














Let’s have a look at the assembler code (Figure10). We can see the buffer being loaded with the Hex
characters as before. Marked below is where each byte of the ciphertext is xored with 0xFA. After the
Xor you can see INC EAX and CMP EAX, 18 followed by a jump.

This is the ‘for loop’ that will iterate 0x18 (the length of the secret message) before it continues. JB stands for ‘jump below,’ so, the jump will happen for the full length of the string decrypting each byte of the ciphertext. This is later compared against the value the contain in the text file. If they match the layer is passed, or you could manipulate a jump or two.

Layer 3: LoadLibrary and GetProcAddress

To bypass this layer you are going to need to create a file in „c:\temp\mytestfile.txt” this file will need to contain the ‘Secret code’ that is Xored in the Figure 10. The C++ code below will open and read this file. It will then compare the contents to the secret code. We are not calling CreateFileA as we normally would. We are using GetProcAdress to locate it within the Kernel32 DLL. Next, we dynamically cal the CreatFileA export with the correct parameters. We are doing all this so as to hide CreateFileA from both the import table and a string dump. Listing 7 shows the code used, with comments for clarification.

Listing 7. Calling CreateFileA dynamically using getProcAddress and LoadLibrary

HANDLE hFile;
HANDLE hAppend;
DWORD dwBytesRead, dwBytesWritten, dwPos;
LPCSTR fname = “c:\\temp\\mytestfile.txt”;
char buff[25];
//Get deobfuscated Kernel32 and CreateFileA strings
LPCWSTR DLL = get_Kernel32_string();
LPCSTR PROC = get_CreateFileA_string();

FARPROC Proc;
HINSTANCE hDLL;
//Get Kernel32 handle
hDLL = LoadLibrary(DLL);
//Get CreateFileA export address
Proc = GetProcAddress(hDLL,PROC);

//Creating Dummy function header
typedef HANDLE (__stdcall *GETADAPTORSFUNC)(LPCSTR, DWORD, DWORD, LPSECURITY_ATTRIBUTES,DWORD, DWORD, HANDLE);
GETADAPTORSFUNC fpGetProcAddress;

fpGetProcAddress = (GETADAPTORSFUNC)GetProcAddress(hDLL, PROC);
//Dynamically call CreateFileA
hFile = fpGetProcAddress(fname, GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL,
NULL);

if(hFile == INVALID_HANDLE_VALUE)
    printf(“Could not open %S\n”, fname);
else
    printf(“Opened %S successfully.\n”, fname);

Layer 4: Anti-Virtualisation

The final layer uses anti-virtualisation. We will look at detecting VMWare. Intel x86 provides two
instructions to allow you to carry I/O operations, these instructions are the „IN” and „OUT” instructions.Vmware uses the “IN” instruction to read from a port that does not really exist. If you access that port in a VMWare you will not get an exception. If you access it in a normal machine it will cause an exception. The detection is based on this anomaly. To perform the test you load 0x0A in the ECX register and you put the magic value of 0x564D5868 (‘VMXh)’ in the EAX register. Then you read a DWORD from port 0x5658 (VX). If an exception is caused you are not in VMware.

Listing 8. VMWare detection function

bool IsInsideVMWare()
{
    bool rc = true;
    printf(“Just going to test if you are running in VMWARE:\n”);
    __try
    {
      __asm
      {
        push edx
        push ecx
        push ebx
        mov eax, ‘VMXh’ // The Magic Number
        mov ebx, 0
        mov ecx, 10
        mov edx, ‘VX’ // The port
        in eax, dx // The IN Instruction
        cmp ebx, ‘VMXh’ // Check if ebx contains the magic number
        setz [rc] // set return value
        pop ebx
        pop ecx
        pop edx
      }
    }
    __except(EXCEPTION_EXECUTE_HANDLER)
    {
        rc = false;
    }
      return rc;
}

A good way to look for this trick is to search for the magic number 0x564D5868. In my code you can search for the string; „Just going to test if you are running in VMWARE:\n”. I have not displayed the assembler code as seen in Ollydbg as it is identical to the inline assembly in Listing 8. Just after this code there is a jump instruction you can manipulate to bypass this detection. Last little bit of advice you may see ‘Privileged instruction – use Shift +F7/F8/F9 to pass exception to program’, If you press Shift + F9 itwill continue past the exception.

Popular posts from this blog

Haking On Demand_WireShark - Part 5

Detect/Analyze Scanning Traffic Using Wireshark “Wireshark”, the world’s most popular Network Protocol Analyzer is a multipurpose tool. It can be used as a Packet Sniffer, Network Analyser, Protocol Analyser & Forensic tool. Through this article my focus is on how to use Wireshark to detect/analyze any scanning & suspect traffic. Let’s start with Scanning first. As a thief studies surroundings before stealing something from a target, similarly attackers or hackers also perform foot printing and scanning before the actual attack. In this phase, they want to collect all possible information about the target so that they can plan their attack accordingly. If we talk about scanning here they want to collect details like: • Which IP addresses are in use? • Which port/services are active on those IPs? • Which platform (Operating System) is in use? • What are the vulnerabilities & other similar kinds of information. • Now moving to some popular scan methods and ho

Bypassing Web Application Firewall Part - 2

WAF Bypassing with SQL Injection HTTP Parameter Pollution & Encoding Techniques HTTP Parameter Pollution is an attack where we have the ability to override or add HTTP GET/POST parameters by injecting string delimiters. HPP can be distinguished in two categories, client-side and server-side, and the exploitation of HPP can result in the following outcomes:  •Override existing hardcoded HTTP parameters  •Modify the application behaviors   •Access and potentially exploit uncontrollable variables  • Bypass input validation checkpoints and WAF rules HTTP Parameter Pollution – HPP   WAFs, which is the topic of interest, many times perform query string parsing before applying the filters to this string. This may result in the execution of a payload that an HTTP request can carry. Some WAFs analyze only one parameter from the string of the request, most of the times the first or the last, which may result in a bypass of the WAF filters, and execution of the payload in the server.  Let’s e

Bypassing Web Application Firewall Part - 4

Securing WAF and Conclusion DOM Based XSS DOM based XSS is another type of XSS that is also used widely, and we didn’t discuss it in module 3. The DOM, or Document Object Model, is the structural format used to represent documents in a browser. The DOM enables dynamic scripts such as JavaScript to reference components of the document such as a form field or a session cookie, and it is also a security feature that limits scripts on different domains from obtaining cookies for other domains. Now, the XSS attacks based on this is when the payload that we inject is executed as a result of modifying the DOM environment in the victim’s browser, so that the code runs in an unexpected way. By this we mean that in contrast with the other two attacks, here the page that the victim sees does not change, but the injected code is executed differently because of the modifications that have been done in the DOM environment, that we said earlier. In the other XSS attacks, we saw the injected code was