Reversing with Stack-Overflow and Exploitation
The theater of the information security professional has changed drastically in the world
of computing or digital world. So we are going to find the root. The keynote to secure the
business is a complete analysis of internal business.
The prevalence of security holes in program and protocols, the increasing size and complexity of the
internet, and the sensitivity of the information stored throughout have created a target-rich environment for our next generation advisory. The criminal element is applying advanced techniques to evade the software/ tool security. So the Knowledge of Analysis is necessary. And that pin point is called “The Art Of Reverse Engineering”
What is Reverse Engineering?
Reverse engineering is the process of taking a compiled binary and attempting to recreate (or simply
understand) the original way the program works. A programmer initially writes a program, usually in
a high-level language such as C++ or Visual Basic (or God forbid, Delphi). Because the computer does not inherently speak these languages, the code that the programmer wrote is assembled into a more machine specific format, one to which a computer does speak. This code is called, originally enough, machine language. This code is not very human friendly, and often times requires a great deal of brain power to figure out exactly what the programmer had in mind.
Why Should you Know
• Military or commercial espionage. Learning about an enemy’s or competitor’s latest research by stealing or capturing a prototype and dismantling it. It may result in development of similar product.
• Improve documentation shortcomings. Reverse engineering can be done when documentation of
a system for its design, production, operation or maintenance have shortcomings and original designers are not available to improve it. RE of software can provide the most current documentation necessary forunderstanding the most current state of a software system
• Software Modernization. RE is generally needed in order to understand the ‘as is’ state of existing
or legacy software in order to properly estimate the effort required to migrate system knowledge into a ‘to be’ state. Much of this may be driven by changing functional, compliance or security requirements.
• Product Security Analysis. To examine how a product works, what are specifications of its components, estimate costs and identify potential patent infringement.
• Bug fixing. To fix (or sometimes to enhance) legacy software which is no longer supported by its creators.
• Creation of unlicensed/unapproved duplicates.
• Academic/learning purposes. RE for learning purposes may help to understand the key issues of an
unsuccessful design and subsequently improve the design.
• Competitive technical intelligence. Understand what your competitor is actually doing, versus what they say they are doing.
What Should you Know?
The Stack: The stack is a piece of the process memory, a data structure that works LIFO (Last in first out). A stack gets allocated by the OS, for each thread (when the thread is created). When the thread ends, the stack is cleared as well. The size of the stack is defined when it gets created and doesn’t change. Combined with LIFO and the fact that it does not require complex management structures/mechanisms to get managed, the stack is pretty fast, but limited in size.
LIFO means that the most recent placed data (result of a PUSH instruction) is the first one that will be removed from the stack again. (by a POP instruction).
Each and every software has a predefined subroutine or sub function that is called dynamically in the
program.
When a function/subroutine is entered, a stack frame is created. This frame keeps the parameters of the parent procedure together and is used to pass arguments to the subroutine. The current location of the stack can be accessed via the stack pointer (ESP), the current base of the function is contained in the base pointer (EBP) (or frame pointer).
The CPU’s general purpose registers (Intel, x86) are:
• EAX: accumulator: used for performing calculations, and to store return values from function calls.
Basic operations such as add, subtract, compare use this general-purpose register.
• EBX: base (does not have anything to do with base pointer). It has no general purpose and can be used to store data.
• ECX: counter: used for iterations. ECX counts downward.
• EDX: data: this is an extension of the EAX register. It allows for more complex calculations (multiply, divide) by allowing extra data to be stored to facilitate those calculations.
• ESP: stack pointer
• EBP: base pointer
• ESI: source index: holds location of input data
• EDI: destination index: points to location of where result of data operation is stored
• EIP: instruction pointer
So The Espinosa tools are used for complete go through or analytic of software which are listed below.
What kinds of tools are used?
There are many different kinds of tools used in reversing. Many are specific to the types of protection
that must be overcome to reverse a binary. There are also several that just make the reverser’s life easier. And then some are what I consider the ‘staple’ items- the ones you use regularly. For the most part, the tools fit into a couple categories:
Disassemblers
Disassemblers attempt to take the machine language codes in the binary and display them in a friendlier format. They also extrapolate data such as function calls, passed variables and text strings. This makes the executable look more like human-readable code as opposed to a bunch of numbers strung together. There are many disassemblers out there, some of them specializing in certain things (such as binaries written in Delphi). Mostly it comes down to the one your most comfortable with. I invariably find myself working with IDA.
Debuggers
Debuggers are the bread and butter for reverse engineers. They first analyze the binary, much like a disassembler. Debuggers then allow the reverser to step through the code, running one line at a time and investigating the results. This is invaluable to discover how a program works. Finally, some debuggers allow certain instructions in the code to be changed and then run again with these changes in place. Examples of debuggers are Windbg, Immunity Debugger and Ollydbg. I almost always use Immunity Debugger and Ollydbg.
REAL ATTACK
Before we start, we are using the following vulnerability which will have a stack based overflow and we will reverse analyze that file and will exploit for our cause.
• Vulnerability item-RM To MP3 Converter
• BOX-Windows XP SP2/SP3 (I’m using SP3)
• Tool: Ollydbg, Immunity Debugger
• Backtrack Machine/Machine with metasploit installed
First of all, create a Python script with predefined written data into buffer and create an .m3u file. Open this file in rm to mp3 converter so the file/software will crash due to stack overflow. In the image, loaded a script with 30,000 bytes of data into an .mp3 file which will crash on the 2nd image or cause a buffer overflow. This is the program (Figure 1).
#!/usr/bin/python
filename =’30000.m3u’buffer = “\x41” * 30000
file = open(filename,’w’)
print”Done!”
file.close()
So the below diagram is the crash file of rm to mp3 (Figure 2).
The Debugger
In order to see the state of the stack (and value of registers such as the instruction pointer, stack pointer etc.), we need to hook up a debugger to the application, so we can see what happens at the time the application runs (and especially when it dies).
There are many debuggers available for this purpose. The two debuggers use most often are ollydbg, and Immunity’s Debugger (Figure 3 and Figure 4).
This GUI shows the same information, but in a more…errr.. graphical way. In the upper left corner, you have the CPU view, which shows assembly instructions and their opcodes (the window is empty because EIP currently points at 41414141 and that’s not a valid address). In the upper right windows, you can see the registers. In the lower left corner, you see the memory dump of 00446000 in this case. In the lower right corner, you can see the contents of the stack (so the contents of memory at the location where ESP points at).
Anyways, in both cases, we can see that the instruction pointer contains 41414141, which is the hexidecimal representation for AAAA. And The Position is called “offset” value.
Checking The EIP Position
• From the result, we know that the ESP and EIP registers are overwritten.
• We don’t know where the ESP and EIP registers are overwritten, so we make the structured string using pattern_create.rb to find the location where the registers are overwritten.
Backtrack has the solution like Metasploit. So we will use
root@dimitry-TravelMate-5730:/opt/metasploit3/msf3/
tools# ./pattern_create.rb 30000
We will get a generation and we will again create an .m3u file and run to the rm to mp3 converter to see the result (Figure 5).
Again Creating a m3u file with the following generation to chec k EIP Location and we have to open in rm to mp3 converter (Figure 6 and Figure 7). So we will get a valu e which is nearer between 5792 to 26072. see the picture below. so in that location EIP Value is written. EIP sits between 25000 and 30000.
For that reason I have taken 30000 byte of data to see what happens to the data or program. see the picture below you will understand (Figure 8).
In the above screen, used two command to check the EIP AND ESP Location and fortunately, have not get any value for 2nd option and got 1st value 5792 for command, because taken the beyond bytes of data.
Finding JMP ESP And Memory Location
Before we try to exploit, we should know the exact memory location, JMP, ESP Location so that our exploit will work perfectly.
Ollydbg: go to view-executable modules and search for Shell 32 modules and
right click on shell32, view JMP ESP Command and location.
Same procedure will be applied for Immumnity Debugger. For More Information See the Figure 9.
Analysis in Immunity Debugger see Figure 10. Analysis in Ollydbg.
Creating Our Own Exploit and Letting The Application Die
As we know, while creating and building an exploit, there is great contribution towards Metasploit Built-in Payload generator and encoders. So we will use one of them for our development of the exploit.
We will use Encoder: x86/shikata_ga_nai which is a good encoder for generating the payload which can be available in just writing msfconsole-show payloads-use payload(in this case bind_tcp)-show encodergenerate encoder
And we will use a program, namely calculator, on a Windows machine to boom the application. For that, we have to run a Perl script behind it and open in rm to mp3 converter (Figure 11).
So we will add the encoder to our final exploit to run calculator on “rm to mp3 converter” to get buffer overflow.
And Exactly we add the location of memory as well as EIP ESP Location into exploit of our code
to get into buffer.
Again Create Vulnerable .m3u file and run in “rm to mp3 converter” to see the calculator and to analyze in debugger either we have to open in immunity debugger or ollydbg debugger and analyze location where EIP AND ESP Overwritten (Figure 12 and Figure 13).
Application Boom to Calculator Application.
You can create the .m3u file and reverse connect to your shell some tool like nmap.netcat etc…
The theater of the information security professional has changed drastically in the world
of computing or digital world. So we are going to find the root. The keynote to secure the
business is a complete analysis of internal business.
The prevalence of security holes in program and protocols, the increasing size and complexity of the
internet, and the sensitivity of the information stored throughout have created a target-rich environment for our next generation advisory. The criminal element is applying advanced techniques to evade the software/ tool security. So the Knowledge of Analysis is necessary. And that pin point is called “The Art Of Reverse Engineering”
What is Reverse Engineering?
Reverse engineering is the process of taking a compiled binary and attempting to recreate (or simply
understand) the original way the program works. A programmer initially writes a program, usually in
a high-level language such as C++ or Visual Basic (or God forbid, Delphi). Because the computer does not inherently speak these languages, the code that the programmer wrote is assembled into a more machine specific format, one to which a computer does speak. This code is called, originally enough, machine language. This code is not very human friendly, and often times requires a great deal of brain power to figure out exactly what the programmer had in mind.
Why Should you Know
• Military or commercial espionage. Learning about an enemy’s or competitor’s latest research by stealing or capturing a prototype and dismantling it. It may result in development of similar product.
• Improve documentation shortcomings. Reverse engineering can be done when documentation of
a system for its design, production, operation or maintenance have shortcomings and original designers are not available to improve it. RE of software can provide the most current documentation necessary forunderstanding the most current state of a software system
• Software Modernization. RE is generally needed in order to understand the ‘as is’ state of existing
or legacy software in order to properly estimate the effort required to migrate system knowledge into a ‘to be’ state. Much of this may be driven by changing functional, compliance or security requirements.
• Product Security Analysis. To examine how a product works, what are specifications of its components, estimate costs and identify potential patent infringement.
• Bug fixing. To fix (or sometimes to enhance) legacy software which is no longer supported by its creators.
• Creation of unlicensed/unapproved duplicates.
• Academic/learning purposes. RE for learning purposes may help to understand the key issues of an
unsuccessful design and subsequently improve the design.
• Competitive technical intelligence. Understand what your competitor is actually doing, versus what they say they are doing.
What Should you Know?
The Stack: The stack is a piece of the process memory, a data structure that works LIFO (Last in first out). A stack gets allocated by the OS, for each thread (when the thread is created). When the thread ends, the stack is cleared as well. The size of the stack is defined when it gets created and doesn’t change. Combined with LIFO and the fact that it does not require complex management structures/mechanisms to get managed, the stack is pretty fast, but limited in size.
LIFO means that the most recent placed data (result of a PUSH instruction) is the first one that will be removed from the stack again. (by a POP instruction).
Each and every software has a predefined subroutine or sub function that is called dynamically in the
program.
When a function/subroutine is entered, a stack frame is created. This frame keeps the parameters of the parent procedure together and is used to pass arguments to the subroutine. The current location of the stack can be accessed via the stack pointer (ESP), the current base of the function is contained in the base pointer (EBP) (or frame pointer).
The CPU’s general purpose registers (Intel, x86) are:
• EAX: accumulator: used for performing calculations, and to store return values from function calls.
Basic operations such as add, subtract, compare use this general-purpose register.
• EBX: base (does not have anything to do with base pointer). It has no general purpose and can be used to store data.
• ECX: counter: used for iterations. ECX counts downward.
• EDX: data: this is an extension of the EAX register. It allows for more complex calculations (multiply, divide) by allowing extra data to be stored to facilitate those calculations.
• ESP: stack pointer
• EBP: base pointer
• ESI: source index: holds location of input data
• EDI: destination index: points to location of where result of data operation is stored
• EIP: instruction pointer
So The Espinosa tools are used for complete go through or analytic of software which are listed below.
What kinds of tools are used?
There are many different kinds of tools used in reversing. Many are specific to the types of protection
that must be overcome to reverse a binary. There are also several that just make the reverser’s life easier. And then some are what I consider the ‘staple’ items- the ones you use regularly. For the most part, the tools fit into a couple categories:
Disassemblers
Disassemblers attempt to take the machine language codes in the binary and display them in a friendlier format. They also extrapolate data such as function calls, passed variables and text strings. This makes the executable look more like human-readable code as opposed to a bunch of numbers strung together. There are many disassemblers out there, some of them specializing in certain things (such as binaries written in Delphi). Mostly it comes down to the one your most comfortable with. I invariably find myself working with IDA.
Debuggers
Debuggers are the bread and butter for reverse engineers. They first analyze the binary, much like a disassembler. Debuggers then allow the reverser to step through the code, running one line at a time and investigating the results. This is invaluable to discover how a program works. Finally, some debuggers allow certain instructions in the code to be changed and then run again with these changes in place. Examples of debuggers are Windbg, Immunity Debugger and Ollydbg. I almost always use Immunity Debugger and Ollydbg.
REAL ATTACK
Before we start, we are using the following vulnerability which will have a stack based overflow and we will reverse analyze that file and will exploit for our cause.
• Vulnerability item-RM To MP3 Converter
• BOX-Windows XP SP2/SP3 (I’m using SP3)
• Tool: Ollydbg, Immunity Debugger
• Backtrack Machine/Machine with metasploit installed
First of all, create a Python script with predefined written data into buffer and create an .m3u file. Open this file in rm to mp3 converter so the file/software will crash due to stack overflow. In the image, loaded a script with 30,000 bytes of data into an .mp3 file which will crash on the 2nd image or cause a buffer overflow. This is the program (Figure 1).
#!/usr/bin/python
filename =’30000.m3u’buffer = “\x41” * 30000
file = open(filename,’w’)
print”Done!”
file.close()
So the below diagram is the crash file of rm to mp3 (Figure 2).
The Debugger
In order to see the state of the stack (and value of registers such as the instruction pointer, stack pointer etc.), we need to hook up a debugger to the application, so we can see what happens at the time the application runs (and especially when it dies).
There are many debuggers available for this purpose. The two debuggers use most often are ollydbg, and Immunity’s Debugger (Figure 3 and Figure 4).
This GUI shows the same information, but in a more…errr.. graphical way. In the upper left corner, you have the CPU view, which shows assembly instructions and their opcodes (the window is empty because EIP currently points at 41414141 and that’s not a valid address). In the upper right windows, you can see the registers. In the lower left corner, you see the memory dump of 00446000 in this case. In the lower right corner, you can see the contents of the stack (so the contents of memory at the location where ESP points at).
Anyways, in both cases, we can see that the instruction pointer contains 41414141, which is the hexidecimal representation for AAAA. And The Position is called “offset” value.
Checking The EIP Position
• From the result, we know that the ESP and EIP registers are overwritten.
• We don’t know where the ESP and EIP registers are overwritten, so we make the structured string using pattern_create.rb to find the location where the registers are overwritten.
Backtrack has the solution like Metasploit. So we will use
root@dimitry-TravelMate-5730:/opt/metasploit3/msf3/
tools# ./pattern_create.rb 30000
We will get a generation and we will again create an .m3u file and run to the rm to mp3 converter to see the result (Figure 5).
Again Creating a m3u file with the following generation to chec k EIP Location and we have to open in rm to mp3 converter (Figure 6 and Figure 7). So we will get a valu e which is nearer between 5792 to 26072. see the picture below. so in that location EIP Value is written. EIP sits between 25000 and 30000.
For that reason I have taken 30000 byte of data to see what happens to the data or program. see the picture below you will understand (Figure 8).
In the above screen, used two command to check the EIP AND ESP Location and fortunately, have not get any value for 2nd option and got 1st value 5792 for command, because taken the beyond bytes of data.
Finding JMP ESP And Memory Location
Before we try to exploit, we should know the exact memory location, JMP, ESP Location so that our exploit will work perfectly.
Ollydbg: go to view-executable modules and search for Shell 32 modules and
right click on shell32, view JMP ESP Command and location.
Same procedure will be applied for Immumnity Debugger. For More Information See the Figure 9.
Analysis in Immunity Debugger see Figure 10. Analysis in Ollydbg.
Creating Our Own Exploit and Letting The Application Die
As we know, while creating and building an exploit, there is great contribution towards Metasploit Built-in Payload generator and encoders. So we will use one of them for our development of the exploit.
We will use Encoder: x86/shikata_ga_nai which is a good encoder for generating the payload which can be available in just writing msfconsole-show payloads-use payload(in this case bind_tcp)-show encodergenerate encoder
And we will use a program, namely calculator, on a Windows machine to boom the application. For that, we have to run a Perl script behind it and open in rm to mp3 converter (Figure 11).
So we will add the encoder to our final exploit to run calculator on “rm to mp3 converter” to get buffer overflow.
And Exactly we add the location of memory as well as EIP ESP Location into exploit of our code
to get into buffer.
Again Create Vulnerable .m3u file and run in “rm to mp3 converter” to see the calculator and to analyze in debugger either we have to open in immunity debugger or ollydbg debugger and analyze location where EIP AND ESP Overwritten (Figure 12 and Figure 13).
Application Boom to Calculator Application.
You can create the .m3u file and reverse connect to your shell some tool like nmap.netcat etc…