Setting Up Your Own Malware Analysis Lab
With new malware attacks making news everyday and compromising company’s network
and critical infrastructures around the world, malware analysis is critical for anyone
who responds to such incidents. In this article you will learn to setup a safe environment
to analyze malicious software and understand its behaviour.
Malware is a piece of software which causes harm to a computer system without the owner’s consent.
Viruses, Trojans, worms, backdoors, rootkits, scareware and spyware can all be considered as malwares.
Malware Analysis
Malware analysis is the process of understanding the behaviour and characteristics of malware, how to detect and eliminate it.
Why Malware Analysis?
There are many reasons why we would want to analyze a malware, below to name just a few:
• Determine the nature and purpose of the malware i.e whether the malware is an information stealing
malware, http bot, spam bot, rootkit, keylogger, RAT etc.
• Interaction with the Operating System i.e to understand the filesystem, registry, network and process
activities.
• Detect identifiable patterns to cure and prevent future infections.
Types of Malware Analysis
In order to understand the characteristics of the malware three types of analysis can be performed they are:
• Static Analysis
• Dynamic Analysis
• Memory Analysis
In most cases static and dynamic analysis will yield sufficient results however Memory analysis helps
in determining hidden artifacts, helps in rootkit detection and unpacking, thus giving more detailed and interesting results.
In this article we will focus on setting up a malware analysis lab to perform Static and Dynamic analysis. Before setting up the malware analysis lab, let us understand the concepts, tools and techniques required to perform Static and Dynamic analysis.
Static Analysis
Static Analysis involves analyzing the malware without actually executing it. Following are some of the steps:
Determining the File Type
This is necessary because the file’s extension cannot be used as a sole indicator to determine its type. Malware author could change the extension of an executable (.exe) file with any extension for example with .pdf to make the user think its a pdf file. Determining the file type can also help you understand the type of environment the malware is targeted towards, for example if the file type is PE (portable executable) it can be concluded that the malware is targeted towards a Windows system. Some of the tools that can be used to determine file type are file utility on linux and File utility for Windows.
Determining the Cryptographic Hash
Cryptographic Hash values like MD5 and SHA1 can serve as unique identifier for the file throughout the course of analysis. Malware, after executing can copy itself to a different location or drop another piece of malware, cryptographic hash can help you determine whether the newly copied/dropped sample is same as the original sample or a different one. With this information we can determine if malware analysis need to be performed on a single sample or multiple samples. Cryptographic hash can also be submitted to online antivirus scanners like VirusTotal to determine if it has been previously detected by any of the AV vendors.
Utilities like md5sum on Linux and md5deep on Windows can be used to determine the cryptographic hash.
Strings search
Strings are plain text ASCII and UNICODE characters embedded within a file. Strings searches give clues about the functionality and commands associated with a malicious file. Although strings do not provide a complete picture of the function and capability of a file, they can yield information like file names, URL, domain names, IP address, registry keys, etc.
strings utility on Linux and BinText on Windows can be used to find the embedded strings in an executable.
File obfuscation (packers, cryptors) detection
Malware authors often use software like packers and cryptors to obfuscate the contents of the file in order to evade detection from anti-virus software and intrusion detection systems. This technique slows down the malware analysts from reverse engineering the code. Packers can be quite tricky to identify and, more importantly, unpack. Once the packer is identified, hopefully finding the unpacker or resources for manual unpacking will be easier.
PEiD or RDG packer detector can be used for packer detection in an executable.
Submission to online Antivirus scanning services
This will help you determine if the malicious code signatures exist for the suspect file. The signature name for the specific file provides an excellent way to gain additional information about the file and capabilities. By visiting the respective antivirus vendor web sites or searching for the signature in search engines can yield additional details about the suspect file. Such information may help in further investigation and reduce the analysis time of the malware specimen.
VirusTotal (http://www.virustotal.com) and Jotti (http://virusscan.jotti.org) are some of the popular web based malware scanning services.
Examining File Dependencies
Windows executable loads multiple DLL’s (Dynamic Linked Library) and call API functions to perform certain actions like resolving domain names, adding registry value, establishing an http connection etc.
Determining the type of DLL and list of api calls imported by an executable can give an idea on the
functionality of the malware. Dependency Walker and PEview are some of the tools that can be used to inspect the file dependencies.
Disassembling the File
Examining the suspect program in a disassembler allows the investigator to explore the instructions that will be executed by the malware. Disassembly can help in tracing the paths that are not usually determined during dynamic analysis.
IDA Pro is a popular disassembler that can be used to disassemble a file, it supports multiple file formats.
Dynamic Analysis
Dynamic Analysis involves executing the malware sample in a controlled environment. It can involve
monitoring malware as it runs or examining the system after the malware has executed. Sometimes static analysis will not reveal much information due to obfuscation or packing, in such cases dynamic analysis is the best way to identify malware functionality. Following are the steps involved in dynamic analysis:
Monitoring Process Activity
This involves executing the malicious program and examining the properties of the resulting process and other processes running on the infected system. This technique can reveal information about the process like process name, process id, system path of the executable program, modules loaded by the suspect program. Tool for gathering process information is Process Explorer. CaptureBAT and ProcMon can also be used to monitor the process activity as the malware is running.
Monitoring File System Activity
This involves examining the real time file system activity while the malware is running; this technique reveals information about the opened files, newly created files and deleted files as a result of executing the malware sample.
Procmon and CaptureBAT are powerful monitoring utilities that can be used to examine the File System activities.
Monitoring Registry Activity
Windows registry is used to store OS and program configuration information. Malware often uses registry for persistence or to store configuration data. Monitoring the registry changes can yield information about which process are accessing the host system’s registry keys and the registry data that is being read or written. This technique can also reveal the malware component that will run automatically when the computer boots.
Regshot, ProcMon and CaptureBAT are some of the tools which give the ability to trace the interaction of the malware with the registry.
Monitoring Network Activity
In addition to monitoring the activity on the infected host system, monitoring the network traffic to and from the system during the course of running the malware sample is also important. This helps to identify the network capabilities of the malware specimen and will also allow us to determine the network based indicator which can then be used to create signatures on security devices like Intrusion Detection System.Some of the network monitoring tools to consider are tcpdump and Wireshark, tcpdump captures real time network traffic to a a command console whereas Wireshark is a GUI based packet capture utility, that provides user with powerful filtering options.
Setting Up Your Own Malware Analysis Lab
Before performing malware analysis, we need to setup a safe analysis environment; we want to make sure that these systems do not have access to any live production systems or the internet. It is a good idea to always start with a fresh install of the OS of your choice for the analysis. You have several options when creating a malware analysis environment. If you have the hardware lying around you can always build your lab using the physical machines. Prefer to use Virtualized Operating systems for the following reasons:
• Ability to take multiple snapshots
• Restoring to the pristine state is easy.
• No extra hardware is required
• Switching between Operating systems is faster
There are also some disadvantages of using Virtualized environments, some malwares change its characteristics or refuse to run when it is detected to be running within a virtual environment. In such cases you may have to analyze the malware on physical machines or reverse engineer and patch the code that is checking for the Virtualized environments using debuggers like OllyDBG or Immunity Debugger.
Building the Environment
Our environment consists of a physical machine running Backtrack 5 Linux (which is called Host machine) with Wireshark installed. The IP address of this host machine is set to 192.168.1.2 This machine also runs INetSim which is a free, Linux-based software suite for simulating common internet services. This tool can fake services, allowing you to analyze the network behaviour of malware samples by emulating services such as DNS, HTTP, HTTPS, FTP, IRC, SMTP and others (Figure 1). INetsim is also configured to emulate the services on the network interface with ip address 192.168.1.2.
The Linux machine also runs VMware Workstation in host only mode with Window XP SP3 installed on it (which is called as Analysis machine). Windows operating system is installed with Static Analysis tools (as mentioned in the Static Analysis section) and CaptureBAT to monitor the File System, Registry and Network activities (as mentioned in the Dynamic Analysis section). The IP address of the Windows machine is set to 192.168.1.100 with the default gateway as 192.168.1.2 (Figure 2) which is the IP address of the Linux machine, this is to make sure that all the traffic will be routed through the Linux machine where we will be monitoring for the network traffic (using Wireshark) and also emulating the internet services using INetSim. The Windows machine is our analysis machine where we will be executing the malware sample.
The screenshot (Figure 3) illustrates the malware analysis environment.
Analysis of a Malware Sample (edd94.exe)
Now that we have a malware analysis lab setup, let’s begin our analysis in the lab environment to see what we can learn about this sample edd94.exe. We will first start with the Static Analysis techniques.
• Determine the File Type: Running the File utility on the malware sample shows that it is a PE32
Executable file (Figure 4)
• Taking the Cryptographic Hash: MD5sum utility shows the md5sum of the malware sample (edd94.exe) (Figure 5). Other algorithms such as Secure Hash Algorithm version 1.0 (SHA1) can also used for the same purpose.
• Determine the Packer: PEiD is a tool that can be used to detect most common packers, cryptors
and compilers for PE files. It can currently detect more than 600 different signatures in the PE files.
In this case the sample is not packed (Figure 6). Another alternative to PEiD is RDG Packer Detector.
• Examining the File Dependencies: Dependency Walker is a great tool for viewing file dependencies.
Dependency Walker shows four DLLs loaded and the list of api calls imported by the executable (edd94. exe) and it also shows the malware specimen importing an api call “CreateRemoteThread” (Figure 7) which is an api call used by the malware to inject code into another process.
• Submission to Online Web Based Malware Scanning Service: Submitting the sample to VirusTotal
shows that malware is a ZeuS bot (zbot) (Figure 8). Zeus is a Trojan horse that steals banking information by Man-in-the-browser keystroke logging and Form Grabbing. Zeus is spread mainly through drive-by downloads and phishing schemes.
Now that we got some information using Static Analysis, let us try to determine the characteristics of the malware using Dynamic Analysis. Before executing the malware, the monitoring tool Wireshark is run on the Linux machine to capture the network traffic (Figure 9) generated as a result of malware execution. INetSim is run to emulate network services and to provide fake responses to the malware (Figure 1).On Windows, CaptureBAT is run to capture the process, registry and file system activity.
The malware sample (edd94.exe) was run in the analysis machine for few seconds. Following are some of activities caught by our monitoring tools after the malware execution.
The below screenshot (Figure 10) shows the process, registry and fileystem activity after executing the malware (edd94.exe), also explorer.exe (which is OS process) performs lot of activity (setting registry value and creating various files) just after executing the malware indicating code injection into explorer.exe.
The malware also drops a new file (raruo.exe) into “C:\Documents and Settings\Administrator\ApplicationData\Lyolxi” directory, after which it executes it and creates a new process (Figure 11). Now this is where the cryptographic hash will help us determine if the dropped file (raruo.exe) is the same as the original file (edd94.exe). We will come to that later.
Another interesting activity is explorer.exe setting a registry value {F561587E-37AB-9701-D0081175F61B} under the sub key “HKCU\Software\Microsoft\Windows\CurrentVersion\Run” (Figure 12). Malwares usually adds values to this registry key to survive the reboot (persistence mechanism). Also explorer.exe creating this registry key is suspicious and could be the result of malware injecting code into explorer.exe.
Wireshark also captured the malware performing a DNS look up to resolve the domain “users9.nofeehost.Com”. Aalso, the domain resolved to the IP address 192.168.1.2 which is our Linux machine (Figure 13).This is because INetSim which was running on the Linux machine responded to the DNS query by giving a fake response. Now we have tricked the malware to think that users9.nofeehost.com is at IP address 192.168.1.2 which is our host machine (Linux). This way, we have not allowed the malware to connect to the internet and also have control over our analysis.
Then the malware tries to establish an http connection trying to download a configuration file (all.bin) from the domain users9.nofeehost.com (Figure 14), also the INetSim gave a fake response page, we can also configure INetSim to respond with whatever custom page we want to.
ZeuS Tracker (project that keeps track of ZeuS command and control servers around the world) shows that this domain (users9.nofeehost.com) was previously listed as ZeuS command and control server also the pattern that we captured is same as mentioned in the ZeuS tracker (Figure 15). This confirms that we are dealing with ZeuS bot (zbot).
Conclusion
By setting up a safe malware analysis lab we were able to perform basic static and dynamic analysis to uncover the characteristics of the malware without actually infecting any of the production systems. The patterns identified after analysis can now be used to create signatures for the security devices.
With new malware attacks making news everyday and compromising company’s network
and critical infrastructures around the world, malware analysis is critical for anyone
who responds to such incidents. In this article you will learn to setup a safe environment
to analyze malicious software and understand its behaviour.
Malware is a piece of software which causes harm to a computer system without the owner’s consent.
Viruses, Trojans, worms, backdoors, rootkits, scareware and spyware can all be considered as malwares.
Malware Analysis
Malware analysis is the process of understanding the behaviour and characteristics of malware, how to detect and eliminate it.
Why Malware Analysis?
There are many reasons why we would want to analyze a malware, below to name just a few:
• Determine the nature and purpose of the malware i.e whether the malware is an information stealing
malware, http bot, spam bot, rootkit, keylogger, RAT etc.
• Interaction with the Operating System i.e to understand the filesystem, registry, network and process
activities.
• Detect identifiable patterns to cure and prevent future infections.
Types of Malware Analysis
In order to understand the characteristics of the malware three types of analysis can be performed they are:
• Static Analysis
• Dynamic Analysis
• Memory Analysis
In most cases static and dynamic analysis will yield sufficient results however Memory analysis helps
in determining hidden artifacts, helps in rootkit detection and unpacking, thus giving more detailed and interesting results.
In this article we will focus on setting up a malware analysis lab to perform Static and Dynamic analysis. Before setting up the malware analysis lab, let us understand the concepts, tools and techniques required to perform Static and Dynamic analysis.
Static Analysis
Static Analysis involves analyzing the malware without actually executing it. Following are some of the steps:
Determining the File Type
This is necessary because the file’s extension cannot be used as a sole indicator to determine its type. Malware author could change the extension of an executable (.exe) file with any extension for example with .pdf to make the user think its a pdf file. Determining the file type can also help you understand the type of environment the malware is targeted towards, for example if the file type is PE (portable executable) it can be concluded that the malware is targeted towards a Windows system. Some of the tools that can be used to determine file type are file utility on linux and File utility for Windows.
Determining the Cryptographic Hash
Cryptographic Hash values like MD5 and SHA1 can serve as unique identifier for the file throughout the course of analysis. Malware, after executing can copy itself to a different location or drop another piece of malware, cryptographic hash can help you determine whether the newly copied/dropped sample is same as the original sample or a different one. With this information we can determine if malware analysis need to be performed on a single sample or multiple samples. Cryptographic hash can also be submitted to online antivirus scanners like VirusTotal to determine if it has been previously detected by any of the AV vendors.
Utilities like md5sum on Linux and md5deep on Windows can be used to determine the cryptographic hash.
Strings search
Strings are plain text ASCII and UNICODE characters embedded within a file. Strings searches give clues about the functionality and commands associated with a malicious file. Although strings do not provide a complete picture of the function and capability of a file, they can yield information like file names, URL, domain names, IP address, registry keys, etc.
strings utility on Linux and BinText on Windows can be used to find the embedded strings in an executable.
File obfuscation (packers, cryptors) detection
Malware authors often use software like packers and cryptors to obfuscate the contents of the file in order to evade detection from anti-virus software and intrusion detection systems. This technique slows down the malware analysts from reverse engineering the code. Packers can be quite tricky to identify and, more importantly, unpack. Once the packer is identified, hopefully finding the unpacker or resources for manual unpacking will be easier.
PEiD or RDG packer detector can be used for packer detection in an executable.
Submission to online Antivirus scanning services
This will help you determine if the malicious code signatures exist for the suspect file. The signature name for the specific file provides an excellent way to gain additional information about the file and capabilities. By visiting the respective antivirus vendor web sites or searching for the signature in search engines can yield additional details about the suspect file. Such information may help in further investigation and reduce the analysis time of the malware specimen.
VirusTotal (http://www.virustotal.com) and Jotti (http://virusscan.jotti.org) are some of the popular web based malware scanning services.
Examining File Dependencies
Windows executable loads multiple DLL’s (Dynamic Linked Library) and call API functions to perform certain actions like resolving domain names, adding registry value, establishing an http connection etc.
Determining the type of DLL and list of api calls imported by an executable can give an idea on the
functionality of the malware. Dependency Walker and PEview are some of the tools that can be used to inspect the file dependencies.
Disassembling the File
Examining the suspect program in a disassembler allows the investigator to explore the instructions that will be executed by the malware. Disassembly can help in tracing the paths that are not usually determined during dynamic analysis.
IDA Pro is a popular disassembler that can be used to disassemble a file, it supports multiple file formats.
Dynamic Analysis
Dynamic Analysis involves executing the malware sample in a controlled environment. It can involve
monitoring malware as it runs or examining the system after the malware has executed. Sometimes static analysis will not reveal much information due to obfuscation or packing, in such cases dynamic analysis is the best way to identify malware functionality. Following are the steps involved in dynamic analysis:
Monitoring Process Activity
This involves executing the malicious program and examining the properties of the resulting process and other processes running on the infected system. This technique can reveal information about the process like process name, process id, system path of the executable program, modules loaded by the suspect program. Tool for gathering process information is Process Explorer. CaptureBAT and ProcMon can also be used to monitor the process activity as the malware is running.
Monitoring File System Activity
This involves examining the real time file system activity while the malware is running; this technique reveals information about the opened files, newly created files and deleted files as a result of executing the malware sample.
Procmon and CaptureBAT are powerful monitoring utilities that can be used to examine the File System activities.
Monitoring Registry Activity
Windows registry is used to store OS and program configuration information. Malware often uses registry for persistence or to store configuration data. Monitoring the registry changes can yield information about which process are accessing the host system’s registry keys and the registry data that is being read or written. This technique can also reveal the malware component that will run automatically when the computer boots.
Regshot, ProcMon and CaptureBAT are some of the tools which give the ability to trace the interaction of the malware with the registry.
Monitoring Network Activity
In addition to monitoring the activity on the infected host system, monitoring the network traffic to and from the system during the course of running the malware sample is also important. This helps to identify the network capabilities of the malware specimen and will also allow us to determine the network based indicator which can then be used to create signatures on security devices like Intrusion Detection System.Some of the network monitoring tools to consider are tcpdump and Wireshark, tcpdump captures real time network traffic to a a command console whereas Wireshark is a GUI based packet capture utility, that provides user with powerful filtering options.
Setting Up Your Own Malware Analysis Lab
Before performing malware analysis, we need to setup a safe analysis environment; we want to make sure that these systems do not have access to any live production systems or the internet. It is a good idea to always start with a fresh install of the OS of your choice for the analysis. You have several options when creating a malware analysis environment. If you have the hardware lying around you can always build your lab using the physical machines. Prefer to use Virtualized Operating systems for the following reasons:
• Ability to take multiple snapshots
• Restoring to the pristine state is easy.
• No extra hardware is required
• Switching between Operating systems is faster
There are also some disadvantages of using Virtualized environments, some malwares change its characteristics or refuse to run when it is detected to be running within a virtual environment. In such cases you may have to analyze the malware on physical machines or reverse engineer and patch the code that is checking for the Virtualized environments using debuggers like OllyDBG or Immunity Debugger.
Building the Environment
Our environment consists of a physical machine running Backtrack 5 Linux (which is called Host machine) with Wireshark installed. The IP address of this host machine is set to 192.168.1.2 This machine also runs INetSim which is a free, Linux-based software suite for simulating common internet services. This tool can fake services, allowing you to analyze the network behaviour of malware samples by emulating services such as DNS, HTTP, HTTPS, FTP, IRC, SMTP and others (Figure 1). INetsim is also configured to emulate the services on the network interface with ip address 192.168.1.2.
The Linux machine also runs VMware Workstation in host only mode with Window XP SP3 installed on it (which is called as Analysis machine). Windows operating system is installed with Static Analysis tools (as mentioned in the Static Analysis section) and CaptureBAT to monitor the File System, Registry and Network activities (as mentioned in the Dynamic Analysis section). The IP address of the Windows machine is set to 192.168.1.100 with the default gateway as 192.168.1.2 (Figure 2) which is the IP address of the Linux machine, this is to make sure that all the traffic will be routed through the Linux machine where we will be monitoring for the network traffic (using Wireshark) and also emulating the internet services using INetSim. The Windows machine is our analysis machine where we will be executing the malware sample.
The screenshot (Figure 3) illustrates the malware analysis environment.
Analysis of a Malware Sample (edd94.exe)
Now that we have a malware analysis lab setup, let’s begin our analysis in the lab environment to see what we can learn about this sample edd94.exe. We will first start with the Static Analysis techniques.
• Determine the File Type: Running the File utility on the malware sample shows that it is a PE32
Executable file (Figure 4)
• Taking the Cryptographic Hash: MD5sum utility shows the md5sum of the malware sample (edd94.exe) (Figure 5). Other algorithms such as Secure Hash Algorithm version 1.0 (SHA1) can also used for the same purpose.
• Determine the Packer: PEiD is a tool that can be used to detect most common packers, cryptors
and compilers for PE files. It can currently detect more than 600 different signatures in the PE files.
In this case the sample is not packed (Figure 6). Another alternative to PEiD is RDG Packer Detector.
• Examining the File Dependencies: Dependency Walker is a great tool for viewing file dependencies.
Dependency Walker shows four DLLs loaded and the list of api calls imported by the executable (edd94. exe) and it also shows the malware specimen importing an api call “CreateRemoteThread” (Figure 7) which is an api call used by the malware to inject code into another process.
• Submission to Online Web Based Malware Scanning Service: Submitting the sample to VirusTotal
shows that malware is a ZeuS bot (zbot) (Figure 8). Zeus is a Trojan horse that steals banking information by Man-in-the-browser keystroke logging and Form Grabbing. Zeus is spread mainly through drive-by downloads and phishing schemes.
Now that we got some information using Static Analysis, let us try to determine the characteristics of the malware using Dynamic Analysis. Before executing the malware, the monitoring tool Wireshark is run on the Linux machine to capture the network traffic (Figure 9) generated as a result of malware execution. INetSim is run to emulate network services and to provide fake responses to the malware (Figure 1).On Windows, CaptureBAT is run to capture the process, registry and file system activity.
The malware sample (edd94.exe) was run in the analysis machine for few seconds. Following are some of activities caught by our monitoring tools after the malware execution.
The below screenshot (Figure 10) shows the process, registry and fileystem activity after executing the malware (edd94.exe), also explorer.exe (which is OS process) performs lot of activity (setting registry value and creating various files) just after executing the malware indicating code injection into explorer.exe.
The malware also drops a new file (raruo.exe) into “C:\Documents and Settings\Administrator\ApplicationData\Lyolxi” directory, after which it executes it and creates a new process (Figure 11). Now this is where the cryptographic hash will help us determine if the dropped file (raruo.exe) is the same as the original file (edd94.exe). We will come to that later.
Another interesting activity is explorer.exe setting a registry value {F561587E-37AB-9701-D0081175F61B} under the sub key “HKCU\Software\Microsoft\Windows\CurrentVersion\Run” (Figure 12). Malwares usually adds values to this registry key to survive the reboot (persistence mechanism). Also explorer.exe creating this registry key is suspicious and could be the result of malware injecting code into explorer.exe.
Wireshark also captured the malware performing a DNS look up to resolve the domain “users9.nofeehost.Com”. Aalso, the domain resolved to the IP address 192.168.1.2 which is our Linux machine (Figure 13).This is because INetSim which was running on the Linux machine responded to the DNS query by giving a fake response. Now we have tricked the malware to think that users9.nofeehost.com is at IP address 192.168.1.2 which is our host machine (Linux). This way, we have not allowed the malware to connect to the internet and also have control over our analysis.
Then the malware tries to establish an http connection trying to download a configuration file (all.bin) from the domain users9.nofeehost.com (Figure 14), also the INetSim gave a fake response page, we can also configure INetSim to respond with whatever custom page we want to.
ZeuS Tracker (project that keeps track of ZeuS command and control servers around the world) shows that this domain (users9.nofeehost.com) was previously listed as ZeuS command and control server also the pattern that we captured is same as mentioned in the ZeuS tracker (Figure 15). This confirms that we are dealing with ZeuS bot (zbot).
Conclusion
By setting up a safe malware analysis lab we were able to perform basic static and dynamic analysis to uncover the characteristics of the malware without actually infecting any of the production systems. The patterns identified after analysis can now be used to create signatures for the security devices.