Skip to main content

Hacking - Best OF Reverse Engineering - Part10

Deep Inside Malicious PDF

Nowadays, people share documents all the time and most of the attacks are based on client side attacks and target applications that exist in the user’s, or employee’s OS. From one single file, the attacker can compromise a large network. PDF is the most common sharing file format, due to the fact that PDFs can include active content, and are passed within the enterprise and across networks. In this article, we will analyze ways to catch malicious PDF files.

When we start to check the PDF files that exist in our PC or laptop, we may use an antivirus scanner but these days it might not be good enough to detect a malicious PDF that contains a shell code because the attacker mostly encrypts its content to bypass the antivirus scanner and, many times, targets a zero day vulnerability that exists in Adobe Acrobat reader or a version that has not been updated. Figure 1 shows how PDF vulnerabilities are rising every year.

Before we start to analyze malicious PDFs, we are going to have a simple look at PDF structures so we can understand how the shell code works and where it is located.

















PDF components

PDF documents contains four main parts (one-line header, body, cross-reference table and trailer).

PDF Header

The first line of the PDF shows the PDF format version, the most important line that gives you the basic information of the PDF file; for example, “%PDF-1.4 means that file fourth version.

PDF Body

The body of the PDF file consists of objects that compose the contents of the document. These objects include fonts, images, annotations, and text streams, and the user can include invisible objects or elements. These objects can interact with PDF features like animation and security features. The body of the PDF supports two types of numbers (integers, real numbers).

The Cross-Reference Table (xref table)

The cross- reference table contains links of all objects and elements that exist in the file format. You can use this feature to see content on other pages (when the users update the PDF, the cross-reference table gets updated automatically).

The Trailer

The trailer contains links to the cross-reference table and always ends up with %%EOF to identify the end of a PDF file. The trailer enables a user to navigate to the next page by clicking on the link provided.

Malicious PDF through Metasploit

Now after we have taken a tour inside PDF file format and what it contains, we will start to install an old version of Adobe Acrobat reader 9.4.6 and 10 through to 10.1.1 that will be vulnerable to Adobe U3D Memory Corruption Vulnerability.

These exploits exist in Metasploit framework so we are going to create the malicious PDF and analyze it in KALI Linux distribution. Start by opening the terminal and type msfconsole (Figure 2). As shown in the picture below, we are going to set some Metasploit variables to be sure that everything is working fine.














*After choosing the exploit type, we are going to choose the payload that will execute during exploitation in the remote target and open Meterpreter session.

*choose the LHOST which is our IP address and we can view through typing ifconfig in new terminal

*finally we type exploit to create the PDF file with configuration we created before

The file has been saved on /root/.msf4/local.

So we are going to move the file to the desktop to make it easier to locate when typing it in the terminal

root@kali :~# cd /root/.msf4/local
root@kali :~# mv msf.pdf /root/Desktop

PDFid

Now we are going to use pdfid to see what the PDF contains of elements and objects and JavaScript
and see if there is something interesting to analyze (Figure 3).






















The PDF has only one page, maybe it’s normal. There are several JavaScript objects inside… this is very strange. There is also an OpenAction object which will execute this malicious JavaScript.

So we are going to use peepdf.

Peepdf

Peepdf is a Python tool that is very powerful for PDF analysis. The tool provides all the necessary
components that security researchers need for PDF analysis without using many tools. It supports
encryption, Object Streams, Shellcode emulation, Javascript Analysis, and for Malicious PDFs, it shows potential Vulnerabilities, Shows Suspicious Elements, Powerful Interactive Console, PDF Obfuscation (bypassing AVs), Decoding: hexadecimal – ASCII and HEX search (Figure 4).














Analysis

To start analysis, go to the directory of the PDF file then start with syntax /usr/bin/peepdf –f msf.pdf.

We use –f option to avoid errors and force the tool to ignore them (Figure 5).




















This is the default output but we see some interesting things. The first one we see is the highlighted one,object 15 contains JavaScript code, and we have also one object 4 that contains two executing elements (/AcroForm & /OpenAction), and the last one is /U3D showing us a Known Vulnerability. For now we will start to explore these objects by getting an interactive console by typing syntax /usr/bin/peepdf –i msf.pdf (Figure 6).



















The tree commands shows the logical structure of the file, and starting explore object 4 (/AcroForm) (Figure 7).




















As we see in the picture above, when we type object 4, it gave you another object to explore. For now, we didn’t see any important information or anything that seems suspicious except object 2 (XFA array) that gave us the element <fjdklsaj fodpsaj fopjdsio> and seems to us not to contain anything special.

Let’s move to the another object (Open Action) (Figure 8).






















Now we can see the JavaScript code that will be executed when the PDF file is opened.

The other part of the JavaScript code is barely obfuscated like writing some variables in hex and in this code we can see a heap spraying with shell code plus some padding bytes. The attackers typically use unicode to encode their shell code and then use the unescape function to translate the unicode representation to binary content (now we are sure that it is definitely a malicious PDF) (Figure 9).










Defend

We defend our network from that type of malicious file by providing strong e-mail and web filters, IPS and by application control: disable JavaScript and disable PDF rendering in browsers, block PDF readers from accessing the file system and network resources, and overall security awareness.

Conclusion

We’ve taken a tour of the PDF file format structure and what it contains and we’ve seen how to detect
a malicious PDF and know where and how to locate suspicious objects and show the JavaScript code,
and finally, know how to defend our network.

Popular posts from this blog

Haking On Demand_WireShark - Part 5

Detect/Analyze Scanning Traffic Using Wireshark “Wireshark”, the world’s most popular Network Protocol Analyzer is a multipurpose tool. It can be used as a Packet Sniffer, Network Analyser, Protocol Analyser & Forensic tool. Through this article my focus is on how to use Wireshark to detect/analyze any scanning & suspect traffic. Let’s start with Scanning first. As a thief studies surroundings before stealing something from a target, similarly attackers or hackers also perform foot printing and scanning before the actual attack. In this phase, they want to collect all possible information about the target so that they can plan their attack accordingly. If we talk about scanning here they want to collect details like: • Which IP addresses are in use? • Which port/services are active on those IPs? • Which platform (Operating System) is in use? • What are the vulnerabilities & other similar kinds of information. • Now moving to some popular scan methods and ho...

Bypassing Web Application Firewall Part - 2

WAF Bypassing with SQL Injection HTTP Parameter Pollution & Encoding Techniques HTTP Parameter Pollution is an attack where we have the ability to override or add HTTP GET/POST parameters by injecting string delimiters. HPP can be distinguished in two categories, client-side and server-side, and the exploitation of HPP can result in the following outcomes:  •Override existing hardcoded HTTP parameters  •Modify the application behaviors   •Access and potentially exploit uncontrollable variables  • Bypass input validation checkpoints and WAF rules HTTP Parameter Pollution – HPP   WAFs, which is the topic of interest, many times perform query string parsing before applying the filters to this string. This may result in the execution of a payload that an HTTP request can carry. Some WAFs analyze only one parameter from the string of the request, most of the times the first or the last, which may result in a bypass of the WAF filters, and execution of the pa...

Bypassing Web Application Firewall Part - 4

Securing WAF and Conclusion DOM Based XSS DOM based XSS is another type of XSS that is also used widely, and we didn’t discuss it in module 3. The DOM, or Document Object Model, is the structural format used to represent documents in a browser. The DOM enables dynamic scripts such as JavaScript to reference components of the document such as a form field or a session cookie, and it is also a security feature that limits scripts on different domains from obtaining cookies for other domains. Now, the XSS attacks based on this is when the payload that we inject is executed as a result of modifying the DOM environment in the victim’s browser, so that the code runs in an unexpected way. By this we mean that in contrast with the other two attacks, here the page that the victim sees does not change, but the injected code is executed differently because of the modifications that have been done in the DOM environment, that we said earlier. In the other XSS attacks, we saw the injected code was ...