Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework

Nissim, Nir; Cohen, Aviad; Moskovitch, Robert; Shabtai, Asaf; Edri, Matan; BarAd, Oren; Elovici, Yuval

doi:10.1186/s13388-016-0026-3

Security Informatics

Table 1 Code obfuscation techniques in PDF files that can be used by an attacker

From: Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework

Obfuscation technique	Details
Separating malicious code over multiple objects	Malicious code is spread among multiple objects. Code chunks are collected and merged and compiled to form a malicious piece of code only during runtime. This makes it difficult for static analysis detectors to recognize the malicious code
Applying filters	Filters are used to conceal malicious code
White space randomization	Random white spaces are inserted in the malicious code in order to evade recognition by signature based maliciousness detectors. White spaces do not affect the code since JavaScript ignores them
Comment randomization	Random comments are inserted in the malicious code in order to evade recognition by signature based maliciousness detectors. Comments do not affect the code since JavaScript ignores them
Variable name randomization	Changing the variable’s name randomly in order to fool signature based maliciousness detectors
Integer obfuscation	Representing numbers in a different way which can be used to hide a specific memory address
String obfuscation	Making changes to string in order to make it difficult for a human analyst to understand the code (e.g., by splitting string into several substrings)
Function name obfuscation	Hiding the name of the function used which can provide a clue about the code’s intention. This is done by creating a pointer with a random name, pointing to the required function
Advanced code obfuscation	String can hold encrypted malicious code. The decryption process takes place during runtime, just before usage. Metadata fields and even the document’s words can also be used to store malicious code
Block randomization	Changing the syntax of the code but not its action
Dead code	Inserting blocks of code that are not intended to be executed
Pointless code	Inserting blocks of code that do not perform anything

Back to article page