Skip to main content

Table 1 Code obfuscation techniques in PDF files that can be used by an attacker

From: Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework

Obfuscation technique

Details

Separating malicious code over multiple objects

Malicious code is spread among multiple objects. Code chunks are collected and merged and compiled to form a malicious piece of code only during runtime. This makes it difficult for static analysis detectors to recognize the malicious code

Applying filters

Filters are used to conceal malicious code

White space randomization

Random white spaces are inserted in the malicious code in order to evade recognition by signature based maliciousness detectors. White spaces do not affect the code since JavaScript ignores them

Comment randomization

Random comments are inserted in the malicious code in order to evade recognition by signature based maliciousness detectors. Comments do not affect the code since JavaScript ignores them

Variable name randomization

Changing the variable’s name randomly in order to fool signature based maliciousness detectors

Integer obfuscation

Representing numbers in a different way which can be used to hide a specific memory address

String obfuscation

Making changes to string in order to make it difficult for a human analyst to understand the code (e.g., by splitting string into several substrings)

Function name obfuscation

Hiding the name of the function used which can provide a clue about the code’s intention. This is done by creating a pointer with a random name, pointing to the required function

Advanced code obfuscation

String can hold encrypted malicious code. The decryption process takes place during runtime, just before usage. Metadata fields and even the document’s words can also be used to store malicious code

Block randomization

Changing the syntax of the code but not its action

Dead code

Inserting blocks of code that are not intended to be executed

Pointless code

Inserting blocks of code that do not perform anything