A Beginners’ Guide to Obfuscation

Obfuscation is a technique used to change software code in order to make it harder for a human to understand. There are several reasons one might obfuscate code:

To make it harder for unauthorised parties to copy the code
To reduce the size of the code in order to improve performance. For example a browser can download a javascript file quicker
For fun! There are code obfuscation competitions
To avoid detection from security products, such as Intrusion Detection systems
To make any analysis of the code more difficult. For instance, reverse engineering a malicious executable

I will focus on the last two, which are of most interest to a security researcher. I will use JavaScript as an example, but the techniques are mostly transferable to other languages.
For example, take the following code exploit, being utilised in JavaScript:

var launcher = new ActiveXObject(“WScript.Shell”);
launcher.Run(“C:malware.exe”);

One could easily write a snort signature to detect this activity on a network level, for example:

alert tcp any any -> any 80 (content: “WScript.Shell”; content: “.Run|22|malware.exe”; distance: 0; within 100; msg: “malware.exe executed by javascript”;)

By looking at the code it is quite obvious that it is trying to run an executable called ‘malware.exe’.
I will now demonstrate and rate nine common obfuscation techniques which an attacker could utilise to avoid detection from security products and make the understanding of what the code does difficult for a security analyst.
String Concatenation
The author can split the strings. which can be signatured or give indication of what the code does into substrings. which can then be concatenated to get the desired result:

var a = “cript.Sh”;
var b = “.e”;
var launcher = new ActiveXObject(“WS” + a + “ell”);
launcher.Run(“C:mal” + “ware” + b + “xe”);

Rating: 2 (out of 10) – It would be relatively easy to figure out the true intentions of the code by looking at it, but this method helps the code avoid static signature checks when used with an interpreted language. It is likely to be removed during the optimisation process if used in a compiled language.
String replacement methods
This method uses a particular language’s char or string replacement methods to replace certain character sequences in strings in order to get the desired result:

var launcher = new ActiveXObject(“WxxSxcxrxixpxtx.xSxhxexlxlx”.replace(/x/g, “”));
launcher.Run(“C:mGlwGrP.PxP”.replace(/G/g, “a”).replace(/P/g, “e”)) ;

Rating: 3 – This method is similar to the previous example but slightly more difficult to figure out by pure sight. Finding and replacingfunctions of text editors would help.
String Encoding
There are various techniques that can be used to encode the string, which can result in the same string when it is evaluated:

var hexString = “x57x53x63x72x69x70x74x2ex53x68x65x6cx6c”;
var octString = “1037213413415514115416714116214556145170145”
var launcher = new ActiveXObject(hexString);
launcher.Run(octString);

Rating: 4 – Similar to previous example, but more difficult to figure out by pure sight unless you know your encodings very well. A debugger, e.g. firebug, would be useful to reveal the ascii representations of the variables.
Custom Encoding
The author encodes strings using a custom algorithm and provides a decoder function to get back to the originals:

Rating: 5 – This method is better than the previous example as the analyst would need to access to the decoder function to reveal the ascii strings. Again, a debugger would be useful in this case.
Name Substitution
Replace all variable, constant and function names with non-meaningful names, which often are very similar to each other in order to confuse analysts:

var lllll = “WScript.Shell”;
var lll1l = “C:malware.exe”;
var l1lll = new ActiveXObject(lllll);
function ll1ll = new function(llllll) {
llllll.Run(lll1l);
}
ll1ll(l1lll);

Rating: 1 – This method is arguably more of a hindrance to analysts than anything and does not help avoid signature detection. This can be overcome with find and replace in your text editor if toned be. It is also not applicable to compiled languages like C++.
White Space reduction
Remove all unnecessary white space and compress code into as little space as possible:

var lllll=”WScript.Shell”;var lll1l=”C:malware.exe”;varl1lll=new ActiveXObject(lllll);function ll1ll=new function(llllll){llllll.Run(lll1l);}ll1ll(l1lll);

Rating: 1 – Again, this method is arguably a hindrance to analysts and does not help avoid signature detection. It can be overcome with a code formatter. It is also not applicable to compiled code.
Dead Code Insertion
This method inserts code that is never called and does nothing to increase confusion:

Meaningless loops also have the added advantage of being able to trick emulators into halting analysis of code if it is taking too long. Please see here for an example.
Rating: 5 – In my view, this methodwastes valuable time trying to find what code is actually executed, and if used correctly can bypass code emulation checks. It can be useful in compiled code, for example, if used in conjunction with a packer, it can make it more difficult to find the real entry point of the code.
Pass Arguments at Runtime
The author can write the code to expect critical values to be passed into the programme at runtime, for example, a java applet’s variables could all be encrypted inside the code and require a decryption key to be passed in. The analyst may have the applet code but may not have access to the packet capture which might reveal what the key is:

/*HTML Code which analyst may not have access to*/
<html>
<object type=”application/x-java-applet” width=”0″ height=”0″>
<param name=”archive” value=”badjar.jar”/>
<param name=”key” value=”123456789abcdef”
</object>
</html>

[av_hr class=’full’ height=’50’ shadow=’shadow’ position=’center’ custom_border=’av-border-thin’ custom_width=’50px’ custom_border_color=” custom_margin_top=’30px’ custom_margin_bottom=’30px’ icon_select=’yes’ custom_icon_color=” icon=’ue808′ font=’entypo-fontello’]

/* Java code which would be found in badjar.jar applet */
private String decrypt(String encryptedString)
String key = getParameter(“key”);
SecretKeySpec skeySpec = new SecretKeySpec(key.getBytes(), “AES”);
Cipher cipher = Cipher.getInstance(“AES”);
cipher.init(Cipher.DECRYPT_MODE, new secretKeySpec(skeySpec.getEncoded(), “AES”));
byte[] original = cipher.doFinal(encrypted.getBytes());
return new String(original);
}

Please see here for an example of this kind of technique used in actual malware.
Rating: 7 – The analyst must have access to both the malware code and the key which was used to decrypt the strings within it.
Packing
It’s not just the constant strings which can be obfuscated. The javascript eval function enables a programmer to pass in javascript code which is then evaluated and executed at runtime. So we can pack our whole programme into a variable and decode and evaluate it at run time:

//this is a simple function to decode a string which has been xored with 0x0C
function decode(encoded) {
var decoded = ”;
for (i = 0; i < encoded.length; i+=2) {
//var hex = encoded.substring(i, i+2)
var s = String.fromCharCode(parseInt(encoded.substr(i, 2), 16) ^ 0x0C);
decoded = decoded + s;
}
return decoded;
}
eval(decode(“7a6d7e2c606d79626f64697e2c312c62697b2c4d6f78657a6954436e66696f78242e5b582e2c272c68696f636869242e3a
6a3b693a393b6f3b343e3e396a3a382e252c272c2e69602e2c272c5f787e65626b226a7e63614f646d7e4f636869243d3c3425253
72c606d79626f64697e225e7962242e4f365050616d602e2c272c68696f636869242e3b6e3a683b693a353e3e3a353b383a352e252537”));

It’s not just javascript which can be used as such. Java’s reflection API allows classes to be defined by a string which can then be loaded and executed at runtime. Packing an executable is a technique which compresses the whole executable, and provides a single unpacking function to uncompress the actual code and run it. This is probably the most common technique used to obfuscate code. Below are some examples using the various programming languages:
Executable packing malware
Packed Java exploit
Packed javascript evaluation
Rating: 9 – While it is possible to use debuggers to set breakpoints and examine the contents of the strings, it is time consuming and can further complicate matters if the values have been packed multiple times. In the case of executable packing, an analyst must find the point in the assembly code where the unpacking routine finishes, which is not trivial. It is advisable, in this case, to use an emulator in a safe environment and examine system behaviour, such as file system changes or network activity
Commercial Tools
Commercial obfuscators combine all of these techniques which make life very difficult for analysts and also make it virtually impossible for signature based detection to work on malicious code. Fortunately there are also commercial de-obfuscators which can help us. The below table lists some of these tools:

Language	Obfuscators	Analyst Tools
Javascript	Dean Edwards Packer Free Javascript Obfuscator JS Minifier Stunnix	JSBeautifier – Copy and paste your script into their website – does code formatting and unpacking, and able to detect certain obfuscators JSUnpack – Python source code available or use their website. Able to detect hidden HTTP connections being created among other things. JSDetox – Offers a web application where analysts can upload javascripts SpiderMonkey – Firefox’s javascript engine, can use command line version to evaluate scripts outside of the browser Firebug – Javascript debugger for use within Firefox
Java	Allatori CafeBabe JBCO ProGuard	JDO – Decompiles and deobfuscates class files. Procyon – decompiles class files into java files.
Assembly	UPX CExe RLPack FSG Themida	UPX is able to unpack UPX packed executables. OllyDbg – Debugger which enables memory dumping of packed executable and import table reconstruction ChimpREC – Allows process to be dumped and import table to be fixed