15 Aug 2020

Deobfuscating AutoIt scripts, part 2

Almost 4 years ago, I wrote a blogpost about deobfuscating a simple AutoIt obfuscator. Today I have a new target which is using a custom obfuscator. smile

Update: This obfuscator is called ObfuscatorSG and can be downloaded from Github. Thanks Bartosz Wójcik!

Author had a very specific request about the methods used to solve the crackme:

If I'm allowed to be picky, I'm primarily interested in scripted efforts to RegEx analyze strings/integers. Very little effort (as in none) went into hiding the correct string. The script was merely passed-through a self-made obfuscator.

In this article I'll show the things I tried, where and how I failed miserably and my final solution for this crackme. I really suggest that you download the crackme from tuts4you and try replicating each step along the way, that way it will be much easier to follow the article.

So, let's get started!
Read More

02 Dec 2016

Deobfuscating AutoIt scripts

Every once in a while, someone posts an interesting challenge concerning protected or obfuscated AutoIt scripts. Today I'd like to show some basic approaches to AutoIt deobfuscation. As a target I'll use a very simple protection called AutoGuardIt and the crackme from Tuts4You thread. If you don't have access to Tuts4You, here is the alternative download link: https://www.mediafire.com/?qs52emp7tkk472g

In general, there is nothing hard in decompiling AutoIt scripts. The Autoit script interpreter is designed in such a way that it's really easy to convert P-Code back to the script form. There's also a tidy.exe utility which takes ugly hand-written script and reformats it to make it really pretty. All of this makes writing deobfuscators much easier because you can start with well-formatted AutoIt script and your deobfuscator can consist of simple regexps and string replaces. It will not be very pretty code but it will work.

While I was preparing this blog post, SmilingWolf came up with a full-featured solution written in Python. It's a nice solution but it doesn't explain how or why it works. So, in this article I will explain how the protection works, show the basic techniques and sample source code to defeat each of the protection steps. Making a full-featured deobfuscator is left as an exercise for the reader.

Required tools

  • C# compiler. All my examples were tested under Visual Studio 2010 but any recent version should do
  • MyAutToExe. I'm using my personal modification of myAutToExe. You can download it from Bitbucket: https://bitbucket.org/kao/myauttoexe
  • Tool for testing regexps. I'm using http://regexr.com/
  • Some brains. You can't become a reverser if you can't think for yourself.

Decompiling the script

There are 2 public tools for extracting compiled AutoIt script: MyAutToExe and Exe2Aut.

Exe2Aut uses dynamic approach for obtaining script - it runs the file and gets decrypted and decompressed script from process memory. That's usually the easiest way but you really don't want to run the malware on your computer.

MyAutToExe uses static approach - it analyzes file and tries to locate, decrypt and decompress the script on its own. That's more safe approach but it's easier to defeat using different packers, modified script markers and so on. To extract script from this crackme, I used my own MyAutToExe (see "Required tools" section above).

Analyzing the obfuscation

Once the script is extracted and decompiled, it looks quite strange and unreadable:

Let's look at each of the obfuscation techniques and see how it works and how it can be defeated.

Integer decomposition

AutoGuardIt takes constants and converts them to series of math operations. Example:

Deobfuscator should be able to take the expression, evaluate it and replace the expression with the correct value.

The biggest problem here is the precedence of operations (multiply and divide should be processed before addition and subtraction), so you can't start from the beginning of line and do it one step at a time. This would be wrong:

After some thinking and few Google searches, I found a LoreSoft.MathExpressions library that does all the heavy lifting for me. smile

The following C# code snippet will find all math expressions, extract them, evaluate them and replace expression with the actual value:

Pseudo-random integers

This is quite strange protection that relies on a fact that AutoIt's Random function is actually pseudo-random number generator. If you seed it with the same seed, you get the same results. Example:

In general, it's a very bad idea because there's no guarantee that random number generator will not change in the next version of AutoIt. But for now it works..

Since I was already using myAutToExe, I decided to use RanRot_MT.dll from the package.

a = StringLen("xyz")

Small integers can be obfuscated by using function StringLen:

To clean them up, a simple regex can be used:

End result:

Chr(x)

Some strings in the executable are split into bytes, and each byte is then encoded as call to Chr function:

Another simple regex will kill all of those:

The result will be valid but still hardly readable string:

And one more simple search-replace will fix that:

End result:

If 1 Then

Once you remove the integer obfuscations, you'll see a lot of useless statements like this:

This condition is always true, so we can remove both If and EndIf lines and improve readability.

The problem here is that If's can be nested and you can't just remove first EndIf that you encounter. Consider this example:

Taking all that into account, I came up with this ugly but working* code:

* - see below for some scenarios where this code might fail.

If a = a Then

Variation of previous protection. In such cases, using regexp is much more efficient than simple string comparison

Do Until 1

It's pretty much the same protection as If 1 Then and it can be defeated the exact same way.

While 1 / ExitLoop / WEnd

Another protection of the same kind, just uses 3 lines of code instead of 2. Same approach, just make sure you match the correct lines and remove all 3 of them.

For $random=0 to 123.456 / ExitLoop / Next

Another protection, very similar to previous ones.

Here one must be very careful not to remove the real for cycles from program, so it's better to use regexps. Apart from that, it's pretty much the same code again.

Assign/Execute

This type of protection relies on AutoIt's Assign function. First, an alias to a function is defined:

Later, alias is used to call the function:

Deobfuscation is a simple operation: find all calls to Assign, extract the variable name and the function name, then replace all references to the variable with the function name:

BinaryToString

As you can see in the example above, some strings in the script are replaced with calls to BinaryToString. Here's a another example of the same protection where part of code is replaced with BinaryToString + call to Execute.

Merging all 3 lines into one and converting hex strings to bytes gives us the following code:

which using the methods described earlier can be deobfuscated as:

Functions returning constants

Some strings are not only encoded using BinaryToString but also moved to a separate function.

Deobfuscated code will look like this:

It's quite tricky to find the correct return value for each function and replace function calls with correct values. In addition to that, regexes aren't really suitable for such tasks. smile The code I wrote is really ugly, so I'm not going to show it. Go, figure it out yourself! smile

Switch control-flow obfuscation

This is actually the hardest one of them all. The example code looks like this:

You must find the initial value of the variable. Then you must find the correct start/end of the Switch, all the switch cases and all other assignments to the control variable. Then you'll be able to reorder all the code. It's a moderately hard problem for which I don't have a pretty solution. smile

Here's my code which seems to work:

After cleanup, the deobfuscated code looks like this:

Unused variable assignments

There are random variable assignments sprinkled all over the code.

They are only assigned once and never used. To clean them up, one can locate the assignments using regex, count how many times this variable appears in the code and remove it, if it's only assigned once. Something like this:

You can remove string and integer assignments using very similar regex.

Lather, rinse, repeat

If you run each of the mentioned deobfuscations only once, you'll end up with half-deobfuscated code. Also, there isn't one specific order in which the deobfuscations should be applied. Of course, you could just run the entire loop 100 times, but it's just ugly.

Therefore, I prefer to run the code in the loop like this:

It will run all the deobfuscations until there's nothing left to clean up. Then run tidy.exe on the output and you'll get a nicely readable script.. smile

Possible problems and gotchas

Deobfuscators based on string matching are very easy to implement. However, one must be very careful to write correct regular expressions and string comparisons. My example code works very well on this specific crackme. However, it will mess up with code like this:

You can work around such issues by using more specific regexes with anchors. During my research, I used http://regexr.com/ a lot, and I find it really helpful. Give it a try!

Conclusion

In this article I showed basic approaches that can be used to deobfuscate (not only) AutoIt scripts. They are extremely simple and work well for one-time tasks. For more complex tasks, deobfuscators based on abstract syntax trees or disassembled P-Code are much more efficient but more time-consuming to create.

Have fun!