15 Aug

Deobfuscating AutoIt scripts, part 2

Almost 4 years ago, I wrote a blogpost about deobfuscating a simple AutoIt obfuscator. Today I have a new target which is using a custom obfuscator. smile

Update: This obfuscator is called ObfuscatorSG and can be downloaded from Github. Thanks Bartosz Wójcik!

Author had a very specific request about the methods used to solve the crackme:

If I'm allowed to be picky, I'm primarily interested in scripted efforts to RegEx analyze strings/integers. Very little effort (as in none) went into hiding the correct string. The script was merely passed-through a self-made obfuscator.

In this article I'll show the things I tried, where and how I failed miserably and my final solution for this crackme. I really suggest that you download the crackme from tuts4you and try replicating each step along the way, that way it will be much easier to follow the article.

So, let's get started!

Required tools

  • MyAutToExe. I'm using my personal modification of myAutToExe but even a standard version should work;
  • C# compiler. I used VS2017 but any version will do;
  • Some library that evaluates math expressions. Just like in my previous article, I used MathExpressions library from LoreSoft.Calculator project;
  • Tool for testing regexes. I'm using Regexr;
  • Some brains. Writing deobfuscators is like 80% thinking, 20% writing the actual code.

First steps

First steps are easy - unpack UPX, extract tokens and decompile. The process has been described numerous times, so just google for details.
Once decompiled, the code looks something like this:

Horrible, isn't it? smile

Cleaning up the math

So, let's get rid of the math expressions first! In my previous post, I used the following regex + math library to clean up the stuff:

I tried it here and it failed on floating point numbers like this:

I fixed that and regex started to work. Sort of. There are evil parentheses everywhere and my regex doesn't handle them.

So, I added a second regex to support parentheses at beginning and the end of the expression. What could possibly go wrong? bigsmile

As I learned few hours later, a lot! See, for example, here:

First, regex matched stuff inside parentheses 4 * 21 - 80 and computed it. Then it matched expression 18 - 71 and computed that.

Well, it's already f*cked up, because that's not the correct order of operations. Multiplication has a higher precedence than subtraction!

At this point managing regexes was becoming so messy that I stopped. This is not going to work, I need a new approach!

Matching parentheses

If you want to read more about crazy regexes to find matching parentheses, this StackOverflow discussion is a good place to start. But I decided to keep it simple.

There are several algorithms, but the simplest one is just counting opening/closing parentheses until you find the correct one.

Now I can take the expression I found, and pass it to the LoreSoft.MathExpressions. Right?

Wrong. Parentheses are also used in function definitions or when passing parameters to another function:

So, I added another check to see if the extracted expression looks like a math expression. And it seemed to work.

Problematic minus signs

Next problem I encountered was LoreSoft.MathExpressions complaining about some expressions like these:

Apparently, library can handle negative numbers when they are alone, but combination of negative sign and parentheses like "(-(1 + 2))" just confuses the hell out of it. Since there were only a few cases in the crackme, I manually edited them:

Another problem solved!

Fixing math library

To continue my journey of failures, some of the calculated expressions were really, really strange. For example:

That doesn't look right! The original line was

77 to the power of 1 equals 77. Divided by 11 equals 7. Minus 1 equals 6. So the result should definitely be 6. Why the hell we have 0.48422...?

It turns out that LoreSoft.MathExpressions is buggy and "raise to power" operator doesn't have the correct precedence. See the source:

Raise to power doesn't have any special handling, so it's handled after the division or multiplication. Which is terribly wrong but really easy to fix:

Finally, the math problems are solved! smile

Function names

After solving math problems, methods are starting to look a bit better:

Now we need to get rid of those obfuscated variable names like $_L1111L1L11L and replace them with a proper function names. But what exactly is $_L1111L1L11L? I ran a simple grep, and there are 11 references in the code - 1 declaration of variable, 7 uses of variable and 3 assignments:

That's interesting. uneasy

First of all, AutoIt allows to do this weird thing where you assign a function to a variable. Then you can use this variable to call a function. Crackme that I solved in my previous post used combination of Assign + Execute methods for the same purposes.

Second, you can have several assignments to the same variable. But which one is the correct one? First one? Last one? A random one?

There is no magic solution here, you just need to go through the script and see the execution flow. In AutoIt, anything that's not inside a function is considered to be main code and will be executed starting from the top. So, I went through the script and left only the interesting parts:

This is the order in which the functions will be called. First _LL1111LL1L1L() and then _LL1LLL1() will be executed. Then inside the Switch we'll take Case 1 because that's the value of global variable $_11L1LL1. So, that will call _111LL111LL() and _11111L1LLL1(). Finally, _LL1111L1L1() will be called.

Method _LL1111LL1L1L() does the first assignments:

Then _LL1LLL1() reassigns some (or maybe all) of the variables:

And so on..

I'm too lazy to analyze all of the assignments, so I just reimplemented all 3 methods in my code.

Of course, I did not type all the assignments manually. Simple regex "search and replace" created C# code from the AutoIt code. smile

Now I have a dictionary of variable names and the actual function names. Let's just run a simple search and replace!

...and we'll fuck up again.

See for example here:

If you start from the first string and do dumb search-and-replace, you'll replace a wrong substring and get a result like this:

For the exact same reason, you should avoid touching local variable names.

My final search-and-replace solution looked like this:

Bit operations

All the hard stuff is done, I promise! We're just a few fuckups away from the solution! smile

Our test method now looks like this:

I decided to use regex loops from my old article:

...and it failed. Some of the calculated numbers just didn't make any sense.

This issue is a little bit tricky. To figure it out, you need to read the documentation for each method used:

Match.NextMatch:
Returns a new System.Text.RegularExpressions.Match object with the results for the next match, starting at the position at which the last match ended (at the character after the last matched character).

Regex.Replace:
In a specified input string, replaces a specified maximum number of strings that match a regular expression pattern with a specified replacement string.

Can you see a problem here? I couldn't. So, I spent ~20 minutes debugging it in VisualStudio.

Here's an image for you:

There are several solutions possible, I just got rid of NextMatch and used a big while loop instead.

TL;DR - DO NOT combine Match.NextMatch with Regex.Replace. It will bite you in the butt one day!

Chr() and string concatenation

Now we're getting somewhere! Code is looking better and better:

Cleaning up the CHR calls and string concatenation was easy. Regex and string replace from my previous article worked without any issues. smile

String reverse

We're left with one final problem that is STRINGREVERSE function:

We can use a simple regex loop to fix those. Just like the one we used for bit operations.

The end result

And this is how the serial check looks like after deobfuscation:

Sure, there is a lot of useless code left in the crackme. Variables are not renamed. I could spend half-hour more and clean up all that mess. But I wasn't interested in that, I just wanted to solve the crackme. smile

Final thoughts

In this post I documented all my mistakes and fuckups while solving a rather simple crackme, so that others can learn from them. Reverse engineering is not an easy process and making mistakes is a huge part of it.

I have not failed. I've just found 10,000 ways that won't work. /Thomas A. Edison/

6 thoughts on “Deobfuscating AutoIt scripts, part 2

  1. Avatar

    Thanks a lot kao for your detailed and step-by-step (try-fail-retry) walktrough ! It's very very interesting.

    Preparing code blocks and images for the post I'm sure took you lots of time, but the result is great and helps understanding the tricky parts!

    So thank you for the analysis of this 'rather simple crackme' (simple for you! LoL) ...

    Best Regards,
    Tony

    • Avatar

      Forgot to say ... this phrase 'We're just a few fuckups away from the solution! :)' made my day ;)

  2. Avatar

    This output is from recent {hidden link}

    I would like you to examine my work {hidden link} :)

    I did read your article about deobfuscation and incorporated some techniques to make it harder to deobfuscate the code, like anti-regex patterns.

    I didn't know about the function to variable assignment possibility within AutoIt and it looks like a cool thing to add.

    The floating point obfuscation looks like a good idea to make deobfuscation harder, I have tried it that once but the overall speed of floating point numbers processing put the entire process to ground (damn slow within PHP FPU calculations).

    • Avatar

      Thank you for that info, I will update my post with the obfuscator name and link. :)

      As for your tool - in 3-4 years when my next AutoIt-related blogpost comes, I might look at it. But not today.

Leave a Reply

  • Be nice to me and everyone else.
  • If you are reporting a problem in my tool, please upload the file which causes the problem.
    I can`t help you without seeing the file.
  • Links in comments are visible only to me. Other visitors cannot see them.

Your email address will not be published.

 −  one  =  three