15 Aug 2020

Deobfuscating AutoIt scripts, part 2

Almost 4 years ago, I wrote a blogpost about deobfuscating a simple AutoIt obfuscator. Today I have a new target which is using a custom obfuscator. smile

Update: This obfuscator is called ObfuscatorSG and can be downloaded from Github. Thanks Bartosz Wójcik!

Author had a very specific request about the methods used to solve the crackme:

If I'm allowed to be picky, I'm primarily interested in scripted efforts to RegEx analyze strings/integers. Very little effort (as in none) went into hiding the correct string. The script was merely passed-through a self-made obfuscator.

In this article I'll show the things I tried, where and how I failed miserably and my final solution for this crackme. I really suggest that you download the crackme from tuts4you and try replicating each step along the way, that way it will be much easier to follow the article.

So, let's get started!

Required tools

  • MyAutToExe. I'm using my personal modification of myAutToExe but even a standard version should work;
  • C# compiler. I used VS2017 but any version will do;
  • Some library that evaluates math expressions. Just like in my previous article, I used MathExpressions library from LoreSoft.Calculator project;
  • Tool for testing regexes. I'm using Regexr;
  • Some brains. Writing deobfuscators is like 80% thinking, 20% writing the actual code.

First steps

First steps are easy - unpack UPX, extract tokens and decompile. The process has been described numerous times, so just google for details.
Once decompiled, the code looks something like this:

Horrible, isn't it? smile

Cleaning up the math

So, let's get rid of the math expressions first! In my previous post, I used the following regex + math library to clean up the stuff:

I tried it here and it failed on floating point numbers like this:

I fixed that and regex started to work. Sort of. There are evil parentheses everywhere and my regex doesn't handle them.

So, I added a second regex to support parentheses at beginning and the end of the expression. What could possibly go wrong? bigsmile

As I learned few hours later, a lot! See, for example, here:

First, regex matched stuff inside parentheses 4 * 21 - 80 and computed it. Then it matched expression 18 - 71 and computed that.

Well, it's already f*cked up, because that's not the correct order of operations. Multiplication has a higher precedence than subtraction!

At this point managing regexes was becoming so messy that I stopped. This is not going to work, I need a new approach!

Matching parentheses

If you want to read more about crazy regexes to find matching parentheses, this StackOverflow discussion is a good place to start. But I decided to keep it simple.

There are several algorithms, but the simplest one is just counting opening/closing parentheses until you find the correct one.

Now I can take the expression I found, and pass it to the LoreSoft.MathExpressions. Right?

Wrong. Parentheses are also used in function definitions or when passing parameters to another function:

So, I added another check to see if the extracted expression looks like a math expression. And it seemed to work.

Problematic minus signs

Next problem I encountered was LoreSoft.MathExpressions complaining about some expressions like these:

Apparently, library can handle negative numbers when they are alone, but combination of negative sign and parentheses like "(-(1 + 2))" just confuses the hell out of it. Since there were only a few cases in the crackme, I manually edited them:

Another problem solved!

Fixing math library

To continue my journey of failures, some of the calculated expressions were really, really strange. For example:

That doesn't look right! The original line was

77 to the power of 1 equals 77. Divided by 11 equals 7. Minus 1 equals 6. So the result should definitely be 6. Why the hell we have 0.48422...?

It turns out that LoreSoft.MathExpressions is buggy and "raise to power" operator doesn't have the correct precedence. See the source:

Raise to power doesn't have any special handling, so it's handled after the division or multiplication. Which is terribly wrong but really easy to fix:

Finally, the math problems are solved! smile

Function names

After solving math problems, methods are starting to look a bit better:

Now we need to get rid of those obfuscated variable names like $_L1111L1L11L and replace them with a proper function names. But what exactly is $_L1111L1L11L? I ran a simple grep, and there are 11 references in the code - 1 declaration of variable, 7 uses of variable and 3 assignments:

That's interesting. uneasy

First of all, AutoIt allows to do this weird thing where you assign a function to a variable. Then you can use this variable to call a function. Crackme that I solved in my previous post used combination of Assign + Execute methods for the same purposes.

Second, you can have several assignments to the same variable. But which one is the correct one? First one? Last one? A random one?

There is no magic solution here, you just need to go through the script and see the execution flow. In AutoIt, anything that's not inside a function is considered to be main code and will be executed starting from the top. So, I went through the script and left only the interesting parts:

This is the order in which the functions will be called. First _LL1111LL1L1L() and then _LL1LLL1() will be executed. Then inside the Switch we'll take Case 1 because that's the value of global variable $_11L1LL1. So, that will call _111LL111LL() and _11111L1LLL1(). Finally, _LL1111L1L1() will be called.

Method _LL1111LL1L1L() does the first assignments:

Then _LL1LLL1() reassigns some (or maybe all) of the variables:

And so on..

I'm too lazy to analyze all of the assignments, so I just reimplemented all 3 methods in my code.

Of course, I did not type all the assignments manually. Simple regex "search and replace" created C# code from the AutoIt code. smile

Now I have a dictionary of variable names and the actual function names. Let's just run a simple search and replace!

...and we'll fuck up again.

See for example here:

If you start from the first string and do dumb search-and-replace, you'll replace a wrong substring and get a result like this:

For the exact same reason, you should avoid touching local variable names.

My final search-and-replace solution looked like this:

Bit operations

All the hard stuff is done, I promise! We're just a few fuckups away from the solution! smile

Our test method now looks like this:

I decided to use regex loops from my old article:

...and it failed. Some of the calculated numbers just didn't make any sense.

This issue is a little bit tricky. To figure it out, you need to read the documentation for each method used:

Match.NextMatch:
Returns a new System.Text.RegularExpressions.Match object with the results for the next match, starting at the position at which the last match ended (at the character after the last matched character).

Regex.Replace:
In a specified input string, replaces a specified maximum number of strings that match a regular expression pattern with a specified replacement string.

Can you see a problem here? I couldn't. So, I spent ~20 minutes debugging it in VisualStudio.

Here's an image for you:

There are several solutions possible, I just got rid of NextMatch and used a big while loop instead.

TL;DR - DO NOT combine Match.NextMatch with Regex.Replace. It will bite you in the butt one day!

Chr() and string concatenation

Now we're getting somewhere! Code is looking better and better:

Cleaning up the CHR calls and string concatenation was easy. Regex and string replace from my previous article worked without any issues. smile

String reverse

We're left with one final problem that is STRINGREVERSE function:

We can use a simple regex loop to fix those. Just like the one we used for bit operations.

The end result

And this is how the serial check looks like after deobfuscation:

Sure, there is a lot of useless code left in the crackme. Variables are not renamed. I could spend half-hour more and clean up all that mess. But I wasn't interested in that, I just wanted to solve the crackme. smile

Final thoughts

In this post I documented all my mistakes and fuckups while solving a rather simple crackme, so that others can learn from them. Reverse engineering is not an easy process and making mistakes is a huge part of it.

I have not failed. I've just found 10,000 ways that won't work. /Thomas A. Edison/

14 thoughts on “Deobfuscating AutoIt scripts, part 2

  1. Thanks a lot kao for your detailed and step-by-step (try-fail-retry) walktrough ! It's very very interesting.

    Preparing code blocks and images for the post I'm sure took you lots of time, but the result is great and helps understanding the tricky parts!

    So thank you for the analysis of this 'rather simple crackme' (simple for you! LoL) ...

    Best Regards,
    Tony

    • Forgot to say ... this phrase 'We're just a few fuckups away from the solution! :)' made my day ;)

  2. This output is from recent {hidden link}

    I would like you to examine my work {hidden link} :)

    I did read your article about deobfuscation and incorporated some techniques to make it harder to deobfuscate the code, like anti-regex patterns.

    I didn't know about the function to variable assignment possibility within AutoIt and it looks like a cool thing to add.

    The floating point obfuscation looks like a good idea to make deobfuscation harder, I have tried it that once but the overall speed of floating point numbers processing put the entire process to ground (damn slow within PHP FPU calculations).

    • Thank you for that info, I will update my post with the obfuscator name and link. :)

      As for your tool - in 3-4 years when my next AutoIt-related blogpost comes, I might look at it. But not today.

  3. some code in the new version can not be de-obfuscated as your method upper, could you please update some ideas for updating? do you have a developed or new app to do that? if yes, kindly share that by email (frombinhdinh@gmail.com). thank you so much!!!

  4. {hidden link}

    generally, it was compiled and obfuscated in the same way, but I can not find out method to de-code correctly. I have tried to use myautoexe 2.15, but it could not be decoded.

    • Thank you, I'll try to look at it over the weekend.

      EDIT: oops, I can't download your file:

      The file you requested has been blocked for a violation of our Terms of Service.

      • just sent you the file by wetransfer. pls let me know if you still cant get it.
        {hidden link}

        • Thank you, I was able to download your archive. Now I just need to find some free time to look at it...

        • So, what exactly is your problem?

          Extracting and decompiling the script? There are plenty of new tools that work really well, so you do not need to use myAutToExe. Try for example, UnAutoIt (originally written by x0r119x91). It is slow but works on all your files.

          Deobfuscating the script? For AT2D.exe and AT2D6C.exe you only need to fix 2 things. First, string concatenation:

          Should be deobfuscated to:

          It's a simple search and replace operation.

          Then, you will need to decode the data. For example:

          should be replaced with

          and then with hex-decoded version of the string:

          (my apologies in case the Vietnamese characters got broken on WordPress).

          It also downloads some important data from the web server. You either can get those data and replace the values, or not. But that's not a problem of obfuscation.


          Your V4-Auto.exe, V6-Auto.exe and the rest use another obfuscator. I believe, it's the old obfuscator from AutoIt Script Editor package.

          For those files, you will need to do 4 things:
          #1 - constant initialization is done by this function on AutoIt startup:

          You'll need to process it, and figure out all values of $os array.
          #2 - once you know values of $os array, you can process the huge globals block at the start of the file:

          Here, you'll need to replace $os[0x3aa] with the correct value, and then replace a5700001f26(...) with the hex-decoded value of that string.
          #3 - once you know all values of the globals, you can replace all of them in the code.

          will become

          #4 - then clean up all these Number(...) obfuscations and get this:

          I think that covers it all. Easy, isn't it? :)

          • Thanks for your support
            All of them need to be approved by the response.text from a website to start the process. The utility also uses some text in the content response to work. Now it is hard to know what things are in the response text.

            for the file xxAuto, I think that they were obfuscated by the way that used data from a stream file. they only de-obfuscated when the stream file connected to the App.
            Also hard to know the value of all the Globals, it is too much and has some risk if the declaration doesn't match the function code lower.

            Anyway, many thanks for your efforts.

  5. oh, a giant variable, globals, and array need to be found exacted value. It takes too much time to do it, will revert to this later. check if you have good ideas for the project, and please share them with me.

Leave a Reply

  • Be nice to me and everyone else.
  • If you are reporting a problem in my tool, please upload the file which causes the problem.
    I can`t help you without seeing the file.
  • Links in comments are visible only to me. Other visitors cannot see them.

Your email address will not be published.

three  ×  nine  =