Use of syscall and sysenter in VMProtect 3.1

kao

Few days ago Xjun briefly mentioned a new feature of VMProtect 3.1 - it uses direct syscalls to check if software is running under debugger. I decided to take a quick look myself and figure out how exactly it works.

I'll give you 2 targets for the research:

  • oppo_flash_tool.exe (https://mega.nz/#!ZgJzjQxR!cNEHMwM-jKnLVgPXf4OUupyk1DNt69FYB2rEfY-5AlA) - it was given as an example by Xjun;
  • asshurt.dll (https://mediafire.com/?3xyc0ugc2hxervn) - some sort of a cheat for Roblox. I don't care about the cheat itself, it just happened to have the syscall feature enabled. And it uses different syscalls than Oppo.

In addition to that, I'll provide a very simple demo executable which replicates part of the VMProtect protection, so that you don't waste time looking at obfuscated code.

As a debugger in 32-bit OS you can use anything you like. On 64-bit OS you will really need to use WinDbg - as far as I know, it's the only debugger that can handle those tricks..

32bit OS

Let's start by debugging oppo_flash_tool.exe. First, we need to get past the usual tricks like IsDebuggerPresent and CheckRemoteDebuggerPresent. If you're reading this, I'm sure you know how to do that.

Few moments later we'll arrive here:

00dcfd8e f744250000800000 test    dword ptr [ebp],8000h
00dcfd96 e9f6bdfdff      jmp     00dabb91
...
00dabb91 0f84841a0b00    je      00e5d61b

Remember this conditional jump. It's taken on 32bit OSes and not taken on 64-bit OS.

Let's look at 32bit OS version first. Now VMProtect prepares to call sysenter.

00e5d61b 56              push    esi
00e5d61c 85f8            test    eax,edi
00e5d61e 57              push    edi
00e5d61f 0facd074        shrd    eax,edx,74h
00e5d623 6633d3          xor     dx,bx
00e5d626 53              push    ebx
00e5d627 8bd9            mov     ebx,ecx
00e5d629 81fcfe567054    cmp     esp,547056FEh
00e5d62f 8bd3            mov     edx,ebx
00e5d631 35fb5cc775      xor     eax,75c75cfbh
00e5d636 c1e202          shl     edx,2
00e5d639 0fbae028        bt      eax,28h
00e5d63d 8bc5            mov     eax,ebp
00e5d63f 6681fbda38      cmp     bx,38DAh
00e5d644 8d0410          lea     eax,[eax+edx]
00e5d647 f7c615443214    test    esi,14324415h
...

Since the code is obfuscated, here comes a cleaned-up version. Please note that in other applications, different registers can be used.

    ;input:
    ;   ECX = number of parameters for the syscall
    ;   [EBP] = syscall id. See https://github.com/tinysec/windows-syscall-table
    ;   [EBP+4] .. [EBP+X] = params for the syscall
    ;   [EBP-4] .. [EBP-8] = free space to save registers
    push    esi
    push    edi
    push    ebx

    ; save register values for later
    lea     eax, [ebp+ecx*4]
    mov     dword ptr [ebp-4], eax
    mov     dword ptr [ebp-8], esp

    ; set up stack frame for syscall
setupParams:
    mov     eax, dword ptr [ebp+ecx*4]
    push    eax
    sub     ecx, 1
    jnz     setupParams

    ; put syscall number in EAX
    mov     eax, dword ptr [ebp]

    ; the actual call
    call    trampoline1

    ; restore stack and frame pointers
    mov     esp, dword ptr [ebp-8]
    mov     ebp, dword ptr [ebp-4]

    ; save result
    mov     dword ptr [ebp], eax

    ; restore registers
    pop     ebx
    pop     edi
    pop     esi
    jmp     00de21de

trampoline1:
    call    trampoline2
    retn

trampoline2:
    mov     edx, esp
    sysenter
    retn 


// -- continue VM execution as usual --
00de21de 8b06            mov     eax,dword ptr [esi]
00de21e2 8db604000000    lea     esi,[esi+4]
00de21ea 33c3            xor     eax,ebx
00de21ec 8d807cc2efb1    lea     eax,[eax-4E103D84h]
00de21f6 f7d0            not     eax
00de21fd 35ee613a76      xor     eax,763a61ee
00df9a28 48              dec     eax
00df9a29 f8              clc
00df9a2a c1c002          rol     eax,2
00df9a2e 33d8            xor     ebx,eax
00df9a32 03f8            add     edi,eax
00df9a34 e9a0eafbff      jmp     00db84d9
00db84d9 ffe7            jmp     edi
...
00ddaeba 8b542500        mov     edx,dword ptr [ebp]

64-bit OS

Here it is getting interesting! 🙂 You cannot use sysenter instruction from 32-bit code in 64-bit Windows. But, as ReWolf described few years ago, one can mix x86 code with x64 code in the same process. And that's exactly what VMProtect 3.1 is doing.

Let's go back to that conditional jump and see what happens in 64-bit OS. The jump will not be taken:

00dabb91 0f84841a0b00    je      00e5d61b
00dabb97 9adf4ce8003300  call    0033:00E84CDF
00dabb9e 668cd1          mov     cx,ss
00dabba1 668ed1          mov     ss,cx

Far call?! Last time I saw that was in 16-bit Windows era..

As explained in ReWolf's article:

Summing things up, for every process (x86 & x64) running on 64-bits Windows there are allocated two code segments:

  • cs = 0x23 -> x86 mode
  • cs = 0x33 -> x64 mode

So, as soon as you execute that call, you'll switch to a 64-bit world. WinDbg happily recognizes that, all other debuggers just go astray..

00dabb97 9adf4ce8003300  call    0033:00E84CDF

00000000`00e84cdf 56              push    rsi
00000000`00e84ce0 57              push    rdi
00000000`00e84ce1 53              push    rbx
00000000`00e84ce2 8bd9            mov     ebx,ecx
00000000`00e84ce4 8bd3            mov     edx,ebx
00000000`00e84ce6 33c9            xor     ecx,ecx
00000000`00e84ce8 81fb04000000    cmp     ebx,4
00000000`00e84cee 0f8606000000    jbe     00000000`00e84cfa
00000000`00e84cf4 8d8bfcffffff    lea     ecx,[rbx-4]
00000000`00e84cfa c1e103          shl     ecx,3
...

x64 code does pretty much the same thing as x86 code - sets up a stack frame, sets up registers and then executes syscall instruction. Cleaned-up and shortened version follows:

    ;input:
    ;   ECX = number of parameters for the syscall
    ;   [EBP] = encoded syscall id. 
    ;           High order byte = special handling info
    ;           Lowest 15 bits = syscall id
    ;   [EBP+4] .. [EBP+X] = params for the syscall
    ;   [EBP-8] .. [EBP-10] = free space to save registers
    ;   [EBP-..] = free space to use in specific syscalls

    push    rsi
    push    rdi
    push    rbx
    mov     ebx,ecx
    mov     edx,ebx
    xor     ecx,ecx

    ; calculate new stack frame pointer
    cmp     ebx,4
    jbe     @F
    lea     ecx,[rbx-4]
@@:
    shl     ecx,3
    shl     edx,2
    mov     rax,rbp
    add     rax,rdx

    ; save registers
    mov     qword ptr [rbp-8],rax
    mov     qword ptr [rbp-10h],rsp

    ; adjust RSP
    sub     rsp,rcx
    and     rsp,0FFFFFFFFFFFFFFF0h
    add     rsp,rcx

    ; useless?
    mov     r10d,dword ptr [rbp]
    shr     r10d,9

    ; set up params for syscall
    test    ebx,ebx
    je      doneSettingParams

loopSetParams:
    mov     eax,dword ptr [rbp+rbx*4]
    cmp     ebx,1
    jne     @F
    mov     rcx,rax
    jmp     nextParam
@@:
    cmp     ebx,2
    jne     @F
    mov     rdx,rax
    jmp     nextParam
@@:
    cmp     ebx,3
    jne     @F
    mov     r8,rax
    jmp     nextParam
@@:
    cmp     ebx,4
    jne     @F
    mov     r9,rax
    jmp     nextParam
@@:
    push    rax

nextParam:
    sub     ebx,1
    jne     loopSetParams

doneSettingParams:
    ; check if syscall needs special handling
    mov     rax,qword ptr [rbp]
    mov     r10d,eax
    shr     r10d,18h

    ; 3 = NtQueryInformationProcess
    cmp     r10b,3
    jne     doSyscall

    ; fix current process pseudo-handle, if it's there
    cmp     ecx,0FFFFFFFFh
    jne     @F
    movsx   rcx,cl
@@:
    ; is this ProcessDebugObjectHandle request?
    cmp     edx,1Eh
    jne     @F

    ; if so, fix buffer and size for ProcessDebugObjectHandle request. 
    ; It should be 8-bytes long.
    lea     r10,[rbp-18h]
    mov     r8,r10
    mov     r9d,8
@@:
    jmp     doSyscall

doSyscall:
    and     eax,7FFFh
    sub     rsp,20h
    call    trampoline
    jmp     processResult

trampoline:
    mov     r10,rcx
    syscall
    ret

processResult:
    ; check for special handling again
    mov     r10d,dword ptr [rbp]
    shr     r10d,18h

    ; 3 = NtQueryInformationProcess
    cmp     r10b,3
    jne     returnToX86

    ; is this ProcessDebugObjectHandle ?
    cmp     dword ptr [rbp+8],1Eh
    jne     returnToX86

    ; were 2 buffers the same in original call?
    mov     ecx,dword ptr [rbp+0Ch]
    cmp     ecx,dword ptr [rbp+14h]
    je      @F

    ; if not, copy returned DebugObjectHandle back to original buffer
    mov     r10d,dword ptr [rbp-18h]
    mov     dword ptr [rcx],r10d
@@:
    jmp     returnToX86

returnToX86:
    nop
    mov     rsp,qword ptr [rbp-10h]
    mov     rbp,qword ptr [rbp-8]
    mov     dword ptr [rbp],eax
    pop     rbx
    pop     rdi
    pop     rsi
    retf

You'll notice that x64 version is slightly more complex due to the way parameters are passed (registers vs. stack). It also includes a special treatment for 8 special edge cases - it will modify syscall parameters to adjust buffers and pointer sizes to satisfy requirements for 64-bit code.

NOTE - to keep code simple, I only showed the part which deals with NtQueryInformationProcess but other cases are similar.

As you can see, return back from x64 to the x86 world is a simple retf instruction. x86 code continues right where it left off:

00000000`00e7edf1 cb              retf
...
00dabb9e 668cd1          mov     cx,ss
00dabba1 668ed1          mov     ss,cx
00dabba4 8b16            mov     edx,dword ptr [esi]
00dabba6 3bf8            cmp     edi,eax
00dabba8 80fbf2          cmp     bl,0F2h
00dabbab 81c604000000    add     esi,4
00dabbb1 6685c8          test    ax,cx
00dabbb4 33d3            xor     edx,ebx
00dabbb6 f5              cmc
00dabbb7 8d927cc2efb1    lea     edx,[edx-4E103D84h]
00dabbbd 6681ff8e5b      cmp     di,5B8Eh
00dabbc2 f7d2            not     edx
00dabbc4 80fbf5          cmp     bl,0F5h
00dabbc7 81f2ee613a76    xor     edx,763a61ee
00dabbcd e9479f0500      jmp     00e05b19
00e05b19 4a              dec     edx
00e05b1a f8              clc
00e05b1b c1c202          rol     edx,2
00e05b1e 33da            xor     ebx,edx
00e05b20 6681fff273      cmp     di,73F2h
00e05b25 03fa            add     edi,edx
00e05b27 ffe7            jmp     edi
...
00ddaeba 8b542500        mov     edx,dword ptr [ebp]

Instruction at address 0x00ddaeba is the same for both x86 and x64 OS-es and VM continues as usual.

Different protection modes and syscalls

I provided you with 2 real-world test executables. Oppo seems to be simpler and use just 3 syscalls:

  • NtQueryInformationProcess with ProcessDebugObjectHandle class
  • NtSetInformationThread with ThreadHideFromDebugger class
  • NtProtectVirtualMemory to set protection attributes for each section in original executable

Asshurt doesn't have antidebug trick with NtQueryInformationProcess but it uses additional syscalls for some purposes:

  • NtOpenFile
  • NtCreateSection
  • NtMapViewOfSection
  • NtQueryVirtualMemory
  • NtUnmapViewOfSection
  • NtClose

Suggested workaround

Since VMProtect is using undocumented Windows features, it somehow needs to ensure that the protection will work on each and every Windows version. That's VMProtect's biggest strength and also the biggest weakness.

Windows' syscall numbers change in each version and also between major builds. Use the wrong syscall number and you're guaranteed to receive unexpected results. So, VMProtect developers had to hardcode a table with Windows build numbers and corresponding syscall id's in the executable.

You can see the syscall numbers in the j00ru's page (slightly out of date) or in tinysec's windows kernel syscall table

To obtain Windows build number, VMProtect uses information from PEB (Process Environment Block). The method is already described in The MASM Forum, so I'll just reproduce the (ugly) code from their page:

	print	"Read From Process Environment Block:",13,10
    ASSUME	FS:Nothing
    mov		edx,fs:[30h]	;PEB.InheritedAddressSpace
    ASSUME  FS:ERROR
    mov     eax,[edx+0A4h]	;eax = Major Version
    push	eax
    push    edx
    print   ustr$(eax),'.'
    pop     edx
    push	edx
    mov     eax,[edx+0A8h]	;eax = Minor Version
    print   ustr$(eax),'.'
    pop edx
    mov eax,[edx+0ACh]		;eax = build
    and eax,0FFFFh			;because win 7 collapses
    print ustr$(eax),13,10,13,10	
	pop		eax

VMProtect checks only the build number and picks the corresponding syscall number. However, if the build number is not in the internal database, it will not use direct syscall and fall back to standard protection. Bingo, problem solved - no need for ugly hacks like Xjun's SharpOD plugin!

Hint: VMProtect 3.1 doesn't support Windows 10 Creators Update (build number 15063).

Demo time

As promised, here is a download link for the test application: https://mediafire.com/?niqqbs0fqcq8n23
Note: it should support most common builds of Windows XP/7/8.1/10. Windows 2003/Vista and other rare systems are not supported!

If it shows "OK" message, you've hidden your debugger well. If it shows "Debugger detected", you have a problem. 🙂

Have fun!
kao.

EDIT: Updated download link for Oppo. Mediafire's antivirus tends to have plenty of False Positives..

VMProtect and dbghelp.dll bug in export processing

kao

If your Olly is crashing when loading executable protected by VMProtect, you most likely have outdated dbghelp.dll somewhere on your path. Grab the latest version from Microsoft and put it in the Olly folder.

Well, that might be enough to work around the issue that I had - but I still wanted to know what's causing the crash.

Cause of the problem

If you try to debug Olly with another Olly, you'll see the Access Violation happening somewhere in dbghelp.dll:

Log data, item 0
 Address=6D529B91
 Message=Access violation when reading [C4983C3E]
6D529B8E   8B55 F4          MOV EDX,DWORD PTR SS:[EBP-C]
6D529B91   66:833C42 00     CMP WORD PTR DS:[EDX+EAX*2],0
6D529B96   75 07            JNZ SHORT DBGHELP.6D529B9F

Check register values in Olly:

EAX 00000000   <----------------
ECX 00000001
EDX C4983C3E   <----------------
EBX 0458A390
ESP 0018A450
EBP 0018AC94
ESI 045EE7E8
EDI 045EF738
EIP 6D529B91 DBGHELP.6D529B91

For some reason, value in EDX is garbage and therefore access violation happens.

Call stack doesn't tell us much:

Call stack of main thread
Procedure / arguments                 Called from              Name from PDB
DBGHELP.6D52997D                      DBGHELP.6D52ACFD         LoadExportSymbols(struct _MODULE_ENTRY *, struct _IMGHLP_DEBUG_DATA *)
DBGHELP.6D52A755                      DBGHELP.6D52B035         load(char *, DWORD)
DBGHELP.6D52ADB8                      DBGHELP.6D5264B2         InternalLoadModule(char *, char *FullPath, char *Str1, unsigned __int64, unsigned __int32, void *, struct _DBGHELP_MODLOAD_DATA *, unsigned __int32)
DBGHELP.SymLoadModuleEx               DBGHELP.6D526502              
DBGHELP.SymLoadModule64               DBGHELP.6D526522            
DBGHELP.SymLoadModule                 OLLYDBG.00491502

And same piece of code in IDA doesn't help much either:

.text:6D529B8E loop_check_something:                   ; CODE XREF: LoadExportSymbols(_MODULE_ENTRY *,_IMGHLP_DEBUG_DATA *)+225j
.text:6D529B8E                 mov     edx, [ebp+ptrAllocatedMemory]
.text:6D529B91                 cmp     word ptr [edx+eax*2], 0 ; <----------------
.text:6D529B96                 jnz     short loc_6D529B9F
.text:6D529B98                 add     [ebp+var_10], 10h
.text:6D529B9C                 inc     [ebp+arg_4]
.text:6D529B9F
.text:6D529B9F loc_6D529B9F:                           ; CODE XREF: LoadExportSymbols(_MODULE_ENTRY *,_IMGHLP_DEBUG_DATA *)+219j
.text:6D529B9F                 inc     eax
.text:6D529BA0                 cmp     eax, ecx
.text:6D529BA2                 jb      short loop_check_something

So, it's debugging time! Set breakpoint to start of LoadExportSymbols, then set hardware breakpoint on write to address [ebp+ptrAllocatedMemory].

First hit is initialization of variable with 0:

.text:6D529986                 xor     ecx, ecx
.text:6D529988                 test    byte ptr dword_6D57F438+1, 4
.text:6D52998F                 mov     [ebp+ptrAllocatedMemory], ecx

Second hit stores the address of allocated memory:

.text:6D529AA7                 call    _pMemAlloc@4    ; pMemAlloc(x)
.text:6D529AAC                 xor     ecx, ecx
.text:6D529AAE                 cmp     eax, ecx
.text:6D529AB0                 mov     [ebp+ptrAllocatedMemory], eax
.text:6D529AB3                 jz      loc_6D529D56

And third time is a charm:

.text:6D529AF5                 lea     edx, [ebp+exportFunctionName] ; 0018A45C
.text:6D529AFB                 sub     edx, eax        ; EDX = FBAD009A
.text:6D529AFD
.text:6D529AFD loop_strcpy_overflows:                  ; CODE XREF: LoadExportSymbols(_MODULE_ENTRY *,_IMGHLP_DEBUG_DATA *)+188j
.text:6D529AFD                 mov     cl, [eax]
.text:6D529AFF                 mov     [edx+eax], cl   ; <----------------
.text:6D529B02                 inc     eax
.text:6D529B03                 test    cl, cl
.text:6D529B05                 jnz     short loop_strcpy_overflows

Good folks at Microsoft have left us with a nice buffer overflow. exportFunctionName is defined as byte array of size 2048 bytes. Any exported function name longer than that will cause stack overflow and (possibly) subsequent crash.

010Editor with PETemplate confirms that the export name is indeed very long (3100 chars):

From what I can tell, it's a similar (but not the same) bug to what was described by j00ru at http://j00ru.vexillium.org/?p=405 (see "PE Image Fuzzing (environment + process)")

Stay safe!

P.S Here's an example file, if you want to test your Olly: https://forum.tuts4you.com/topic/38963-vmprotect-professional-v-309-custom-protection/
P.P.S. CFF Explorer, HIEW and IDA do not show us any exports in this example file - but that's a matter of another story..