How HyperLock 386 Works and How to Crack It? - (C)
by Dirk Gently
Such articles can't be read in usual papers, that's for sure :) I was
ill at ease as I coulnd't decide whether I should allow a papermag
to release my article- I was afraid of BSA and its comrades,
of course. At last, in 1994, as a part of a big article
of mine on Turbo Debugger and other debugging tools, I published it. Fortunately, nobody wanted to arrest me :)
By reading this article, we'll understand how these, very effective and anti-cracker protections
work. I'll show you how to foil hackers armed with Turbo Debugger
(and other deebuggers as well) by writing such, top protections.
You'll have to have the following programs to understand/ try the
following part. Turbo Debugger is required, in the first place.
TDUMP, which is an additional utility of TD, should be used to check the
initial CS:IP address of the program we are going to debug. And, finally, one
should need a program that contains such a protection. I've chosen
Mega-EM, the well-known MT-32 and GMidi emulator for the GUS. Click here to download it. The ZIP
contains both Mega-EM 2.00023b (1603.exe) and Mega-EM 2.02 (1622.exe). Before
starting their cracker programs, 1603dec.exe and 1622dec.exe respectively,
rename Mega-EM's to megaem.exe. Be warned, do NOT confuse the two versions,
as MegaEMs have different initial CS:IPs and it must be supplied when compiling
the sources of crackers. All
other 1.X-2.X versions have been protected with
HyperLOCK 386, but the newest versions have no protection (the 3.X series).
The protected MegaEM versions all have some 90k size (there are some
old MegaEMs shipped with commercial games that aren't protected. They are
some 26 kBytes long).
Of course, I have to tell you why I've chosen just MegaEM. Well, it was in
1993 that I bought my first GUS. I was very happy, of course, when I heard
of MegaEM, which was a brand new product those times. Well, I wanted to
know how it works- I was the most important cracker here those times. I was only too pleased to hear that MegaEM's author refused releasing
the sources of MegaEM. And, in addition, I didn't want to pay 50 US$s for registrating
MegaEM. So, I didn't hesitate for long, I started cracking it. And, to tell
the truth, it took me quite a long time - abbout a day! - to crack it :)
Well, let's talk about the protection itself. It uses so-called
layer encrytpion. What does it mean? We'll see that the protection,
which has been put on the original MegaEM, has layers, which cause a lot of
problems when trying to debug them, because only the first layer contains
executable code, all the other, inner layers are decoded when the outer
layers run. You can guess that, of course, the lack of code segment protection
helps a lot when writing such protections- one doesn't have to reserve memory
to decode encrypted code areas.
Let's load MEGAEM.EXE under TD. The PC (Instruction Pointer /
Program Counter) will be set to 0123. There is a JMP 01CA command
at this position, which seems usual. The same stands for the area from
01ca - the code seems logical and executable. Oops... What's that from
01ea? The cmp ax,3092 command looks OK, but commands like
enter 4321, 5D can't be considered to be real instructions, used in usual, MS-LOSS-based programs.
And we could go on checking how strange 'commands' the code has. So, one must
assume that the code between 01ca and 01e9 decodes this second layer, and,
when the first layer (between the above position) ends, the execution
continues on 01ea, which position contains a REAL command by that time.
I used bold to mark the code that changes in other links of the
protection. Just compare e.g. this area of protection of MegaEM2.02 to that
of some other versions and you'll see what I mean. This means, of course,
that our program which will crack ANY link of a particular version of
HyperLOCK 386 must have check what commands are there in the recent linking
of HLOCK 386, import them all and CHANGE its own commands to them. This
requires, of course, that the cracker routine we're going to write must be
written, at least partly, in assembly.
cs:01CA cli
cs:01CB mov di,0123
cs:01CE mov al,11
cs:01D0 mov cx,0648
cs:01D3 mov ah,cs:[di]
cs:01D6 cmp di,01EA ; 01ea= the starting address of the 2nd layer
cs:01DA jb 01DF
cs:01DC xor cs:[di],al
cs:01DF inc di
cs:01E0 sub al,ah
cs:01E2 shr al,1
cs:01E4 xor al,ah
cs:01E6 rol al,1
cs:01E8 loop 01D3
cs:01ea cmp ax,3092 [...] ; this is the first, non-decoded command of
; the 2nd layer
Try to execute this 1st layer. Try both Run to... (f4) and Step-execution
(f7). Of course, the latter requires a big deal of patience, as the
mov cx,0648 command is being executed
$648*(10 times, if di is bigger than 01ea, and 9 times, if less).
Anyway, it's very instructive to notice that pressing the
F8 on 01e8> LOOP causes the debugger to freeze.
It dosn't matter which way would you execute this first layer, the
code of the second layer won't work. Why?
One might assume that we didn't do anything illogical. The code of the
first layer seems OK- it seems to be a simple routine that computes the
XOR sum... but let's stop for a moment, and check more accurately WHAT
code this routine makes the CRC sum FROM! The point is that the code counts
this internal CRC sum not only from locations outside this 01d3-01ea
(the loop), but also it includes the inner locations. And think of it:
if you set up a breakpoint in this loop, it will cause the Turbo Debugger
to put a CCh op. code there (Int 3). When the cycle READS (at cs:01D3)
the actual executable code, byte by byte, it'll read this CC and NOT the
original code. This is what the protection of this first layer is based on.
Of course, swapping this CC opcode and the original opcode is transparent
when debugging routines that do NOT read from their code area to find out
whether they're being debugged or not; but ANY routine can invent such
tricks to disallow correct decoding of the inner layers if it computes an
initial, preferably at least 16-bit long CRC sum (the routine above uses
only a 8-bit one. It is a weak point of it: think of it, if you can't find
out how to execute the above routine without getting the XOR sum wrong,
but you can RUN 256 TESTS with XOR values from 0 to FF to check
which initial XOR value decodes the 2nd layer. You'll find it very fast).
So far, we've learnt how to recognise the presence of a debugger.
You can see that without considering the RIGHT process of debugging the
above routine you'll NEVER find out how to restore the second layer.
Simply pressing F9, of course, doesn't put any CCs anywhere, but the machine
executes all the commands so fast that you won't have enough time to press
Ctrl-Break to see what happened to Layer 2 etc.... because all the inner
layers (regarding HLOCK 386, it's the third layer that erases the entire
memory area of layer 1 and layer 2 before giving back the execution
to the real program that is being protected (Mega-EM, before linking the
HLOCK 386. If you still don't see how such a linking works, I recommend you
to check my IntroMaker Toolkits on my homepage. It's very instructive to
check all of them,, because, as I've noticed, most of PC-coders/crackers
don't really know how to handle the EXE header, how to 'infect' EXE files
etc...).
Still, how should we debug the above routine? One has to notice at once
that the routine reads only one byte at once, while the LOOP that reads
and processes this byte is some 10-15 bytes long. So, if we move the cursor
to 01e8 and keep pressing F4 while the value of DI rises from its initial
0123 to 01d4 (the maximum we may allow should be 01e8- if we allow DI to
reach this value, the loop would read the opcode from the point we're
standing at (with the false CC, of course) only in the next cycle.
So, reaching this DI value, let's move the cursor to e.g. 01d3, which is
the first command in the loop, and go on pressing F4. Before long, DI will
exceed 01e9- we can see that decoding of layer 2 works and it is done
without problems. NOW we can move the cursor to 01ea and we can press
F4 there to execute the remaining cycles in one step.
This was the first layer- and it was hard enough, wasn't it? And,
unfortunately, I have to announce that the second layer is much more
complicated.
Layer 2 can't be run under TD at any rate, because it operates with
not only detecting opcodes of INT 3's, but it also rewrites fundamental
interrupt vectors that DO freeze Turbo Debugger. This part is even more useful
for them who want to write USABLE protections.
The following code is the already de-XOR-ed layer 2.
cs:01EA push ds
cs:01EB push cs
cs:01EC pop ds
cs:01ED pop es
cs:01EE mov [0110],es ;we save ES in order to restore it (because in the
; next step we clear ES)
cs:01F2 mov es,[010A] ;it's very instuctive to check the way the
; protection zeroes the register! No immediate
; values, nothing. Loading 0 to registers
; immediately would make most crackerds think
; that the program is going to rewrite the
; interrupt table, which requires a segmentregister
; to have a zero value (of course noone would
; be so naive to think that an author of a
; protection would use 'usual' and 'official'
; ways of updating the intr. descriptor table :-)
cs:01F6 mov ax,[0110]
cs:01F9 add ax,0010
cs:01FC add [010A],ax
cs:0200 add [010C],ax
cs:0204 add [010E],ax ;we save the recent CS (depending on DOS - where
; has it loaded the EXE. We update with this
; value 010a, 010c and 010e.
cs:0208 push es:word ptr [0000]
cs:020D push es:word ptr [0002] ; if we haven't noticed yet that we were
; going to read/write intr. table, well,
; this is the point where one SHOULD
; notice it :) Saving Int0's original
; offset and segment.
cs:0212 push cs
cs:0213 pop es:word ptr [0002] ; and we point INT 0 to the ACTUAL CS.
; of course, we don't load CS immediately
; to avoid crackers' noticing the fact
; that we're going to address an OWN
; routine.
cs:0218 mov ax,051F
cs:021B mov es:[0000],ax ; yes, Int0's new address is CS:051f
cs:021F push es:word ptr [0004]
cs:0224 push es:word ptr [0006]
cs:0229 push cs
cs:022A pop es:word ptr [0006]
cs:022F mov ax,051F
cs:0232 mov es:[0004],ax ; and we do the same to INT 1. It'll point
; to CS:051f
cs:0236 push es:word ptr [0008]
cs:023B push es:word ptr [000A]
cs:0240 push es:word ptr [000C]
cs:0245 push es:word ptr [000E] ;saving the original values of int 2 and
; int 3
cs:024A push cs
cs:024B pop es:word ptr [000E]
cs:0250 mov ax,02C1
cs:0253 mov es:[000C],ax ;int 3 = cs:02c1
cs:0257 push es:word ptr [0018] ;int 6 = cs:0274
cs:025C push es:word ptr [001A]
cs:0261 push cs
cs:0262 pop es:word ptr [001A]
cs:0267 mov ax,0274
cs:026A mov es:[0018],ax
cs:026E jmp 0279 ; we avoid some datas and int6's entry point
cs:0270 stc
cs:0271 lock dec sp ; don't be afraid of such commands,
; this is data area
cs:0273 pop es
cs:0274 xor ax,ax ;int 6 entry point
cs:0276 jmp 049E
cs:0279 xor ax,ax
cs:027B mov es,ax
cs:027D mov ax,[0436]
cs:0280 mov es:[0008],ax
cs:0284 mov ax,[0471]
cs:0287 mov es:[0004],ax
cs:028B mov bh,00
cs:028D mov si,0123
cs:0290 cld
ciklus: cs:0291 lodsb ;yes, we're going to compute a new
; CRC-sum, starting from CS:0123 again.
; Direction flag=0, so we'll address
; ascending addresses.
cs:0292 xor bh,al
cs:0294 push si
cs:0295 mov ax,0911
cs:0298 mov si,4647
cs:029B mov di,0000
cs:029E mov dx,0497
cs:02A1 int 03 ;we'll call INT 3, which is, of course,
; our OWN intr. routine. The datas we're
; going to pass it are in the registers.
;Int 3 modifies the XOR we're computing,
; and, in addition, it also modifies both
; int1 and int2 - it uses them as temporary
; registers. Turbo Debugger freezes if
; anything modifies Int1/ int2 so it's
; impossible to debug this routine. And
; the usual debugging trick (using int f1
; instead of 01) would require a lot of
; effort, as regards checking the actual
; code position we're reading from and
; correct the byte we've read if we use
; e.g. an int f3 command instead of int 03
; etc. Of course, I prefer transforming
; these 'dangerous' interrupts to such
; a high, unused area, and thsi hlock 386
; was one of the very few protections that
; made it almost impossible to use simple
; interrupt transforming.
cs:02A2 add bh,dh
cs:02A4 push sp
cs:02A5 pop ax
cs:02A6 cmp ax,sp
cs:02A8 jne 02BB
cs:02AA pushf
cs:02AB pop ax
cs:02AC mov si,ax
cs:02AE xor ax,7000
cs:02B1 push ax
cs:02B2 popf
cs:02B3 pushf
cs:02B4 pop ax
cs:02B5 xor ax,si
cs:02B7 je 02BB
cs:02B9 jmp 0326 ;it jumps at any rate
cs:02BB mov ax,0002
cs:02BE jmp 049E ;the code from 02a4 doesn't affect TD at all.
; (so we'll never jump to 049e from 02be)
int3: cs:02C1 push es ;here begins int 3
cs:02C2 push ax
cs:02C3 push bx
cs:02C4 push cx
cs:02C5 cmp ax,0911
cs:02C8 jne 0320
cs:02CA xor ax,ax
cs:02CC mov es,ax
cs:02CE mov ax,es:[0008]
cs:02D2 mov bx,es:[0004]
cs:02D7 mov cx,ax
cs:02D9 mul word ptr [03C5]
cs:02DD shl cl,1
cs:02DF shl cl,1
cs:02E1 shl cl,1
cs:02E3 add ch,cl
cs:02E5 add dx,cx
cs:02E7 add dx,bx
cs:02E9 shl bx,1
cs:02EB shl bx,1
cs:02ED add si,di
cs:02EF add dx,bx
cs:02F1 add dh,bl
cs:02F3 mov cl,05
cs:02F5 shr di,03
cs:02F8 shl bx,cl
cs:02FA add dh,bl
cs:02FC shl si,04
cs:02FF add ax,0001
cs:0302 adc dx,0000
cs:0305 mov es:[0008],ax ; TD freezes here (rewriting int1/2)
cs:0309 mov es:[0004],dx
cs:030E pop cx
cs:030F pop bx
cs:0310 pop ax
cs:0311 push bx
cs:0312 add bh,dh
cs:0314 mov bl,bh
cs:0316 xor es:[0008],bx
cs:031B pop bx
cs:031C pop es
cs:031D ret 0004 ;=iret, end of int3
cs:0320 mov ax,0003 ;'unstable system'.
cs:0323 jmp 049E ;error, error code in ax, exiting (after deleting
; layer 2, of course)
cs:0326 pop si ;we jump here from 02b9
cs:0327 xor eax,eax
cs:032A push ax
cs:032B push bx
cs:032C push dx
cs:032D mov ax,FFFF
cs:0330 mov dx,FFFF
cs:0333 mov bx,0001
cs:0336 div bx ;we call Int 0 by dividing by 0. of course, TD doesn't
;execute int 0's, either. dh= output.
cs:0338 xor bh,dh ;bh is the xor value
cs:033A cmp si,0551 ;have we reached the starting address of layer3?
; if not, we read the next code byte from
;layer 1/2; if yes, jmp to 052e to decode layer 3
cs:033E jne 0291
cs:0342 jmp 052E
; error handling routine: exiting, after deleting layer 1 and 2
cs:049E push cs
cs:049F push cs
cs:04A0 pop ds
cs:04A1 pop es
cs:04A2 mov dx,0345
cs:04A5 cmp ax,0001
cs:04A8 jne 04AD
cs:04AA mov dx,03C7
cs:04AD cmp ax,0002
cs:04B0 jne 04B5
cs:04B2 mov dx,0438
cs:04B5 cmp ax,0003
cs:04B8 jne 04BD
cs:04BA mov dx,0473
cs:04BD mov di,051C
cs:04C0 mov si,sp
cs:04C2 sub si,0040
cs:04C5 mov dword ptr [di],00000000
cs:04CC add di,0004
cs:04CF cmp di,si
cs:04D1 jb 04C5
cs:04D3 std
cs:04D4 mov di,0345
cs:04D7 dec di
cs:04D8 mov cx,di
cs:04DA dec cx
cs:04DB rep stosb
cs:04DD mov ah,09
cs:04DF int 21
cs:04E1 xor ax,ax
cs:04E3 mov es,ax
cs:04E5 pop es:word ptr [001A] ; restoring int. vectors
cs:04EA pop es:word ptr [0018]
cs:04EF pop es:word ptr [000E]
cs:04F4 pop es:word ptr [000C]
cs:04F9 pop es:word ptr [000A]
cs:04FE pop es:word ptr [0008]
cs:0503 pop es:word ptr [0006]
cs:0508 pop es:word ptr [0004]
cs:050D pop es:word ptr [0002]
cs:0512 pop es:word ptr [0000]
cs:0517 mov ax,4C02
cs:051A int 21 ;exit to DOS
cs:051C jmp 0338
cs:051F add sp,0004 ;single step (int 1) is addressing here
cs:0522 popf
cs:0523 pop dx
cs:0524 pop bx
cs:0525 pop ax
cs:0526 cmp si,0551
cs:052A jne 0291 ;have we read all the code bytes from l1/2?
cs:052E mov di,sp ;we jump here from 0342, too
cs:0530 sub di,0020
cs:0533 mov bl,[si]
cs:0535 xor [si],bh
cs:0537 mov bh,bl
cs:0539 push si
cs:053A push di
cs:053B mov ax,0911
cs:053E mov si,4647
cs:0541 mov di,0000
cs:0544 mov dx,0497
cs:0547 int 03
cs:0548 pop di
cs:0549 pop si
cs:054A add bh,dh
cs:054C inc si
cs:054D cmp si,di
cs:054f jb 0553 ; and we're decoding layer 3 until stack pointer-20.
; SP is given in the EXE header. Of course, its actual
; value is somewhat smaller than given, but it's far
; away from the end of our executable code.
;3rd layer- this decodes the ORIGINAL program.
; Of course, decoding the original code is quite complicated,
; and is varied in each link.
cs:0551 xor di,di
cs:0553 mov es,di
cs:0555 mov ax,[0752]
cs:0558 mov es:[0008],ax
cs:055C mov ax,[0754]
cs:055F mov es:[0004],ax
cs:0563 mov es,[010A]
cs:0567 cld
cs:0568 mov ah,[0751]
cs:056C cmp dword ptr [074D],00008000 ;if the code we have to decode is
; bigger than 32k, we cut it into
; 32k-blocks. The last block is
; decoded by the routine from 066f.
cs:0575 jbe 066F
cs:0579 mov cx,8000
cs:057C mov al,es:[di]
cs:057F xor es:[di],ah
cs:0582 mov ah,al
cs:0584 dec byte ptr [075A]
cs:0588 je 05A5
cs:058A add ah,0D
cs:058D inc di
cs:058E loop 057C
cs:0590 sub dword ptr [074D],00008000
cs:0599 mov di,es
cs:059B add di,0800
cs:059F mov es,di
cs:05A1 xor di,di
cs:05A3 jmp 056C
cs:05A5 push ax
cs:05A6 push di
cs:05A7 mov bx,ax
cs:05A9 mov ax,0911
cs:05AC mov si,4647
cs:05AF mov di,0000
cs:05B2 mov dx,05C1
cs:05B5 int 03
cs:05B6 pop di
cs:05B7 pop ax
cs:05B8 add ah,dh
cs:05BA mov byte ptr [075A],20
cs:05BF jmp 058D
cs:05C1 dec ax ; another error-handling routine for layer 3
cs:05C2 inc dx
cs:05C3 dec di
cs:05C4 dec di
cs:05C5 push sp
cs:05C6 or ax,0E0E
cs:05C9 pop ds
cs:05CA pop es
cs:05CB mov dx,0345
cs:05CE cmp ax,0001
cs:05D1 jne 05D6
cs:05D3 mov dx,03C7
cs:05D6 cmp ax,0002
cs:05D9 jne 05DE
cs:05DB mov dx,0438
cs:05DE cmp ax,0003
cs:05E1 jne 05E6
cs:05E3 mov dx,0473
cs:05E6 mov di,0653
cs:05E9 mov si,sp
cs:05EB sub si,0040
cs:05EE mov dword ptr [di],00000000
cs:05F5 add di,0004
cs:05F8 cmp di,si
cs:05FA jb 05EE
cs:05FC std
cs:05FD mov di,0608
cs:0600 mov cx,di
cs:0602 sub cx,051F
cs:0606 xor ax,ax
cs:0608 dec cx
cs:0609 rep stosb
cs:060B mov di,0345
cs:060E dec di
cs:060F mov cx,di
cs:0611 dec cx
cs:0612 rep stosb
cs:0614 mov ah,09
cs:0616 int 21
cs:0618 xor ax,ax
cs:061A mov es,ax
cs:061C pop es:word ptr [001A]
cs:0621 pop es:word ptr [0018]
cs:0626 pop es:word ptr [000E]
cs:062B pop es:word ptr [000C]
cs:0630 pop es:word ptr [000A]
cs:0635 pop es:word ptr [0008]
cs:063A pop es:word ptr [0006]
cs:063F pop es:word ptr [0004]
cs:0644 pop es:word ptr [0002]
cs:0649 pop es:word ptr [0000]
cs:064E mov ax,4C02
cs:0651 int 21
cs:0653 push ax
cs:0654 push di
cs:0655 mov bx,ax
cs:0657 mov ax,0911
cs:065A mov si,4647
cs:065D mov di,0000
cs:0660 mov dx,05C1
cs:0663 int 03
cs:0664 pop di
cs:0665 pop ax
cs:0666 add ah,dh
cs:0668 mov byte ptr [075A],20
cs:066D jmp 068C
cs:066F cmp dword ptr [074D],0000 ;decoding smaller blocks than 32k
cs:0675 je 068F
cs:0677 mov cx,[074D]
cs:067B mov al,es:[di]
cs:067E xor es:[di],ah
cs:0681 mov ah,al
cs:0683 dec byte ptr [075A]
cs:0687 je 0653
cs:0689 add ah,0B
cs:0689 add ah,0B
cs:068C inc di
cs:068D loop 067B
cs:068F xor ax,ax
cs:0691 mov es,ax
cs:0693 mov ax,es:[0008]
cs:0697 cmp ax,[0756]
cs:069B jne 06A7
cs:069D mov ax,es:[0004]
cs:06A1 cmp ax,[0758]
cs:06A5 je 06AC
cs:06A7 xor ax,ax
cs:06A9 jmp 05C7
cs:06AC cmp word ptr [074B],0000
cs:06B1 je 06D7
cs:06B1 je 06D7
cs:06B3 mov si,075B
cs:06B6 mov cx,[074B]
cs:06BA mov dx,[010A]
cs:06BE cld
cs:06BF mov di,[si]
cs:06C1 mov ax,[si+02]
cs:06C4 add ax,dx
cs:06C6 mov es,ax
cs:06C8 mov dword ptr [si],00000000
cs:06CF add si,0004
cs:06D2 add es:[di],dx
cs:06D5 loop 06BF
cs:06D7 xor al,al ; erasing the entire code area of layer 1, 2 and 3
; between 0123 and 06e4
cs:06D9 mov di,0123
cs:06D9 mov di,0123
cs:06DC mov cx,06E4
cs:06DF sub cx,di
cs:06E1 push ds
cs:06E2 pop es
cs:06E3 cld ;increasing addresses
cs:06E4 rep stosb
cs:06E6 xor ax,ax
cs:06E8 mov es,ax
cs:06EA pop es:word ptr [001A] ;restoring intr. vectors
cs:06EF pop es:word ptr [0018]
cs:06F4 pop es:word ptr [000E]
cs:06F9 pop es:word ptr [000C]
cs:06FE pop es:word ptr [000A]
cs:0703 pop es:word ptr [0008]
cs:0703 pop es:word ptr [0008]
cs:0708 pop es:word ptr [0006]
cs:070D pop es:word ptr [0004]
cs:0712 pop es:word ptr [0002]
cs:0717 pop es:word ptr [0000]
cs:071C mov ss,[010E] ;restoring SS
cs:0720 mov sp,[0749] ;restoring SP
cs:0724 xor bx,bx
cs:0726 pushf
cs:0727 xor cx,cx
cs:0729 mov bp,sp
cs:072B or word ptr [bp],0200
cs:0730 xor bp,bp
cs:0732 push word ptr [010C] ;we save the CS of the original program
;(NOT the protection!)
cs:0736 mov di,ax
cs:0736 mov di,ax
cs:0738 push word ptr [0747] ;and saving the IP of 'infected', protected
; program
cs:073C mov si,bx
cs:073E mov es,[0110]
cs:0742 push es
cs:0743 pop ds
cs:0744 mov dx,cx
cs:0746 iret ;a FAR (32-bit) RETURN: starting the protected
; program
Debugging such a protection, as we've already seen, is impossible. The
only way to decode a program that has been protected with such a
protection is SIMULATING it. The easiest way to get over such problems is to
SIMULATE them - to write a -preferably assembly-based- program that contains
almost the same code, except for the anti-debug code, but reads from FILE
and writes the decoded code to a FILE, too. By comparing the code below to
the code of the original prtotection you can see how this works.
Of course, when writing a protection, do NOT forget that the original
program you're to protect should be encoded in an extremely difficult way.
Simple XORs shoudl be avoided etc. Not even increment-XOR should be used.
;CopyRight (C) by DirkGent@iRC
;A *WORKING* c0de to crack HyperLOCK 386.
;1, it pays attention to handle the varying codes
;2, the beginning values of decoding
;
;Input filename: megaem.exe
;and the output: decoded.exe
;
;there is only one must for you: you have to TDUMP the file you want to
;free and write the INITIAL CS VALUE into the following EQU:
; CSWithAntidebugRtn equ 1603h (at the 23rd row!)
; the present value is for MegaEm 2.02
; Fortunately, it IS possible to track this routine! The Interrupts
;used by THIS routine:
; int0 -> int 0f0h 3c0-3c3
; (int1 -> int 0f1h 3c4-3c7)
; (int2 -> int 0f2h 3c8-3ca)
; int3 -> int 0f3h 3cb-3ce
; (int8 -> int 0f4h ...)
.model small
.386
.code
jmp start
;!!!! one EQU, to be supplied BEFORE assembling this cracker routine !!!!!!
CSWithAntidebugRtn equ 1623h
FNameIn db 'MEGAEM.exe',0
FHandleIn dw 0 ;filehandle of the input file
FNameOut db 'decoded.exe',0
FHandleOut dw 0
Buff db 0
TempDD dd 0 ;for conversion between 32- and 16-bit registers (SEEK)
;the three parameters which are readable without decryption
OrigCS dw 0 ;the original CS in the EXE file
OrigSS dw 0 ;the original SS in the EXE file
_01cf db 0 ;the beginning XOR value at 01cf (the first decryptor rtn)
;the second set of parameters, grabbable after/under decoding the code from 01eah
_0298SIValue dw 0 ;used for int3 call in the XOR maker rtn
_029bDIValue dw 0
_03c5 dw 0 ;value to multiply in the int 3 rtn
_0436 dw 0 ;begin
_0471 dw 0
_053eSIValue dw 0 ;used for int3 call in the decoder routine of the code from 0551
_0541DIValue dw 0
;the third set of parameters, grabbable after/under decoding the code from 0551h
OrigIP dw 0 ;the original IP in the EXE file
OrigSP dw 0 ;the original SP in the EXE file
_074d dd 0 ;the size of the program to be decompressed
FileSize dd 0 ;as above
_0751 db 0 ;XOR byte to begin with (ah)
_0752 dw 0 ;INT2 beginner offset
_0754 dw 0 ;INT1 beginner offset
_075a db 0 ;counter: when do we have to include an INT3 call while decompressing the main program?
SeekAndWordRead macro AddyOffset,Variable
mov bx,FHandleIn
mov ax,4200h
mov edx,20h+CSWithAntidebugRtn*16+100h+AddyOffset
mov TempDD,edx
mov cx, word ptr [TempDD+2]
int 21h
call _LodsbToBuffer
mov al,Buff ;reading the LOW byte
mov Variable,al
endm
SeekTo macro Addy
mov ax,4200h
mov edx,Addy
mov TempDD,edx
mov cx, word ptr [TempDD+2]
int 21h
endm
SeekAndWordReadFromTargetFile macro AddyOffset,Variable
mov bx,FHandleOut
mov ax,4200h
mov edx,20h+CSWithAntidebugRtn*16+100h+AddyOffset
mov TempDD,edx
mov cx, word ptr [TempDD+2]
int 21h
call _LodsbToBufferFromNewFile
mov al,Buff ;reading the LOW byte
call _LodsbToBufferFromNewFile
mov ah,Buff
mov Variable,ax
endm
start:
push cs
pop ds
mov dx, offset FNameIn
mov ax, 3d00h ;open
int 21h
mov byte ptr [FHandleIn], al
mov dx, offset FNameOut
mov ax, 3c00h ;create
mov cx,0
int 21h
mov byte ptr [FHandleOut], al
;getting the first XOR value at 01cf before actually starting the decrunching
SeekAndWordRead 00cfh,_01cf
call GrabCode1
SeekTo 0 ;point back to the beginning
;now, step to cs:01ea, saving the word at cs:010c as well (orig. CS)
;and building up the first coded area from 01ea to 0551.
mov ecx,0
mov al,_01cf
ReadTheFirstSectionUntil0551:
call _LodsbToBuffer
inc ecx
mov ah,Buff
cmp ecx,20h+CSWithAntidebugRtn*16+100h+0eah ;above 01ea, we must XOR the actual code as well
jbe TheAddyIsBelow0eah
xor Buff,al
TheAddyIsBelow0eah:
cmp ecx,20h+CSWithAntidebugRtn*16+100h+023h
jbe TheAddyIsBelow023h
ChangeableCode: sub al,ah ;this 4 commands is likely to be changed in every issues
shr al,1
xor al,ah
rol al,1
cmp ecx,20h+CSWithAntidebugRtn*16+100h+0548h+240+0023h ;!!!!!!!! decoding everything
je _0550HasBeenWrittenOut
TheAddyIsBelow023h:
call _StosbFromBuffer
jmp ReadTheFirstSectionUntil0551
_0550HasBeenWrittenOut:
call _StosbFromBuffer
;now, get the following bytes from the undecoded file: section 1 and 2
SeekAndWordReadFromTargetFile 0ch,OrigCS
SeekAndWordReadFromTargetFile 0eh,OrigSS
SeekAndWordReadFromTargetFile 02c5h,_03c5
SeekAndWordReadFromTargetFile 0336h,_0436
SeekAndWordReadFromTargetFile 0371h,_0471
SeekAndWordReadFromTargetFile 043fh,_053eSIValue
SeekAndWordReadFromTargetFile 0442h,_0541DIValue
SeekAndWordReadFromTargetFile 0199h,_0298SIValue
SeekAndWordReadFromTargetFile 019Ch,_029bDIValue
;back to 0123h
SeekTo 20h+CSWithAntidebugRtn*16+100h+0023h
;so, we are going to write the third part out- let's compute the beginning XOR
; value!
mov ax,0
mov es,ax
push cs
pop es:word ptr [03c2h] ;int 0 segment addy
mov ax,offset int0
mov es:[03c0h],ax
push cs
pop es:word ptr [03ceh] ;int 3 segment addy
mov ax,offset int3
mov es:[03cch],ax
xor ax,ax
mov es,ax
mov ax,_0436 ;!!!!!!!!!VAR!!!!!!!!!!!
mov es:[03c8h],ax
mov ax,_0471 ;!!!!!!!!!VAR!!!!!!!!!!!
mov es:[03c4h],ax
mov bh,00 ;lameness... It's FIXED to 0 :)
mov si,0123h ;we read the OutFile to emulate the passed first XOR cycle
WeHaventReached551Yet:
call _LodsbToBufferFromNewFile ;to count the CRC, we have seeked back to
; 0123 in the outfile and we are reading from it
inc si ;additional ofcuz, SI shows us if we have already reached the point from
; where we must actually CHANGE the code itself as well
mov al,Buff
xor bh,al
push si
mov si,_0298SIValue
mov di,_029bDIValue
MOV DX,0497H ;permanent
int 0f3h
add bh,dh
;0326!
pop si
int 0f0h
xor bh,dh
cmp si,0551h
jne WeHaventReached551Yet
jmp WeHaveReached551
int3: push es
;push ax
push bx
push cx
xor ax,ax
mov es,ax
mov ax,es:[03c8h]
mov bx,es:[03c4h]
mov cx,ax
mul _03c5
shl cl,3
add ch,cl
add dx,cx
add dx,bx
shl bx,2
add si,di
add dx,bx
add dh,bl
mov cl,05
shr di,03
shl bx,cl
add dh,bl
shl si,04
add ax,0001
adc dx,0000
mov es:[03c8h],ax
mov es:[03c4h],dx
pop cx
pop bx
;pop ax
push bx
add bh,dh
mov bl,bh
xor es:[03c8h],bx
pop bx
pop es
ret 0004
int0:
add sp,0004
popf
cmp si,0551h
jne WeHaventReached551Yet
WeHaveReached551:
;now we have the correct XOR value in bh, let's do again some DEXORing
call _LodsbToBufferFromNewFile ;we are still reading from the TARGET file
;and step back!
push bx
mov bx,FHandleOut
mov ax,4201h ;relative seek (-1 byte)
mov dx,0ffffh
mov cx,0ffffh
int 21h
pop bx
mov bl,Buff
push ax
mov ah,bl
xor ah,bh
mov Buff,ah
pop ax
call _StosbFromBuffer
mov bh,bl
push si
push di
mov si,_053eSIValue
mov di,_0541DIValue
MOV DX,0497H
int 0f3h
pop di
pop si
add bh,dh
inc si
cmp si,0a00h ;it's quite high, but at least we don't have to pay attention to SP-conversions
jb WeHaveReached551
; now the third phase- to decompress the main file itself
; First, we have to grab the begin XOR values:
SeekAndWordReadFromTargetFile 0647h,OrigIP
SeekAndWordReadFromTargetFile 0649h,OrigSP
SeekAndWordReadFromTargetFile 0654h,_0754
; now read the original EXEsize to be decrypted
SeekAndWordReadFromTargetFile 064dh,_0752
mov ax,_0752
mov word ptr _074d,ax ;the lower word is ready
mov word ptr FileSize,ax ;the lower word is ready
SeekAndWordReadFromTargetFile 064fh,_0752
mov ax,_0752
mov word ptr [_074d+2],ax ;the upper word is ready
mov word ptr [FileSize+2],ax ;the lower word is ready
call _LodsbToBufferFromNewFile
mov al, Buff
mov _0751,al ;begin XOR value
SeekAndWordReadFromTargetFile 0652h,_0752
mov _075a,1 ;counter to jump to an INT3 (permanent 1 as well, even if it has an own data byte)
mov bx, FHandleOut
mov ax, 3e00h ;close outfile
int 21h
mov ah,4ch
int 21h
mov dx, offset FNameOut ;we can record the final version of the decoded
; EXE! So we simply create a new file with the
; same name with the prev. tempfile
mov ax, 3c00h ;create
mov cx,0
int 21h
mov byte ptr [FHandleOut], al
mov bx, FHandleIn
SeekTo 0
;COPYING the EXE header (2 paragraphs)
mov cx,20h
ReadTheHeader:
call _LodsbToBuffer
call _StosbFromBuffer
dec cx
jne ReadTheHeader
xor di,di
;mov es,di
mov ax,_0752
mov es:[03c8h],ax
mov ax,_0754
mov es:[03c4h],ax
cld
mov ah,_0751
LetsRestoreTheNextNext32k: cmp dword ptr _074D,00008000h
jbe LessThan32kLetsUnpackItInOneTurn
mov cx,8000h
ReadTheNextByteFromTheINFECTEDFile: call _LodsbToBuffer
mov al,Buff
xor Buff,ah
mov ah,al
dec byte ptr _075A ;decrementing the when-we-have-to-put-an-int03-routine-in counter
je OkInsertAnInt3Call
add ah,0Dh
ReturnFromCallingInt3:
call _StosbFromBuffer
loop ReadTheNextByteFromTheINFECTEDFile
sub dword ptr _074D,00008000h
jmp LetsRestoreTheNextNext32k
OkInsertAnInt3Call:
push ax
mov bx,ax
mov si,4647h
mov di,0000
mov dx,05C1h
int 0f3h
pop ax
add ah,dh
mov byte ptr _075A,20h ;we call int 3 quite rarely, due to speed problems
jmp ReturnFromCallingInt3
LessThan32kLetsUnpackItInOneTurn:
mov cx,word ptr _074D
ReadTheNextByteFromTheINFECTEDFile_2: call _LodsbToBuffer
mov al,Buff
xor Buff,ah
mov ah,al
dec byte ptr _075A ;decrementing the when-we-have-to-put-an-int03-routine-in counter
je OkInsertAnInt3Call_2
add ah,0bh
ReturnFromCallingInt3_2:
call _StosbFromBuffer
loop ReadTheNextByteFromTheINFECTEDFile_2
;storing the original SS:SP in the final, uncoded file
mov bx, FHandleOut
SeekTo 14d
mov ax,OrigSS
mov Buff,al
call _StosbFromBuffer
mov Buff,ah
call _StosbFromBuffer ;send out SS
mov ax,OrigSP
mov Buff,al
call _StosbFromBuffer
mov Buff,ah
call _StosbFromBuffer ;send out SP
;write out the real CS:IP
SeekTo 20d
mov ax,OrigIP
mov Buff,al
call _StosbFromBuffer
mov Buff,ah
call _StosbFromBuffer
mov ax,OrigCS
mov Buff,al
call _StosbFromBuffer
mov Buff,ah
call _StosbFromBuffer
;counting abd storing the new DOS size in the header
SeekTo 2 ;pointing to MOD and DIV-word in the header
mov ax,word ptr [FileSize]
add ax,20h ;we'll calculate in the header size as well
mov dx,word ptr [FileSize+2] ;upper word
mov bx,512d
div bx
inc ax ;we have (Exesize div 512)+1 in AX, while MOD is in DX
mov Buff,dl
call _StosbFromBuffer
mov Buff,dh
call _StosbFromBuffer ;sending out DIV
mov Buff,al
call _StosbFromBuffer
mov Buff,ah
call _StosbFromBuffer
mov bx, FHandleOut
mov ax, 3e00h ;close outfile
int 21h
mov ah,4ch
int 21h
OkInsertAnInt3Call_2:
push ax
mov bx,ax
mov si,4647h
mov di,0000
mov dx,05C1h
int 0f3h
pop ax
add ah,dh
mov byte ptr _075A,20h ;we call int 3 quite rarely, due to speed problems
jmp ReturnFromCallingInt3_2
;getting 8 byte from 01e0 to 01e8 and putting it in our OWN decryptor routine!
GrabCode1: mov bp,0
mov ax,4200h
mov edx,20h+CSWithAntidebugRtn*16+100h+00e0h
mov TempDD,edx
mov cx, word ptr [TempDD+2]
int 21h
GettingTheNextCodeByte: call _LodsbToBuffer
mov al,Buff ;reading the LOW byte
mov byte ptr [ChangeableCode+bp],al
inc bp
cmp bp,8
jne GettingTheNextCodeByte
ret
_LodsbToBuffer: ;read the next byte from the SOURCE file
pusha
mov bx,FHandleIn
mov ax,3f00h
mov cx,0001
mov dx,offset Buff
int 21h
popa
ret
_LodsbToBufferFromNewFile: ;read the next byte from the TARGET file
pusha
mov bx,FHandleOut
mov ax,3f00h
mov cx,0001
mov dx,offset Buff
int 21h
popa
ret
_StosbFromBuffer: ;write Buff to the TARGET file
pusha
mov bx,FHandleOut
mov ax,4000h
mov cx,0001
mov dx,offset Buff
int 21h
popa
ret
end