How To Write Your Own
Fast Decruncher? - (C) by Dirk Gently
Coders often face the problem of the lack of an efficient, but still
fast, let's say on-the-fly
decrunchers (of course ne should make the distinction between Stacker, Doubledisk etc..., which represent REAL on-the-fly decoders, but I think e.g. PKLITE is also fast enough, not to mention the vast difference between the compression rate. So, from now on, I'll use the expression 'on-the-fly' when referring to PKLITE & Co.). Think of it- using a cruncher with the efficiency of PKZIP 1.1 (or even the ancient PKARC package, which had definitely worse compression ratio than PKZIP) can reduce the size of your demo/diskmag/etc....
efficiently, and can also add some PROTECTION- your files will be not only
compressed, but also PROTECTED (of course, experienced coders don't face
problems when trying to find out how to decompress a file that has been
compressed by an unknown compressor).
Of course, one could grab some books on
designing compressors/decompressors, but I think it's quite useless. Take my case, for
example. Once, back
in 1992, I had to produce an efficient decruncher for our diskmag. The cruncher I wrote in C was damn slow and its efficiency wasn't
outstanding, either. So I found it better to STEAL a compressor and its
decompressor routine. Where did I steal my routines from? Well, of course,
from EXE crunchers. Why?
- They're very fast and quite effective
- Their decompression routines are small and they work incredibly fast
Of course, it's vital to compare the
efficiency of the available EXE crunchers. There are the following EXE
crunchers on the scene: LZEXE, PKLITE and DIET. DIET can compress not only
EXE, but also data files, but we'll see how to use other EXE crunchers
to compress pure data without EXE headers.
First, I preferred DIET because I didn't
want to fuss around with converting data files to EXE format. Later, I found
that using LZEXE can be more useful because the decruncher code is
considerably smaller and it also handles large data files (so does DIET, of course. A coder that compresses only 64k is shitty and should be forgotten at once).
So, back to the subject: how to use LZEXE
to compress our data files? First, as LZEXE can't compress pure data files,
we have to convert them into a semi-EXE
format. This can be achieved by INSERTING an exe header before the code and
setting the length information of this header to the value of the length
of the data file. It's very simple. Compile the following two programs
(you can find the PASCAL program's compiled version in the ZIP I've provided, although).
Let's assume their name is header.asm and crctsize.pas, respectively.
header.asm is the header we're going to file our data file after.
.model small
.code
end
crctsize.pas sets the EXE size right to avoid LZEXE's refusing
to compress EXE files that 'contain overlays'.
uses dos;
Var WordFile: file of word;
ByteFile:file of byte;
FileSizeInBytes:longint;
DiVword: word;
MODword: word;
begin
assign(ByteFile,paramstr(1));
reset(ByteFile);
FileSizeInBytes:= filesize(ByteFile);
close(ByteFile);
divword:= (FileSizeInBytes div 512) + 1;
modword:= FileSizeInBytes mod 512;
assign(WordFile,paramstr(1));
reset(WordFile);
seek(WordFile,1);
write(WordFile,modword);
write(WordFile,divword);
close(WordFile);
end.
Now, you can compress this semi-EXE file with LZEXE. When the compression is ready, you must cut the first 32 and the last 340-350 bytes of the
EXE it produces. The 32 bytes form the new header and the last section is
the runtime decompression module on the EXE file. I've included a fast,
and very useful utility called cut.exe to do this job.
Of course, it's strongly recommended that you write a batch file which does
all this. coder.bat contains the following:
@echo off
copy /b header.exe + %1 _tmp.exe
crctsize.exe _tmp.exe
lzexe _tmp
cut.exe _tmp.exe _tmp2.exe 32 -340
del % 1
ren _tmp2.exe %1
Just supply the name of the file you want to compress and it'll compress and cut it.
Finally, the most important part: how to decompress the file you've just produced? I give you the code to do that (decr.inc).
mov dx,0010h
lodsw
mov bp, ax
@0069: shr bp,1
dec dx
jne @0073
lodsw
mov bp, ax
mov dl,10h
@0073: jnb @0078
movsb
jmp @0069
@0078: xor cx,cx
shr bp,1
dec dx
jne @0084
lodsw
mov bp, ax
mov dl,10h
@0084: jnb @00a8
shr bp,1
dec dx
jne @0090
lodsw
mov bp, ax
mov dl,10h
@0090: rcl cx,1
shr bp,1
dec dx
jne @009C
lodsw
mov bp, ax
mov dl,10h
@009c: rcl cx,1
inc cx
inc cx
lodsb
mov bh,0FFh
mov bl,al
jmp @00BB
@00a8: lodsw
mov bx,ax
mov cl,03
shr bh,cl
or bh,0E0h
and ah,07
je @00c3
mov cl,ah
inc cx
inc cx
@00bb: mov al,es:[bx+di]
stosb
loop @00BB
jmp @0069
@00c3: lodsb
or al,al
je @Ready
cmp al,01
je @00D1
mov cl,al
inc cx
jmp @00BB
@00D1: mov bx,di
and di,000Fh
add di,2000h
mov cl,04
shr bx,cl
mov ax,es
add ax,bx
sub ax,0200h
mov es,ax
mov bx,si
and si,000Fh
shr bx,cl
mov ax,ds
add ax,bx
mov ds,ax
jmp @0069
@Ready: ret
Input parameters:
DS:SI: the address of the compressed data IN THE MEMORY.
ES:DI: the address where you want to decompress it to.
Of course, this routine should be called with a CALL.
I hope you'll enjoy this article. Don't forget to send me greets,
although! :)
And here is the ZIP that contains all the above files (incl. LZEXE).