How To Write Your Own Fast Decruncher? - (C) by Dirk Gently



Coders often face the problem of the lack of an efficient, but still fast, let's say on-the-fly decrunchers (of course ne should make the distinction between Stacker, Doubledisk etc..., which represent REAL on-the-fly decoders, but I think e.g. PKLITE is also fast enough, not to mention the vast difference between the compression rate. So, from now on, I'll use the expression 'on-the-fly' when referring to PKLITE & Co.). Think of it- using a cruncher with the efficiency of PKZIP 1.1 (or even the ancient PKARC package, which had definitely worse compression ratio than PKZIP) can reduce the size of your demo/diskmag/etc.... efficiently, and can also add some PROTECTION- your files will be not only compressed, but also PROTECTED (of course, experienced coders don't face problems when trying to find out how to decompress a file that has been compressed by an unknown compressor).
Of course, one could grab some books on designing compressors/decompressors, but I think it's quite useless. Take my case, for example. Once, back in 1992, I had to produce an efficient decruncher for our diskmag. The cruncher I wrote in C was damn slow and its efficiency wasn't outstanding, either. So I found it better to STEAL a compressor and its decompressor routine. Where did I steal my routines from? Well, of course, from EXE crunchers. Why?
  1. They're very fast and quite effective
  2. Their decompression routines are small and they work incredibly fast

Of course, it's vital to compare the efficiency of the available EXE crunchers. There are the following EXE crunchers on the scene: LZEXE, PKLITE and DIET. DIET can compress not only EXE, but also data files, but we'll see how to use other EXE crunchers to compress pure data without EXE headers.
First, I preferred DIET because I didn't want to fuss around with converting data files to EXE format. Later, I found that using LZEXE can be more useful because the decruncher code is considerably smaller and it also handles large data files (so does DIET, of course. A coder that compresses only 64k is shitty and should be forgotten at once).
So, back to the subject: how to use LZEXE to compress our data files? First, as LZEXE can't compress pure data files, we have to convert them into a semi-EXE format. This can be achieved by INSERTING an exe header before the code and setting the length information of this header to the value of the length of the data file. It's very simple. Compile the following two programs (you can find the PASCAL program's compiled version in the ZIP I've provided, although). Let's assume their name is header.asm and crctsize.pas, respectively.

header.asm is the header we're going to file our data file after.

.model small
.code
end

crctsize.pas sets the EXE size right to avoid LZEXE's refusing to compress EXE files that 'contain overlays'.

uses dos;
Var WordFile: file of word;
    ByteFile:file of byte;
    FileSizeInBytes:longint;
    DiVword: word;
    MODword: word;

begin

assign(ByteFile,paramstr(1));
reset(ByteFile);
FileSizeInBytes:= filesize(ByteFile);
close(ByteFile);

divword:= (FileSizeInBytes div 512) + 1;
modword:= FileSizeInBytes mod 512;

assign(WordFile,paramstr(1));
reset(WordFile);
seek(WordFile,1);
write(WordFile,modword);
write(WordFile,divword);
close(WordFile);
end.

Now, you can compress this semi-EXE file with LZEXE. When the compression is ready, you must cut the first 32 and the last 340-350 bytes of the EXE it produces. The 32 bytes form the new header and the last section is the runtime decompression module on the EXE file. I've included a fast, and very useful utility called cut.exe to do this job.

Of course, it's strongly recommended that you write a batch file which does all this. coder.bat contains the following:

@echo off
copy /b header.exe + %1 _tmp.exe
crctsize.exe _tmp.exe
lzexe _tmp
cut.exe _tmp.exe _tmp2.exe 32 -340
del % 1
ren _tmp2.exe %1

Just supply the name of the file you want to compress and it'll compress and cut it.

Finally, the most important part: how to decompress the file you've just produced? I give you the code to do that (decr.inc).

mov      dx,0010h
lodsw
mov      bp, ax
@0069:   shr bp,1
dec      dx
jne      @0073
lodsw
mov      bp, ax
mov      dl,10h
@0073:   jnb @0078
movsb
jmp      @0069
@0078:   xor cx,cx
shr      bp,1
dec      dx
jne      @0084
lodsw
mov      bp, ax
mov      dl,10h
@0084:   jnb @00a8
shr      bp,1
dec      dx
jne      @0090
lodsw
mov      bp, ax
mov      dl,10h
@0090:   rcl cx,1
shr      bp,1
dec      dx
jne      @009C
lodsw
mov      bp, ax
mov      dl,10h
@009c:   rcl cx,1
inc      cx
inc      cx
lodsb
mov      bh,0FFh
mov      bl,al
jmp      @00BB
@00a8:   lodsw
mov      bx,ax
mov      cl,03
shr      bh,cl
or       bh,0E0h
and      ah,07
je       @00c3
mov      cl,ah
inc      cx
inc      cx
@00bb:   mov al,es:[bx+di]
stosb
loop     @00BB
jmp      @0069
@00c3:   lodsb
or       al,al
je       @Ready
cmp      al,01
je       @00D1
mov      cl,al
inc      cx
jmp      @00BB
@00D1:   mov bx,di
and      di,000Fh
add      di,2000h
mov      cl,04
shr      bx,cl
mov      ax,es
add      ax,bx
sub      ax,0200h
mov      es,ax
mov      bx,si
and      si,000Fh
shr      bx,cl
mov      ax,ds
add      ax,bx
mov      ds,ax
jmp      @0069
@Ready:  ret

Input parameters:
DS:SI: the address of the compressed data IN THE MEMORY.
ES:DI: the address where you want to decompress it to.

Of course, this routine should be called with a CALL.

I hope you'll enjoy this article. Don't forget to send me greets, although! :)

And here is the ZIP that contains all the above files (incl. LZEXE).