News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

RLE - encoding/decoding program and source

Started by drarem, March 11, 2005, 07:09:58 PM

Previous topic - Next topic

gabor

Hi again!

A bit delayed, but finally here is my RLE algo. rle.zip
It is wrapped very simply in a Win32 process, since the goal was to test the encoder/decoder. Please test it!
I have some ideas how to optimize it, but I thing RLE is a bit old method, and it's simply not worth working on it for that much time.

The proggy works like this:
1. throws an OpenFileName Dialog to get the file to be compressed.
2. throws a SaveFileName Dialog to get the output file.
3. throws an OpenFileName Dialog to get the file to be decompressed.
4. throws a SaveFileName Dialog to get the output file.

My results show what I've expected: the compression ratio is not very good, for text files it is a disaster. Once again RLE can be used for images with having large areas of the same color, or for sound samples of low voices...

Note:
- it uses memory buffers to read from and write to. The buffers are 64k long, so longer files are truncated.
- the rle.asm is full of debuginfo generating code, but they are commented out. If no debug is needed all this stuff with all the includes can be deleted.

I hope it works for every case.

Furthermore I would like to discuss the other packing algos, Huffmann and LZW. Does anyone agree??

By for now!

Gábor

drarem

Nice work, i was able to encode/decode a small .zip file and a ms paint file.

My packed files were 5 to 200 bytes larger than yours, I see how the key system works now.

Thanks for the post.

The Dude of Dudes

Fixed my previous code, bundled it into (yet another!) example. Just drag and drop the file to pack/unpack onto Packer.exe's icon. It also retains the file extension when unpacking. Will take a look at the other examples when I have some time!  :bg

[attachment deleted by admin]

gabor

Good Job Dude :U!

I didn't check your packer completly, however I found one bug (we could live with it :bg). When running the program directly, not dropping a file on it, it gives a memory access exception.

The results: as expected your packer gives the same compression as mine, this means we both implemented the algo correctly :toothy

Interested in LZW?

Greets, Gábor

The Dude of Dudes

Yes, it does throw an exception. If you check the source, you'll see  the commandline handling is quite simplistic which is why it causes an exception!  :red It could actually be re-worked quite a bit, I threw it together in quite a hurry.

I'd be very interested in other compression methods! 

drarem

wwooooopps...  hate to beat dead horses but apparently with characters greater than 128 your decompressor gets confused, try the bitmap below, the picture shifts a few bytes to the right.  How can the decompressor know which is an actual MSB and which is what is supposed to be there unless you know for certain an entire binary file normally uncompressed won't contain an MSB?

Like I said, I wonder what it would do with larger files in binary.

[attachment deleted by admin]

gabor

Hello!

I thought I'd better not bore all of you with packing algos, so I've put some explaination on http://members.chello.hu/nemeth.gabor1. At the moment it mainly consists of the the stuff already discussed here. But, I've also put there a not finished discription about LZW. (But I will finish it real soon!)

Greets, Gábor

The Dude of Dudes

Hmm... Im not sure what you mean, Drarem. I have no trouble with the bmp. I verified my decompression by examining Hex dumps of original and decompressed files. I check for matching length and randomly select offsets and compare their values. If the Image was shifted the offsets would not contain the same value, they would be shifted.

The decompressor processes the file in order like this:

The first byte is read and it determines what the following byte(s) mean. If the msb is set, the following byte is repeated (value of bits 0-6 add 3 times). If the msb is NOT set, its a literal value of how many single bytes follow. In either case, it can be determined where the offset of the next 'control charater' is. In the case of the repeating sequence, the next control character is 2 bytes past the current control character. In the case of a string of single bytes, the next control character is (control char value) + 1 bytes past the current control char. So as you can see, whether the msb is set in the bytes of the source file has no effect on the decompression. The key is getting the offset of the next control character, and the next, and so on. Hope this makes sense!

gabor

Dude, I'd say a clean 'n tidy explanation!

And I'd say could we jmp to the LZW, I would like anyone else (but me) to come up with an encoder/decoder code. As I've posted before, I put on a webpage the info about the LZW algo.


Greets, Gábor

roticv

Maybe I will code something when I have time..

roticv

I have started to code the LZW compressor/uncompressor. Just give me some time and you guys would get to see it.

The Dude of Dudes

#26
I've been working on a version as well. Still buggy. The de-compression is by far more work! I think my understanding of the Algo may be wrong. Specifically during de-compression.  This is what I am using:

read a character k
output k
w = k
loop
   read a character
   entry = dictionary entry for k
   output entry (fully translate all codes to final characters)
   add w + first char of entry to the dictionary
   w = entry
endloop

I think my problem is when k is a yet undefined Dictionary entry.  This is how I am handling it:

If k = next uncreated Dictionary entry, last k outputted is appended to w and creates the next Dictionary Entry

In a small test file i used a repeating series of numbers (1-5). At the 12th series it decodes to 1,2,3,4,5,5,2,3,4,5,1,2,3,4,5,1,1,3,4,5

Any ideas?

The Dude of Dudes

  :dance: Solution: :dance:

If k = next uncreated Dictionary entry, FIRST CHAR of last code output  is appended to w and creates the next Dictionary Entry

(big sigh!)


Same as my RLE Packer. Drag and drop files onto Packer icon to compress/decompress.

Works GREAT with most text files (up to 50% compaction), not as good with exe's (%30 maybe)



[attachment deleted by admin]

gabor

Hi Dude!

Nice work. Are your entries in the dictionary only 2 character long? I would say you could try a bigger value, and you could use other code length not just 12 bit (the number of entries, so the size of the dictionary would change as well).

Greets, Gábor

The Dude of Dudes

Here's a 16-bit version. Little better compression. Here's results from Windows.inc from the MASM package.

Windows.inc - 995kb
12bit lzw(2 byte entries) - 503kb
16bit lzw (2 byte entries) - 377kb  :cheekygreen:

I am still using 2 bytes entries. I believe increasing the dictionary size or dictionary entry size achieves the same result, does it not (please tell me if I am wrong)?

[attachment deleted by admin]