News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

KGB Archiver on Sourceforge

Started by anunitu, August 16, 2007, 06:22:45 PM

Previous topic - Next topic

anunitu

I was checking this out on Sourceforge, as well as on The KGB homepage. The rates of compression are truly inspiring, but the program has VERY high system resource demands. It is done in C++, and I wondered if the author had done any assembler to reduce the system load and make it less a system hog. I emailed the Author, and told him about this site, and perhaps he might give us a visit.

http://kgbarchiver.net/?page=contact  (home page)

Very nice this is being done open source as well.


This is what caught my attention about this Archiver

http://digg.com/tech_news/Compress_MS_Office_430MB_ISO_to_1.4MB_with_KGB_Archiver


Anunitu

evlncrn8

the dig url is more or less stating its a hoax.... and 430mb -> floppy sized file, is seriously doubtful
unless they applied some optimisation on the iso itself to begin with...

interesting sure, if its true, but im really doubting it

anunitu

Perhaps it is a little more than might seem possible. I do remember Downloading a Demo from the demo site, it was called the product, and it was 64k. It produced about 16 minutes of video and sound, using an algorithm that build the graphics and sound on the fly. Done in assembler, it was fairly fast.

This is a link to the file, "The product"

http://www.theproduct.de/

You can Download the file, and they explain how the accomplished the graphics and sound on the fly.


Anunitu

BTW, this demo was done back in 2000.


Mark Jones

Hmmm. (Real) data can only be compressed so far - there comes a point of diminishing returns when additional computation simply cannot provide further compression. Likely this "Iso" or whatever is wholly incomplete, fake, and/or bloated with much repetitive data. Depending on the type of data you try to compress, you may get excellent compression results or poor results. MP3's don't compress well because the data is already very complex. Bitmaps DO compress well because they contain lots of repetitive data. But there is no such thing as a "Magic File Compressor" which can take 100MB of MP3s and make it 1MB - this just isn't possible.

Compression limitations can be seen very effectively in random-number-generated files. These files are a stream of highly unique numbers, and even modern compressors cannot compress them further. This is because the data is already at it's most complex, and the compressor cannot increase it's complexity any further. Attached is one such random file, generated using a very good algorithm. Try compressing it with your favorite compressor. Enjoy. :lol


[attachment deleted by admin]
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

zooba

Compressed from 65,536 bytes to 65,602 bytes (100.1%) :bg

I had a read on the KGB Archiver site and found nothing mentioning ISO files anywhere. ISOs tend to compress well since they don't have any compression (unless the files stored within them are already compressed). However, looking at their tests the ratios do seem reasonable and especially since they state up front that pretty decent processing power is required I am inclined to think that they are genuine, even if the Digg page was not.

Cheers,

Zooba :U

Jupiter

KGB Archiver used PAQ compression algo developed by Matt Mahoney
PAQ8 (and older) has ASM implementation which is faster than C version.

PAQ at Wikipedia
EnJoy!

ecube

ISO isn't a compression format, however if implemented correctly it can indeed reduce the overall size to a low level(by removing duplicate files) which i've seen sizes drop 50% or more. As far as this kgb archiver everyone here already commented on how extremely unlikely it compressed that much unless the iso was dilbertly setup in a certain way. To be fair i've seen videos on how to compress dvd(4gb) down small enough to fit on a cell phone, but lot was stripped down with that.

hutch--

I just ran the earlier version (I don't have NET installed) as a test aand it comes in slightly smaller than a 7zip archive as an SFX but it is very slow in comparison, both in compression and decompression. I would be interested to see if the later version that needs NET for its install has a higher compression ratio. At the moment the SFX archive is far too slow for an installation.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

Beware of the earlier version of this piece of sewerage, I am currently searching the setting of my machine to stop it from holding the CPL extension. I uninstalled it, cleaned out what I could find in the registry and it stil does it.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

u

People reported that it connects to external servers and downloads from there while doing the "decompression" of that MS Office package.
We can easily make even better of this type of "decompression" - 8kB code downloading a 500GB HDD image backup from the net :P . that's "compression" 65536000 : 1 :)

P.S. anunitu, there's a huge difference between procedurally-generated rough data (you can make that infinitely) and a heap of human-generated data. With the first, you get a few building-blocks (functions) and feed them different parameters, with the other ... well the other is more close to noise.
Please use a smaller graphic in your signature.

TMM

At present, PAQ has the best results at the cost of CPU and memory. A HEAVY cost.

The all-round best version is PAQ8O9 which has special subroutines for text, BMP, JPG etc. compression.

The fastest version of this is ... well ... mine ;) As I rewrote some of Matt's code (c++ or ASM, which is the fastest ?) - although it's taking a long time to refine a lot more. As i'm using this version for my archives, i've released 2 versions. One that runs faster on P4 CPU's, and one that does'nt.

http://rapidshare.com/files/147526173/TAQ-pack.RAR.html for the files.

Feel free to scream about "oh, there's a trojan in there" if you want - there is'nt :P

Now, feel free to test KGB against this (which is, in essence, PAQ8O9 - i've only added speed to it) and see which is the best.

Note: If (entropy == 100%) then compression=NONE <- remember that formula ;)

BlackVortex

Wow, this is unbelievable compression and unbelievable slowness. (even the desompression is veery slow, tested with a 1,34 mb file by the way)

From now on, instead of "extremely slow", I'll just say paq-slow.

But did I mention the compression is extreme ?     :toothy :green

P.S.: I didn't specify a compression level, which is the default ?

Eddy

Quote from: Mark Jones on August 18, 2007, 12:45:54 AM
Compression limitations can be seen very effectively in random-number-generated files.
Random and (good!) pseudo random data can not be compressed by default. Compression algorithms rely on finding certain patterns in data. (Good pseudo) random data, by definition, does not have patterns and so it does not compress.

Kind regards
Eddy
Eddy
www.devotechs.com -- HIME : Huge Integer Math and Encryption library--