making smallest 32-bit "Hello World" application

thomasantony · April 04, 2005, 05:31:49 AM

Hi,
This is a problem I found with most tutorials that talk about import tables. They forget to mention that the IAT DAta Directory should point to the array of RVA's of IMAGE_THUNK_DATAs . I found out that the API calls point to a member of this array.

Thomas

Randall Hyde · April 04, 2005, 10:42:00 PM

One's goal should never be to write the "world's smallest 32-bit 'Hello World' program." Instead, see how much functionality you can pack into 4K. After all, under 32-bit OSes 4,096 bytes of code space is the *minimum* amount of memory your process will consume (it will also consume 4K of stack space, and possibly some data space, too). Yes, you might make the EXE file a little smaller, but when you consider that most recent versions of Windows generally allocate 4K chunks on the disk, too, that magic number appearing in the directory listing, if it's not a multiple of 4K, is a total lie.

Learning how to create a tiny "Hello World" program is a fascinating side trip in terms of code compaction, but the truth is that the application is too small to allow you to learn really useful techniques (you need to work on larger programs to get a good feel for how to write compact code). And given the memory page requirements of the x86/Windows combination, and cluster sizes on disk, you're wasting your time trying to compact something that's already below the limits imposed externally to your program.

As I said, a better goal is to see how much functionality you can cram into 4K.
Cheers,
Randy Hyde

hitchhikr · April 04, 2005, 11:16:33 PM

If one wants to cram as much as possible into 4k one have to first perfectly understand the PE file format and make this part of the file as small or as highly compressible as possible.

liquidsilver · April 05, 2005, 07:26:46 AM

I agree. A good challenge would be to find a program and attempt to compress it as much as possible. Perferably a program that interests you.

Vortex · April 05, 2005, 07:34:54 AM

Randall is right, better is to study to improve the functionality.

Randall Hyde · April 05, 2005, 03:24:27 PM

Quote from: hitchhikr on April 04, 2005, 11:16:33 PM
If one wants to cram as much as possible into 4k one have to first perfectly understand the PE file format and make this part of the file as small or as highly compressible as possible.

As long as the whole file sits in a single disk block (generally 4K), compression is irrelevant. It's like squeezing a bunch of extra cycles out of a code sequence that executes once during initialization -- it doesn't buy you much.

Knowing the PE file format isn't a bad thing, but knowing the PE format doesn't really help you write better assembly code. When someone says something like "I want to write the world's smallest 'hello world' program" the assumption is that they want to do this by using the fewest number of instructions (or the shortest sequence of bytes to comprise those instructions), not by resorting to compression or other techniques that are independent of the original purpose.

When people suggest "compression" as a solution to the "shortest hello world" program I have to laugh. Such people have completely missed the point of the exercise -- to learn how to write *code* as compactly as possible. When it's all said and done, the compression approach is actually *larger* because you have to supply the code to decompress the actual executable. While compressed EXEs do serve a purpose (for *large* application), their application to the "hello world" example defeats the purpose.

As usual, I still argue that the smallest memory block you can allocate is 4K. Trying to shrink an application below this point is, well, pointless. Further, as any win32 app is going to have to link with Kernel32.dll (and, possibly, others), the size of the file doesn't really have much meaning as all the real work is being done by a DLL elsewhere. You may as well bury the string in a user-written DLL and write a "hello world" program consisting of a single call to the DLL and make claims about the size of your code.

"The world's shortest 'Hello World' program" actually made sense in the days of DOS and COM files where the program would clear the (text) video display and copy the string "Hello World" directly into video memory. COM files could be as small as disk blocks (typically 512 bytes, in those days) and the application only used as much memory as it's code, stack, and data segments required (well, you could argue that DOS apps always consumed the whole 640K because there was no multitasking, but that's a different issue).

Again, my argument is that a trival applicaton like "Hello World" provides almost no opportunity for real code compaction techniques. A better solution is to pick a memory size and see how much functionality you can squeeze into that, rather than taking a trival program and seeing how small you can make it. There have been several "256-byte" contests floating around. I still argue that 4K is the right size, as this is the page size of the x86.

To this date, one of the best examples of packing functionality into a limited amount of space I've seen is the original Apple II monitor ROM. Steve Wozniak (Woz) packed an *incredible* amount of functionality into 2K of memory. Amazing stuff.
Cheers,
Randy Hyde

hitchhikr · April 06, 2005, 12:43:20 AM

I think you missed the point by a few hundred miles, Hyde.

Quote
Knowing the PE file format isn't a bad thing, but knowing the PE format doesn't really help you write better assembly code. When someone says something like "I want to write the world's smallest 'hello world' program" the assumption is that they want to do this by using the fewest number of instructions (or the shortest sequence of bytes to comprise those instructions), not by resorting to compression or other techniques that are independent of the original purpose.

Well...

Quote
If one wants to cram as much as possible into 4k one have to first perfectly understand the PE file format and make this part of the file as small or as highly compressible as possible.

Where did i even slightly mentioned that it might helps someone to write better code ?

Quote
When people suggest "compression" as a solution to the "shortest hello world" program I have to laugh.

And what offending imbecile would suggest such heresy ?

Quote
When it's all said and done, the compression approach is actually *larger* because you have to supply the code to decompress the actual executable. While compressed EXEs do serve a purpose (for *large* application), their application to the "hello world" example defeats the purpose.

Of course, plain PE files compression without using dos/cab droppers becomes effective for programs with size of a few kilobytes (around 4).

Notice that you're the only one still talking about that hello world stuff, bub.

white scorpion · April 16, 2005, 12:03:28 PM

QuoteOne's goal should never be to write the "world's smallest 32-bit 'Hello World' program." Instead, see how much functionality you can pack into 4K. After all, under 32-bit OSes 4,096 bytes of code space is the *minimum* amount of memory your process will consume (it will also consume 4K of stack space, and possibly some data space, too). Yes, you might make the EXE file a little smaller, but when you consider that most recent versions of Windows generally allocate 4K chunks on the disk, too, that magic number appearing in the directory listing, if it's not a multiple of 4K, is a total lie.

Learning how to create a tiny "Hello World" program is a fascinating side trip in terms of code compaction, but the truth is that the application is too small to allow you to learn really useful techniques (you need to work on larger programs to get a good feel for how to write compact code). And given the memory page requirements of the x86/Windows combination, and cluster sizes on disk, you're wasting your time trying to compact something that's already below the limits imposed externally to your program.

As I said, a better goal is to see how much functionality you can cram into 4K.
Cheers,
Randy Hyde

that would indeed be interesting indeed, but i have to start somewhere, and this was the first thing that came to mind... i'm already reading up on compressions of PE files, but this most of the time needs an uncompressor as well, and this is something i am trying to avoid...
and yes of course i would always want to improve my writing skills, but IMO this can only be done by experimenting and learning.

Unfortunately i didn't have the time to respond sooner since i've been very busy learning to write device drivers :)

AeroASM · April 22, 2005, 08:52:49 PM

Here is one way to save bytes that I see all the time in Win32 DLLs:

Instead of cmp eax,0 use:

xor edi,edi
cmp eax,edi

Or even instead of cmp eax,1 use

xor edi,edi
inc edi
cmp eax,edi

roticv · April 23, 2005, 08:30:30 AM

Quote from: AeroASM on April 22, 2005, 08:52:49 PM
Here is one way to save bytes that I see all the time in Win32 DLLs:

Instead of cmp eax,0 use:

xor edi,edi
cmp eax,edi

Or even instead of cmp eax,1 use

xor edi,edi
inc edi
cmp eax,edi

test eax, eax
jz/jnz

dec eax
jz/jnz

AeroASM · April 23, 2005, 03:22:42 PM

Yeah, yeah, make me look stupid why don't you?

roticv · April 24, 2005, 03:28:14 AM

No, that's not my point. I am not attacking you or anything - I have no reason to do so. I don't if you are correct - maybe C compilers are not that great at optimising. The examples you stated actually take up more space.

cmp eax, 0 = 83h F8h 00h

xor edi, edi = 33h FFh
cmp eax, edi = 3Bh C7h

test eax, eax = 85h C0h
------------------------------------
cmp eax, 1 = 83h F8h 01h

xor edi, edi = 33h FFh
inc edi = 47h
cmp eax, edi = 3Bh C7h

dec eax = 48h

AeroASM · April 24, 2005, 06:43:03 AM

Sorry - I was not being serious. I found it funny (as in "ha ha") that after I try to make a really intelligent comment to add to the discussion, someone else completely smashes it down.

Anyway, if that is the the case either Microsoft programmers are stupid or they use idiotic compilers.

white scorpion · April 24, 2005, 08:05:43 PM

hi guys,

it is not worth making smaller code since it is difficult enough to get below the 1kb as discussed previously in the thread.. i'm sure if it is possible to remove the not used space in the sections you can safe a lot more then a few bytes with the optimizing opcodes. don't get me wrong, it is interesting indeed, but in this case it is something which should be done last, at first i need to strip off as much unused bytes as possible....

jorgon · June 25, 2005, 05:23:32 AM

Quote from: AeroASM on April 22, 2005, 08:52:49 PM
Instead of cmp eax,0 use:

xor edi,edi
cmp eax,edi

There is also:-

Code Select

OR EAX,EAX
which does the same as CMP EAX,0

News:

making smallest 32-bit "Hello World" application

thomasantony

Randall Hyde

hitchhikr

liquidsilver

Vortex

Randall Hyde

hitchhikr

white scorpion

AeroASM

roticv

AeroASM

roticv

AeroASM

white scorpion

jorgon