News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Understanding optimization

Started by vozzie, August 11, 2009, 11:18:16 PM

Previous topic - Next topic

vozzie

Hy all,

In the opcode helpfile of the MASM32 sdk there are tables for each mnemonic, i guess that clocks is the "steps" a processor needs to process the instruction.

Now i wrote a procedure working more on byte level (AL,AH,DL,DH,...) and then one with "full" registers(EAX,EDX,...), the second one using full registers was faster...(shorter too)

They had both the same result but,..

The first one was moving 4 bytes mov eax,[esi] and then more shifting(SHR,SHL) and some rolling :)

The second one was using lodsb and more binary operations(OR,AND)

What are the things i should try to learn and look at for more optimization? Is it about saving "clocks"?

Maybe somebody can point me out some tips or information, or watch what i wrote and see where i stand and what i should look into,...

(update:searching in the forum i found some good links and info and i added the 'procedures' as attachment)
With regards

redskull

In this day and age, optimization is way more than just cycle counts; it depends a great deal on what instructions came before it, the general state of affairs when the instruction is executed, and the specific CPU you are using.  Check out Agner Fogs optimization manuals for a good introduction to things.  Generally though, the less memory you access, the fewer conditional jumps you use, and the fewer API  functions you call, the faster everything will go.

-r
Strange women, lying in ponds, distributing swords, is no basis for a system of government

MichaelW

QuoteWhat are the things i should try to learn and look at for more optimization? Is it about saving "clocks"?

Assuming that you mean optimization for speed, it's about executing faster, and completing a given task in less time. This time can be measured in seconds, or in clock cycles. But before you expend a lot of effort optimizing code, you first need to consider if there is a faster algorithm for the task.
eschew obfuscation

dedndave

Mark has a few nice tips to get you going
http://www.website.masmforum.com/mark/index.htm
intel and amd also have references, but are a bit more involved
32-bit registers are faster because that is the "native language" of pentiums
many opcodes are shorter when using the 32-bit form
some instructions are further optimized for use with eax, ax, al
xchg eax,reg32 is a single byte, for example
mov eax,[immed], mov [immed],eax, etc are shorter, too, i think (or faster - i forget which - lol)

redskull

Also, the first half of Micheal Abrash's "Graphics Programming Black Book" is perhaps the best-ever reference for "getting your mind right" for optimization.  It's becoming more and more outdated as the years go on, but several of the chapters are still the authoritative text on the subject.

-r
Strange women, lying in ponds, distributing swords, is no basis for a system of government