News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

LEA

Started by bomz, July 04, 2011, 10:00:07 PM

Previous topic - Next topic

dedndave

i had a case the other day.....

EDX was already 0 as a result of code at the bottom of the loop
i only had to zero it in loop init code
at the top of the loop, i needed the byte in CH extended to a dword, so...
        mov     dl,ch

it was faster to do this....
        movzx   edx,ch

jj2007

My Celeron doesn't care...

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
48      cycles for REP100 mov
46      cycles for REP100 zx
172     cycles for 100*mov (with dec ecx & jne)
171     cycles for 100*zx

bomz

my results was wrong I use dedndave tocks movzx also faster

QuoteItanium 2 (2002+), MMX, SSE2
1234 1237 1234 1231 1232 - empty cycle
1839 1835 1846 1845 1833 - mov
1533 1534 1536 1538 1530 - movzx
Press any key to continue ...

QuoteItanium 2 (2002+), MMX, SSE2
1238 1231 1234 1239 1235 - empty cycle
1535 1535 1537 1531 1540 - sub cl, 48
1560 1558 1564 1553 1563 - sub ecx, 48
Press any key to continue ...

bomz

QuoteItanium 2 (2002+), MMX, SSE2
1233 1228 1223 1231 1230 - empty
4021 4021 4020 4021 4020 - shl edx, 3
2114 2109 2114 2112 2113 - lea edx,[eax*8]
Press any key to continue ...

Itanium 2 (2002+), MMX, SSE2
1228 1232 1225 1227 1226 - empty
4021 4021 4020 4020 4020 - shl edx, 3
5027 5026 5027 5027 5027 - lea eax,[eax*8]
Press any key to continue ...

Itanium 2 (2002+), MMX, SSE2
1231 1234 1236 1222 1234 - empty
4021 4021 4020 4020 4020 - shl edx, 2
2112 2106 2116 2113 2115 - lea edx,[eax*4]
Press any key to continue ...

QuoteItanium 2 (2002+), MMX, SSE2
1232 1231 1226 1229 1229 - empty
2031 2031 2028 2031 2027 - lea edi, [2*EAX-1]
6031 6030 6030 6030 6031 - shl edi, 1; sub edi, 1
Press any key to continue ...

qWord

Quote from: bomz on July 08, 2011, 11:11:41 PM
Itanium 2 (2002+), MMX, SSE2
wow! - Itanium 2 server or workstation as home PC?
FPU in a trice: SmplMath
It's that simple!

jj2007

Strange...
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
190     cycles for shl eax, 2
196     cycles for 100*lea eax, [4*eax]
205     cycles for 100*lea edx, [4*eax]
190     cycles for shl eax, 3
195     cycles for 100*lea eax, [8*eax]
205     cycles for 100*lea edx, [8*eax]

bomz

Quote from: qWord on July 08, 2011, 11:21:15 PM
Quote from: bomz on July 08, 2011, 11:11:41 PM
Itanium 2 (2002+), MMX, SSE2
wow! - Itanium 2 server or workstation as home PC?

Pentium 4 - 9 years old I don't know why program recognize it like Itanium

bomz

nothing strange
lea MAY call processor stop (pause) if use register which changed in previous ticks - not good translation

qWord

Quote from: bomz on July 08, 2011, 11:30:01 PM
nothing strange
lea MAY call processor stop (pause) if use register which changed in previous ticks - not good translation
the processor never stops :wink
Also: the test has no validity - the advantage (e.g. no flag stalls) of LEA may be only visible in context of a (sensfull) instruction-sequence.
FPU in a trice: SmplMath
It's that simple!

bomz

I don't know how translate and can't find about LEA in English. In Russian use word which I can translate ~ stop or pause

qWord

FPU in a trice: SmplMath
It's that simple!

bomz

stop + pause + wait /3


   1. stop
   2. halt
   3. halting
   4. arrester
   5. dog
   6. lock

may be LOCK the best. or HALT

and "in previous ticks" use in sense in previous wave - oscillation

I am not programmer, my English very poor and I don't know special Computer English, and this back translation eng-rus-eng

redskull

"Stall". Though it keeps on executing other stuff
Strange women, lying in ponds, distributing swords, is no basis for a system of government

bomz

google says Stall is lose speed. redskull you give more prof comment you must know what call the same register without my back translation. strange I can't find anything in english about lea. in russian only few word's and this word is same on different sites

at first 'at previous oscillation' I understand like 'in previous command' so I do:
Quotelea eax, [4*EAX+EAX]
   add edx, 1
   lea eax, [2*EAX+ECX]

but now I see that lea eax, [2*EAX+ECX] also in previous

hutch--

bomz,

LEA was only slow on a PIV, in some instances it was faster to do 2 adds than 1 LEA and this is from the Intel manual. Earlier and later Intel hardware were faster with LEA in most instances.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php