News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

slower? (with how much?)

Started by korte, June 25, 2008, 09:25:10 AM

Previous topic - Next topic

korte

slower? (with how much?)



_cycle:
     mov eax,[esi]
     mov [edi],eax
     add  esi,4
     add edi,4
     loop _cycle


vs.


_cycle:
     mov eax,fs:[esi]
     mov fs:[edi],eax
     add  esi,4
     add edi,4
     loop _cycle

hutch--

korte,

Get rid of the LOOP instrucion, its ancioent and slow. Work out your loop exit condition and try for a reduction to ZERO which you can test and return back to the start if its not yet zero.


  label:
    your code
  test eax, eax
  jnz label


This basic structure loop is a lot faster than the old LOOP istruction.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

korte

ok. bad question.

mov eax,[esi]

vs

mov eax,FS:[esi]




evlncrn8

fs:[esi] is a totally different segment.. you do know the difference i hope...

fs:[0] = seh pointer..


so for your example..


mov eax,[esi]  vs. mov eax,FS:[esi]

will read completely different data....

proof?

try making esi zero and see why the first code example will crash (memory exception) but the second will pass...


mov eax, [esi] vs. mov eax, ds:[esi] is essentially the same, the compiler might add in a prefix for the ds: part, or may not, really depends on the compiler...
and assumptions made in the code...



korte

ok.

I ask badly always yet.
Sorry, weak my English knowledge.
I was worth the two instructions.


how much slows down fs: prefix


MichaelW

On a P3, I cannot detect any slow down, and I seem to recall reading that there is no penalty for a segment override prefix.

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
    .686
    include \masm32\macros\timers.asm
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
      buffer dd 16 dup (0)
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    invoke Sleep, 3000

    mov esi, OFFSET buffer
    counter_begin 1000, HIGH_PRIORITY_CLASS
      mov eax, [esi]
      mov eax, [esi+4]
      mov eax, [esi+8]
      mov eax, [esi+12]
      mov eax, [esi+16]
      mov eax, [esi+20]
      mov eax, [esi+24]
      mov eax, [esi+28]
      mov eax, [esi+32]
      mov eax, [esi+36]
      mov eax, [esi+40]
      mov eax, [esi+44]
      mov eax, [esi+48]
      mov eax, [esi+52]
      mov eax, [esi+56]
      mov eax, [esi+60]
    counter_end
    print ustr$(eax)," cycles",13,10

    ASSUME fs:NOTHING

    mov esi, 0
    counter_begin 1000, HIGH_PRIORITY_CLASS
      mov eax, fs:[esi]
      mov eax, fs:[esi+4]
      mov eax, fs:[esi+8]
      mov eax, fs:[esi+12]
      mov eax, fs:[esi+16]
      mov eax, fs:[esi+20]
      mov eax, fs:[esi+24]
      mov eax, fs:[esi+28]
      mov eax, fs:[esi+32]
      mov eax, fs:[esi+36]
      mov eax, fs:[esi+40]
      mov eax, fs:[esi+44]
      mov eax, fs:[esi+48]
      mov eax, fs:[esi+52]
      mov eax, fs:[esi+56]
      mov eax, fs:[esi+60]
    counter_end
    print ustr$(eax)," cycles",13,10

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


11 cycles
11 cycles
eschew obfuscation

korte