News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Unusual discovery.

Started by hutch--, June 07, 2005, 02:21:02 PM

Previous topic - Next topic

hutch--

I wanted to test out an idea so I wrote a simple strlen procedure, got it up running and clocked it.

The test was not the algo but the idea that the actual data BEFORE the algo, aligned or not effected the speed of the algo. By playing with the repeat count and the data being repeated it can change the timimg of the algo even though it is BEFORE the ALIGN 16 directive. This is only tested so far on my PIV but its an unusual effect.


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

REPEAT 32
  db 0,1,2,3,4,5,6,7,8,9
ENDM

align 16

strlen2 proc C src:DWORD

    mov eax, [esp+4]
    sub eax, 1

  align 16
  @@:
    add eax, 1
    cmp BYTE PTR [eax], 0
    jnz @B

    sub eax, [esp+4]
    ret

strlen2 endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Phil

#1
Hi Steve,

  I'll plug your code into MichaelW's timer on my machine and let you know if I see similar effects. I've been seeing unusual things too with the program I'm working on and timing based on elapsed time. I'm not sure what's happening with it. I'll let you know if I figure it out.

  I couldn't find anything that made a difference in timing for your routine. I called your procedure definintion hutch.asm and included it in:
    .586                        ; create 32 bit code
    .model flat, stdcall        ; 32 bit memory model
    option casemap :none        ; case sensitive

    include \masm32\include\windows.inc
    include \masm32\include\masm32.inc
    include \masm32\include\kernel32.inc

    includelib \masm32\lib\masm32.lib
    includelib \masm32\lib\kernel32.lib
    includelib \masm32\lib\msvcrt.lib

    include \masm32\macros\macros.asm

    include timers.asm

    .data

len2    dd 0
str2    db "Now is the time for the apples to ripen!",0

    .code

    include hutch.asm

start:
    LOOP_COUNT EQU 10000000

    counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
        push    offset str2
        call    strlen2
        add     esp,4
        mov     len2,eax
    counter_end

    print ustr$(eax)
    print chr$(" call strlen2(str2) = ")
    print ustr$(len2)
    print chr$(13,10)

    counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
        invoke  strlen2, ADDR str2
        mov len2,eax
    counter_end

    print ustr$(eax)
    print chr$(" invoke strlen2,ADDR str2 = ")
    print ustr$(len2)
    print chr$(13,10)

    mov   eax, input(13,10,"Press enter to exit...")
    exit

end start

And here were the results no matter how I changed the data definitions in hutch.asm

C:\ASM\test>hutchtime
99 call strlen2(str2) = 40
99 invoke strlen2,ADDR str2 = 40

Press enter to exit...


hutch--

Phil,

The idea was the block of data before the algo being changed in both its repeat count and data in the DB line. It can equally be assembler instructions within the repeat block.

I have on and off seen unpredictable variations in an algorithm due to its location within an EXE file and this is after having aligned the entry label with various values, 4, 8 & 16.

What I have done here is set up an experiment using a memory bound algo that is restricted by memory access speed to test the characteristic I am after.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Phil

Yes, I think I understand what you had intended. I ran the timing routine several times after manually changing the REPEAT count and various db values in your data definition. That would force the 16 byte alignment for your procedure to move up and down in memory as the data area grows and shrinks. Indeed, that shouldn't affect the timing of your routine but in some cases it does. I was attempting to create a test case that we could use to identify a failing instance but I was unsuccessful. Every time I ran it I got the same timing no mater what changes I had made to the data definition. I think the maximum data REPEAT count I had tried was 311 and the number of db's was 37 or such ... didn't seem to make a difference.