News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

STD instruction

Started by dedndave, September 24, 2009, 12:36:21 PM

Previous topic - Next topic

dedndave

am i doing something wrong, or is STD one slow-ass instruction ? - lol - CLD seems to be fast enough
i have it at about 215 cycles on a p4 prescott
the following code is only about 25 cycles faster

        pushfd
        pop     eax
        or      eax,400h
        push    eax
        popfd

EDIT - i am about to write a loop to do a "manual" reverse scan - lol

MichaelW

Is it possible that your code is triggering an exception? On my P3 I get 13 cycles total for a STD followed by a CLD.
eschew obfuscation

dedndave

it functions ok
i dunno what kind of exception it would generate ???  :eek

dedndave

i am using it to scan a bignum from the top down - to skip over unused bytes (FF's for negative - 0's for positive and unsigned)

MichaelW

I can't actually recall ever seeing an exception caused by leaving the direction flag set, but I have seen my application die because of it. For example, this code:

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
    .686
    include \masm32\macros\timers.asm
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    invoke Sleep, 3000
   
    counter_begin 1000, HIGH_PRIORITY_CLASS
        std
        cld
    counter_end
    print ustr$(eax),13,10

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


Runs OK as is, but if I comment out the CLD, then it dies just as it starts displaying the results.
eschew obfuscation

dedndave

on mine - lol - it continues to run
- displays the data
- but no cr/lf - lol - strange

MichaelW

#6
I'm running Windows 2000. Perhaps on your system it continues to run because Windows is detecting the problem and correcting it, and that accounts for the lost cycles. Or perhaps the direction flag has been virtualized.

Edit: "virtualized" is the wrong term. What I mean is that Windows may be actively managing the direction flag to prevent problems with it being left set during a call to a CRT or API function that expects it to be clear.
eschew obfuscation

dedndave

this is odd also...

std
cld
220 cycles

cld
5 cycles

cld
cld
100 cycles

i will play with the instruction placement

dedndave

this really sux, but....
the best solution seems to be

        mov     eax,400h
        pushfd
        or      [esp],eax
        popfd

all of that is faster than

        std

drizz

i think the best solution is not using std  :lol


direction_N=4

_std macro
  direction_N=-4
endm

_cld macro
  direction_N=4
endm

_lodsd macro
  mov eax,[esi]
  add esi,direction_N
endm

_stosd macro
  mov [edi],eax
  add edi,direction_N
endm

_lodsb macro
  mov al,[esi]
  add esi,direction_N/4
endm

_stosb macro
  mov [edi],al
  add edi,direction_N/4
endm

_lodsw macro
  mov ax,[esi]
  add esi,direction_N/2
endm

_stosw macro
  mov [edi],ax
  add edi,direction_N/2
endm

The truth cannot be learned ... it can only be recognized.

dedndave

well - i tried that Drizz
i am glad to see someone confirm my grief, though - lol
you see how i am using it
i have found a very good soultion for this application
1) measure the integer length where using scasb to reduce length yields an advantage
2) skip it for shorter integers - just go ahead and evaluate the unused bytes   :U
by branching around the std/repz scasb/cld - we speed it up for short values
using the loop method works well if there aren't many unused bytes
if there are a lot - scasb kicks butt
what i may do is sample the length where there is an advantage and repz scasb in the up direction - tricky, huh   :U

herge


Hi *.*:

If I remember correctly windows O S likes the direction flag up ie
forward it will react badly to a direction flag down.
Translation it will crash or may be hang or send message to
Redmond, Washington, USA to our friends at the Big M.

Always put direction flag up for Windows!

Regards: herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

dedndave

yah - we got that Hegre
we want to set it down temporarily
it is just very slow
i am guessing that the OS traps that instruction for some reason

herge

Hi dedndave:

This is a Windows driver site out of I believe is New Hampshire, USA.
I find it useful info on WinDBG from MicroSoft.

http://www.osronline.com

Regards: herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

sinsi

About the only thing the intel docs say could be a problem is a partial flag register stall
Light travels faster than sound, that's why some people seem bright until you hear them.