News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

The fastest way to clear a buffer

Started by frktons, August 24, 2010, 08:47:34 PM

Previous topic - Next topic

GregL

E^cube,

No, you have that backwards E^cube, it was you sending me nasty PMs.  And you got a chance to take a jab at me and you sure took advantage of it didn't you?



GregL

Dave,

No problem Dave, I guess I'm just a little on edge tonight, I'm sorry.

frktons

Quote from: dedndave on August 26, 2010, 01:21:33 AM
give yourself some credit Frank - lol
STOSQ stores a qword (64 bit value)
REP STOSQ is probably fast as hell on 64-bit machines
Jochen showed REP STOSD in his 64-bit example code - that was probably just a small oversight
i am sure he meant REP STOSQ
RtlZeroMemory probably preserves ESI (or RSI), but other than that, it is straightforward
you can assume that the direction flag is clear, as it should be - i am sure RtlZeroMemory also makes that assumption

so....
load the value you want repeated into EAX/RAX
load the repeat count into ECX/RCX
load the address into ESI/RSI
then do the REP STOSD or REP STOSQ

it will be quote fast as long as the address is 4-byte-aligned for 32-bit code or 8-byte-aligned for 64-bit code

Well, Dave I got the 32 bit version of rep stosd, it is not that difficult.
I'm trying to see in 64 bit how it works, and I don't have a clue on how to do it
because I have no 64 bit experience and tools at all.  ::)

So I'm asking somebody to translate and compile in 64 bit mode to test the
performance it gets.
Mind is like a parachute. You know what to do in order to use it :-)

GregL

frktons,

I don't think anyone has written timing routines in x64 yet.


frktons

Quote from: Greg Lyon on August 26, 2010, 02:25:52 AM
frktons,

I don't think anyone has written timing routines in x64 yet.

Time to start? Well, I can be satisfied just seeing how it translate into 64 bit.
It shouldn't be too complex:

push edi
mov ecx, 2000
mov edi, offset Dest
mov eax, 20202020h
rep stosd
pop edi

::)

Jochen suggested to start from:

    mov rax, 20202020202020202020202020202020h
    mov rdi, offset Dest
    mov rcx, 1000
    rep stosq


And it is quite clear, but what about all the rest of the program?
And moreover is this a legal 64 bit syntax that a 64 bit assembler can assemble?
Which one? JWasm, GoAsm, ML64?
Mind is like a parachute. You know what to do in order to use it :-)

dedndave

        push    rdi
        mov     rcx,1000
        mov     rdi,offset Dest
        mov     rax,2020202020202020h
        rep     stosq
        pop     rdi

i have no way to test it   :P
JwAsm will assemble it for you
you could clear a larger area using 32-bit and 64-bit, then use a stopwatch   :lol

frktons

Quote from: dedndave on August 26, 2010, 02:35:11 AM
        push    rdi
        mov     rcx,1000
        mov     rdi,offset Dest
        mov     rax,2020202020202020h
        rep     stosq
        pop     rdi

i have no way to test it   :P
JwAsm will assemble it for you

oh, I was writing while you replied. Have a look at my prev post Master.
Mind is like a parachute. You know what to do in order to use it :-)

dedndave

i think Jochen has a few too many spaces in there, Frank   :bg
16 spaces = 128 bits in a 64-bit reg
overflow !!! - lol

frktons

#53
Quote from: dedndave on August 26, 2010, 02:38:13 AM
i think Jochen has a few too many spaces in there, Frank   :bg
16 spaces = 128 bits in a 64-bit reg
overflow !!! - lol

You are right, those were 128 bit xmm registers He used with SSE mnemonics and probably
forgot we were going to native 64 bit registers rxx.
Me too  :P
Mind is like a parachute. You know what to do in order to use it :-)

jj2007

2020202020202020h

Folks, if I remember well, Hutch knows a revolutionary mathematical technology to count the spaces in this expression :bg

hutch--

 :bg

Huh ?

In my own case "revolutionary" and "mathematical" are not compatible in the same sentence. I freely admit to "Eenie meanie minie moe" technology (fingers) and have to cheat and use computers to add up numbers.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Quote from: hutch-- on August 26, 2010, 07:06:51 AM
:bg

Huh ?

In my own case "revolutionary" and "mathematical" are not compatible in the same sentence. I freely admit to "Eenie meanie minie moe" technology (fingers) and have to cheat and use computers to add up numbers.

I meant the "Eenie meanie minie moe" technology. It is quite sufficient to see that there are 8 spaces in 2020202020202020h, not 16 as Dave suspected :wink

hutch--

 :bg

Dave is probably cheating and using both hands.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

Quote from: jj2007 on August 25, 2010, 08:30:53 PM
What about you? I'll give you a starting point:
    mov rax, 20202020202020202020202020202020h
    mov rdi, offset buffer
    mov rcx, 1000
    rep stosd
i'd say that either Jochen has had one too many cappuccinos or his space bar is stuck   :lol
note: it should also be STOSQ - not STOSD

http://www.masm32.com/board/index.php?topic=14685.msg119244#msg119244

i cheated and used both hands to create this post

jj2007

Quote from: dedndave on August 26, 2010, 12:13:25 PM
Quote from: jj2007 on August 25, 2010, 08:30:53 PM
What about you? I'll give you a starting point:
    mov rax, 20202020202020202020202020202020h
    mov rdi, offset buffer
    mov rcx, 1000
    rep stosd
i'd say that either Jochen has had one too many cappuccinos or his space bar is stuck   :lol
note: it should also be STOSQ - not STOSD

Oops, you are right! I had looked at replies 51 & 52 and saw exactly 8 spaces in the code... but that was your code, not mine :red
Apologies :thumbu