News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

The fastest way to clear a buffer

Started by frktons, August 24, 2010, 08:47:34 PM

Previous topic - Next topic

frktons

Well, I asked in 64 bit subforum because I'm not ready for 64 bit assembling.
My OS is 64 bit and my machine too, but I know too little to do it myself, and
I don't even have a clue on how to use GoAsm or JWasm or ML64.  ::)

By the way, instead of RtlZeroMemory it's probably better to use this MACRO from Microsoft
to accomplish the task of filling a block of memory:

void FillMemory(
  [out]  PVOID Destination,
  [in]   SIZE_T Length,
  [in]   BYTE Fill
);


What do you think?



Mind is like a parachute. You know what to do in order to use it :-)

GregL

The FillMemory macro calls the RtlFillMemory function.

You will find the following in windows.inc:


FillMemory EQU RtlFillMemory


I would just call RtlFillMemory instead of FillMemory just to keep things straightforward.

RtlZeroMemory fills memory with zeros, RtlFillMemory is for filling memory with other characters.


dedndave

i think i would use REP STOSD/Q
or have we forgotten how to write ASM ?

frktons

Quote from: Greg Lyon on August 25, 2010, 11:29:46 PM
The FillMemory macro calls the RtlFillMemory function.

You will find the following in windows.inc:


FillMemory EQU RtlFillMemory


I would just call RtlFillMemory instead of FillMemory just to keep things straightforward.

RtlZeroMemory fills memory with zeros, RtlFillMemory is for filling memory with other characters.

Thanks Greg, this is what I meant. There is this MACRO from Microsoft that is quite efficient, and
is an implementation of rep stosd

Quote from: dedndave on August 25, 2010, 11:38:26 PM
i think i would use REP STOSD/Q
or have we forgotten how to write ASM ?

You are right, Master. Only one thing to meditate upon: REP STOSQ is only implemented
inside VISUAL C++,  and
Quote
This routine is only available as an intrinsic

And in Assembly as INTEL says:
Quote
In 64-bit mode, the default address size is 64 bits, 32-bit address size is supported using the prefix 67H. Using a REX prefix in the form of REX.W promotes operation on doubleword operand to 64 bits. The promoted no-operand mnemonic is STOSQ. STOSQ (and its explicit operands variant) store a quadword from the RAX register into the destination addressed by RDI or EDI.

That is quite obscure meaning for a "premium n00b" of my level  :P

My results with RtlFillMemory added to the testbed:
Quote
Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz (SSE4)
1057    cycles for RtlZeroMemory
2017    cycles for FrkTons
1050    cycles for rep stosd
568     cycles for movdqa
556     cycles for movaps
1098    cycles for RtlFillMemory

1078    cycles for RtlZeroMemory
2047    cycles for FrkTons
1052    cycles for rep stosd
555     cycles for movdqa
548     cycles for movaps
1111    cycles for RtlFillMemory


--- ok ---

In the "sloppy category" my routine beats anybody's else.  :P
Mind is like a parachute. You know what to do in order to use it :-)

dedndave

#34
give yourself some credit Frank - lol
STOSQ stores a qword (64 bit value)
REP STOSQ is probably fast as hell on 64-bit machines
Jochen showed REP STOSD in his 64-bit example code - that was probably just a small oversight
i am sure he meant REP STOSQ
RtlZeroMemory probably preserves ESI (or RSI), but other than that, it is straightforward
you can assume that the direction flag is clear, as it should be - i am sure RtlZeroMemory also makes that assumption

so....
load the value you want repeated into EAX/RAX
load the repeat count into ECX/RCX
load the address into EDI/RDI
then do the REP STOSD or REP STOSQ

it will be quote fast as long as the address is 4-byte-aligned for 32-bit code or 8-byte-aligned for 64-bit code

GregL

Quote from: dedndave
i think i would use REP STOSD/Q
or have we forgotten how to write ASM ?

Dave,

I new someone was going to say something like that.  I was only commenting on the use of FillMemory, not on what was fastest to do the job.




dedndave

you're right, of course, Greg
many of these functions were written for C-programmers   :lol

GregL

Dave,

So are you saying I don't know how to write ASM code?


dedndave

no - no - not at all, Greg - lol
you're tops   :U

GregL

Dave,

No, I'm definitely not tops.   Smartass.


ecube

Quote from: Greg Lyon on August 26, 2010, 01:48:52 AM
Dave,

No, I'm definitely not tops.   Smartass.



heh you must not be aware of Dave's humor, shame.

Also to the OP, if you're just trying to "clear" a buffer to use with an ascii string you can


lea edi,buffer
mov byte ptr [edi],0


and lstcpy etc.. will consider it empty.

GregL

Quote from: E^cubeheh you must not be aware of Dave's humor, shame.

Oh, I'm fully aware of it.


ecube

Ahh I remember you, you were insulting me in PM  :tdown I see you're playing nice with others too... :snooty:

dedndave

i meant that, Greg - i wish i knew all the stuff you guys know
i might learn some of it, if i had more time, too
i had planned on spending the entire summer learning more about win32 code
as it turned out, i didn't get to spend hardly any time on code
this weekend, it looks like i am off to Michigan to remodel a house, too - lol
by the time the day is done, i will be too tired to concentrate on learning anything new
maybe by the time christmas rolls around......

frktons

Quote from: E^cube on August 26, 2010, 01:55:32 AM

To the OP, if you're just trying to "clear" a buffer to use with an ascii string you can


lea edi,buffer
mov byte ptr [edi],0


and lstcpy etc.. will consider it empty.

I'm trying to fill a buffer with spaces [ASCII 32] and If somebody can translate
and compile in 64 bit Assembly the equivalent of:

push edi
mov ecx, 2000
mov edi, offset Dest
mov eax, 20202020h
rep stosd
pop edi

         
I'll have a more detailed idea of how 64 bit native registers compare to
SSE instructions. My machine and OS are 64 bit, I'm not yet.  :'(

Mind is like a parachute. You know what to do in order to use it :-)