News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

memcpy

Started by N1ghtm4r3, May 09, 2011, 12:34:48 PM

Previous topic - Next topic

N1ghtm4r3

C++ memcpy function...

Example:
include \masm32\include\masm32rt.inc

include _memcpy.asm

Main PROTO

.data
szSrc db 20h  dup(1)
szDst db 20h  dup(0)

.code

Start:
  Invoke Main
  Invoke ExitProcess,0

Main proc

push 10
push offset szSrc
push offset szDst
call _memcpy

ret
Main endp

end Start

dedndave

 :bg
        mov     esi,offset szSrc
        mov     edi,offset szDst
        mov     ecx,sizeof szSrc
        rep     movsb

N1ghtm4r3

sure!  :bg
but the code I posted might be useful in some especial situations. as it was for me today  :wink

brethren

you can use c++ functions.................or you can just roll your own

.code
MemCopy PROC USES esi edi, Source:DWORD, Dest:DWORD, ln:DWORD

cld
mov esi, Source
mov edi, Dest
mov ecx, ln
shr ecx, 2
rep movsd

mov ecx, ln
and ecx, 3
rep movsb

ret
MemCopy ENDP
END

N1ghtm4r3

About using C++ functions you are right, but as I said it's not just matter of copying memory! I had to use exactly this code for a special case and I just shared it here maybe someday, someone find it useful. that's all ;)

qWord

if speed is need, the CRT (dynamical linked)  is probably the best choice. Using this, you application will also take speed gain of future instruction sets.
FPU in a trice: SmplMath
It's that simple!

hutch--

I have a reasonably simple view on memory copy, if its a terrabyte it matter, if its a megabyte it doesn't. If you have the registers to spare a simple REP MOVSB does the job in most instances.


mov esi, src
mov edi, dst
mov ecx, cnt
rep movsb
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

that's right
the real problem with trying to speed it up by using REP MOVSD is not dealing with the mod 3 count at the end
it is the fact that strings are not usually both dword aligned - that negates the speed advantage
you have a 1 in 16 chance of going really fast   :bg

jj2007

Quote from: dedndave on May 10, 2011, 03:36:09 AM
that's right
the real problem with trying to speed it up by using REP MOVSD is not dealing with the mod 3 count at the end
it is the fact that strings are not usually both dword aligned - that negates the speed advantage
you have a 1 in 16 chance of going really fast   :bg

Dave,
In real life big buffers are 8- or 16-bit aligned - and then rep movsd is almost unbeatable. Remember the Code location sensitivity of timings thread...? Look for MemCo1.

drizz

It looks exactly like if you have disassembled ms crt library memcpy function with IDA.
And invoked it as stdcall while the function is clearly cdecl.
The truth cannot be learned ... it can only be recognized.

N1ghtm4r3

Yes, that's what I did :p