memcpy

N1ghtm4r3 · May 09, 2011, 12:34:48 PM

C++ memcpy function...

Example:

include \masm32\include\masm32rt.inc

include _memcpy.asm

Main PROTO

.data
szSrc	db	20h  dup(1)
szDst	db	20h  dup(0)

.code

Start:
  Invoke Main
  Invoke ExitProcess,0

Main proc 
		
	push 10
	push offset szSrc
	push offset szDst
	call _memcpy

	ret
Main endp

end Start

dedndave · May 09, 2011, 01:01:52 PM

:bg

Code Select

        mov     esi,offset szSrc
        mov     edi,offset szDst
        mov     ecx,sizeof szSrc
        rep     movsb

N1ghtm4r3 · May 09, 2011, 01:09:50 PM

sure! :bg
but the code I posted might be useful in some especial situations. as it was for me today :wink

brethren · May 09, 2011, 01:29:12 PM

you can use c++ functions.................or you can just roll your own

Code Select

.code
MemCopy PROC USES esi edi, Source:DWORD, Dest:DWORD, ln:DWORD

	cld
	mov esi, Source
	mov edi, Dest
	mov ecx, ln
	shr ecx, 2
	rep movsd

	mov ecx, ln
	and ecx, 3
	rep movsb

	ret
MemCopy ENDP
END

N1ghtm4r3 · May 09, 2011, 05:51:53 PM

About using C++ functions you are right, but as I said it's not just matter of copying memory! I had to use exactly this code for a special case and I just shared it here maybe someday, someone find it useful. that's all ;)

qWord · May 09, 2011, 06:00:33 PM

if speed is need, the CRT (dynamical linked) is probably the best choice. Using this, you application will also take speed gain of future instruction sets.

hutch-- · May 10, 2011, 01:00:34 AM

I have a reasonably simple view on memory copy, if its a terrabyte it matter, if its a megabyte it doesn't. If you have the registers to spare a simple REP MOVSB does the job in most instances.

Code Select


mov esi, src
mov edi, dst
mov ecx, cnt
rep movsb

dedndave · May 10, 2011, 03:36:09 AM

that's right
the real problem with trying to speed it up by using REP MOVSD is not dealing with the mod 3 count at the end
it is the fact that strings are not usually both dword aligned - that negates the speed advantage
you have a 1 in 16 chance of going really fast :bg

jj2007 · May 10, 2011, 02:24:42 PM

Quote from: dedndave on May 10, 2011, 03:36:09 AM
that's right
the real problem with trying to speed it up by using REP MOVSD is not dealing with the mod 3 count at the end
it is the fact that strings are not usually both dword aligned - that negates the speed advantage
you have a 1 in 16 chance of going really fast :bg

Dave,
In real life big buffers are 8- or 16-bit aligned - and then rep movsd is almost unbeatable. Remember the Code location sensitivity of timings thread...? Look for MemCo1.

drizz · May 10, 2011, 03:33:51 PM

It looks exactly like if you have disassembled ms crt library memcpy function with IDA.
And invoked it as stdcall while the function is clearly cdecl.

N1ghtm4r3 · May 10, 2011, 09:17:38 PM

Yes, that's what I did :p

News:

memcpy