Question[s] about lstrcat routine (string concatenation)

Started by disintx, June 20, 2009, 08:55:57 AM

Previous topic - Next topic

disintx

*edit* this was a really retarded question to ask, but it was early morning..heh.

My code is like so, just for your reference (I never use invoke :P)
push offset TheText
push offset UserGreeting
call lstrcat
push offset UserName
push offset UserGreeting
call lstrcat


TheText = "Hi there, " (0 terminated)
UserName is result from GetUserName api so it's also 0 terminated.

I don't like having to call it twice but since I'm not too sure what a better solution is right now I'm doing it anyways.
I tried writing my own string concatenation routine and failed...kinda miserably. So I decided to use lstrcat and then step through it in Ollydbg so I can see what it does and replicate it.

Keep in mind my analysis is from an active point of view, live disassembly as opposed to an IDA/etc approach so it may have caused me some confusion.
PUSH 8
PUSH kernel32.75159D28
CALL kernel32.751412D0

AND DWORD PTR SS:[EBP-4],0
MOV EAX,DWORD PTR SS:[EBP+C] ;
MOV EDX,EAX ; puts the address of ascii "sea" into edx

@@:
MOV CL,BYTE PTR DS:[EAX] ; moves a byte of the string into cl
INC EAX ; increases eax by one (next byte in the string)
TEST CL,CL ; is cl 0 (NULL, end of string)
JNZ SHORT @b ; if no, go back to @@, repeating ^
SUB EAX,EDX ; subtract addr of "sea" from "Hi there, ".
; if my analysis is correct, this is a string-length
; operation. i'm guessing because eax == edx before now.
; eax should be 4 (3 chars + null terminator)

MOV ESI,EDX ; move the addr of "sea" into ESI
MOV EDX,EAX ; copy string length of ascii "sea" into edx (4)
MOV EDI,DWORD PTR SS:[EBP+8] ; puts address of ascii "Hi there, " from stack into edi
;question 2
DEC EDI ; decrements edi by 1...so next instruction can point to H
LEA EAX,DWORD PTR DS:[EDI+1] ; loads address of [edi+1] into eax

@@:
MOV CL,BYTE PTR DS:[EDI+1] ; moves the first byte of "Hi there, " (H) into cl
INC EDI ; increments edi
TEST CL,CL ; is cl 0 (NULL, end of string)
JNZ SHORT @b ; if no, go back to @@

;question 3
MOV ECX,EDX ; puts strlength of ascii "sea" into ecx
SHR ECX,2 ; equivalent to ecx/4
; so ecx == 1
; i'm stumped on this next instruction.
; i understand that movs copies a byte AT esi INTO the address AT edi.
; what i don't understand is how within one 1 rep they move
; all 4 bytes ;_;
REP MOVS DWORD PTR ES:[EDI],DWORD PTR DS:[ESI]

MOV ECX,EDX ; ecx = 4
AND ECX,3 ; 4 AND 3 = 0
REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI] ; so nothing happens here
MOV DWORD PTR SS:[EBP-4],-2 ; why ?
CALL kernel32.75141315 ; restores registers
RETN 8 ; returns


Now, onto my actual questions (sorry for making you read so much!)...

  • What is a better way to put these two strings together?
  • If you read my analysation of lstrcat (search for ;question 2), why does lstrcat decrement EDI and then add one to it for the next instruction? Is it just bad compiler logic or am I missing something simple?
  • If you read my analysation of lstrcat (search for ;question 3), you will have read that I saw EDX having value of 4. So the value of 4 is moved into ECX and then SHR'd, which was basically a division by 4, resulting in ECX having a value of 1. Why did REP MOVS copy all four bytes ("sea" + null)?

Thanks for reading and hopefully answering some of my questions :) It's a bit early in the morning here so I'm a bit tired and might not be thinking clearly, so I apologize in advance for any frustrations I cause.
-d

ToutEnMasm


The div by 4 is an optimized method.Copy dword instead of bytes,much faster.
the dec edi is just to put the copied chain exactly at the end of the first chain.
I supposed you have seen that the routine begin by the copy of dword,then bytes.Like that there is the exact number of bytes needed for the concatenation.


ToutEnMasm

    The number of bytes to copy is divided by 4.Result is a number of dword + a few bytes (3 is a maximum).
    movsb,stosb are less faster than movsd,stosd.
If the chain is 42 bytes:
  with bytes you made 42 stosb
  with dword you made 10 stosd + 2 stosb
that is a gain of 30 instructions.

ramguru

I thought you were just saying like:
div 4 is better instruction than shr ecx, 2 ( :green )
My mistake .. need to improve my perception :}

disintx

Quote from: ToutEnMasm on June 20, 2009, 11:59:27 AM
    The number of bytes to copy is divided by 4.Result is a number of dword + a few bytes (3 is a maximum).
    movsb,stosb are less faster than movsd,stosd.
If the chain is 42 bytes:
  with bytes you made 42 stosb
  with dword you made 10 stosd + 2 stosb
that is a gain of 30 instructions.


Thank you for your reply, I get it now. Much appreciated!