Print Page - Strings?

Title: Strings?
Post by: 2-Bit Chip on November 17, 2009, 03:11:36 AM

Why are there mnemonics for strings like: LODS, LODSB, REP, STOS, STOSB?

Can't just a few simple mov's work?

mov al, byte ptr [esi + ch]
inc ch
... ; edit the character, do work..
mov byte ptr [edi + ch], al

Title: Re: Strings?
Post by: dedndave on November 17, 2009, 03:20:47 AM

the string operations can be very fast
especially when you want to copy a large section of data or clear out a large area of memory
the ESI register points to the source and EDI points to the destination - they are incremented or decremented automatically for you
the ECX register holds the count if a REP prefix is used (REP repeat, REPZ repeat if zero, REPNZ repeat if not zero)
the direction flag controls up (CLD) or down (STD)
you can mov/scan/compare/load/store bytes, words, or dwords
there are also I/O instructions - somewhat useless
for some of the instructions, the AL/AX/EAX register is used for data
from Randy Hyde's Art of Assembly:
http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_6/CH06-4.html#HEADING4-162

Title: Re: Strings?
Post by: 2-Bit Chip on November 17, 2009, 03:33:46 AM

Oh! That is neat! one instruction can make up three different ones. (REPNZ) :dance:

Title: Re: Strings?
Post by: dedndave on November 17, 2009, 03:41:42 AM

you should play with them a little bit - lol
here is an example - i want to make a copy of a string...

cld
mov esi,offset source_string
mov edi,offset destination_string
mov ecx,number_of_bytes
rep movsb

it's a little faster for copying words or dwords, i think - it was faster on an 8088 to copy words, at least

here is another example - i want to clear out 32 Kb of memory...

cld
mov edi,offset memory_to_clear
xor eax,eax
mov ecx,8192 ;8192 dwords = 32 Kb
rep stosd

Title: Re: Strings?
Post by: 2-Bit Chip on November 17, 2009, 04:09:35 AM

With just simple mnemonics, I can create this:

Code Select

UpThree proc uses esi edi edx ecx lpszSrc:DWORD, lpszDest:DWORD, dwCount:DWORD

    mov esi, lpszSrc
    mov edi, lpszDest
    mov edx, dwCount
    xor ecx, ecx
    mov al, 3
@@:
    mov ah, byte ptr [esi + ecx]
    cmp ah, 0
    jz @F
    add ah, al
    mov byte ptr [edi + ecx], ah
    inc ecx
    cmp ecx, edx
    je @F
    jmp @B
@@:
    ret

UpThree endp

I just can't understand how to optimize it with those higher mnemonics (rep, lodsb)

Title: Re: Strings?
Post by: dedndave on November 17, 2009, 04:55:42 AM

i am not sure the string instructions may be applied here - at least, not in a way to make things go faster
you could use lodsb and stosb for single bytes, but without the REP prefix, they are kinda slow
one thing i see is the way you maintain the loop count and branch at the end of the loop
the ECX register is traditionally used as a count register, so....

mov ecx,dwCount
.
.
loop_start:
.
.
dec ecx
jnz loop_start

that eliminates the need to compare ECX with EDX

the processor is happy when moving data in and out of AL, as opposed to AH
also - the base+index addressing slows you down a little....

mov esi, lpszSrc
mov edi, lpszDest
mov ecx, dwCount
mov ah, 3
@@:
mov al,[esi]
or al,al
jz @F
add al,ah
inc esi
mov [edi],al
inc edi
dec ecx
jnz @B
@@:
ret

you could make the thing run faster by accessing all data in 4-aligned dwords
it would take a lot more code, though - you have to sort out the first few bytes until you are 4-aligned
then, load dwords and, in register, sort out if any of the bytes are 0
then, add 3 to 4 bytes at a time and store them as (again, 4-aligned) dwords
you can see where the code gets messy - but it could make the routine run quite a bit faster
it would take 3 loops
one to handle a few bytes at the beginning
one to handle the bulk of the string in dwords
and another to handle may be misaligned bytes at the end
the fact that you look for a null terminator OR a terminal count really throws a wrench in the works - lol

Title: Re: Strings?
Post by: 2-Bit Chip on November 17, 2009, 05:39:09 AM

Oh! I see that "dec" can set a zero-flag.

Quotethe base+index addressing slows you down a little.

Yeah, it does. :red

Are you packing data (4-aligned) to speed things up? Because that is what it looks like.

Title: Re: Strings?
Post by: dedndave on November 17, 2009, 05:40:57 AM

no - i just simplified your code - i am not packing anything - lol
the loop i posted is about as good as it will get while accessing the data as bytes
you have two different strings - one source - one destination
if one is aligned and the other is not, you're underwear will get all bunchy - lol

Title: Re: Strings?
Post by: hutch-- on November 17, 2009, 06:19:10 AM

Beware of the old string instructions, unless used in a very limited way they can be very slow. There is special case circuitry for REP MOVS? and a few others but used individually they are way off the pace.

In most instances incremented pointer code is faster and even the special case circuitry of REP MOVSD can be beaten by MMX/XMM instructions.

Title: Re: Strings?
Post by: ecube on November 17, 2009, 06:21:52 AM

Code Select


dec ecx
 jnz @B

nice trick,does it have to be ecx or can be any register?

Title: Re: Strings?
Post by: hutch-- on November 17, 2009, 07:28:57 AM

At a byte level here is the masm32 library procedure to do it.

Code Select


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

OPTION PROLOGUE:NONE 
OPTION EPILOGUE:NONE 

comment * -----------------------------------------------
        copied length minus terminator is returned in EAX
        ----------------------------------------------- *
align 4

szCopy proc src:DWORD,dst:DWORD

    push ebp
    push esi

    mov edx, [esp+12]
    mov ebp, [esp+16]
    mov eax, -1
    mov esi, 1

  @@:
    add eax, esi
    movzx ecx, BYTE PTR [edx+eax]
    mov [ebp+eax], cl
    test ecx, ecx
    jnz @B

    pop esi
    pop ebp

    ret 8

szCopy endp

OPTION PROLOGUE:PrologueDef 
OPTION EPILOGUE:EpilogueDef 

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

Title: Re: Strings?
Post by: jj2007 on November 17, 2009, 08:07:44 AM

Quote from: E^cube on November 17, 2009, 06:21:52 AM
Code Select Expand
dec ecx jnz @B

nice trick,does it have to be ecx or can be any register?

It works with any register.

Title: Re: Strings?
Post by: RuiLoureiro on November 18, 2009, 04:18:54 PM

hutch,
Why to use PUSH ESI, MOV EAX, -1 , MOV ESI, 1 and pop esi ?
It could be:

Code Select


; note:  copied length minus terminator is returned in EAX
OPTION PROLOGUE:NONE 
OPTION EPILOGUE:NONE
szCopy proc src:DWORD,dst:DWORD

    push ebp

    mov edx, [esp+8]                 ; src
    mov ebp, [esp+12]                ; dst
    xor   eax, eax
   @@:
    movzx ecx, BYTE PTR [edx+eax]
    mov [ebp+eax], cl
    add   eax, 1
    test ecx, ecx
    jnz @B
    sub    eax, 1
    pop ebp

    ret 8

szCopy endp

OPTION PROLOGUE:PrologueDef 
OPTION EPILOGUE:EpilogueDef 
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

Rui

Title: Re: Strings?
Post by: MichaelW on November 18, 2009, 08:27:46 PM

Rui,

You have an error in the instructions that access the parameters:

mov edx, [esp+12]
mov ebp, [esp+16]

Because you have only one push at the top of the procedure, they should be:

mov edx, [esp+8]
mov ebp, [esp+12]

Title: Re: Strings?
Post by: RuiLoureiro on November 18, 2009, 09:51:51 PM

Hi MichaelW,
Yes i know, that should be

mov edx, [esp+8] ; src
mov ebp, [esp+12] ; dst

i used copy-paste and i forgot args
Rui

Title: Re: Strings?
Post by: hutch-- on November 18, 2009, 10:09:49 PM

Rui,

Code Select


    mov eax, -1
    mov esi, 1

The "mov eax, -1" could be slightly shorter wit "or eax, -1" but it hardly matters.

Preseting EAX with -1 and putting the ADD EAX before the byte copy means you don't have to correct the result on exit from the loop.

Using ESI to store 1 allows you to do the ADD on a register to register which is faster on some hardware than using an immediate. "add eax, esi"

The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: 2-Bit Chip on November 17, 2009, 03:11:36 AM