News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

MASM for FUN - #0.2 [string reversing]

Started by frktons, May 20, 2010, 08:24:50 PM

Previous topic - Next topic

frktons

Thinking about string reversion, I came up with this chunk of code:


  mov edx, p_str         ; Pointer to first 4 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 4 bytes string - the one that holds the result
  mov eax, [edx]
  bswap eax
  mov [ebx], eax


This is a special case, a 4 bytes string, but I was  thinking about how many ways
I could do this string reversion using stack, single bytes move, and so on.

Does anybody knows any other way to do it? Or a better/faster way?

Thanks

Frank


Mind is like a parachute. You know what to do in order to use it :-)

clive

Back with the 8086 I would have suggested XCHG against memory, but now that is hideously slow.

  mov edx, p_str         ; Pointer to first 8 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 8 bytes string - the one that holds the result
  mov eax, [edx]
  mov ecx, [edx+4]
  bswap eax
  bswap ecx
  mov [ebx], ecx
  mov [ebx+4], eax


A lot of optimizations would be workable if the string length was divisible by 4, watching for odd/even multiples. Others for 5 or 6 character strings.

Also do you want it to use separate in/out buffers, or swap in place?
It could be a random act of randomness. Those happen a lot as well.

frktons

Quote from: clive on May 20, 2010, 09:06:31 PM
Back with the 8086 I would have suggested XCHG against memory, but now that is hideously slow.

  mov edx, p_str         ; Pointer to first 8 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 8 bytes string - the one that holds the result
  mov eax, [edx]
  mov ecx, [edx+4]
  bswap eax
  bswap ecx
  mov [ebx], ecx
  mov [ebx+4], eax


A lot of optimizations would be workable if the string length was divisible by 4, watching for odd/even multiples. Others for 5 or 6 character strings.

Also do you want it to use separate in/out buffers, or swap in place?

That's correct indeed!  :P
If the string lenght is divisible by 4 the speed of reversing should be quite
high. Well it doesn't matter if the string is the same or a buffer-string to do the conversion,
it can be easily modified.
I used the following code for testing 1000 conversion, this is why I used 2 variables,
just to retain the original value [what a silly reason  :lol ]

! mov ecx, 1000
! mov edx, p_str
! mov ebx, p_str1

count_rev:

! mov eax, [edx]
! bswap eax
! mov [ebx], eax
! dec ecx
! jnz count_rev   


By the way, what kind of approach would you use for 5-7 bytes, or not multiple of 4?
Using the stack would be slow? How would you use it in case?

Thanks Clive

Frank

Mind is like a parachute. You know what to do in order to use it :-)

clive

Without much thought, I have this (ie not tested, could be broken)

  mov edx, p_str         ; Pointer to first 5 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 5 bytes string - the one that holds the result
  mov eax, [edx]
  mov cl, [edx+4]
  bswap eax
  mov [ebx], cl
  mov [ebx+1], eax

  mov edx, p_str         ; Pointer to first 6 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 6 bytes string - the one that holds the result
  mov eax, [edx]
  mov cx, [edx+4]
  xchg cl, ch
  bswap eax
  mov [ebx], cx
  mov [ebx+2], eax

  mov edx, p_str         ; Pointer to first 7 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 7 bytes string - the one that holds the result
  mov eax, [edx]
  mov ecx, [edx+4]
  bswap eax
  bswap ecx
  ror ecx,8
  mov [ebx], ecx
  mov [ebx+3], eax


For larger strings, I would assume I know the length, and then step backward 8 bytes at a time, and then handle the remainder.
Swapping in place, I would again handle as may 8 byte groups as possible, perhaps 4 bytes from each end until I met in the middle, and then code for the 1,2 or 3 byte remainder.
It could be a random act of randomness. Those happen a lot as well.

frktons

As far as I can understand these routines looks fine.  :U

Well, time to sleep.  :dazzled:

If somebody is awake and has any idea on how to exploit the full potential
of the stack for string reversing, he is welcome.

Frank
Mind is like a parachute. You know what to do in order to use it :-)

clive

Not sure the stack helps, you'd have to copy the data back off.

Here I process 8 bytes at a time

        .486
        .MODEL FLAT, C
        .CODE

        align 16

SwapString      PROC USES esi edi Src:PTR BYTE, Dst:PTR BYTE, Len:DWORD

        mov     esi,Src
        mov     edi,Dst
        mov     ecx,Len

        mov     byte ptr [edi+ecx],0 ; guarantee a NUL at the end of the string

        sub     ecx,8
        jb      swap_final

        align   16

swap_loop:

        mov     eax,[esi]
        mov     edx,[esi+4]
        bswap   eax
        bswap   edx
        add     esi,8
        mov     [edi+ecx],edx
        mov     [edi+ecx+4],eax
        sub     ecx,8
        jnb     swap_loop

swap_final:

        add     ecx,8
        jmp     swap_tbl[ecx * 4]

        align   4

swap_tbl:
        dd      offset swap_done
        dd      offset swap_1
        dd      offset swap_2
        dd      offset swap_3
        dd      offset swap_4
        dd      offset swap_5
        dd      offset swap_6
        dd      offset swap_7

        align   16

swap_1:
        mov     al,[esi]
        mov     [edi+ecx-1],al
swap_done:
        ret

        align   16

swap_2:
        mov     ax,[esi]
        xchg    al,ah
        mov     [edi+ecx-2],ax
        ret

        align   16

swap_3:
        mov     eax,[esi]
        bswap   eax
        shr     eax,8
        mov     [edi+ecx-3],al
        shr     eax,8
        mov     [edi+ecx-2],ax
        ret

        align   16

swap_4:
        mov     eax,[esi]
        bswap   eax
        mov     [edi+ecx-4],eax
        ret

        align   16

swap_5:
        mov     eax,[esi]
        mov     dl,[esi+4]
        bswap   eax
        mov     [edi+ecx-5],dl
        mov     [edi+ecx-4],eax
        ret

        align   16

swap_6:
        mov     eax,[esi]
        mov     dx,[esi+4]
        bswap   eax
        xchg    dl,dh
        mov     [edi+ecx-6],dx
        mov     [edi+ecx-4],eax
        ret

        align   16

swap_7:
        mov     eax,[esi]
        mov     edx,[esi+4]
        bswap   eax
        bswap   edx
        shr     edx,8
        mov     [edi+ecx-7],edx
        mov     [edi+ecx-4],eax
        ret

SwapString      ENDP

        END
It could be a random act of randomness. Those happen a lot as well.

frktons

Before going to dissect your last proposed routine, I'd like to
draw your attention to this section of your previous code:
Quote
  mov edx, p_str         ; Pointer to first 7 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 7 bytes string - the one that holds the result
  mov eax, [edx]
  mov ecx, [edx+4]
  bswap eax
  bswap ecx
  ror ecx,8
  mov [ebx], ecx
  mov [ebx+3], eax
When you
Quote
  mov ecx, [edx+4]
you get the last four bytes, 3 pertaining to the string and an extra one,
and with
Quote
  bswap ecx
the extra byte becomes the leftmost byte inside the ecx register,
after you
Quote
  ror ecx,8
and the extra byte moves 8 bit on the right,
but I was sure you would have done a
Quote
  rol ecx,8
in order to get rid of the extra byte when you copy on the rightmost
byte the content of eax
Quote
  mov [ebx+3], eax

unless I am misunderstanding something about how bytes move
inside and out memory/registers.

Can you please explain me what I've missed?

Thanks

Frank
Mind is like a parachute. You know what to do in order to use it :-)

clive

pstr db 'A','B','C','D','E','F','G',0

mov edx, p_str         ; Pointer to first 7 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 7 bytes string - the one that holds the result
  mov eax, [edx]
; eax = D/C/B/A
  mov ecx, [edx+4]
; ecx = 0/G/F/E
  bswap eax
; eax = A/B/C/D
  bswap ecx
; ecx = E/F/G/0
  ror ecx,8
; ecx = 0/E/F/G
  mov [ebx], ecx
; str1+0 = 'G'
; str1+1 = 'F'
; str1+2 = 'E'
; str1+3 = 0 ; overwriten next
  mov [ebx+3], eax
; str1+3 = 'D'
; str1+4 = 'C'
; str1+5 = 'B'
; str1+6 = 'A'
It could be a random act of randomness. Those happen a lot as well.

frktons

So the bytes are written in this order inside the registers:

mov edx, p_str         ; Pointer to first 7 bytes string - the one to reverse
  mov ebx, p_str1       ; Pointer to second 7 bytes string - the one that holds the result
  mov eax, [edx]
; eax = D/C/B/A
  mov ecx, [edx+4]
; ecx = 0/G/F/E


I had the silly idea that it was the reverse  for strings and this was valid only for
numbers :lol

When I'll be back from the week-end, I'll give it a try.  :P

Thanks Clive for your patience  :U

Frank
Mind is like a parachute. You know what to do in order to use it :-)

jj2007

Something short and crispy?

include \masm32\include\masm32rt.inc

.data
TheString db "zyxwvutsrqponmlkjihgfedcba", 0

.code
start: push offset TheString
call RevString
print "abcdefghijklmnopqrstuvwxyz", 13, 10 ; for comparison
print offset TheString
getkey
exit

RevString proc
push esi
push edi
mov esi, [esp+4+8]
mov edi, esi
add edi, len(edi) ; you must know the end of the string
shr eax, 3 ; len div 8 - we use dwords and meet in the middle
je @F
.Repeat
sub edi, 4
mov ecx, [esi]
mov edx, [edi]
bswap ecx
bswap edx
mov [edi], ecx
mov [esi], edx
add esi, 4
dec eax
.Until Zero?
@@: dec edi
cmp edi, esi
jbe @F
movzx eax, byte ptr [esi]
movzx edx, byte ptr [edi]
mov [esi], dl
mov [edi], al
inc esi
jmp @B
@@:
pop edi
pop esi
ret 4
RevString endp
end start

hutch--

Frank,

Have a look at the masm32 library source file szRev.asm. It swaps pairs from opposite ends until it reaches the middle. It seems to run OK.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Celeron M:
217     cycles for RevString
.sogla gnitset sa hcus ,sesoprup fo yteirav a rof sevres taht ,gnol sretcarahc 001 ,gnirts a si sihT
377     cycles for szRev
This is a string, 100 characters long, that serves for a variety of purposes, such as testing algos.
:thumbu

dedndave

a little modification on Jochen's code   :P
i don't know if it'll make a big difference - lol
what is slowing us down is mis-aligned dword accesses - there's probably no good way around that
;-----------------------------------------------------------------------------------------------

RevStr  proc

        push    edi
        push    esi
        mov     edi,[esp+4+8]
        mov     esi,edi
        add     edi,len(edi)                ; you must know the end of the string
        shr     eax,3                       ; len div 8 - we use dwords and meet in the middle
        jz      RevSt2

RevSt0: sub     edi,4
        mov     ecx,[esi]
        mov     edx,[edi]
        bswap   ecx
        bswap   edx
        mov     [edi],ecx
        mov     [esi],edx
        dec     eax
        lea     esi,[esi+4]
        jnz     RevSt0

        jmp short RevSt2

RevSt1: movzx   edx,byte ptr [edi]
        movzx   eax,byte ptr [esi]
        mov     [esi],dl
        mov     [edi],al
        inc     esi

RevSt2: dec     edi
        cmp     edi,esi
        ja      RevSt1

        pop     esi
        pop     edi
        ret     4

RevStr  endp

;-----------------------------------------------------------------------------------------------


jj2007

Quote from: dedndave on May 21, 2010, 06:09:45 PM
oops

? ::)

223     cycles for RevString
..sogla ym gnitset sa hcus ,s
207     cycles for RevString2
This is a string, 100 charact
224     cycles for RevStr
..sogla ym gnitset sa hcus ,s
380     cycles for szRev
This is a string, 100 charact

Sizes:
64      RevString
64      RevString2
65      RevStr
72      szRev


I guess RevString2 will make it into the MasmBasic library :bg

By the way: The Masm32 lib szRev triggers an exception for string lens below 2, so you better check for the len before using it.