News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Missing AsciiHexToDW in masm32 and

Started by ToutEnMasm, July 11, 2005, 01:26:46 PM

Previous topic - Next topic

ToutEnMasm

Hello,

There isn't this in masm32
   In adress of an hexadecimal ascii , NULL terminated string (She can have a h H or not)
   out eax the exact translation of the string, 0C0000001h put C0000001 in eax
            failed  carry flag set to 1  ( eax == 0)
   Accept only hexadecimal caracters (and the h H at the end) , A= a ....   
so i made it
I made a little modify to dw2ah , C0000001 is translate C0000001H instead of 0C0000001H ,making an error in ML
                                       ToutEnMasm

; #########################################################################

      .386                      ; force 32 bit code
      .model flat, stdcall      ; memory model & calling convention
      option casemap :none      ; case sensitive

       include \masm32\include\windows.inc
      include \masm32\include\kernel32.inc

      ; AsciiHexToDW PROTO :DWORD
comment µ
In adress of an hexadecimal ascii , NULL terminated string (She can have a h H or not)
out eax the exact translation of the string, 0C0000001h put C0000001 in eax
failed  carry flag set to 1  ( eax == 0)
Accept only hexadecimal caracters (and the h H at the end) , A= a ....
             µ

      ;samples of  extern definitions
      ; EXTERNDEF Copier :PROTO  :DWORD,:DWORD
      ; EXTERNDEF EtatMemoirePile :MEMORY_BASIC_INFORMATION
      ; EXTERNDEF PoriginePile:DWORD

    .code

; #########################################################################

AsciiHexToDW PROC uses esi lpBuffer:DWORD
Local  retour:DWORD
Local   long:DWORD
Local  resultat:DWORD
local  decale:DWORD
       mov retour,0
mov resultat,0
mov decale,0
invoke lstrlen,lpBuffer
.if eax == 0
stc ;carry flag error
jmp FindeAsciiHexToDW
.endif
mov esi,lpBuffer
mov long,eax
add esi,eax ;start at the right and go to left
dec esi
mov al,[esi]
.if al == "h" || al == "H" ;pass if H or h
dec esi
dec long
.if long == 0
                                               stc  ;carry flag error
jmp FindeAsciiHexToDW
.endif
.endif
.if long > 8
mov long,8 ;don't take care of digits that format the text
.endif
;there is only ascii digit in the string
newchar:
xor eax,eax
mov al,[esi]
.if al > 96 && al < 103 ;a et f
sub al,87
.elseif al > 64 && al < 71 ;A et F
sub al,55
.elseif al >47 && al < 58 ;0 et 9
sub al,48
.else
stc ;carry flag error
jmp FindeAsciiHexToDW
.endif
;the digit is translate in a number
mov ecx,decale
shl eax,cl ;shift left
add resultat,eax ;add to result
add decale,4
dec long
.if long == 0
push resultat ;all usable digit has been translate
pop retour
clc ;OK it's finish
jmp FindeAsciiHexToDW
.endif
dec esi
jmp newchar

;DWORD 8 chiffres max


FindeAsciiHexToDW:
         mov eax,retour
         ret
AsciiHexToDW endp

; #########################################################################

; #########################################################################

dw2ah proc dwValue:DWORD, lpBuffer:DWORD

    ; -------------------------------------------------------------
    ; convert DWORD to hexadecimal ascii string
    ; dwValue is value to be converted
    ; lpBuffer is the address of the receiving buffer
    ; EXAMPLE:
    ; invoke dwtoa,edx,ADDR buffer
    ;
    ; lpBuffer must be at least 10 bytes long
    ;
    ; Uses: eax, ecx.
    ;
    ;
    ; -------------------------------------------------------------

    mov ecx, lpBuffer
    add ecx, 9 ;8
    mov WORD PTR [ecx], 0048H   ; "H", 0  (Hex identifier and trailing zero)
    dec ecx
Convert:
    mov eax, dwValue
    and eax, 0FH            ; get digit
    .IF al < 10
        add al, "0"         ; convert digits 0-9 to ascii
    .ELSE
        add al, ("A"-10)    ; convert digits A-F to ascii
    .ENDIF
    mov BYTE PTR [ecx], al           
    dec ecx
    ror dwValue,4           ; shift in next hex digit
    .if ecx > lpBuffer       ; see if we have more to do
    jmp Convert
    .endif
    mov al,"0" ;starting with a letter make error with masm
    mov BYTE PTR [ecx], al               
    ret

dw2ah endp


I just added the code formatting tags as it makes your code easer to read.

Jimg

An old method of conversion is to just keep a running total in eax, which would eliminate the need for locals,  e.g.-
AsciiHexToDW2 proc lpBuffer:dword
xor eax,eax
mov edx,lpBuffer
mainloop:
movzx ecx,byte ptr [edx]
.if cl > 96 && cl < 103 ;a et f
sub cl,87
.elseif cl > 64 && cl < 71 ;A et F
sub cl,54
.elseif cl >47 && cl < 58 ;0 et 9
sub cl,48
.else
jmp done ; illegal character, or done because found an 'H' or zero byte terminator.
.endif
shl eax,4         ; times 16
add eax,ecx    ; plus new digit
add edx,1
jmp mainloop
done:
ret
AsciiHexToDW2 EndP


Just makes  it a little simpler and easier to maintain.  Also, as a general purpose routine, you should probably return an error condition if something is wrong, like getting passed too many hex characters.  To set an error condition, an old method is to just set the carry flag and the calling routine can just do a  'jc LocalErrorRoutine' after the call.

ToutEnMasm

Hello,
I have added the carry flag , so for tracking errors
jnc or
.if CARRY?

Locals are a good way to make the code more readable.They offered also fast coding and less bugs.
   
                                             ToutEnMasm



Jimg

I think there is a small error in this part-

.if al > 96 && al < 103  ;a et f
sub al,87
.elseif al > 64 && al < 71 ;A et F
sub al,54     ;;-----  this should be sub al,55
.elseif al >47 && al < 58 ;0 et 9
sub al,48
.else

same error in the code I posted as I copied yours  :red

hutch--

Guys,

I am moving this topic to the Laboratory as it is better suited for that sub forum.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

ToutEnMasm

Helo again,
Thanks,I have corrected this
                      ToutEnMasm


Jimg

#6
Ok, this is the smallest, fastest routine I have come up with so far.  Anyone else like to try?

align 16
AsciiHexToDW7 proc lpBuffer:dword
xor eax,eax
mov edx,lpBuffer
jmp main2
mainloop:
doit:
shl eax,4 ; *16
add eax,ecx
next:
add edx,1
main2:
movzx ecx,byte ptr [edx]
cmp ecx,48
jb done ; below "0"
and ecx,0dfh ; convert to lower case
;cmp cl,"X" ; to allow 0x0000 format ;--- add in these two lines to allow 0x0000 format
;je next
sub ecx,16 ; assume 0-9  (0 thru 9 with bit masked off = 16 thru 25)
cmp ecx,9
jbe doit ; good, from "0" to "9"
sub ecx,39 ; assume A-F, convert to binary 10 thru 15
cmp cl,10
jb done ; invalid between "9" and "A"
cmp ecx,15
jbe doit ; else, no good, after "F"
done:
ret
AsciiHexToDW7 EndP



In this routine, an invalid character terminates the input string rather than returning an error, and it only returns the value of the last 8 valid hex characters if the input length was greater than 8.

Edited:  slightly smaller & faster.
Edit2:  opps, typo.

[attachment deleted by admin]

Jimg

And, of course, the ever populer, much faster though much larger, table lookup:

   .data
    align 16
    ;    0-15 = good value.  254 = bad character, terminate.  255 = ok character, ignore.
    tbl \
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
        ; 0    1   2   3   4   5   6   7   8   9
      db  00, 01, 02, 03, 04, 05, 06, 07, 08, 09,254,254,254,254,254,254
        ;      A   B   C   D   E   F       H
      db 254, 10, 11, 12, 13, 14, 15,254,254,254,254,254,254,254,254,254
        ;                                  x
      db 254,254,254,254,254,254,254,254,255,254,254,254,254,254,254,254
        ;      A   B   C   D   E   F       H
      db 254, 10, 11, 12, 13, 14, 15,254,254,254,254,254,254,254,254,254
        ;                                  x
      db 254,254,254,254,254,254,254,254,255,254,254,254,254,254,254,254

      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
      db 254,254,254,254,254,254,254,254,254,254,254,254,254,254,254,254
.code
align 16
AsciiHexToDW8 proc lpBuffer:dword
    xor eax,eax
    mov edx,lpBuffer
    jmp main2
doit:
    shl eax,4   ; *16
    add eax,ecx
next:
    inc edx
main2:
    movzx ecx,byte ptr [edx]
    mov cl,tbl[ecx]
    cmp cl,254
    jb doit
;   cmp cl,255  ; to allow 'x' ( or just ja next)
;   je next
    ret
AsciiHexToDW8 EndP



Testing ffffffff
AsciiHexToDW5 4294967295  size=54  cycles=73
AsciiHexToDW6 4294967295  size=54  cycles=70
AsciiHexToDW7 4294967295  size=57  cycles=69
AsciiHexToDW8 4294967295  size=290  cycles=50

Testing 01234567
AsciiHexToDW5 19088743  size=54  cycles=56
AsciiHexToDW6 19088743  size=54  cycles=52
AsciiHexToDW7 19088743  size=57  cycles=52
AsciiHexToDW8 19088743  size=290  cycles=51

Testing 89ABCDEF
AsciiHexToDW5 2309737967  size=54  cycles=83
AsciiHexToDW6 2309737967  size=54  cycles=83
AsciiHexToDW7 2309737967  size=57  cycles=82
AsciiHexToDW8 2309737967  size=290  cycles=50

Testing abcdef
AsciiHexToDW5 11259375  size=54  cycles=57
AsciiHexToDW6 11259375  size=54  cycles=53
AsciiHexToDW7 11259375  size=57  cycles=53
AsciiHexToDW8 11259375  size=290  cycles=42

Testing fFaA9@
AsciiHexToDW5 1047209  size=54  cycles=52
AsciiHexToDW6 1047209  size=54  cycles=56
AsciiHexToDW7 1047209  size=57  cycles=50
AsciiHexToDW8 1047209  size=290  cycles=35

Testing 0f3h
AsciiHexToDW5 243  size=54  cycles=33
AsciiHexToDW6 243  size=54  cycles=33
AsciiHexToDW7 243  size=57  cycles=32
AsciiHexToDW8 243  size=290  cycles=25

AsciiHexToDW5 total time(cycles) = 354
AsciiHexToDW6 total time(cycles) = 347
AsciiHexToDW7 total time(cycles) = 338
AsciiHexToDW8 total time(cycles) = 253


Press enter to exit...



What's nice about this is it can be easily expanded to much larger number bases like base 36. :wink

[attachment deleted by admin]

Ian_B

#8
I just wrote this one for myself. Recently checking Agner's optimisation manual again, I discovered the use of CDQ and realised that I could remove a lot of jump-testing in my code with that using simple ANDs to choose whether to add a value or not, and it's a lot faster op than the SBB I was using before, even if it ties you down to using specific registers. I generally write loops that can work on two values at once using as many registers as possible intertwining nicely, to avoid stalls, but the CDQ version I tried to make of this was messy (needed XCHG as well) and I ended up with this which is much simpler.

The trick of this is subbing from and adding to the whole EAX/ECX register rather than just AL/CL to get the "carry bit" effect in AH/CH that you can AND with to create an addition or not directly and quickly. The only catch may be if there are stalls between saving values into EDX/EBX and accessing BL/DL so closely. I doubt whether it can compete with JimG's brute-force table lookup, but it's kinda neat and then again there are only 5 memory accesses total, and the penalties of getting a bad jump prediction doing all that jump-testing (is it A-F? is it a-f? is it 0-9?) are completely eliminated, so maybe. If anyone wants to comparison test it, I'd be interested to see the results. It's gotta be better than the HTODW in the MASM library anyway, with 8 byte-only memory accesses, 3 shifts and an ADC for every byte, all those jumps...  :eek

These exit parameters/string considerations are useful to me, your needs may vary. In particular, there is no check for any "h/H" at the end of the string, as I only need to convert strings without those. As the next character ends up pointed to directly by [EDX] this shouldn't be too hard to add if necessary.

HextoDW    proc

comment    * ---------------------------------------------------------------
            Convert Hex ASCII string into DWORD value by IanB.
            Reads up to 8 chars and stops at the first non-Hex digit.

            On entry:
                string pointer in EAX
            On exit:
                EAX = converted value
                ECX = number of ASCII chars read (zero if unsuccessful)
                EDX = pointer to char in string immediately after last digit
                      read, which might be null-termination or another digit
                      if there were more than 8 in the string
            EBX/EBP/EDI/ESI preserved
            --------------------------------------------------------------- *

        push    ebx
        push    ebp
        push    edi
        push    esi

        push    eax                     ; push pointer
        mov     edi, eax
        xor     eax, eax
        mov     ax, WORD PTR [edi]      ; work with 2 chars at a time
        xor     ecx, ecx
        mov     ebp, 4                  ; 4 * 2 digits allowed to convert only
        xor     esi, esi                ; clear accumulator

HextoDWLoop:
        mov     cl, ah
        and     eax, 0ffH               ; top 3 bytes all zeroed
        mov     ebx, 32
        mov     edx, 32
        sub     eax, ('a')              ; fixes lowercase a/f to 0/5
        sub     ecx, ('a')
        and     bl, ah                  ; AH/CH holds DWORD extended sign of SUB
        and     dl, ch                  ; either add 32 or 0, quicker than SBB
        add     eax, ebx                ; fixes A/F or a/f to 0/5
        mov     ebx, 7
        add     ecx, edx
        mov     edx, 7
        and     bl, ah                  ; either add 7 or 0, quicker than SBB
        and     dl, ch
        add     eax, ebx                ; fixes 0/9 to -10/-1
        mov     ebx, 0fffffff0H         ; nibble mask
        add     ecx, edx
        add     eax, 10                 ; get final nibble value
        add     ecx, 10
        test    eax, ebx                ; was the first digit in range?
        jnz     ExitConvertHex          ; first non-hex digit found

        test    ecx, ebx                ; was the second digit in range?
        jnz     LastHexDigit            ; single hex digit left

        add     eax, eax
        shl     esi, 8                  ; ESI * 256
        lea     ecx, [ecx+eax*8]        ; combine nibbles - same P4 latency as SHL
        mov     ax, WORD PTR [edi]      ; work with 2 chars at a time
        add     edi, 2
        add     esi, ecx                ; add new digit
        xor     ecx, ecx                ; zero ECX
        sub     ebp, 1
        jnz     HextoDWLoop

ExitConvertHex:
        pop     ecx                     ; retrieve pointer
        mov     eax, esi                ; get result
        jmp     @F

LastHexDigit:
        shl     esi, 4                  ; ESI * 16
        add     edi, 1
        pop     ecx                     ; retrieve pointer
        add     eax, esi                ; digit combined for output value
@@:
        pop     esi
        mov     edx, edi                ; EDX holds pointer to char after last used
        sub     ecx, edi
        pop     edi
        pop     ebp
        pop     ebx
        neg     ecx                     ; ECX holds number of chars read
        ret

HextoDW    endp


IanB

hutch--

I would be interested in a fast version of this procedure for the masm32 library, even if it was table driven as the existing one works fine for occasional conversions but an extended version would also be worth having for applications that stream this type of data.

PS: Ian, welcome back, I tough you must have found a blonde.  :bg
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Ian_B

Quote from: hutch-- on October 25, 2005, 12:23:27 AM
I would be interested in a fast version of this procedure for the masm32 library, even if it was table driven as the existing one works fine for occasional conversions but an extended version would also be worth having for applications that stream this type of data.
Yes, my own application might need to parse full 32-char MD5s in ASCII, and this should convert quite well to allow that if needed.

QuotePS: Ian, welcome back, I tough you must have found a blonde.  :bg
Thanks, hutch! No such luck,  :'(  just getting my head back into programming again after a long time too busy. Picking up the threads of my project again is a challenge, but it's been fun re-reading all the code and discovering all the stupid bugs and sloppy ordering with a fresh eye...  :bg

IanB

hutch--

Here is a "roughie" based on a hex streaming algo I have in the library. It has no error testing and requires an 8 character hex string but its probably reasonably fast.


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    h2d PROTO :DWORD

    ; 48 - 57   0123456789
    ; 65 - 70   ABCDEF
    ; 99 - 102  abcdef

    .data
      tbl1 \
        db 00h,10h,20h,30h,40h,50h,60h,70h,80h,90h,0,0,0,0,0,0      ; 63
        db 00h,0A0h,0B0h,0C0h,0D0h,0E0h,0F0h,0,0,0,0,0,0,0,0,0      ; 79
        db 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0                          ; 95
        db 00h,0A0h,0B0h,0C0h,0D0h,0E0h,0F0h

      tbl2 \
        db 00h,01h,02h,03h,04h,05h,06h,07h,08h,09h,0,0,0,0,0,0      ; 63
        db 00h,0Ah,0Bh,0Ch,0Dh,0Eh,0Fh,0,0,0,0,0,0,0,0,0            ; 79
        db 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0                          ; 95
        db 00h,0Ah,0Bh,0Ch,0Dh,0Eh,0Fh

        ; sub 48 from each offset table

    .code

start:
   
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

    call main
    inkey
    exit

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

main proc

    fn h2d,"FD19ED27"

    print hex$(eax),13,10

    ret

main endp

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

h2d proc phex:DWORD

    LOCAL var   :DWORD

    push esi
    push edi

    mov esi, phex
    lea edi, var

    movzx eax, BYTE PTR [esi]
    movzx edx, BYTE PTR [esi+1]
    mov cl, [tbl1+eax-48]
    add cl, [tbl2+edx-48]
    mov [edi+3], cl

    movzx eax, BYTE PTR [esi+2]
    movzx edx, BYTE PTR [esi+3]
    mov cl, [tbl1+eax-48]
    add cl, [tbl2+edx-48]
    mov [edi+2], cl

    movzx eax, BYTE PTR [esi+4]
    movzx edx, BYTE PTR [esi+5]
    mov cl, [tbl1+eax-48]
    add cl, [tbl2+edx-48]
    mov [edi+1], cl

    movzx eax, BYTE PTR [esi+6]
    movzx edx, BYTE PTR [esi+7]
    mov cl, [tbl1+eax-48]
    add cl, [tbl2+edx-48]
    mov [edi+0], cl

    mov eax, [edi]

    pop edi
    pop esi

    ret

h2d endp

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

end start
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

lingo

#12
Hutch,

The same but faster  :lol

H2dw proc  lHexValue:dword
     mov   eax, [esp+4] ; eax->buffer with Hex
     mov   [esp+4], ebx
     mov   ecx, [eax]
     mov   ebx, [eax+4]

     movzx eax, cl
     movzx edx, word ptr [tbl1+eax-48-1]
     mov   al, ch
     bswap ecx
     add   dh, [tbl2+eax-48]

     mov   al, cl
     mov   dl, [tbl2+eax-48]
     mov   al, ch
     add   dl, [tbl1+eax-48]

     movzx ecx, bl
     movzx eax, word ptr [tbl1+ecx-48-1]
     mov   cl, bh
     bswap ebx
     add   ah, [tbl2+ecx-48]

     mov   cl, bl
     mov   al, [tbl2+ecx-48]
     mov   cl, bh
     add   al, [tbl1+ecx-48]

     shl   edx, 16
     mov   ebx, [esp+4]
     add   eax, edx
     ret   4
H2dw endp


and times:


Hex2Dword  Tests:

Hex2Dw - hutch : 62  clocks; Result: -48632537
Hex2Dw - lingo : 23  clocks; Result: -48632537

Press ENTER to exit...


Regards,
Lingo


[attachment deleted by admin]

Ian_B

#13
Excellent stuff! Now add in all the necessary error-checking and capability to work on non-8char-perfectly-formatted CRCs that you need for a real-world app, and let's see how it compares with the one I posted...  :wink

As I've said before in the old forum, for me these things are indivisible from the routine. After all, if you're having to do this sort of conversion in the first place, it's because you're parsing data from an external source and you therefore can't rely on the format, length or whether it's even a CRC in the first place. If you've already got the information in your own program you'd keep it as a DWORD that would never need converting.

So that fragment is an interesting technical exercise for a very limited application, when to make the routine worthwhile for everyday use it has to be able to handle every case you can throw at it while always providing a meaningful result and also tell you whether it actually found something useful to process so you know whether the result you got was meaningful.

Just to make this clearer, here's the reason I am saying this. The application I am writing (and that I coded this routine for) is parsing yEnc newsgroup files. In a yEnc file, the CRC for the encode is SPECIFIED to be 8 characters long officially, with leading zeroes. It would be nice to write a program that could assume this would always be the case, even presuming that you haven't got a broken file that stopped halfway through this value. However, I know for a fact that some posting programs have blithely ignored this specification, and put less-than-8char CRC values in this field if there aren't that many significant digits. This is a real-world application working with real-world data, it can't just fail because someone else can't work to specification.

That's why I think that should always be the case with these conversion routines, they must have some form of checking otherwise they are merely a toy for timewasting, they are pointless for anyone to use when they actually NEED to.

IanB

hutch--

Ian,

The technique I use in the hex streaming algo was to have a 256 member table of acceptable characters which provided error checking fo any character pair. I just pelted this one togeter yesterday as I was satisfied that the idea would at least work. I was looking for a non looped fall through while keeping the tabes down to a reasonable size. When I get a bit more time I will have a play with the idea to make it safe to use.

Lingo,

Looks good.  :U
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php