The MASM Forum Archive 2004 to 2012

Project Support Forums => MASM32 => Topic started by: Larry Hammick on December 17, 2007, 04:29:39 PM

Title: islower.asm, isupper.asm
Post by: Larry Hammick on December 17, 2007, 04:29:39 PM
They can be simplified.

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 16
islower proc char:BYTE
    xor eax,eax
    cmp BYTE PTR [esp+4],"a"
    jb @F
    cmp BYTE PTR [esp+4],"z"+1
    adc al,0
@@:
    ret 4

islower endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 16
isupper proc char:BYTE
    xor eax,eax
    cmp BYTE PTR [esp+4],"A"
    jb @F
    cmp BYTE PTR [esp+4],"Z"+1
    adc al,0
@@:
    ret 4

isupper endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

Title: Re: islower.asm, isupper.asm
Post by: hutch-- on December 18, 2007, 12:10:10 AM
Yes but at the price of an unpredicted jump. Thats why the originals jump forward to the "fail:" label.
Title: Re: islower.asm, isupper.asm
Post by: Larry Hammick on December 18, 2007, 09:02:36 AM
I see. Hmm. It can be done with no jumps at all, e.g.
IsLower:
    mov al,[esp+4]
    mov ah,al
    cmp al,"z"+1
    setnc al     ;make AL nonzero if either test fails
    cmp ah,"a"
    adc al,0
    setz al
    and eax,1
    ret 4

but that variation is probably slower even though it does only one fetch from the stack. It does have the merit of returning ZF or NZ according to eax.
In practice, a tiny routine like IsLower would be one of a number of related text routines, and I would prefer an approach in which they share code (and share one or more lookup tables, if necessary):
is_al_lcase:
    xor al,20h
is_al_ucase:
    mov dx,"ZA"
is_al_in_range:
    cmp al,dl
    ;...
    cmp al,dh
    ;...
    ret
is_al_digit:
    mov dx,"90"
    jmp is_al_in_range

Title: Re: islower.asm, isupper.asm
Post by: hutch-- on December 18, 2007, 10:24:12 AM
Yes,

The table appoach is the fastest by a long way. There is a macro called "chtype$" that does exactly this. It comes at the price of a 256 byte lookup table but its much faster in testing than seperate procedures for each category of character type.
Title: Re: islower.asm, isupper.asm
Post by: Adamanteus on December 18, 2007, 08:45:38 PM
The macro of course fundamental decision, but it's need to know all national tables - operating system them not supplying as I know. So I'd make the more flexible code, that uses pointer for universality and transferring some functionality to API if it is :



; initial character routines for qualifiers
isupchar    TEXTEQU    <isupperAlEng>
islowchar    TEXTEQU    <islowerAlEng>

.data

isupperAL    DD    isupchar
islowerAL    DD    islowchar

.code

OPTION PROLOGUE : NONE
OPTION EPILOGUE : NONE

isupperAlEng    PROC
    MOV    DL, AL
    XOR    EAX, EAX
    CMP    DL, "A"
    JB @F
    CMP    DL, "Z" + 1
    ADC    AL, 0
@@:    RET
isupperAlEng    ENDP

islowerAlEng    PROC
    MOV    DL, AL
    XOR    EAX, EAX
    CMP    DL, "a"
    JB @F
    CMP    DL, "z" + 1
    ADC    AL, 0
@@:    RET
islowerAlEng    ENDP

isupper    PROC, ; USES preserve,
    alpha : BYTE
    MOV    AL, BYTE PTR [ESP + 4]
    CALL isupperAL
    RET
isupper    ENDP

islower    PROC, ; USES preserve,
    alpha : BYTE
    MOV    AL, BYTE PTR [ESP + 4]
    CALL islowerAL
    RET
islower    ENDP

OPTION PROLOGUE : PrologueDef
OPTION EPILOGUE : EpilogueDef

Title: Re: islower.asm, isupper.asm
Post by: hutch-- on December 19, 2007, 04:57:45 AM
Larry,

This is what a PROC has to compete against. The "IS"whatever are for people who are used to the older C style character evaluation method. In every instance the direct table lookup that the macro uses will eat a procedure alive in speed terms. The cost is a 256 byte lookup table.


  ; ----------------------------------------
  ; chtype$() will accept either a BYTE sized
  ; register or the address of a BYTE as a
  ; memory operand.
  ; The result is returned in a memory operand
  ; as a BYTE PTR to the character class in the
  ; table.
  ; You would normally use this macro with
  ;
  ;     movzx ecx, chtype$([ebp+4])
  ;     cmp chtype$([esp+4]), 2
  ;     cmp chtype$(ah), dl
  ;
  ; ----------------------------------------
    chtype$ MACRO character
      IFNDEF chtyptbl
        EXTERNDEF chtyptbl:DWORD         ;; load table if not already loaded
      ENDIF
      movzx eax, BYTE PTR character      ;; zero extend character to 32 bit reg
      EXITM <BYTE PTR [eax+chtyptbl]>    ;; place the table access in a 32 bit memory operand
    ENDM