The MASM Forum Archive 2004 to 2012

General Forums => The Workshop => Topic started by: RuiLoureiro on May 27, 2005, 09:59:08 PM

Title: M32Lib-ATODW
Post by: RuiLoureiro on May 27, 2005, 09:59:08 PM
Hi,

   I was seeing converters ascii-to-integer. I found ATODW in m32lib. It uses «lea   ecx, dword ptr [eax+10*ecx]» (in two instructions). But:

      1. It has 3 instructions that do nothing: push  edi, pop edi and xor eax, eax;
   2. It uses 2D in turn of 2Dh ( minus signal ) and not 2Bh (+);
   3. We have no control over the result;
      4. We have no control over the buffer contents;

Here is what i have in m32lib folder:
................................................................................
atodw proc String:DWORD

    push esi
    push edi

    xor eax, eax
    mov esi, [String]
    xor ecx, ecx
    xor edx, edx
    mov al, [esi]
    inc esi
    cmp al, 2D
    jne proceed
    mov al, byte ptr [esi]
    not edx
    inc esi
    jmp proceed

  @@:
    sub al, 30h
    lea ecx, dword ptr [ecx+4*ecx]
    lea ecx, dword ptr [eax+2*ecx]
    mov al, byte ptr [esi]
    inc esi

  proceed:
    or al, al
    jne @B
    lea eax, dword ptr [edx+ecx]
    xor eax, edx

    pop edi
    pop esi
    ret
atodw endp
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
When i saw  «lea   ecx, dword ptr [ecx+4*ecx]» i thought «good, it can be "cheese"». But quickly i found out it is ... a "trap" (we cannot control the result because LEA doesnt affect any flag)

Here is my code

; To call:    invoke AtoDW, ADDR String     ;[String  db "??? ...",0 ]
; Out:  clc=> OK; stc=> error
AtoDW               proc  pString:DWORD   
                    push  esi

                    mov   esi, pString              ; String pointer
                    xor   ecx, ecx                  ; the result
                    xor   edx, edx                  ; the sign to the result
   
                    mov   al, byte ptr [esi]        ; get first byte
                    cmp   al, 2Bh                   ; plus ?
                    je    _nAtoDW

                    cmp   al, 2Dh                   ; minus ?
                    jne   _iAtoDW
                   
                    not   edx
                    je    _nAtoDW                    ; get next

  @@:               cmp    al, " "
                    je    _nAtoDW                    ; get next
                    ; -----------------------------------------
                    ; we must Control chars 30-39. If not error
                    ; -----------------------------------------
                    ; jc   @F if chars not between 30-39
                    sub   al, 30h
                    lea   ecx, dword ptr [ecx+4*ecx]
                    lea   ecx, dword ptr [eax+2*ecx]
                   
_nAtoDW:            inc   esi
                    mov   al, byte ptr [esi]        ; get next byte

_iAtoDW:            or    al, al
                    jne   @B
   
                    lea   eax, dword ptr [edx+ecx]
                    xor   eax, edx
                    clc
                   
@@:                 pop   esi
                    ret
AtoDW               endp

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I saw this too.
From Tutorial-fputute chapter 13 ( by Raymond ) i made this modifications:
( Raymond says in his page that we can - how are you ?)
;*******************************************************************************
;                            atofl
;*******************************************************************************
; lodsb   can be substituted by:
;                mov    al, byte ptr [esi]
;                inc    esi
atofl:
      push  ebx         ;preserve EBX and ESI
      push  esi

      lea   esi,buffer1 ;use ESI as pointer to text buffer
      xor   eax,eax
      xor   ebx,ebx     ;will be used as an accumulator
      xor   ecx,ecx     ;will be used as a counter
;************************************************
; Skip leading spaces without generating an error
;************************************************

   @@:
      lodsb             ;get next character
      cmp   al," "      ;check if a space character
      jz    @B          ;repeat until a non-space character is found
;*********************************************
; Check 1st non-space character for a +/- sign
;*********************************************
      cmp   al,"-"      ;is it a "-" sign
   je    atoflerr
;      jnz   @F

;atoflerr:
;      xor   eax,eax     ;set EAX to error code
;      pop   esi         ;restore the EBX and ESI registers
;      pop   ebx
;      ret               ;return with error code

;   @@:
      cmp   al,"+"      ;is it a "+" sign
   jnz   short @F
;      jnz   nextchar

nextchar:
      lodsb             ;disregard a "+" sign and get next character
;***********************************************************
; From this point, space and sign characters will be invalid
;***********************************************************
;nextchar:
@@:
      cmp   al,0        ;check for end-of-string character
      jz    endinput    ;exit the string parsing section

      cmp   al,"."      ;is it the "." decimal delimiter
                        ;other delimiters such as the "," used in some
                        ;countries could also be allowed but would need
                        ;additional coding to make it more generalized
      jnz   @F
;******************************************************************
; Only one decimal delimiter can be acceptable. The sign bit of ECX
; is used to keep a record of the first delimiter identified.
;******************************************************************

      or    ecx,ecx     ;check if a delimiter has already been identified
      js    atoflerr    ;exit with error code if more than 1 delimiter
     
      stc               ;set the carry flag
      rcr   ecx,1       ;set bit31 of ECX (the sign bit) when
                        ;the 1st delimiter is identified
;      lodsb             ;get next character
      jmp   nextchar    ;continue parsing
;***********************************************************************
; All ASCII characters other than the numerical ones will now be invalid
;***********************************************************************
   @@:
      cmp   al,"0"
      jb    atoflerr
      cmp   al,"9"
      ja    atoflerr

      sub   al,"0"      ;convert valid ASCII numerical character to binary
      xchg  eax,ebx     ;get the accumulated integer value in EAX
                        ;holding the new digit in EBX
      mul   factor10    ;multiply the accumulated value by 10
      add   eax,ebx     ; and add the new digit
      xchg  eax,ebx     ;store this new accumulated value back in EBX

      or    ecx,ecx     ;check if a decimal delimiter detected yet
      js    @F          ;jump if decimal digits are being processed
;*************************************
; Integer digits still being processed
;*************************************
      cmp   ebx,100     ;verify current value of integer portion
      jbe   nextchar    ;continue processing string characters
;      ja    atoflerr    ;abort if input for annual rate is > 100%

atoflerr:
      xor   eax,eax     ;set EAX to error code
      pop   esi           ;restore the EBX and ESI registers
      pop   ebx
      ret                  ;return with error code

;      lodsb             ;get next string character
;      jmp   nextchar    ;continue processing string characters
;*******************************************************
; The CL register is used as a counter of decimal digits
; after the decimal delimiter has been identified
;*******************************************************
   @@:
      inc   cl          ;increment count of decimal digits
;      lodsb             ;get next string character
      jmp   nextchar    ;continue processing string characters
;***********************************
; Parsing of the string is completed
;***********************************
endinput:
      or    ebx,ebx     ;check if total input was equal to 0
      jz    atoflerr    ;abort if annual rate input is 0%

      finit             ;initialize FPU
      push  ebx         ;store value of EBX on stack
      fild  dword ptr[esp]    ;-> st(0)=EBX
      add   cl,2        ;increment the number of decimal digits
                        ;to convert from % rate to a decimal rate
      shl   ecx,1       ;get rid of the potential sign "flag"
      shr   ecx,1       ;restore the count of decimal digits
      fild  factor10    ;-> st(0)=10, st(1)=EBX
   @@:
      fdiv  st(1),st    ;-> st(0)=10, st(1)=EBX/10
      dec   ecx         ;decrement counter of decimal digits
      jnz   @B          ;continue dividing by 10 until count exhausted
      fstp  st          ;get rid of the dividing 10 in st(0)
                        ;-> st(0)=annual rate (as a decimal rate)
      pop   ebx         ;clean CPU stack

      pop   esi         ;restore the EBX and ESI registers
      pop   ebx
      or    al,1        ;insure EAX != 0 (i.e. no error detected)
      ret
;*******************************************************************************
stay well
Title: Re: M32Lib-ATODW
Post by: hutch-- on May 28, 2005, 01:53:50 AM
The "xor eax, eax" is to prevent a register stall in the following use of AL. The PUSH/POP of EDI appears to be a left over from the last time Alex did some work on it and it is not needed but I doubt it slows anything up much.
Title: Re: M32Lib-ATODW
Post by: MichaelW on May 28, 2005, 07:26:24 AM
Hi Rui,

Just to illustrate the effects that Hutch is referring to:

; ««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .586                       ; create 32 bit code
    .model flat, stdcall       ; 32 bit memory model
    option casemap :none       ; case sensitive

    include \masm32\include\windows.inc
    include \masm32\include\masm32.inc
    include \masm32\include\kernel32.inc

    includelib \masm32\lib\masm32.lib
    includelib \masm32\lib\kernel32.lib

    include \masm32\macros\macros.asm

    include timers.asm

    atodw_no_xor_eaxeax   PROTO :DWORD
    atodw_no_pushpop_edi  PROTO :DWORD

; ««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
      teststr db "123456789",0
    .code
; ««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; ««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    LOOP_COUNT EQU 10000000

    counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
      invoke atodw, ADDR teststr
    counter_end
    mov   ebx,eax
    print chr$("atodw                : ")
    print ustr$(ebx)
    print chr$(" cycles", 13, 10)

    counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
      invoke atodw_no_pushpop_edi, ADDR teststr
    counter_end
    mov   ebx,eax
    print chr$("atodw_no_pushpop_edi : ")
    print ustr$(ebx)
    print chr$(" cycles", 13, 10)   

    counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
      invoke atodw_no_xor_eaxeax, ADDR teststr
    counter_end
    mov   ebx,eax
    print chr$("atodw_no_xor_eaxeax  : ")
    print ustr$(ebx)
    print chr$(" cycles", 13, 10)
   
    mov   eax, input(13, 10, "Press enter to exit...")
    exit   

; ««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; Copies to play with.   
; ««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
atodw_no_pushpop_edi proc String:DWORD

  ; ----------------------------------------
  ; Convert decimal string into dword value
  ; return value in eax
  ; ----------------------------------------

    push esi
    ;push edi

    xor eax, eax
    mov esi, [String]
    xor ecx, ecx
    xor edx, edx
    mov al, [esi]
    inc esi
    cmp al, 2D
    jne proceed
    mov al, byte ptr [esi]
    not edx
    inc esi
    jmp proceed

  @@:
    sub al, 30h
    lea ecx, dword ptr [ecx+4*ecx]
    lea ecx, dword ptr [eax+2*ecx]
    mov al, byte ptr [esi]
    inc esi

  proceed:
    or al, al
    jne @B
    lea eax, dword ptr [edx+ecx]
    xor eax, edx

    ;pop edi
    pop esi

    ret

atodw_no_pushpop_edi endp

atodw_no_xor_eaxeax proc String:DWORD

  ; ----------------------------------------
  ; Convert decimal string into dword value
  ; return value in eax
  ; ----------------------------------------

    push esi
    push edi

    ;xor eax, eax
    mov esi, [String]
    xor ecx, ecx
    xor edx, edx
    mov al, [esi]
    inc esi
    cmp al, 2D
    jne proceed
    mov al, byte ptr [esi]
    not edx
    inc esi
    jmp proceed

  @@:
    sub al, 30h
    lea ecx, dword ptr [ecx+4*ecx]
    lea ecx, dword ptr [eax+2*ecx]
    mov al, byte ptr [esi]
    inc esi

  proceed:
    or al, al
    jne @B
    lea eax, dword ptr [edx+ecx]
    xor eax, edx

    pop edi
    pop esi

    ret

atodw_no_xor_eaxeax endp

; ««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


Results on my P3:

atodw                : 57 cycles
atodw_no_pushpop_edi : 52 cycles
atodw_no_xor_eaxeax  : 131 cycles




[attachment deleted by admin]
Title: Re: M32Lib-ATODW
Post by: Vortex on May 28, 2005, 10:11:05 AM
Hi MichaelW,

Thanks for the demo.

Here are the results on my P4 2.66 GHz:
Quote
atodw                : 54 cycles
atodw_no_pushpop_edi : 64 cycles
atodw_no_xor_eaxeax  : 60 cycles

Michael, will you modify your timers macros for P4 or are they OK to use them on a P4?

Thanks,

Erol
Title: Re: M32Lib-ATODW
Post by: MazeGen on May 28, 2005, 11:14:51 AM
AFAIK there are no problems on P4 with those macros.

Your results are much different probably because P4 handles partial register access unlike P3.
Title: Re: M32Lib-ATODW
Post by: hutch-- on May 28, 2005, 11:16:27 AM
Erol,

The need to clear the register with XOR reg, reg or alternatively SUB reg, reg is not so noticable on a PIV but if you want code that runs on everything properly, it must be there.
Title: Re: M32Lib-ATODW
Post by: dsouza123 on May 28, 2005, 01:04:37 PM
The 2D instead of 2Dh is very likely a bug. 
The 2D becomes 2 when assembled, which wont match a - (minus sign).
Title: Re: M32Lib-ATODW
Post by: Vortex on May 28, 2005, 03:15:41 PM
MazeGen, Hutch

Thanks for your replies.
Title: Re: M32Lib-ATODW
Post by: MichaelW on May 29, 2005, 12:26:42 AM
Erol,

The macros contain no processor-specific code, so AFAIK the results are equally valid for all of the processor families. Agner Fog states in his Pentium optimization manual that the P4 was designed to store the whole register together, instead of splitting it into separate temporary registers as for the PPro, P2, and P3, to avoid the "serious delay whenever there was a need to join different parts of a register into a single full register." This seems to me to indicate that a large timing difference should be expected.
Title: Re: M32Lib-ATODW
Post by: Vortex on May 29, 2005, 07:53:48 AM
Hi Michael,

Thanks for the technical info :U
Title: Re: M32Lib-ATODW
Post by: hutch-- on May 29, 2005, 07:59:04 AM
Just a note on atodw, it was designed to handle DWORD rather than LONG values so it was never pointed at negative numbers. For the signed version, there is an algo written by Ray Filiatreault called "atol" that handles signed conversions.
Title: Re: M32Lib-ATODW
Post by: RuiLoureiro on May 30, 2005, 09:11:20 PM
Hi all

Here are the results on my P3:

atodw                 :  58 cycles    [+1 ]
atodw_no_pushpop_edi  :  53 cycles    [+1 ]
atodw_no_xor_eaxeax   : 131 cycles


Hi Hutch,
   How are you ? I hope you are fine.
      Yes, "xor   eax, eax" is needed. So, it must be there [no HomeWork rule]
      2D, as noted by dSouza123, is a bug. But if it has 2Dh (-), why not 2Bh (+) ?
I am guessing that when you come to our topics, many people want to see what you say. When we have not your help, sometimes, it is more difficult.
Thank you.

Hi Erol,
       Are you fine ? I hope. Thanks for the contribution.
The case [P4 2.66 G ]atodw_no_xor_eaxeax:  60  cycles against
Michael case [P3    ]atodw_no_xor_eaxeax: 131 cycles is mysterious !

Hi Michael,
   How are you getting along ? Thanks for your example (i will use it in other cases). In this case it is important because we can have hundred of strings to convert in one single loop (or task ). If the difference is 5 cycles (with push-pop and without), in 100 we have 500 cycles or in 200 the difference is 1000 cycles ( best case ).
   I noticed one strange case: atodw_no_xor_eaxeax  gives 131 cycles !!! Why this ? What is the explanation ? What i know is that without "xor eax, eax", the procedure is wrong.

Here is the corrected code [i call BufToInt in turn of AtoDW]

; In:   pString => string pointer
;
; Out:  clc => OK    the result is in EAX ( but can be wrong -overflow problems )
;
;       stc => char is not valid
;
; Info:
;      1. The string must terminated by 0;
;       2. The string can contain spaces between digit codes;
;       3. The first char. can be «-» or «+»;
;       4. We have no overflow control in the EAX result;
;       5. Destroy the contents of ECX, EDX.

;
; To call:      invoke   BufToInt, ADDR String     ;[String  db "??? ...",0 ]
;
BufToInt         proc  pString:DWORD   
                    push  esi

                    mov   esi, pString              ; String pointer
                    xor   eax, eax                 
                    xor   ecx, ecx                  ; the result
                    xor   edx, edx                  ; to sign the result
   
                    mov   al, byte ptr [esi]        ; get first byte

                    cmp   al, 2Bh                   ; plus ?
                    je    _nBufToInt

                    cmp   al, 2Dh                   ; minus ?
                    jne   _iBufToInt
                   
                    not   edx                 ; doesnt affect flags
                    je    _nBufToInt         ; get next

  @@:            cmp    al, " "
                    je    _nBufToInt                 ; get next

                    cmp   al,"9"
                    jbe   _tBufToInt

_rBufToInt:    stc
                      pop   esi
                      ret

_tBufToInt:         sub   al, 30h            ; most signif. byte=0
                         jc    short _rBufToInt

                        lea   ecx, dword ptr [ecx+4*ecx]
                        lea   ecx, dword ptr [eax+2*ecx]
                   
_nBufToInt:        inc   esi
                        mov   al, byte ptr [esi]        ; get next byte

_iBufToInt:        or    al, al
                        jne   @B
   
                        lea   eax, dword ptr [edx+ecx]
                        xor   eax, edx
                        clc
                   
                       pop   esi
                       ret
BufToInt           endp
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
About «atol», i have this code


atol           proc  lpSrc:DWORD
                xor   eax, eax
                xor   ecx, ecx
                mov   edx, lpSrc

                sub   edx, 1
  @@:       
                add   edx, 1
                cmp   BYTE PTR [edx], 32
                je    @B
                cmp   BYTE PTR [edx], 9
                je    @B                            ; [ strip spaces and tabs ]

                mov   al, [edx]                   ; [ begin ]
                add   edx, 1

   .if al == "-"
     add   ecx, 1
           mov   al,[edx]
           add   edx, 1
   .elseif al == "+"
           mov   al, [edx]
           add   edx, 1
   .endif
                push ecx             ; keep sign on stack
                xor  ecx,ecx

@@:         
                sub al,"0"
                jc    @F                 

                 lea    ecx, [ecx+ecx*4]
                 lea    ecx, [eax+ecx*2]
                 mov  al, [edx]
                 add   edx, 1
                 jmp    @B
@@:   
                mov eax,ecx
                pop ecx               ; retrieve sign
                shr ecx,1
                jnc @F
                neg eax
@@:         
                ret
atol           endp


Where

.if al == "-"
  add  ecx, 1
            mov  al,[edx]
            add  edx, 1
   .elseif al == "+"
             mov  al,[edx]
             add  edx, 1
   .endif


should be [ HomeWork rule ! ]


.if al == "-"                 ; if not, the sign is ecx=0
       add ecx, 1
.endif

mov  al,[edx]
add  edx, 1

........................................

All best things to all of you
Stay well
Title: Re: M32Lib-ATODW
Post by: QvasiModo on May 30, 2005, 10:15:01 PM
RuiLoureiro, this could be the reason why removing the XOR EAX, EAX causes a slowdown:

Quote from: hutch-- on May 28, 2005, 01:53:50 AM
The "xor eax, eax" is to prevent a register stall in the following use of AL.

Cheers, :U
QvasiModo
Title: Re: M32Lib-ATODW
Post by: hutch-- on May 31, 2005, 12:51:53 AM
hmmmm,

Quote
Hi Hutch,
   How are you ? I hope you are fine.
      Yes, "xor   eax, eax" is needed. So, it must be there [no HomeWork rule]
      2D, as noted by dSouza123, is a bug. But if it has 2Dh (-), why not 2Bh (+) ?
I am guessing that when you come to our topics, many people want to see what you say. When we have not your help, sometimes, it is more difficult.
Thank you.

Thanks but I already knew about the Intel optimisation since they published it for the PIII many years ago. The only BUG in the algo is a user bug of trying to use an UNSIGNED algo for signed values, as posted before, use ATOL for signed values.

I am not sure of the point you ae trying to make with comments about the forum rules but they are in place for a reason which is to protect our members from nonsense and this will not be changed. Keep it up and the posting WILL be changed.
Title: Re: M32Lib-ATODW
Post by: roticv on May 31, 2005, 03:17:25 AM
Normal people don't add + to a number if it is a positive number. It is rare that we add a plus sign in front of the number and the only cases I can think of is oxidation number of an element.

On the other hand, negative sign tells us that the number is negative.

Therefore I think the cmp with plus sign is useless and slows down the code. Personally I don't like to see too many branches as it would slow down the code.
Title: Re: M32Lib-ATODW
Post by: RuiLoureiro on May 31, 2005, 02:56:59 PM
Hi

Hutch,
               Thank you.

1. I never comment forum rules this way. Its an interpretation question.
It isn't a comment about that rule. That rule in your topic is a rule ( and i agree with it).
If i want to comment it i will go there to comment.
As you know, i said something there and i am not against the rule. I think this is the rule.

2. Sorry, if you have another interpretation.

3. What you are saying is that this 2 instructions
 
                        "cmp al, 2D"
                        "jne  proceed"

   are related with optimisation. It doesnt look like but may be. I can
         say i dont know anything about optimisation questions.

       4. Sorry, but i am not sure about this: «Keep it up and the posting WILL be changed.»

QvasiModo,
                It seems.

Roticv,
          I dont agree. When we want to use it after getting the string from the keyboard, we can type "+245...". Why not ?
I say the same you said: «Personally I don't like to see too many branches» but when we need to compare we should use the instructions unless we have another algo.

regards
Title: Re: M32Lib-ATODW
Post by: roticv on May 31, 2005, 03:49:38 PM
I think humans are by nature lazy. We would prefer to keep things short and simple (K.I.S.S). I really doubt anyone would put a plus sign in front of their numbers. That's what probably The Svin thought of too when creating the routine. Anyway, if I remember correctly, the algorithm cannot handle certain numbers - need to check that out.

Maybe there are other approaches to the routine. I will think about it (hopefully) and get back to it.
Title: Re: M32Lib-ATODW
Post by: RuiLoureiro on June 01, 2005, 03:38:40 PM
Hmmmmmmm ?

Quote from: roticv on May 31, 2005, 03:49:38 PM
We would prefer to keep things short and simple (K.I.S.S).
Quote
           When somebody uses computer names like «K.I.S.S.» coming from Keep Things Short and Simple, hmmmmmm ?
I think we must not use words that can have other meanings out of the context. I know that it depends on interpretation, but ....
If anyone dont like to see some word i had said, i will edit my topic to delete it or to correct.

            About the algos, we can assume strings like db "   123" or
db "  +234"; db "  -456 567"; db "3456999999999999"; etc. Some are good for ones and not for others.
I want see other approches.
Title: Re: M32Lib-ATODW
Post by: roticv on June 01, 2005, 04:52:18 PM
Personally I feel that it should be 2 different routine. One routine to strip the spaces and stuff (since you really insist on it) and the other for the real part. This is so as, to give a choice to make instead of forcing them to use a more bloated code with more features - features that he/she does not really need.
Title: Re: M32Lib-ATODW
Post by: hutch-- on June 01, 2005, 11:54:27 PM
Finally a conversion is a conversion and it is the responsibility of the programmer to point the right data at the algo. Stripping "+ or -" is actually a striing function that has nothing to do with the conversion and it becomes a point of diminishing returns to keep adding junk to an algo because someone may point garbage at it. There will never be an idiot proof algo as there will always be a better idiot so rather than cater for the idiot level, design whatever you need to ensure the conversion algos get the correct format data.

Now the library has 2 version of atodw, the original and the extended version which is table based and if you need to convert a signed value, there is the atol algo so there is enough capacity to do any of these conversions.


atodw     original algo
atodw_ex  extended version
atol      signed version


For any who have any accuracy doubts, the extended version is table driven so it is garranteed accurate within the DWORD range.

The other factor is that the extended version is much faster than the others.
Title: Re: M32Lib-ATODW
Post by: RuiLoureiro on June 02, 2005, 11:29:05 AM
Quote from: hutch-- on June 01, 2005, 11:54:27 PM
Now the library has 2 version of atodw, the original and the extended version

atodw_ex  extended version


but I haven't this extended version atodw_ex in m32lib, Hutch.
Title: Re: M32Lib-ATODW
Post by: hutch-- on June 02, 2005, 11:36:31 AM
Get a current version, it can be done by getting the latest service pack from the forum web site. The algo was written about a year ago.
Title: Re: M32Lib-ATODW
Post by: RuiLoureiro on June 02, 2005, 04:36:00 PM
Hutch,
          i thk we need unicode versions, too. No ?
Title: Re: M32Lib-ATODW
Post by: hutch-- on June 03, 2005, 12:37:23 AM
Feel free to write it.
Title: Re: M32Lib-ATODW
Post by: RuiLoureiro on June 04, 2005, 12:05:01 PM
MASM doubt with INCLUDE

    Can we include a file FILE.INC, which has 3 sections .data, .data? and .code ( all or some ), in any place inside a main file MAIN.ASM ?

MAIN.ASM:

.data
     ...
.data?
     ...
.code
start:
     ...
;..................
       INCLUDE /.../FILE.INC
end         start



FILE.INC:

.data
     ...
.data?
     ...
.code
     ...

Title: Re: M32Lib-ATODW
Post by: Mark Jones on June 04, 2005, 01:39:11 PM
I believe so Rui, have you tried it?
Title: Re: M32Lib-ATODW
Post by: hutch-- on June 05, 2005, 01:17:28 AM
These types of questions are determined by writing a simple test piece. MASM supports changing the "SECTION" within code but it is the programmers responsibility to ensure that they write the correct data or code in the correct sections.

Now for example if you had an include file that was written like,


.data
  item dd 0


and included it directly in the .CODE section, then you would get errors if there is code after the insertion point as you would be trying to write code in a data section. This would simply be a programming error.

As usual, try it out and if it goes BANG, you know the answer.
Title: Re: M32Lib-ATODW
Post by: Momoass on June 05, 2005, 09:21:11 AM
atodw                : 68 cycles
atodw_no_pushpop_edi : 63 cycles
atodw_no_xor_eaxeax  : 69 cycles


result on my SP4600+<1.83G>.