hi all,
i don't know if this is the right place for this post(if not please move it to the right location)
i have found a unexpected behavior(at least for me) in the function h2dw present in masm32.lib in Masm V10
local msg[128]:BYTE
invoke GetDlgItemText,hWnd,IDC_INPUT,offset Buff,128 ----------> input here = "12345"
invoke a2dw,offset Buff
invoke wsprintf,addr msg,s("Output = %d"),eax -----------------------> here eax(returned from a2dw) must be 3039h (12345 in decimal) but is 7C80EEAF (2088824495 in decimal)
invoke SetDlgItemText,hWnd,IDC_OUTPUT,addr msg ----------------> output here must be "12345" but is "2088824495"
the solution is simple
a2dw.asm
a2dw proc uses ecx edi edx esi String:DWORD
;----------------------------------------
; Convert decimal string into dword value
; return value in eax
;----------------------------------------
xor ecx, ecx <---------------------------------------------------------------------------------------
mov edi, String |
invoke lstrlen, String |
xor ecx,ecx ---------------------------------> ecx must be zeroed here and not here
.while eax != 0
xor edx, edx
mov dl, byte ptr [edi]
sub dl, "0" ; subtrack each digit with "0" to convert it to hex value
mov esi, eax
dec esi
push eax
mov eax, edx
push ebx
mov ebx, 10
.while esi > 0
mul ebx
dec esi
.endw
pop ebx
add ecx, eax
pop eax
inc edi
dec eax
.endw
mov eax, ecx
ret
a2dw endp
i don't know if someone have detected this but i have recompiled the mas32 lib with this change:
https://www.box.com/s/48df5247de73bc82dade
bye all
5k3l3t0r
Interesting. Although we use atodw, not a2dw normally. atodw works as expected.
I am sure Hutch has more to say on this one.
Strangely enough I think its one of Iczelion's antiques.
this may be a clue - lol
sub dl, "0" ; subtrack each digit with "0" to convert it to hex value
the problem could also be fixed by using szLen instead of lstrlen :P
maybe it did, originally - assuming that it worked at some point
hi,
the problem is here:
add ecx, eax -------------> on first pass is assumed ecx = 0, he have set ecx to 0 and then called lstrlen, after that call ecx isn't 0 anymore
(for me lstrlen must restore all registers at end except eax but... ) so xor ecx,ecx after the call to lstrlen solves the problem.
bye all
5k3l3t0r
i get that :P
szLen does not use the ECX register
it is specially designed for use in macros where only EAX may be modified
not as fast as StrLen - but probably faster than lstrlen (showing some faith in Hutch, there :bg )
szLen "heals" it, indeed. However, it remains comparatively slow.
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (SSE3)
307 cycles for a2dw, res= 2101157844
271 cycles for a2dw2, res= 12345678
37 cycles for atodw, res= 12345678
112 cycles for MB Val, res= 12345678
315 cycles for a2dw, res= 2101157844
290 cycles for a2dw2, res= 12345678
37 cycles for atodw, res= 12345678
105 cycles for MB Val, res= 12345678
Val() is a bit slower because it swallows all kinds of valid inputs, including hex, float and binary strings, e.g.
MovVal f:xmm0, Chr$("12345678.9012345678")
Print Str$("Xmm0=%Gf", f:xmm0) ; Xmm0=12345678.90123457
:bg
Have a look at "atodw_ex", I AM guilty of that one. :P
Quote from: hutch-- on May 19, 2012, 08:27:31 PM
:bg
Have a look at "atodw_ex", I AM guilty of that one. :P
That must be about the physical limit, Hutch :bg
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (SSE3)
37 cycles for atodw, res= 12345678
35 cycles 4 atodw_ex, res= 12345678
273 cycles for a2dw2, res= 12345678
112 cycles for MB Val, res= 12345678
37 cycles for atodw, res= 12345678
35 cycles 4 atodw_ex, res= 12345678
260 cycles for a2dw2, res= 12345678
105 cycles for MB Val, res= 12345678
:bg
Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz (SSE4)
33 cycles for atodw, res= 12345678
34 cycles 4 atodw_ex, res= 12345678
249 cycles for a2dw2, res= 12345678
131 cycles for MB Val, res= 12345678
33 cycles for atodw, res= 12345678
34 cycles 4 atodw_ex, res= 12345678
244 cycles for a2dw2, res= 12345678
131 cycles for MB Val, res= 12345678
33 cycles for atodw, res= 12345678
34 cycles 4 atodw_ex, res= 12345678
229 cycles for a2dw2, res= 12345678
131 cycles for MB Val, res= 12345678
--- ok ---
not bad for an old Aussie guy :bdg
prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
59 cycles for atodw, res= 12345678
52 cycles 4 atodw_ex, res= 12345678
327 cycles for a2dw2, res= 12345678
277 cycles for MB Val, res= 12345678
59 cycles for atodw, res= 12345678
52 cycles 4 atodw_ex, res= 12345678
326 cycles for a2dw2, res= 12345678
282 cycles for MB Val, res= 12345678
59 cycles for atodw, res= 12345678
52 cycles 4 atodw_ex, res= 12345678
325 cycles for a2dw2, res= 12345678
277 cycles for MB Val, res= 12345678
AMD doesn't like MB, sorry jj
AMD Phenom(tm) II X6 1100T Processor (SSE3)
30 cycles for atodw, res= 12345678
30 cycles 4 atodw_ex, res= 12345678
177 cycles for a2dw2, res= 12345678
168 cycles for MB Val, res= 12345678
31 cycles for atodw, res= 12345678
30 cycles 4 atodw_ex, res= 12345678
176 cycles for a2dw2, res= 12345678
167 cycles for MB Val, res= 12345678
31 cycles for atodw, res= 12345678
30 cycles 4 atodw_ex, res= 12345678
172 cycles for a2dw2, res= 12345678
169 cycles for MB Val, res= 12345678
Quote from: sinsi on May 20, 2012, 07:07:32 AM
AMD doesn't like MB, sorry jj
It's the other way round, sinsi
.if IsAmd
invoke Sleep, 1
.endif
:green2
It should not be hard to tell that the algo was developed on a PIV, Prescott core as a matter of fact. It seems to perform reasonably well for a non xmm algo and seems to work OK on most hardware.