News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

UNSIGNED Integer square root problem

Started by cobold, November 26, 2008, 01:23:28 AM

Previous topic - Next topic

cobold

found on german site a routine to calculate sqare root UNSIGNED 32-bit:
implement it, but doesnot work,-- gives wrong result wiht carry????
help appreciated:

[
Sqrt_Cobold proc number:DWORD
    LOCAL i0,i1,i2,i3,i4,i5,k0,k1:DWORD

    mov k1,0
    mov i1,0
    mov i5,32
   
    mov i0,1                ;For i0 := 1 to 4 do Begin
ALIGN 4   
    .while i0 <= 4
        sub i5,8
        mov eax,number
        mov ecx,i5
        shr eax,cl
        and eax,0000000FFh
        mov i4,eax          ;i4 := (number shr i5) and $000000FF
        mov eax,i1
        shl eax,8
        or eax,i4
        mov i1,eax          ;i1 := (i1 shl 8) or i4
        mov eax,k1
        shl eax,5
        inc eax
        mov i2,eax          ;i2 := (k1 shl 5) + 1
        inc i0
    .endw
    mov k0,0
ALIGN 4
    .repeat
        mov eax,i1          ;i3 = i1 - i2
        sub eax,i2
        jc @F
        mov i3,eax
        .break .if i3 < 0
        mov eax,i3
        mov i1,eax      ;i1 = i3
        add i2,2        ;i2 = i2 + 2
        adc i2,0
        inc k0
    .until i3 < 0
@@:
    mov eax,k1              ;k1 := (k1 shl 4) + k0;
    shl eax,4
    adc eax,k0
    ret
Sqrt_Cobold endp   
/code]

raymond

I haven't analyzed the algo you posted to determine how it may work. However, if you are looking for an algo to extract the square root of unsigned 32-bit integers (and 64-bit unsigned integers) without using the FPU, you may want to look at the one(s) included in my MixLib package which you can find at:

http://www.ray.masmcode.com/fixmath.html

The source code(s) are included in the package. You can adapt the source code to return a truncated result instead of a rounded result by commenting out the very last "inc eax". Although the code may look long, part of it is repeated 16 times (for 32-bit integers) on the assumption that the timing would be improved by unrolling a loop.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

cobold

Raymond,

thanks for the reply and for mixlib!

I believe FSQRT and the IntSqrt of masm treat all numbers as signed, so you cannot calculate the square root of numbers > 2147483648 - is my assumption correct?

Anyway, I forget about my routine and take a look at mixlib now!

regards

Cobold

ecube

Your library looks nice Raymond, you seem to have put a lot of time into it. Is it for the most part safe? I tried to understand all you documentation but it seemed to go back and forth. And if so, have you considered talking to hutch about adding it to masm32? i'd imagine countless people could benefit from this.

raymond

I issued that library many years ago (I still had a Pentium III at the time :() before I even issued the Fpulib and SimplyFPU tutorial. Although the basic principles of "fixed point" math can be usefull in some applications, it has become more of a curiosity with the advent of the FPU/MMX/SSE technologies.

However, I am available to answer any question regarding this subject if someone doesn't understand some of the explanations given at the above link or in the Mixlib package.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

vanjast

thought I'd play with that sample...
what I can work out in 30 seconds - I might be wrong  :boohoo:

At the end of the While loop, you just end up with
I1 = number
I2 = 1
It actually does nuzzing except make I2 ??

Anyway off to the repeat loop, whcih at the end just
K0 = repeat loop Counter
K1 is intitilaised to zero and always remains there, yet it's used in the final calc ??
Result would be the repeat loop count (+1 if there is a 'stray carry' floating around )

It looks like the variables might be mixed up, or the author was smoking something
:bg


Mark_Larson

Quote from: cobold on November 26, 2008, 01:23:28 AM
found on german site a routine to calculate sqare root UNSIGNED 32-bit:
implement it, but doesnot work,-- gives wrong result wiht carry????
help appreciated:[\quote]

the floating point square root on a P4 and  up is extremely fast.  I recommend converting to floating point and then doing the square root and converting back.  Assuming of course you are looking for speed.

I think the p4 does the fsqrt in like 12 cycles.  So it's really fast now.  The P3 was like 45 cycles.  Most people don't realize that and try to do software implementations like yours.

BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

MichaelW

Agner Fog shows 69 cycles for the P2/3 at the default 64-bit precision, with notes that the instruction is not pipelined, and that it will be faster at a lower precision. For the P4 he shows 43 cycles at the default precision, 30 at 53-bit precision, and 23 at 24-bit precision. I can't see any good way to test fsqrt in isolation, so I settled for testing:

fild xx
fsqrt
fistp yy


Running the code in the attachment on my P3, I get consistent results:

Control word PC field = 00000003h (default, 64 bits)
65 cycles
Control word PC field = 00000002h (53 bits)
54 cycles
Control word PC field = 00000000h (24 bits)
25 cycles


Running on a P4 the run to run variation will probably be relatively large.



[attachment deleted by admin]
eschew obfuscation