I am trying to use the Bisection Method for computing the square root of a 32-bit integer
This is not for homework, I am just trying to make a simple calculator, and I wanted to try adding the sqrt function to it.
As it is right now, I have gotten lost in my code of many pops and pushes.
If you guys could help me with writing this or just come up with something your self that would be awesome thanks :D
the easy way is to use the FPU
load integer, sqrt, store integer
but, if you want to write a "discrete" routine, this might be helpful...
http://www.masm32.com/board/index.php?topic=12116.msg92595#msg92595
This converts to assembler quite easily
// Jack Crenshaw's Integer Square Root
unsigned long integer_sqrt(unsigned long a)
{
unsigned long rem = 0;
unsigned long root = 0;
unsigned long divisor = 0;
int i;
for(i=0; i<16; i++) // 32-bit, 2 at a time
{
root <<= 1;
rem = ((rem << 2) + (a >> 30));
a <<= 2;
divisor = (root << 1) + 1;
if (divisor <= rem)
{
rem -= divisor;
root++;
}
}
return(root);
}
thanks guys
but clive a few things in your c++ code
what does
root <<= 1
rem = ((rem << 2) + (a >> 30)
a <<= 2
divisor = (root << 1) + 1
what do the use of the '<<' and '>>' mean in these i know they are used for input and output but i have never seen them used in this way...:) (learning c++ later this year)
shift left by 1, shift right by 1
Should look roughly like this and assembles fine but it always returns zero ::)
What does a <<= 2 mean??
include \masm32\include\masm32rt.inc
.code
start:
a equ eax
rem equ edx
root equ esi
divisor equ edi
i equ ecx
mov a, 12345678
call integer_sqrt
inkey str$(root), " returned"
exit
; unsigned long integer_sqrt(unsigned long a)
;{
; unsigned long rem = 0;
; unsigned long root = 0;
; unsigned long divisor = 0;
; int i;
integer_sqrt proc
xor rem, rem
xor root, root
xor divisor, divisor
xor i, i
.Repeat
shl root, 1 ; root <<= 1
shl rem, 2 ; rem = ((rem << 2) + (a >> 30))
shr a, 30
add rem, a
shl a, 2 ; a <<= 2 ???
shl root, 1
mov divisor, root
inc divisor
.if divisor<rem
sub rem, divisor ; rem -= divisor
inc root ; root++
.endif
inc i
.Until i>=16
ret
integer_sqrt endp
end start
QuoteWhat does a <<= 2 mean
equal to: a= a << 2;
a = a SHL 2
Quote from: Number1awa
but clive a few things in your c++ code
Actually it's pure C, ANSI rather than K&R. The << and >> are shift left and shift right respectively. With an equal sign they become assignments. root <<= 1, can be written in the longer form root = root << 1;
I can't say converting it to x86 has much point given the easy availability of float/XMM instructions to achieve better performance, and better fractional precision. When using it on an ARM or MIPS, one might also consider suitable scaling of the inputs and outputs to achieve the desired range.
Quote from: clive on February 05, 2011, 07:01:34 PM
I can't say converting it to x86 has much point given the easy availability of float/XMM instructions to achieve better performance, and better fractional precision.
It's hard to say. It could be someone trying to learn assembler and grasping at straws looking for good projects that are sufficiently rigorous to help foster understanding. I concur with you and dave, it's just better to use the sqrt functionality of the FPU instead of cooking up your own (unless you want FPU emulation, and there's been more than enough effort already done on that).
Still, just for the fun of it, any idea why my code above doesn't work? I have never used C, so it's probably something very trivial.
The error could be here...:
.if divisor<rem
if 1
mov rem, divisor
neg rem
else
sub rem, divisor ; rem -= divisor
endif
inc root ; root++
.endif
::)
I got this version working. Google, by the way, suggests this interesting link (http://www.masm32.com/board/index.php?topic=10070.0) which seems to be based on Listing 4 (http://www.embedded-systems.com/98/9802fe2.htm) :bg
include \masm32\include\masm32rt.inc
.code
start: push 152399025
call integer_sqrt
inkey str$(eax), " returned"
exit
integer_sqrt proc
root equ eax
tmp equ ebx
i equ ecx
rem equ edx
number equ dword ptr [esp+4]
xor rem, rem
xor root, root
xor i, i
.Repeat
shl root, 1 ; root <<= 1
mov tmp, number
shl rem, 2 ; rem = ((rem << 2) + (a >> 30))
shr tmp, 30
add rem, tmp
shl number, 2 ; a <<= 2 ???
.if root<rem
inc root
sub rem, root ; rem -= root
inc root
.endif
inc i
.Until i>=16
shr root, 1
ret 4
integer_sqrt endp
end start
unsigned long asm_sqrt(unsigned long a)
{
unsigned long root;
__asm
{
xor edx,edx ; rem
mov eax,a
xor ebx,ebx ; root
mov ecx,16 ; i, iterations
lp:
shl ebx,1 ; root <<= 1
add eax,eax ; rem = ((rem << 2) + (a >> 30)); a <<= 2;
adc edx,edx
add eax,eax
adc edx,edx
lea edi,[ebx*2+1] ; divisor = (root << 1) + 1;
cmp edx,edi ; if (divisor <= rem)
jc ovr
sub edx,edi ; rem -= divisor;
inc ebx ; root++
ovr:
dec ecx ; i--
jnz lp
mov root,ebx
}
return(root);
}
Thanks, Clive, with LEA codesize is down to 40 bytes. But it's still not competitive with the FPU...
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (SSE3)
223 cycles for Jack C, result 12345
64 cycles for the FPU, result 12345
62 cycles for MasmBasic, result 12345
A more sensible implimentation might build a table rather than just calculate once fast.... If used for a calculator no-one is going to notice the difference between 64 and 223 cycles on a button click and you want the best legible code :lol....
For an implementation that has to calculate the sqrt a lot you would want to generate a 256Kb table? in as few cycles as possible and then search table in far fewer than 60 cycles for result?....
Just my 2 cents worth :lol I may be far out, Maths/French is not my Fortay, but this is an interesting calculation I've not done before :bg
Thanks guys i like jj2007 version the best :)
it helped alot :)
Quote from: jj2007 on February 05, 2011, 10:54:49 PM
Thanks, Clive, with LEA codesize is down to 40 bytes. But it's still not competitive with the FPU...
Indeed, something I'd use on non-FPU implementations (68K, ARM, MIPS). Although the way it's described a hardware implementation would be reasonably viable if you could spare the gates.
As for optimization, the best one is to avoid the SQRT completely, especially if you are comparing/sorting based on the result.
So I found another way to do this which I wanted to try: implementing the
http://en.wikipedia.org/wiki/Bisection_method (http://en.wikipedia.org/wiki/Bisection_method) I followed the pseudo-code found at the bottom of the page except for the break condition i used k*k <= N && (k+1)(k+1) > N
When i attempt to run it. It just runs forever without ever breaking from the loop
This is the code that I have
INCLUDE irvine32.inc
.data
left EQU ebx ; left interval
right EQU eax ; right interval
midpoint EQU ecx ; midpoint
firstIfP EQU esi ; first product in checking
secondIfP EQU edi ; second product in checking
loopP1 EQU edx ; UNTIL condition holders
loopP2 EQU esp ; ........................
N DWORD ? ; holds root squared
.code
main PROC
; RESETING REGISTERS
sub eax, eax
sub ebx, ebx
sub ecx, ecx
sub edx, edx
sub esp, esp
sub esi, esi
sub edi, edi
mov N, 400
mov right, 400 ; right interval ( to take root of
mov left, -10 ; left interval
.REPEAT
push right ; saves right interval
add right, left ; (b + a)
shr right, 1 ; / 2
mov midpoint, right ; the midpoint
pop right
push left ; saves left interval
imul left, midpoint ; left * midpoint
mov firstIfP, left
pop left
.IF firstIfP < 0 ; left * midpoint < 0
mov right, midpoint ; new right interval
.ENDIF
push right ; saves right interval
imul right, midpoint ; right * midpoint
mov secondIfP, right
pop right
.IF secondIfP < 0; ; right * midpoint < 0
mov left, midpoint ; new left interval
.ENDIF
mov loopP1, midpoint ; computes k*k < N
imul loopP1, loopP1
mov loopP2, midpoint ; computes (k+1)(k+1) > N
inc loopP2
imul loopP2, loopP2
.UNTIL (loopP1 <= N) && (loopP2 > N)
mov eax, midpoint ; moves root to eax
call WriteInt
exit
main ENDP
END main
Thanks :)
.IF firstIfP < 0
See The .if eax<0 trap (http://www.webalice.it/jj2006/Masm32_Tips_Tricks_and_Traps.htm)
I tried using
SDWORD PTR firstIfP
and when i try to run the program it still just stops working and i have to stop the program....
Your use of registers is, well, kind of unorthodox. Check what you did to esp :wink
The code below
should work but is still stuck in an endless loop. You will need Olly to find out why (I don't have the time, sorry).
Hint:
Quoteimul loopP1, loopP1
.Break .if Carry? ; lets you exit the endless loop when edx becomes negative
include \masm32\include\masm32rt.inc
.data
left EQU ebx ; left interval
right EQU eax ; right interval
midpoint EQU ecx ; midpoint
firstIfP EQU esi ; first product in checking
secondIfP EQU edi ; second product in checking
loopP1 EQU edx ; UNTIL condition holders
loopP2 EQU ebp ; ........................
N DWORD ? ; holds root squared
.code
start:
; RESETING REGISTERS
sub eax, eax
sub ebx, ebx
sub ecx, ecx
sub edx, edx
;sub esp, esp
sub esi, esi
sub edi, edi
mov N, 400
mov right, 400 ; right interval ( to take root of
mov left, -10 ; left interval
; int 3 ; Olly is your friend: http://ollydbg.de/
push ebp
.REPEAT
push right ; saves right interval
add right, left ; (b + a)
shr right, 1 ; / 2
mov midpoint, right ; the midpoint
pop right
push left ; saves left interval
imul left, midpoint ; left * midpoint
mov firstIfP, left
pop left
.IF sdword ptr firstIfP < 0 ; left * midpoint < 0
mov right, midpoint ; new right interval
.ENDIF
push right ; saves right interval
imul right, midpoint ; right * midpoint
mov secondIfP, right
pop right
.IF sdword ptr secondIfP < 0; ; right * midpoint < 0
mov left, midpoint ; new left interval
.ENDIF
mov loopP1, midpoint ; computes k*k < N
imul loopP1, loopP1
mov loopP2, midpoint ; computes (k+1)(k+1) > N
inc loopP2
imul loopP2, loopP2
.UNTIL (loopP1 <= N) && (loopP2 > N)
pop ebp
print str$(midpoint), " returned"
exit
end start
Ok, so I figured out why it is being an infinite loop mostly....
at the bottom i did this:
push eax
mov eax, midpoint
call WriteInt
pop eax
With the number to take the root of set at 400:
just to see what the midpoint was every time, and it prints out forever "200" the if you output the right interval it prints forever as "400" same for left interval printing "0"
right and left are not taking on the values of whatever they should be taking on and continuing with that value to the next iteration of the loop....
So...at least that is an error that i can hopefully work with now :D
Where in the above code is the functionality of the f(left), f(right), and f(midpoint) calls, as specified in the Wikipedia pseudo-code, implemented? Assuming that you are trying to find the square root of 400, this function should return (arg * arg - 400).
Oh....really....lol perhaps i should add that to the code....hehe
I surely do hope that that is the only thing that is missing....
ok so i just added this now.....which in all ways that i can see computes f(left) f(right) and f(midpoint) and it still does and infinite loop..... :(
push left ; saves left interval
push midpoint
imul left, left ; left*left - 400
sub left, 400
imul midpoint, midpoint ; midpoint*midpoint - 400
sub midpoint, 400
imul left, midpoint ; f(left) * f(midpoint)
mov firstIfP, left
pop midpoint
pop left
.IF SDWORD PTR firstIfP < 0 ; left * midpoint < 0
mov right, midpoint ; new right interval
.ENDIF
push right ; saves right interval
push midpoint
imul right, right ; right*right - 400
sub right, 400
imul midpoint, midpoint ; midpoint*midpoint - 400
sub midpoint, 400
imul right, midpoint ; f(right) * f(midpoint)
mov secondIfP, right
pop midpoint
pop right
.IF SDWORD PTR secondIfP < 0 ; right * midpoint < 0
mov left, midpoint ; new left interval
.ENDIF
Hold on guys I just figured it out!!!!
It works now!!!!!
Yay....that took so long to figure out lol
Thank you all could not have done it without you guys :)
The attachment is a small test app (done in C) that implements a workable integer version of the wiki pseudo-code.
In case the logic behind the f() function is not clear, here is a probably not very good explanation:
The bisection method is used to solve equations of the form f(x) = 0. To use your example value, if x is the square root of 400, we know that:
x2 = 400
So we algebraically manipulate this to the form:
x2 – 400 = 0
And the task is then to find the root of f(x) = x2 – 400