News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Divide by zero error deep in routine

Started by Proto, July 27, 2007, 10:46:13 AM

Previous topic - Next topic

Proto

I'm kinda newish to assembly and I was hoping that someone could help me with this. I made a program to calculate the prime numbers up to 2^32 by division (I know that there are faster methods but the idea was to show a friend how slow this method was - which soon became me trying to optimize it as best as possible). This is the generation procedure, the rest of the code is two api calls to requisition memory via globalalloc and globallock and a procedure for saving the data.

For some reason that I haven't been able to figure out, a divide by zero error takes place within the .REPEAT loop. It happens somewhere between testing 7FFFFFFFh and 8FFFFFFFh for primality. Based on an output file it should take place roughly .5 million primes after testing 7FFFFFFFh. (Output file is 410mb when ending at 7FFFFFFFh, divide by zero takes place at around 413mb of memory usage and primes are always stored in a full DWORD).

Any help, even unrelated to the error - such as how to do something faster / more efficiently or just properly would be appreciated. And if theres a proper methodology on how to debug these sorts of problems (I've tried ollydbg but it slowed it down massively, and by the time it did get to divisor = 0 I couldn't figure out exactly what had caused it).

Quote
exp proc


mov ebx, pMemory            ;start initial data entry, pMemory = pointer to memory location
mov [ebx], dword ptr 1h
add ebx, 4h
mov [ebx], dword ptr 2h
add ebx, 4h
mov [ebx], dword ptr 3h
add ebx, 4h
mov [ebx], dword ptr 5h     ;end initial data entry


add ebx, 4h                 
mov writeloc, ebx           ;move next possible writeloc into writeloc
push writeloc
                           

FLDCW FPSTATE       ;load FP control word with round to 1 for sqrt function
mov esi, 5h         ;move last prime into esi (might as well be an immediate in this case)

add pMemory, 4h     ;move pointer to dword 2h so that next add moves it to 3h


restart:      



mov ebx, pMemory    ;return divisor pointer to point at the first divisor, which = 3

cmp esi, 0FFFFFFFFh ;check if prime candidate is last acceptable 2^32 number
je endsave          ;if so, end and save output

add esi, 2h         ;increment prime candidate


push esi
fild  DWORD PTR[esp]       
fsqrt                   ;FP square root,
fistp DWORD PTR[esp]
pop   edi


.REPEAT     ;check prime loop
         

      add ebx, 4h         ;move forward to next divisor
      mov ecx, DWORD PTR [ebx]      ;move divisor into ecx
     
      mov eax, DWORD PTR esi        ;esi = current prime candidate
      xor edx, edx        ;clear remainder/modulo register
      div dword ptr ecx   ;divide
     
      cmp edx, 0h         ;check if remainder = 0
      je restart          ;next prime candidate
     


.UNTIL ecx > edi        ;until ecx > sqrt(prime candidate)


isaprime:


pop edx                    ;writeloc pop'd into ebx
mov [edx], dword ptr ESI       ;save checked prime
add edx, 4h
push edx                   ;push writeloc onto stack


jmp restart




endsave:                ;prepares for save

pop edx                 ;writeloc pop'd into ebx
sub pMemory, 4h         ;remove bias
sub edx, 4h             
sub edx, pMemory        ;writeloc - initial memory pointer = number of bytes written
mov SizeReadWrite, edx


ret
exp endp

Tedd

My first guess (without checking your code thoroughly) would be that it's caused due to the representation of negative numbers by 2's-complement. Which basically means that 7FFFFFFFh+1 and 'above' are negative.
Anyway, using the fpu is slow, and unnecessary for this. (But at least you're only checking up to the square-root :wink)

A completely integer based method (I wrote this quite a while ago :lol):
prime proc eks:DWORD    ;return true if prime, false if not
    push esi
    mov esi,pList
    add esi,4
  @@:
    mov eax,[esi]
    mov ecx,eks
    mul eax
    cmp eax,ecx
    ja @fin     ;when all divisors (upto sqrt) have been checked it must be prime

    mov eax,ecx
    xor edx,edx
    div DWORD PTR [esi]
    test edx,edx
    jz @fail    ;if no remainder then is whole divisor - so eks not prime!
    add esi,4
    jmp @B
  @fin:
    pop esi
    mov eax,TRUE
    ret
  @fail:
    pop esi
    mov eax,FALSE
    ret
prime endp

pList is a pointer to the list of currently found primes (first element is 2) -- which we use as the list of numbers to divide with (e.g. there's no point checking with any even numbers as dividing by 2 already covers them; you only need to check against any primes up to the sqrt of your candidate.)
No snowflake in an avalanche feels responsible.

Proto

I used the FPU square root function because its supposed to be the fastest method. I searched forum topics and found someone else's testing of different square root algorithms and the fastest was using the FP sqrt at about 26 cycles (including moving the data back into the x86 registers). Squaring the divisor seemed like a slower way since MUL is 3 cycles and the amount of divisors will quickly increase past 26/3.


zooba

The FPU square root function is fine in this situation and your logic is sound.

The divide by zero I believe will be what Tedd said. However, the DIV instruction is not at fault here (it normally is, since there are different instructions for signed and unsigned). I think the issue is that loading 80000000h into the FPU will translate into a negative number. Taking the square root of a negative number is not possible in the real number set and either the floating point exception thrown here is misinterpreted at a divide by zero or the number zero is returned.

The only workaround AFAIK is to zero-extend the original integer into a QWORD. This will ensure that the number is not loaded as a negative number. (It also sets you up to go further than 2^32 numbers (if you have the time :wink )).

Cheers,

Zooba :U

raymond

When loading an integer to the FPU, it will ALWAYS  be considered as a SIGNED integer. The FPU will then consider extracting the square root of a negative number as an invalid oeration and replace the value in the FPU register with a NAN (Not-A-Number). When trying to store such a NAN as an integer, this is also considered as an invalid operation and a value of 80000000h will be stored at a dword destination address.

You were then transfering that value into EDI, using it as the limit for increasing ECX. By increasing the pointer into the primes' memory area, you were eventually loading a 0 into ECX and use it as the divisor. :red

Actually, you don't need the square root nor the square to terminate the loop. Simply terminate it when the divisor exceeds the quotient.
;assuming N=the number being tested and ESI=pointer to the prime list
@@:
   mov  eax,N
   xor  edx,edx
   div  dword ptr[esi]
   test edx,edx
   jz   NOT_prime
   cmp  eax,[esi]
   jb   IS_prime
   add  esi,4
   jmp  @B

Raymond

N.B. Using the FPU without proper knowledge can create problems. As a starter, you may want to have a peak at the following if you are not yet aware of its existence.

http://www.ray.masmcode.com/tutorial/index.html
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

Proto

Thanks for the replies. Looks like I have some reading to do (thanks for the link too).

I've already tried terminating when divisor>n/3, since no even number other than two is a prime and the first divisor is 3 but it was very slow - taking about 4 hours to hit 10mb of memory usage (by comparison its about 8 minutes till 70mb of usage for terminating when divisor>sqrt(n)).