News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Dave's nickname hijacked??

Started by raymond, August 12, 2011, 03:04:56 AM

Previous topic - Next topic

FORTRANS

Quote from: dedndave on August 16, 2011, 01:53:05 PM
that's because AAA works by using the AC and CF flags, i think
XOR clears them   :P

Hi,

   That would explain it. I wonder why they did that?  (Rhetorical.)
Reading a text book explains it.  Must have saved some transistors.
I should have known that.

Regards,

Steve

dedndave

        ADD     AL,[DI] ; If there is any existing data.
        MOV     AH,0
        AAA             ; Adjust after addition.

:bg

actually, the fact that AAA, DAA, AAS, DAS use the flags makes them usable in "trick" schemes

clive

#62
An improved attempt, Phenom II X4 2.8 GHz, in C
     10000 6237204500        8 ms
    100000 3031782500       95 ms
   1000000 4077562500     1142 ms
  10000000 4535362500    13411 ms
 100000000 9113362500   154577 ms


2nd run
          10000 6237204500        7 ms
         100000 3031782500       94 ms
        1000000 4077562500     1150 ms
       10000000 4535362500    13472 ms
      100000000 9113362500   154893 ms
     1000000000 4893362500  1751345 ms
    10000000000 2693362500 19500530 ms
It could be a random act of randomness. Those happen a lot as well.

FORTRANS

Hi,

   Woke up too early Monday morning, and thought up a way
to speed things up (maybe).  After coding up a few disasters,
I convinced myself it wouldn't work.  Then woke up too early
again today and decided it should work after all.  After undisastering
the thing and adding some timing code I tried it out.  Yipee!
Another insomnia event put to rest.

   The after and before results follow.  The old code was timed
using the TIME command in a BATch file.  Code sizes in *.EXE
thrown in for good luck.  Even though they are not strictly
comparable.

Regards,

Steve N.


Calculating sum of series S N**N N = 1, 1000.
Glorious second version.

18 Aug 2011  8:47:24

0000000000 0000000001 0000000000 0000000000 0000000000 9????????0

  8:47:24

Calculating sum of series S N**N N = 1, 10,000.
Glorious second version.

18 Aug 2011  8:47:29

0000000000 0000000001 0000000000 0000000000 0000000000 6237204500

  8:47:30

Current time is:  8:53:02.38

Calculating sum of series S N**N N = 1, 1000.
0000000000 0000001000 0000000000 9????????0

Current time is:  8:53:04.17

Calculating sum of series S N**N N = 1, 10,000.
0000000000 0000010000 0000000000 6237204500

Current time is:  8:56:22.80

RTC      EXE      1225   1-07-10  3:10p
SUMS_EX  EXE      1376   8-15-11  8:32a
SUMS_EX2 EXE      1856   8-18-11  8:47a

dioxin

Improved attempt:
N= 10,000,000,000              time=3,638,023ms Answer = 2693362500
It's a slightly improved algorithm to the one above and then threaded to use all 4 cores of the Phenom II x4 3GHz cpu.

There is actually a huge short cut that can be made for the big numbers. Once you get to 1,000,000 things repeat so you can just calculate for n=1,000,000 and then a simple multiply and add will cover any  number of whole millions beyond that.
e.g. 1,000,000 gives  4,077,562,500
The bit that repeats = 3,384,200,000 so:
10,000,000 =  4,077,562,500 + 9*3,384,200,000 = 34,535,362,500
100,000,000 =  4,077,562,500 + 99*3,384,200,000 = 339,113,362,500
1,000,000,000 =  4,077,562,500 + 999*3,384,200,000 = 3,384,893,362,500
10,000,000,000 =  4,077,562,500 + 9,999*3,384,200,000 = 33,842,693,362,500
100,000,000,000 =  4,077,562,500 + 99,999*3,384,200,000 = 338,420,693,362,500
At this point the 10 digits stay fixed if we increase the range by 10x more.

Paul.

dedndave

i noticed that phenomenon earlier

http://www.masm32.com/board/index.php?topic=17223.msg144280#msg144280

with a little thought, i believe it could be the basis of a greatly simplified and faster algorithm   :U

start small - extract a digit - multiply up
once you have the minimum amount of required information, you are done   :bg

dedndave

we could be on the verge of discovering a new algebraic law   :U

raymond

Computing the modulo with a modulus exceeding 32 bits using 32-bit registers is not immediately evident. It would actually be impossible if such a modulus was itself a prime or had a prime factor exceeding 32 bits. Nevertheless, my algo now seems to be working properly. My timing for the 1000000 limit is 1050 ms (about 4x slower than using a modulus smaller than 32 bits). That timing should be the same for finding the last 10 up to the last 14 digits which would not involve products exceeding 96 bits.

QuoteThere is actually a huge short cut that can be made for the big numbers.

That may be true as long as the limit is a power of 10. What if the given limit is a random number?
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dedndave

glad to see you back, Ray   :U

FORTRANS

#69
Hi,

   Oh goodie.  I modified my program and tried 100,000 and
1,000,000.  100,000 seems okay, 1,000,000 is not.

Regards,

Steve N.

Edit:  Got it working.  It was some extra code to prevent
an error that wasn't modified correctly with the last upgrade.
If it had been left out, there would not have been tne
incorrect result.  Got to love it.

SRN

dioxin

Raymond,
QuoteThat may be true as long as the limit is a power of 10. What if the given limit is a random number?
Do the first million, do the next million and multiply by the number of whole millions needed then do the small number of digits left over and add the three values.

Paul.