News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

pi program to calculate millions of digits fast

Started by Mark_Larson, October 11, 2006, 03:42:47 PM

Previous topic - Next topic

Mark_Larson


  I mentioned working on a PI program under the Assembler is Irrelevent topic posted by Mant.  It can calculate PI to 1 million digits in 2.82 seconds on my Athlon 64 2.2.  I posted the program on the pi-hacks newsgroup.  I posted it about a month ago.  I wrote the bulk of the code in C, with the stuff that needed to be optimzied in assembler written in assembler.

  If you want to play with it, you can download it from the following link.  The name of the file is chud.zip

http://groups.yahoo.com/group/pi-hacks/files/

  have fun
BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

Wistrik


Mark_Larson

#2
  Here's the attached program.

EDIT: modified the .zip due to a bug in downloading it from pi-hacks yahoo newsgroup. 

[attachment deleted by admin]
BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

Wistrik

Thanks. I ran Chudp4 and this was its output:

Processor Name:               Intel(R) Pentium(R) 4 CPU 2.40GHz
Processor Speed: 2400 MHz
Number of processors: 1
#terms=7, depth=4
.......
total   time =  0.000
pi =
0.3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117068e1


If I specify a command line option it crashes with a Windows error report request. If I don't, it only prints PI out to what I've shown above and then stops. At least it displays some digits; the K7 version displays nothing after total time.

Mark_Larson


some explanations.  I should have made the code nicer, currently it outputs PI to the console window instead of a file.  And it doesn't print any help information when you try and run it.

chudk7 is for AMD k7 processors and up.  If you run it on a Intel processor it won't work since it uses AMD specific instructions.

and vice a versa for the chudp4 version.  So run it on the appropriate processor for your system.

You need to pass in one parameter to tell the program how many digits to compute

chudp4 1048576 > pi.txt - will compute pi to 1 million digits and output the result to pi.tzt
chudk7 1048576 > pi.txt - will compute pi to 1 million digits and output the result to pi.txt

if you don't specifiy anything on the command line it defaults to computing 100 digits.

enjoy.
BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

Wistrik

I figured out the difference in the files. My problem was that my work computer is a P4 but my home computer is a AMD64 so I was getting them confused.

Here's the header for the 1,000,000 digit output, just for performance reference:
Processor Name:               Intel(R) Pentium(R) 4 CPU 2.40GHz
Processor Speed: 2400 MHz
Number of processors: 1
#terms=73938, depth=18
..................................................
total   time =  4.578


Thanks for the help.

Ghirai

MASM32 Project/RadASM mirror - http://ghirai.com/hutch/mmi.html

Wistrik

I found a possible bug...

This was the output from CHUDK7 on my AMD at home:

Processor Name: AMD Athlon(tm) 64 Processor 3200+
Number of processors: 1
#terms=73938, depth=18
..................................................
total   time =  2.953


And this is the output (clipped) from CHUDP4 on my AMD at home:

Processor Name: AMD Athlon(tm) 64 Processor 3200+
Processor Speed: 2000 MHz
Number of processors: 1
#terms=73938, depth=18
..................................................
total   time =  4.063
pi =
0.3141592653589793238<snip>


EduardoS

I try:

C:\pi>chudk7 1000000  > pi7.txt

C:\pi>chudp4 1000000  > pi4.txt

and get:
k7:

Processor Name: AMD Athlon(tm) 64 Processor 3200+
Number of processors: 1
#terms=70513, depth=18
..................................................
total   time =  2.812

p4:

Processor Name: AMD Athlon(tm) 64 Processor 3200+
Processor Speed: 2000 MHz
Number of processors: 1
#terms=70513, depth=18
..................................................
total   time =  3.812
pi =
0.3141592653589793<a lot>


The k7 version seens bugged..
Also, wich algorithm did you use? --EDIT: The exe name sugest chudnovsky...

Anyway, good job, SuperPi need 45 seconds to do it here.

Mark_Larson

  I have an Athlon 64 at home and both the K7 and P4 versions print out the processor speed correctly.  I get the speed from the registry and round up.  It doesn't need to know the processor speed to run correctly.  I had planned on doing code that was CPU speed dependent, but I never added it.

K7


Processor Name: AMD Athlon(tm) 64 X2 Dual Core Processor 4200+
Processor Speed: 2200 MHz
Number of processors: 2


P4


Processor Name: AMD Athlon(tm) 64 X2 Dual Core Processor 4200+
Processor Speed: 2200 MHz
Number of processors: 2


  I use Chudnovzsky to compute the value of PI.  I use binary splitting to calculate the Chud formula quickly.  Currently using binary splitting is the fastest known method for computing Chud.  It was primarily created to speed up factorials ( there are 3 factorials in the Chud formula).  Binary splitting uses a lot more memory than standard methods, but it makes up for it in speed.  Chud calculates 14 digits of PI per iteration.  The "depth=" field printed out is how deep the binary splitting went ( it's recursive).  If you want more detail let me know.

BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

EduardoS

In the K7 version it also don't prints the result.

And... How chudnovsky formula is? (Sorry for this stupid question... Don't find on Google..)

Mark_Larson

Quote from: EduardoS on October 12, 2006, 03:16:32 PM
In the K7 version it also don't prints the result.

  I think I figured out the problem on my way to work this morning.  I have 2 parts to the code.  A C part and the asm part.  The asm part doesn't use any instructions greater than a K7.  However I remember using a switch for the C compiler telling it was as K8 ( I have a K8).  So I am guessing it is generating K8 specific code for the main part.  I don't have the code in front of me, so I won't be able to check until I get home.

Quote from: EduardoS on October 12, 2006, 03:16:32 PM
And... How chudnovsky formula is? (Sorry for this stupid question... Don't find on Google..)

  good place to check is mathworld.  This webpage below has quite a large number of algorithms for computing PI.

http://mathworld.wolfram.com/PiFormulas.html

search for "Chud"

you can also view the formula directly by going to this link:

    http://mathworld.wolfram.com/images/equations/PiFormulas/inline216.gif

BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

EduardoS

Quote from: Mark_Larson on October 12, 2006, 05:28:27 PM
I think I figured out the problem on my way to work this morning.  I have 2 parts to the code.  A C part and the asm part.  The asm part doesn't use any instructions greater than a K7.  However I remember using a switch for the C compiler telling it was as K8 ( I have a K8).  So I am guessing it is generating K8 specific code for the main part.  I don't have the code in front of me, so I won't be able to check until I get home.

I'm using K-8 here, and with SSE3 on a x64 windows...

Mark_Larson

Quote from: EduardoS on October 12, 2006, 07:12:30 PM
I'm using K-8 here, and with SSE3 on a x64 windows...

  Ah, I don't have 64-bit Windows.  I wonder if that is the problem.  I didn't want to install it, because when I first got my Athlon it wasn't that stable , and didn't have support for a number of things.  I had planned on adding a 64-bit linux and doing dual boot.

Have you had any issues running 64-bit Windows?

Any issues with any 32-bit programs you compile and then run?

I used GCC for the compiler since I feel it does a better job optimizing for the C part of the code.

Wistrik do you also have 64-bit Windows?
BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

EduardoS

I tryed the XP x64 and now Vista x64, the problems:
- 16-bit programs simple don't work;
- BIG compatibility problems with drivers and programs wich depends on these drivers (mostly on Vista);
- Debuggers don't work well.

Everything else goes well...