News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Factorial Question

Started by z941998, April 03, 2010, 04:15:43 AM

Previous topic - Next topic

z941998

I developed a small routine for factorials (not using the fpu).  It ran fine when I briefly tested it with a simple 5! test, it return 120.

I then went on to create procedures for Permutations and Combinations that called the factorial procedure and found that they were failing.  After analyzing this, I found that I was exceeding the X-86 registers 32 bit boundary, which means that I can only do factorial up to 12, the 13th fails.

I have needs to handle factorials up to a value of 150.  Is there a process or set of instructions that can be looked at to help me aliong?

I am aware of shortcuts for permutations and combinations, but even this has its limits.

Also I have/did see to assembly created calculators that handle what I need, but there is no source code and the exe's are averaging 1 meg.  So I know it can be done.

Any sugguestions?
Thanks
Steve

clive

Use multiple registers, and spill it to 64 bits.

Use a multiply macro or subroutine, or inline to do this, a quick google might pull one up.

Basically you want a mul64 function, either c64 = mul6432(a64, b32) or c64 = mult6464(a64, b64) type implementation. The former being computationally cheaper with a64/c64 being the accumulator and b32 the iterator.

-Clive

It could be a random act of randomness. Those happen a lot as well.

oex

Not a mathematical wizz here but if I understand what a factorial is correctly you wont be able to handle a factorial 150 with 64 bit math you would surely need big num lib?

1*2*3*4*5....*150=X.XXX 200ish
We are all of us insane, just to varying degrees and intelligently balanced through networking

http://www.hereford.tv

FORTRANS

Hi,

   Well 150! will need around 1,000 bits of storage (roughly
5.7e262) if exact results are wanted.  If approximate results
are okay, use the FPU.  Else write a 32-bit by 1,024 bit multiply.

Steve

dedndave

i am thinking ling long kai fang, here   :bg
thanks for the idea, guys   :U

z941998

Thanks everyone, now I know whats the parameters area, I will proceed onward.  Steve

raymond

Quote from: dedndave on April 03, 2010, 03:33:36 PM
i am thinking ling long kai fang, here :bg
thanks for the idea, guys :U

I once wrote an algo to compute 12499C2499 (i.e. 12499!/(10000!*2499!) and then find the sum of the digits. The algo ran in 9 ms on my 1.89GHz CoreDuo. See if you can beat that timing, Dave. :wink I'm sure you can.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

clive

Ok, for 256-bit wide I have the following,  can someone double check the big ones?
-Clive

1! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
2! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000002
3! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000006
4! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000018
5! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000078
6! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 000002D0
7! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 000013B0
8! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00009D80
9! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00058980
10! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00375F00
11! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 02611500
12! - 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1C8CFC00
13! - 00000000 00000000 00000000 00000000 00000000 00000000 00000001 7328CC00
14! - 00000000 00000000 00000000 00000000 00000000 00000000 00000014 4C3B2800
15! - 00000000 00000000 00000000 00000000 00000000 00000000 00000130 77775800
16! - 00000000 00000000 00000000 00000000 00000000 00000000 00001307 77758000
17! - 00000000 00000000 00000000 00000000 00000000 00000000 0001437E EECD8000
18! - 00000000 00000000 00000000 00000000 00000000 00000000 0016BEEC CA730000
19! - 00000000 00000000 00000000 00000000 00000000 00000000 01B02B93 06890000
20! - 00000000 00000000 00000000 00000000 00000000 00000000 21C3677C 82B40000
21! - 00000000 00000000 00000000 00000000 00000000 00000002 C5077D36 B8C40000
22! - 00000000 00000000 00000000 00000000 00000000 0000003C EEA4C2B3 E0D80000
23! - 00000000 00000000 00000000 00000000 00000000 00000579 70CD7E29 33680000
24! - 00000000 00000000 00000000 00000000 00000000 00008362 9343D3DC D1C00000
25! - 00000000 00000000 00000000 00000000 00000000 000CD4A0 619FB090 7BC00000
26! - 00000000 00000000 00000000 00000000 00000000 014D9849 EA37EEAC 91800000
27! - 00000000 00000000 00000000 00000000 00000000 232F0FCB B3E62C33 58800000
28! - 00000000 00000000 00000000 00000000 00000003 D925BA47 AD2CD59D AE000000
29! - 00000000 00000000 00000000 00000000 0000006F 99461A1E 9E1432DC B6000000
30! - 00000000 00000000 00000000 00000000 00000D13 F6370F96 865DF5DD 54000000
31! - 00000000 00000000 00000000 00000000 0001956A D0AAE33A 4560C5CD 2C000000
32! - 00000000 00000000 00000000 00000000 0032AD5A 155C6748 AC18B9A5 80000000
33! - 00000000 00000000 00000000 00000000 0688589C C0E9505E 2F2FEE55 80000000
34! - 00000000 00000000 00000000 00000000 DE1BC4D1 9EFCAC82 445DA75B 00000000
35! - 00000000 00000000 00000000 0000001E 5DCBE8A8 BC8B95CF 58CDE171 00000000
36! - 00000000 00000000 00000000 00000445 30ACB7BA 83A11128 7CF3B3E4 00000000
37! - 00000000 00000000 00000000 00009E00 08F68DF5 06477ADA 0F38FFF4 00000000
38! - 00000000 00000000 00000000 00177401 5499125E EE9C3C5E 4275FE38 00000000
39! - 00000000 00000000 00000000 0392AC33 E351CC76 59CD325C 1FF9BA88 00000000
40! - 00000000 00000000 00000000 8EEAE81B 84C7F27E 080FDE64 FF052540 00000000
41! - 00000000 00000000 00000016 E39F2C68 4405D62F 4A8A9E2C D7D2F740 00000000
42! - 00000000 00000000 000003C1 581D491B 28F523C2 3ABDF35B 689C9080 00000000
43! - 00000000 00000000 0000A179 CCEB478F E12D019F DDE7E05A 924C4580 00000000
44! - 00000000 00000000 001BC0EF 38704CBA B3BC477A 23DA8F91 251BF200 00000000
45! - 00000000 00000000 04E0EA0C EBBD7CD1 98189078 4D6B3C83 85E98A00 00000000
46! - 00000000 00000000 E06A0E52 5C0C6DA9 5469F59D E944DFA2 0FF6CC00 00000000
47! - 00000000 00000029 3378A11E E6482216 7F7417FD D3A50EC0 EE4F7400 00000000
48! - 00000000 000007B9 A69E35CB 2D866437 E5C47F97 AEF2C42C AEE5C000 00000000
49! - 00000000 00017A88 E4484BE3 B6B92EB2 FA9C6C08 7C778C8D 79F9C000 00000000
50! - 00000000 0049EEBC 961ED279 B02B1EF4 F28D19A8 4F5973A1 D2C78000 00000000
51! - 00000000 0EBA8F91 E823EE3E 18972ACC 521C1C87 CED2093C FDBE8000 00000000
52! - 00000002 FDE529A3 274C649C FEB4B180 ADB5CB96 02A9E063 8AB20000 00000000
53! - 0000009E 90719EC7 22D0D480 BB68BFA3 F6A3260E 8D2B749B B6DA0000 00000000
54! - 00002172 77F77E01 580CD327 88186C96 066A0711 C72A98D8 91FC0000 00000000
55! - 00072F97 C62C1249 EAC15D7E 3D3F543B 60C784D1 CA26D687 5D240000 00000000
56! - 01926933 59A4002B 5A4C739D 65DA6CFD 2BA50DE4 387EED9C 5FE00000 00000000
57! - 59996C6E F58409A7 1B05BE0B ADA2445E B7C017D0 9442E7D1 58E00000 00000000
58! - Overflow


Then I got kind of stupid and did this..

966! - A68EE6E9 058FFEEB 10D064C1 8D0D491F 05C1B2EA 4E4B7344 29B91999 3F91CD80 176361BD 3CB111BE B0B787BF 64FE4BDF 2571070D 8F3AB60A ABA3240D CF2B36DC D82B6142 94337D2D BEC97E56 A43A5705 B926EE24 A5C23B6D 0046EE39 35AC8AB9 FE810B98 95904C37 61587992 2F793D5E 2791F07C D309E5CB 05043FF9 5C22DAA4 543ADD56 D35DD8DC 8F27A71E 615987FC 43298FE0 23A9BA5B 3858CA92 5A3EAA91 FE673687 1107B226 A3A59380 C7355723 4C3DD100 C5236FD4 75791A52 C6619E5D 0339F3B5 4E1AB055 0130869F 690CC1D5 D948C448 BB47BB76 8518FB06 8398057A F06E5710 EE63EE68 63F30930 3E70423D CCEAEF8F 918F4CC3 82E9A4A0 0FAD2BA9 F8C34A07 E99DF061 9FC55959 A7ED7C85 CE2D7476 DDCAC680 4FA5ED52 1376E2B5 58DC5602 08D84D43 FA09D036 DC58FEC1 2E7C4C8A ABC80D77 E0AC7C7A 33315E5B 10CAAFE5 84E583D8 74253191 62B4DF03 7279407F EE5788FE FF695813 55016C42 427705F5 66CE53FD BD625E35 E04C3111 B34ED475 7A64A61C 36B6711E BB3D7A23 8F081D27 B8E87589 EABE6641 89FE5A74 58DFAFD3 CFAD8AA2 69B4841D 7B9BC017 554788EA 15862F7D 5B503B44 CD1BA5DE 000C961C 654F45BC C4FE7CA0 BC87B115 25056F50 361B57DA D207177C 5798FE9E 467F7DD6 BF363266 D2AFA62A 4F14A5B3 E3EA2B25 DDB48CD6 BFAB25EF EFC7DE1A 8066ABBF A8104956 E22D913F 1218F40B 0F94D5CA C77017A6 3EBB780F 7D9899CA B98BC74C 94D40078 27151E42 660E2ED3 AD0A2D01 2AE30DFF 8F5B2BC0 9D25DA62 2DA2F0BD F9084763 D5617774 E82595BB 0E986161 03C70EA9 6F0BC25F CB98CD2E 26D1B266 3FD57D91 140D7C2A 867B0F10 01603674 0B506A32 DE899D79 2E3D7AD5 7FC79FD6 83C3B13D B7C6562F AC7A34A8 2BF09318 EBE273B1 FBD78855 316F0BCC 0856827D 59745876 92ADA598 87212CCF 497D4283 E4DCEFF7 A098A855 AA88B2E2 1531CF6E 13DFA900 77656408 D8218BD7 DF982E2B 32362FB5 BFD8E532 5288F9B3 31F6F8A3 BB77234B FB971C00 D2C5626F 92133C10 524F786C E33559B3 275D86FF 8B249FC2 B5A1CA8E 8011210F F950142C 704C8E10 A4B730D6 15A07BE5 53D30A19 01E6166E ACE6FC0D 3F168668 E4887614 3A336563 A95E121E B5987420 4E45B135 F0188414 B4D62062 A35CAABF 27291CBC 1F06C17D 09420E5F DDD17229 DE968609 F009F6F0 A82247A2 4EB6585C BDA97A1C 445ABF30 7A685116 BD74FBDA 01185AE6 EB397CBF F11F5B32 774B51C6 599F43BD DCF8495F C3C44E43 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
967! - Overflow
It could be a random act of randomness. Those happen a lot as well.

z941998

I have been doing some review.

I found that you can use continuous memory locations for large numbers versus keeping everything in registers, I had forgotten about this fact.

I saw within the Masm Forum search area there is a proceedure (uArbitrary2Str) that displays large numbers and dedndave was involved here again.

I aslo found another routine that multiplies large numbers (16 dd's in length) called Mult003.asm.

Putting these two together meets my needs, but as Raymond has pointed out we need a library that can handle anything, which I am still in the process of building.

It might take me a while, but I will make a stab at Raymond's mark.  Steve

Footnote:  Fact(145) has 252 decimal digits and I figured anything up to 255 would be ok for my needs, but now I have learned what large really is, forget the limits of memory, use your virtual space and/or harddrive space, and wa-la you meet Morphius.

dedndave

lol Raymond
i know i can never beat that time   :bg
but, my routines calculate and convert the values at full precision

Steve
here is a link to a set of routines that will display signed and/or unsigned integers of any practical size
these routines supercede uArbitrary2Str and are considerably faster

http://www.masm32.com/board/index.php?action=dlattach;topic=12363.0;id=6926

there are 3 versions; one for signed, one for unsigned, one that will do either - you probably want the unsigned version

this is a link to a little program that calculates integer exponents to full precision

http://www.masm32.com/board/index.php?action=dlattach;topic=12363.0;id=7070

this is somewhat similar to what a factorial routine might look like
after giving some thought, though, ling long kai fang is not required   :(
a simple multiple precision multiplication loop would work

clive

Quote from: dedndave
this is somewhat similar to what a factorial routine might look like
after giving some thought, though, ling long kai fang is not required   :(
a simple multiple precision multiplication loop would work

Pretty much, I ended up building some routines to shift left by one, and add, optimized to track the magnitude of the numbers. The multiply would only need to support N bits x 8 bits for 255!, although I implemented it using 32 bits with an early exit as it was no more complex/expensive than doing 8 bits.

I used an iterated loop not recursion, and in the cold light of morning I should have cached (N-1)! so N! would have taken a single multiply.

I was using an array of DWORD, yes this could probably be extended to all usable/virtual memory, you'd probably run out of time before you ran out of memory.

-Clive
It could be a random act of randomness. Those happen a lot as well.

dedndave

hi Clive
i think i'd use the dword MUL - they aren't going to be able to ask for anything as large as 4294967295!
but, as you say - recursion would just be overhead that looks cool - lol

z941998

@dedndave: I did eventually find those links during my review, great work.  For my needs , I agree to not using the LLKF process.  I also like the 4:1 and 1:4 between decimal and binary which setups 4 Gig and and 1 Meg nicely from the links you have provided.  I this will become a standard for me.

@Clive: I also tought about internal tables that do the math automatically upon looking up the corresponding digits in the tables that would be multipled and also provide a carry flag.  I was thinking that every step a calculator uses was done entirely in the ASCII mode and just progress thru the mathematical process.  I was thinking along the lines of setting up variable slots persay (mimicing the FPU) that would handle all conversions and math issues.  Like Raymond's FPU formated functions, we could add a few new functions for Factorial, Permuational, and Combinational.... etc.  But I also recognize the need to process four bytes at once to increase my through-put and loop control evolution timings.

I take Raymonds mark of 9ms to be 9000 clocks, I don't want to break the sillyscope at this time.

Thanks guys you all have be great.  Give me a week or so to put something together and we will see what happens.

raymond

9 ms means 9 millisecond or 0.009 second. At a speed of 2GHz, it should be around 18,000,000 clocks.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dedndave

Steve, I found this document for combinatorial math
it could make a big difference in the performance of your routine to factor first

http://homepage.smc.edu/kennedy_john/NFACT.PDF

i was thinking of factoring first to do the factorial - for larger values, it could speed things up a bit
but, after seeing this document, i think it might be better to write the combinations and/or permutations calculations as complete routines,
as opposed to calculating the individual factorials then dividing bignums

besides, who can resist a doc written by a dead president   :P
i always thought the "F." stood for "Fitzgerald" - guess it stands for "Factor" or "Factorial" instead