In this topic I'd like to prepare a PB program that generates sequences of random
numbers, creating an array of 5000 rows of 64 long integer numbers, and writing the
array to a file when the generation is complete.
The rules are very simple:
1) numbers are in the range 1-50
2) they are generated in groups of 4 at a time
3) in the same group we can't have the same number more than 1 time
4) 16 groups of 4 numbers are a row in the array
5) when we have 5000 rows we write the whole array into a binary file
The elapsed CPU cycles are displayed on a screen format, along with the
progressive number of generated rows.
We'll have:
- RandGen.bas - the source program
- RandGen.exe - the compiled program
- RandGen.scn - the screen format displaying the generation progress
- RandGen.dat - the output binary file we create
After I've created the PB version, I'll try to optimize any single SUB
with INLINE ASM.
Working on it, The Idea [Design] is ready, The Algorithm and CODE [implementation] not yet. :P
I'll be back
soooooooo
that's 80,000 groups of 4 dwords :P
75% of the file is 0's
Quote from: dedndave on April 24, 2010, 05:02:43 PM
soooooooo
that's 80,000 groups of 4 dwords :P
75% of the file is 0's
That's right, but I think it is faster doing so. Isn't it?
By the way here we have the first partial test attached.
I guess it is optimizable a lot. Let's see what we get at the end :8)
And, of course, please suggest any optimization you can imagine :P
@ Hutch: I suspect the MS tic you previously suggested, aren't milliseconds
or at least they looks like something else.
for the memory, i would use GlobalAlloc - don't bother clearing it out
allocate enough for the entire file ~1.22 Mb
for the innermost loop that generates 4 numbers from 1 to 50
use mutiply-to-divide for the scaling - that'll be much faster than using DIV
for the first number, scale the random gen result from 1 to 50
put it in the file buffer (call that #1)
for the second number, scale the random gen result from 1 to 49
if it is equal to or greater than #1, add one to it's value to adjust
put it in the file buffer (call that #2)
for the third number, scale the random gen result from 1 to 48
if it is equal to or greater than #1, add one to it's value to adjust
if it is equal to or greater than #2, add one to it's value to adjust
put it in the file buffer (call that #3)
for the forth number, scale the random gen result from 1 to 47
if it is equal to or greater than #1, add one to it's value to adjust
if it is equal to or greater than #2, add one to it's value to adjust
if it is equal to or greater than #3, add one to it's value to adjust
put it in the file buffer (call that #4)
that way, you generate 4 unique values from 1 to 50 without having to punch the random generator more than 4 times
do that 80,000 times, then write the file :P
after you are done, free the allocated memory block
EDIT - there is a little flaw in this method
for example, if #4 does not get incremented when tested against #1
but, then, it does get incremented when tested against #2
it may cause it to miss an increment against the value of #1
let me think about it for a bit - there has to be a simple fix :8)
the idea is, when you get around to picking the forth value, there are only 47 possible valid values for it
the other 3 values have already been used
one other note...
a lot will depend on how random you need the values to be
a "good" random number generator will likely be slower than a "run of the mill" generator
if it isn't that critical, a simpler generator can be pretty fast
another approach that sounds promising would be to generate one value
from 0 to 49*48*47*46
use multiply-to-divide to decode the 4 values
then you have the same problem, plus you need to scramble the 4 values - lol
but - it sounds like it could have merit
the scramble permutation code could be part of the value
so, it would be 0 to 49*48*47*46*23, or 116,955,552 (the number of possible 4-value groups)
the advantage here is there would only be one pull from the generator for each group
this is the way to fly, if you can figure out how to translate the groups
@ Dave: if you'd like to do so, please try to download the previous attached zip file, and have a look at what I
already did, so we can see, step by step, how to put some INLINE ASM,
with a better algorithm, into the pgm. ::)
The main part of the pgm related with random number generation:
'--------------------------------------------------------------------------------
' Generate 5000 rows of 64 integer numbers - PB version
'--------------------------------------------------------------------------------
SUB TestPB
FOR x = 1 TO 5000
CALL GenRowPB
LOCATE 10,21
PRINT FORMAT$(x,"###,")
NEXT x
END SUB
'--------------------------------------------------------------------------------
' Generate single row of 64 integer numbers - PB version
'--------------------------------------------------------------------------------
SUB GenRowPB
RANDOMIZE TIMER
FOR y = 1 TO 16
again:
g&(1) = RND(Min_n,Max_n)
g&(2) = RND(Min_n,Max_n)
g&(3) = RND(Min_n,Max_n)
g&(4) = RND(Min_n,Max_n)
ok = si
FOR x2 = 1 TO 3
FOR x3 = x2 + 1 TO 4
IF g&(x2) = g&(x3) THEN
ok = no
EXIT
END IF
NEXT x3
IF ok = no THEN
EXIT
END IF
NEXT x2
IF ok = no THEN
GOTO again
END IF
NEXT y
END SUB
Enjoy
┌───────────────────────────────── RandGen ────────────────────────────────────┐
├──────────────────────────────────────────────────────────────────────────────┤
│ │
│ Testing the performance of PB and INLINE ASSEMBLY for the generation of │
│ │
│ random integer numbers [5000 rows of 64 numbers total]. │
│ │
│ Version: PB Version: ASM │
│┌────────────────────────────────────────────────────────────────────────────┐│
││ Rows generation: 5,000 Rows generation: x,xxx ││
│└────────────────────────────────────────────────────────────────────────────┘│
│┌────────────────────────────────────────────────────────────────────────────┐│
││ Elapsed CPU cycles: 5,614,049,693 Elapsed CPU cycles: x,xxx,xxx,xxx ││
│└────────────────────────────────────────────────────────────────────────────┘│
│┌────────────────────────────────────────────────────────────────────────────┐│
││ Elapsed time in MS: 1,859 Elapsed time in MS: x,xxx ││
│└────────────────────────────────────────────────────────────────────────────┘│
│┌────────────────────────────────────────────────────────────────────────────┐│
││ M A S M - F O R U M 2 0 1 0 ││
│└────────────────────────────────────────────────────────────────────────────┘│
└──────────────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────┐
│ Messages from the program: Random numbers generation complete - PB version │
│ │
Messages from Dave: that's over 70,000 cycles per group - i know we can beat that :bg
but - you can speed that code up just by eliminating some loop overhead
try a single loop of 80,000 groups
i may play with it somemore - after my nap
Well, I eliminated the unnecessary print of the counter, and used a single loop
of 80000 cycles of 4 numbers, and some other tricks, and the CPU counter
shows a 50 fold improvement.
This is just the algorithm, we didn't put our hands on the bytes with ASM yet :8)
'--------------------------------------------------------------------------------
' Generate 5000 rows of 64 integer numbers - PB version
'--------------------------------------------------------------------------------
SUB TestPB
DIM gen(1 TO 50) AS LONG
FOR x = 1 TO 50
gen(x) = 0
NEXT x
RANDOMIZE TIMER
FOR x = 1 TO 80000
g&(1) = RND(Min_n,Max_n)
gen(g&(1)) = 1
gen2_again:
g&(2) = RND(Min_n,Max_n)
IF gen(g&(2)) = 0 THEN
gen(g&(2)) = 1
ELSE
GOTO gen2_again
END IF
gen3_again:
g&(3) = RND(Min_n,Max_n)
IF gen(g&(3)) = 0 THEN
gen(g&(3)) = 1
ELSE
GOTO gen3_again
END IF
gen4_again:
g&(4) = RND(Min_n,Max_n)
IF gen(g&(4)) = 0 THEN
gen(g&(1)) = 0
ELSE
GOTO gen4_again
END IF
gen(g&(2)) = 0
gen(g&(3)) = 0
NEXT x
LOCATE 10,21
PRINT "5,000"
END SUB
The main improvements are:
- the RANDOMIZE instruction is used only once instead of 80000 times
- the printing of the counter has been eliminated during the process
- instead of 6 IF we do 3 IF
- instead of regenerating the whole group if a number is generated more than once,
we regenerate only the single number
- instead of 5000 CALLs to a SUB we have only one CALL
- we test if the number is already generated in the group with an array of flags
Maybe we can find some more points to optimize before ASM translation :P
that's a little better - i get about a 30:1 improvement on my prescott CPU
the heart of the code is probably the random number generator, itself
you could make a big improvement by using an ASM random number generator inside the PB loop
as far as improving the loop, ASM isn't likely to make that much difference, as you saw in the loop test
but, deriving all 4 group values from a single random number would speed things up, too
that would hold true whether the code was written in ASM or PB
i am trying to come up with an algo for that in my head
Night time, going to sleep. :dazzled:
I'll be back.
Cheers
Quote from: dedndave on April 24, 2010, 10:28:23 PM
deriving all 4 group values from a single random number would speed things up, too
that would hold true whether the code was written in ASM or PB
i am trying to come up with an algo for that in my head
Well I', curious about that single random number generating 4 numbers.
Let me know when the algo comes to you :bg
well - we can use a simple "Mixed Linear Congruential (LGC)" random number generator
i miscalculated earlier - the number of possible 4-value groups is 132,652,800
we may be able to use that value as the divisor in the random number generator
that way, it spits out integers from 0 to 132,652,799
then, we can use a little algo to convert that value into the 4 group values
i am still working on that in my head :P
i know it can be done - i did something very similar several years ago (~25 years - lol)
the question is - can we make it faster than 4 pulls on a random number generator
Hi,
One read of a random number can certainly be faster than
four. However, the additional manipulation will slow things
down, unless it is like a table look up. Big table. Or just chop
up the returned random number into pieces. Not too good
if the randomness is supposed to be good, but for "human"
use, it could be okay.
However using a linear congruential generator is probably
fast enough for four "in line" usages.
Cheers,
Steve N.
you may be right, Steve
if you can just make a fast RNG and pull it 4 times, or even 5 (to re-order)
it can be done the other way, i am sure
but it would take a lot of code development and perhaps an LUT (i am thinking 24 dwords for branch vectors)
i may write a 4 pull routine, just to see what the numbers are like
if it seems like too many clock cycles, then look at the single pull
the method i am thinking of extracting 4 values from a single one involves 4 multiply-to-divide's
if the RNG is fast enough, they will be about the same times
Quote from: dedndave on April 25, 2010, 09:57:15 AM
- the number of possible 4-value groups is 132,652,800
Sorry Dave, how did you get that 132,652,800 value?
We have numbers in the range 1-50, and we can't have the same number in the same 4-value group,
so it looks a little bit
TOO BIG.
I don't think I'm getting your point, please can you detail your idea?
i dunno, Frank
i just brought up the calculator and started mashing all the buttons and that's what i came up with - lol
i guess i should use the formula....
P(n,r) = n! / (n-r)!
sooooo, that would be 5,527,200
you are right - my other number was too big :P
not that it helps us any
the value would only matter if i were going to write the code that way - lol
Quote from: dedndave on April 25, 2010, 02:18:22 PM
i dunno, Frank
i just brought up the calculator and started mashing all the buttons and that's what i came up with - lol
i guess i should use the formula....
P(n,r) = n! / (n-r)!
sooooo, that would be 5,527,200
you are right - my other number was too big :P
not that it helps us any
the value would only matter if i were going to write the code that way - lol
Still
too big.
According to my calculation it should be 230,300. :P
Just remember that the following groups are only one group:
1 2 3 4
1 3 2 4
1 4 2 3
1 3 4 2
1 4 3 2
and so on
For the task I'm considering all these different combinations are just one.
Cheers
nah - lol
50 x 49 x 48 x 47 = 5,527,200
the first number may be 1 of 50 possible values
the second number may be 1 of 49 possible values
the third number may be 1 of 48 possible values
the forth number may be 1 of 47 possible values
Quote from: frktons on April 25, 2010, 02:48:59 PM
Quote from: dedndave on April 25, 2010, 02:18:22 PM
i dunno, Frank
i just brought up the calculator and started mashing all the buttons and that's what i came up with - lol
i guess i should use the formula....
P(n,r) = n! / (n-r)!
sooooo, that would be 5,527,200
you are right - my other number was too big :P
not that it helps us any
the value would only matter if i were going to write the code that way - lol
Still too big.
According to my calculation it should be 230,300. :P
Just remember that the following groups are only one group:
1 2 3 4
1 3 2 4
1 4 2 3
1 3 4 2
1 4 3 2
and so on
For the task I'm considering all these different combinations are just one.
Cheers
quoting myself :lol
that would only be true if you put each group in order
then, the formula would be
C(n,r) = n! / ( r! (n-r)! )
for generating the groups, we have to consider these as 2 different sets:
5 25 35 45
25 5 35 45
in fact, if we were to generate sorted groups, it would simplify the code :bg
As said before:
For the task I'm considering all these different combinations are just one.
So please consider those groups as just one.
A simple pgm to show the possible combinations is attached
Cheers
Frank
well - i am not going to argue with you about it - lol
if they are sorted groups, there are fewer groups - easier to generate
if they are unsorted groups, there are 4! times as many possible groups
if we generate sorted groups, it can be even faster
http://www.mathsisfun.com/combinatorics/combinations-permutations.html
Quote from: dedndave on April 25, 2010, 03:16:04 PM
well - i am not going to argue with you about it - lol
if they are sorted groups, there are fewer groups - easier to generate
if they are unsorted groups, there are 4! times as many possible groups
if we generate sorted groups, it can be even faster
http://www.mathsisfun.com/combinatorics/combinations-permutations.html
Nice educational web-page, thanks Dave.
By the way, we consider
1234 and
4321 as the same combination, but we can generate them in whatever order,
so if it is easier to have them sorted, that's ok as well.
For the task I've in mind, the order doesn't really matter, just that they
are all different from each other :bg
Let's think about it this way:
numbers 1 to 50 are customers code. It doesn't matter in which order they place an order
into a shop, but "if" they do, no matter how many times per day or how many items.
So a group of 4 customers a day all different from each other can place an order any time in a day.
And we don't know how long it takes between an order and the next of the same customer.
The pgm I proposed
RandGen3 creates this kind of combination, with this rules.
How to optimize the generating routine should consider these rules. :P
Let's see what we get
that makes a difference
not only is the code faster, but simpler to write :P
it brings the "generate 4 at a time" idea back into play
because the code to seperate a single value into 4 ordered numbers is much simpler than unordered
Let's think about it this way:
numbers 1 to 50 are customers codes. It doesn't matter in which order they place an order
into a shop, but "if" they do, no matter how many times per day or how many items.
So a group of 4 customers a day all different from each other can place an order any time in a day.
And we don't know how long it takes between an order and the next of the same customer.
The pgm I proposed RandGen3 creates this kind of combination, with this rules.
How to optimize the generating routine should consider these rules.
Let's see what we get