Dear fellow programmers,
I have written an algorithm to simulate an austrian lotto-machine (draws 6 numbers from a funnel containing 45 numbers, plus a so called "additional number").
To my surprise the program works :toothy but the results are slightly different as expected from the theory of probabilities. :(
That's why I would appreciate any feedback from you experts, especially feedback concerning
a) the design of the algorithm itself (as a newbie to MASM I'm curious about how experts judge my code, and how I could possibly enhance it)
b) my ideas about this simulation, and if and where they are possibly wrong.
Should you think that this is off-topic here, please tell me and forgive me :U - I won't bother you then any more
In medias res:
When one takes a look at the numbers that the lotto-machine has produced since a.d. 1986 - when lotto was introduced in good old austria - one finds out that these numbers are distributed fairly even. By the way, :bg I'm mentioning austrian lotto not because I want to advertise for them, but because to point out that I'm referring to "real-world-data".
--> First requirement of algorithm: Produce random numbers between 1 ... 45 that are evenly distributed.
E.g. If I let generate the algo 0ffffffffh drawings, I expect every number to occur _more or less_ 572 662 306 times (2^21-1)*6/45
--> Second requirement â€" so I thought: decrement the range by one for every number drawn, because with each number drawn, one number less is in the funnel.
AND HERE’S THE „FUNNY“ THING:
If I call create_draw WITHOUT decrementing range the numbers are evenly distributed, but the “wins� are too less according to theory.
If I call create_draw WITH decrementing range, numbers 1..40 occur more than expected, degrading more and more until 45, but I have too much “wins� according to theory.
If someone could enlighten me on that, I’d be more than happy, and should this someone ever be in Vienna just call me and I spend him/her as many beer or whatever drink he/she prefers
[attachment deleted by admin]
Hi,
Quote
--> First requirement of algorithm: Produce random numbers between 1 ... 45 that are evenly distributed.
This will skew your results as the notion of "random" and evenly distributed across a numeric range are contradictory requirements. The closer you come to a random method the more likely the result that the distribution will not be uniform over the numeric range.
Hello Hutch
1st of all, thanks for your reply. There's just one thing I don't understand:
f.ex. a dice: If you throw a dice f.ex. 2^32 times shouldn't the numbers be _more or less_ equally distributed and all the same "randomly"?
Perhaps I'm too silly to understand, but please explain me why "randomly" and equally "distributed" is a contradiction?
It's because I'm stuck.
I mean, isn't it funny that we can fly to the moon, produce atomic bombs and at the same time we are not able to predict which 6 little balls out of 45 will come out of a machine?
kindest regards
Klaus- the cobold, because he was a COBOL-Programmer once
Klaus,
T he notion of "random" is literally "without order" where equal distribution is to some extent "ordered". Vaguely I remember a lotto analysis here in OZ where I live that produced a "Lotto" distribution curve based on every lotto draw since it started and it was by no means ordered or evenly distributed.
Basically the more "unordered" your random distribution is the more powerful it is.
hutch,
thks again for replying,
a question, what is "OZ" New Zealand, Australia?
about "evenly distributed", if u take a look at www.win2day.at/gaming/index.html you'll see the numbers came between 259 times and 197 times.
At the beginning they made a "draw" every sunday, since a few years- I don't know exactly since when- they make draws every sunday and wednesday.
I'll find that out, but taken 1600 rounds the probalibility for each nbr ist:
(1600 rounds * 6 nbrs per round) /45 numbers = 213 per nbr
But okay, I have to do my homework and find out what is the exact nbr of rounds. Anyway, my idea behind all of this ist: simulate the machine as good as possible and thereby enhance your chance of winning.
Comments? Wellcome!
rgds
Klaus
If you're intending to use this to determine the most likely numbers to be drawn, biasing the likelihood of each number being drawn seems to defeat the purpose.
I fully believe that a lotto machine is uniform (that is, each ball has an equal probability of being selected) but the number of possible combinations far exceeds the number of draws that have ever been made. Any distribution that is based on less than a few billion draws is going to be flawed.
Searching Google Scholar or something similar (to find academic papers) will likely find some very thorough analysis of the subject.
Cheers,
Zooba :U
I also believe that the numbers produced by most lotto machines come close to a uniform distribution, it’s just not apparent because the available samples are too small. AFAIK most PRNGs are designed to produce numbers with a uniform distribution, and the fixed version of nrandom does a reasonable job of this. This code compares the distribution produced by nrandom to the distribution produced by the Windows CSP RNG.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
  include \masm32\include\masm32rt.inc
  include \masm32\include\advapi32.inc
  includelib \masm32\lib\advapi32.lib
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
  .data
   cnts dd 10 dup(0)
  .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; This proc collects random bytes generated by the cryptographic
; service provider (CSP), and returns a 32-bit integer in the
; interval 0 to base-1.
;
; Everything is in a single procedure to make automatic release
; of the CSP handle simple. The large buffer is a quick and dirty
; correction for the slowness of the CSP generator.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
 hProv    dd 0
 rand_buffer dd 10000 dup(0)
 pNext    dd 0
.code
PROV_RSA_FULL equ 1
; ------------------------------------------------------------
; This value for the dwFlags parameter of CryptAcquireContext
; is necessary for Windows 2000. For Windows 98 the parameter
; can be zero.
; ------------------------------------------------------------
CRYPT_VERIFYCONTEXT equ 0F0000000h
align 4
csp_rand proc uses ebx base:DWORD
  mov ebx,pNext
  .IF ebx == 0
    invoke CryptAcquireContext, ADDR hProv,
                  NULL,
                  NULL,
                  PROV_RSA_FULL,
                  CRYPT_VERIFYCONTEXT
    .IF eax == 0
      print "CryptAcquireContext failed",13,10
      ret
    .ENDIF
    invoke CryptGenRandom, hProv, 10000*4, ADDR rand_buffer
    .IF eax == 0
      print "CryptGenRandom failed",13,10
    .ENDIF
    invoke CryptReleaseContext, hProv, 0
  .ENDIF
  mov eax, [rand_buffer+ebx]
  add ebx, 4
  .IF ebx > 9999*4
   xor ebx, ebx
  .ENDIF
  mov pNext, ebx
  xor edx, edx
  div base
  mov eax, edx
  ret
csp_rand endp
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
  SAMPLES equ 10000000
  xor ebx, ebx
  .WHILE ebx < SAMPLES
   invoke nrandom, 10
   inc [cnts+eax*4]
   inc ebx
  .ENDW
  xor ebx, ebx
  .WHILE ebx < 10
   print ustr$(ebx),9
   print ustr$([cnts+ebx*4]),9
   sub [cnts+ebx*4], SAMPLES / 10
   js @F
   print chr$("+")
  @@:
   add edi, [cnts+ebx*4]Â
   print str$([cnts+ebx*4]),13,10
   mov [cnts+ebx*4], 0
   inc ebx
  .ENDW
  print chr$(13,10)
  xor ebx, ebx
  .WHILE ebx < SAMPLES
   invoke csp_rand, 10
   inc [cnts+eax*4]
   inc ebx
  .ENDW
  xor ebx, ebx
  .WHILE ebx < 10
   print ustr$(ebx),9
   print ustr$([cnts+ebx*4]),9
   sub [cnts+ebx*4], SAMPLES / 10
   js @F
   print chr$("+")
  @@:Â
   print str$([cnts+ebx*4]),13,10
   mov [cnts+ebx*4], 0
   inc ebx
  .ENDW
  inkey "Press any key to exit..."
  exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start
0Â Â Â Â 1001901 +1901
1Â Â Â Â 999198Â -802
2Â Â Â Â 1001659 +1659
3Â Â Â Â 1000060 +60
4Â Â Â Â 999904Â -96
5Â Â Â Â 1000079 +79
6Â Â Â Â 999780Â -220
7Â Â Â Â 998903Â -1097
8Â Â Â Â 1000925 +925
9Â Â Â Â 997591Â -2409
0Â Â Â Â 1000055 +55
1Â Â Â Â 999199Â -801
2Â Â Â Â 1001228 +1228
3Â Â Â Â 999010Â -990
4Â Â Â Â 999554Â -446
5Â Â Â Â 1001137 +1137
6Â Â Â Â 1000259 +259
7Â Â Â Â 999340Â -660
8Â Â Â Â 999389Â -611
9Â Â Â Â 1000829 +829
The better your nrandom() function is, the worse distribution you'd get with that current approach:
inc [arr+eax*4] ; Merken, wie oft die Zahl gekommen ist
.if [arr+eax*4] > 1 ; Wenn Zahl bereits gekommen
dec [arr+eax*4]
jmp @B ; !!!!!! this. You're forcing another nrandom() call, instead of using the data
.endif
If nrandom() returns "0", take the first empty slot. If nrandom() returns "33", take the 34th empty slot.
It's not striking that ball 45 is rarely taken - you give a chance for it to be taken _only_ on the first draw! And 44 - only on the first two draws. That's why balls "1" to "40" are getting good even results currently.
Btw,
(line 120) .until ebx == range
This is a future bug waiting to happen. Use "Max" instead of "range" :). Same for line 139, but you have to restore the initial value of Max somehow.
Here's an outline of the above idea:
create_draw proc Max,Num
local Taken[range]:BYTE
inc Num ; so that we fit the "bonus number"
;-----[ clear Taken ]-------[
mov ebx,Max
@@:
dec ebx
mov Taken[ebx],0
jnz @B
;---------------------------/
xor ebx,ebx
.repeat
invoke nrandom,Max
;---[ find eax-th empty slot ]------[
mov ecx,-1
inc eax
@@:
inc ecx
cmp Taken[ecx],0
jne @B
dec eax
jnz @B
mov Taken[ecx],1
; ecx is the index 0...44 now
;-----------------------------------/
inc ecx
mov [draw+ebx*4], ecx
inc ebx
dec Max
.until ebx==Num
ret
create_draw endp
Quote from: Ultrano on November 11, 2007, 07:19:07 AMIf nrandom() returns "0", take the first empty slot. If nrandom() returns "33", take the 34th empty slot.
It's not striking that ball 45 is rarely taken - you give a chance for it to be taken _only_ on the first draw!
Indeed. What you might do instead is to offer the full 45 in each loop, check if the drawn number has been used already, and discard if it has been used.
This code displays the distribution for 1,000,000,000 groups of 6 unique numbers from the range 0 to 44.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
  include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
  .data
   cnts dd 45 dup(0)
   flags dd 45 dup(0)
  .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
  SAMPLES equ 1000000000
  xor ebx, ebx
  .WHILE ebx < SAMPLES
   invoke RtlZeroMemory, ADDR flags, 45*4
   xor esi, esi
   .WHILE esi < 6
    invoke nrandom, 45
    .IF DWORD PTR[flags+eax*4] == 0
     inc DWORD PTR[flags+eax*4]
     inc [cnts+eax*4]
     inc esi
    .ENDIF
   .ENDW
   inc ebx
  .ENDW
  xor ebx, ebx
  .WHILE ebx < 45
   print ustr$([cnts+ebx*4]),13,10
   inc ebx
  .ENDW
  print chr$(13,10)
  inkey "Press any key to exit..."
  exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start
133323453
133327735
133330106
133329841
133331917
133328262
133336255
133337350
133334383
133336577
133337976
133340867
133334076
133333848
133343289
133321969
133328977
133330938
133332123
133331912
133333104
133335274
133333310
133343180
133332216
133337966
133336933
133339215
133338040
133334872
133329817
133331469
133329380
133332563
133327579
133330870
133333383
133329160
133338755
133331647
133332257
133334142
133330230
133336198
133336586
Quote from: jj2007 on November 11, 2007, 12:39:09 PM
Quote from: Ultrano on November 11, 2007, 07:19:07 AMIf nrandom() returns "0", take the first empty slot. If nrandom() returns "33", take the 34th empty slot.
It's not striking that ball 45 is rarely taken - you give a chance for it to be taken _only_ on the first draw!
Indeed. What you might do instead is to offer the full 45 in each loop, check if the drawn number has been used already, and discard if it has been used.
Other option, maybe more precise: Instead of discarding and repeating, provide an initial table with 45 entries
balls db 1,2,3,4,5,6, ... 44,45
.. and apply the random generator with equal chances for all 45 entries.
Now assume that 5 has been drawn in the first round:
- copy 6...45 into the initial 5..44 slots
- which results in a modified table 1,2,3,4,6,...45
- apply the random generator with equal chances for all remaining 44 entries
This procedure mimics exactly what happens in reality.
Thanks to everone for your feedback. For now I'm trying to adept all the ideas.
A 64kb serial result of Agner Fog's prng routine (analyzed with ent.exe) yields:
Entropy = 7.997172 bits per byte.
Optimum compression would reduce the size
of this 65536 byte file by 0 percent.
Chi square distribution for 65536 samples is 256.51, and randomly
would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.9213 (127.5 = random).
Monte Carlo value for Pi is 3.130562168 (error 0.35 percent).
Serial correlation coefficient is 0.003217 (totally uncorrelated = 0.0).
That's the best PRNG I've seen. For larger sample sets (1MB, 10MB) the numbers approach the ideals.
Quote from: jj2007 on November 11, 2007, 04:14:39 PM
Other option, maybe more precise: Instead of discarding and repeating, provide an initial table with 45 entries
balls db 1,2,3,4,5,6, ... 44,45
.. and apply the random generator with equal chances for all 45 entries.
Now assume that 5 has been drawn in the first round:
- copy 6...45 into the initial 5..44 slots
- which results in a modified table 1,2,3,4,6,...45
- apply the random generator with equal chances for all remaining 44 entries
This procedure mimics exactly what happens in reality.
To avoid the copying (which is inherently slow), you can count through the list until you find the Nth undrawn number.
Or, if you have complete faith in randomness (you shouldn't :lol) then you can swap the Nth number with the last element of the list -- a random random number is still random.
Or instead of copying....
assuming nrandom(45) returned "3" (zero-based), do:
BallsRemaining[3] = BallsRemaining[44];
Here's C code, (for clarity) :
void Generate(int* Results){
int BallsRemaining[45];
for(int i=0;i<45;i++){
BallsRemaining[i] = i+1; // make results 1-based
}
int Max = 45;
for(int curDraw=0;curDraw<6;curDraw++){
int r = nrandom(Max);
Results[curDraw] = BallsRemaining[r];
Max--;
BallsRemaining[r] = BallsRemaining[Max];
}
}
Generate proc uses ebx esi edi pResults
local BallsRemaining[45]:DWORD
mov ecx,45
@@:
mov BallsRemaining[ecx*4-4],ecx
dec ecx
jnz @B
mov edi,Results
mov ebx,6 ; curDraw=6
mov esi,45 ; Max=45
@@:
invoke nrandom,esi ; nrandom(Max)
dec esi ; Max--
dec ebx ; curDraw--
mov edx,BallsRemaining[eax*4]
mov [edi+ebx*4],edx
mov edx,BallsRemaining[esi*4]
mov ballsRemaining[eax*4],edx
jnz @B
ret
Generate endp
Despite the reordering, each ball keeps its equal chance to be taken, but it depends on how nrandom() will disperse its results, thus the resulting dispersion of this lotto function can suffer a bit.
Quote from: Tedd on November 11, 2007, 09:17:56 PM
Or, if you have complete faith in randomness (you shouldn't :lol) then you can swap the Nth number with the last element of the list -- a random random number is still random.
Swapping is a good solution.
Quote from: Ultrano on November 11, 2007, 11:07:37 PM
mov esi,45 ; Max=45
In real life, after each drawing one ball is gone forever, so Max must be decremented.
With so many parameters involved simulating a lotto machine even approximately would be a virtual impossibility, even if you had all of the specs. With such a great difference in the generators, I fail to see how progressively limiting the range of the numbers produced by nrandom, or other similar schemes, could be any better than just rejecting the invalid numbers.
Quote from: MichaelW on November 12, 2007, 09:42:44 AM
With so many parameters involved simulating a lotto machine even approximately would be a virtual impossibility, even if you had all of the specs. With such a great difference in the generators, I fail to see how progressively limiting the range of the numbers produced by nrandom, or other similar schemes, could be any better than just rejecting the invalid numbers.
You are probably right, but 1. cobold wants to simulate real behaviour, and 2. rejecting needs more code than just decrementing the Max :wink
:eek
ok, hands up all of you that studied extensively Probability and Statistics at university. My hand is raised now. I almost bombed Statistics, but at least I think I know how to create formulas for the probability. Heck, this "lotto machine" is one of the first tutorials in Probability classes ::)
You want to create a perfect lotto simulation? Be my guest and make a 3D physical model simulation with Euler's integrations for the dynamics; I bet you'll get the same dispersion. Which currently is perfect.
jj2007, "dec esi" ... and nrandom(45) returns a number in the range [0;44].
This lotto proc has already been completed here imho. The only problem remaining is the construction of nrandom to wait for the user to press a button, and measure that time in cycles - to add more randomization. But that way we can't get the dispersion results easily :)
Interesting, “perfect simulation� apparently means something very different to me than it does to others here. It is very unlikely that a physical machine that is designed primarily to impress the ignorant masses, mostly through eye appeal, would produce statistically random sequences, or be designed in such a way that it could be reasonably modeled.
Here's what i've done to generate random numbers for the german lotto about a year ago. It's a simple 6 out of 49 numbers generator. Not quite well programmed... wanted to learn assembly step by step, and to see how the nseed and nrandom procedures functioning... In my code you will see code step by step, no code improvements and failures due to lack of time, ... i've discarded the "rng" project.
"Ultrano wrote: The only problem remaining is the construction of nrandom to wait for the user to press a button, and measure that time in cycles - to add more randomization."
My conclusion to add more randomization was to nearly randomize the nseed procedure with GetTickCount, GetLocalTime, add, xor, sub, etc. instead of pressing the button six times !
Have Fun !
[attachment deleted by admin]
Another source for a random seed could be the microphone recording input. Even with no signal present, there is noise... sample that, generate a random value, sample again, add it to the previous value, generate another random value, sample again... etc.
If you have some electronics experience, try applying around 20v/1mA in reverse through a general-purpose NPN transistor's B-E junction. This exceeds the breakdown voltage and current flows erratically on an atomic scale, which can be sampled and converted to integer(s). As simple as this seems, I wonder why the CPU doesn't include one extra transistor (out of the other billion-or-so) just for this purpose. An "FRND" instruction would be nice...
Re: FRND
Probably because it would be too difficult to provide any guarantees as to its distribution. It would be far better to implement such an instruction using a deterministic methodology, and as such, there doesnt really need to be an instruction.
I am not sure what to make of some of the comments in this thread but I think several people here are having trouble understanding that a truely random sequence of values from a normal distribution has statistical properties that can be emulated using psuedo-random techniques. That infact for all intents and purposes the output of a PNRG like the mersenes twister (MT) can be treated exactly like an ideal random number generator. The sequence length of MT19937 is so long that never in your lifetime will you ever run over the same location in the sequence if you so choose. The length is approximately 4.315425E+6001.
While it is true that many PNRG's have very poor statistical properties (especialy LCG's), that is not true for all of them.