News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

_Perfect_ simulation of a lotto-machine

Started by cobold, November 10, 2007, 11:08:35 PM

Previous topic - Next topic

cobold

Dear fellow programmers,

I have written an algorithm to simulate an austrian lotto-machine (draws 6 numbers from a funnel containing 45 numbers, plus a so called "additional number").
To my surprise the program works  :toothy but the results are slightly different as expected from the theory of probabilities.  :(

That's why I would appreciate any feedback from you experts, especially feedback concerning
a) the design of the algorithm itself (as a newbie to MASM I'm curious about how experts judge my code, and how I could possibly enhance it)
b) my ideas about this simulation, and if and where they are possibly wrong.

Should you think that this is off-topic here, please tell me and forgive me  :U - I won't bother you then any more

In medias res:
When one takes a look at the numbers that the lotto-machine has produced since a.d. 1986 - when lotto was introduced in good old austria - one finds out that these numbers are distributed fairly even. By the way,  :bg I'm mentioning austrian lotto not because I want to advertise for them, but because to point out that I'm referring to "real-world-data".
--> First requirement of algorithm: Produce random numbers between 1 ... 45 that are evenly distributed.
E.g. If I let generate the algo 0ffffffffh drawings, I expect every number to occur _more or less_ 572 662 306 times (2^21-1)*6/45
--> Second requirement â€" so I thought: decrement the range by one for every number drawn, because with each number drawn, one number less is in the funnel.
AND HERE’S THE „FUNNY“ THING:
If I call create_draw WITHOUT decrementing range the numbers are evenly distributed, but the “wins� are too less according to theory.
If I call create_draw WITH decrementing range, numbers 1..40 occur more than expected, degrading more and more until 45, but I have too much “wins� according to theory.
If someone could enlighten me on that, I’d be more than happy, and should this someone ever be in Vienna just call me and I spend him/her as many beer or whatever drink he/she prefers



[attachment deleted by admin]

hutch--

Hi,

Quote
--> First requirement of algorithm: Produce random numbers between 1 ... 45 that are evenly distributed.

This will skew your results as the notion of "random" and evenly distributed across a numeric range are contradictory requirements. The closer you come to a random method the more likely the result that the distribution will not be uniform over the numeric range.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

cobold

Hello Hutch

1st of all, thanks for your reply. There's just one thing I don't understand:

f.ex. a dice: If you throw a dice f.ex. 2^32 times shouldn't the numbers be _more or less_  equally distributed and all the same "randomly"?
Perhaps I'm too silly to understand, but please explain me why "randomly" and equally "distributed" is a contradiction?
It's because I'm stuck.
I mean, isn't it funny that we can fly to the moon, produce atomic bombs and at the same time we are not able to predict which 6 little balls out of 45 will come out of a machine?

kindest regards
Klaus- the cobold, because he was a COBOL-Programmer once

hutch--

Klaus,

T he notion of "random" is literally "without order" where equal distribution is to some extent "ordered". Vaguely I remember a lotto  analysis here in OZ where I live that produced a "Lotto" distribution curve based on every lotto draw since it started and it was by no means ordered or evenly distributed.

Basically the more "unordered" your random distribution is the more powerful it is.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

cobold

hutch,

thks again for replying,

a question, what is "OZ" New Zealand, Australia?

about "evenly distributed", if u take a look at www.win2day.at/gaming/index.html  you'll see the numbers came between 259 times and 197 times.
At the beginning they made a "draw" every sunday, since a few years- I don't know exactly since when- they make draws every sunday and wednesday.
I'll find that out, but taken 1600 rounds the probalibility for each nbr ist:
(1600 rounds * 6 nbrs per round) /45 numbers = 213 per nbr
But okay, I have to do my homework and find out what is the exact nbr of rounds. Anyway, my idea behind all of this ist: simulate the machine as good as possible and thereby enhance your chance of winning.

Comments? Wellcome!


rgds
Klaus

zooba

If you're intending to use this to determine the most likely numbers to be drawn, biasing the likelihood of each number being drawn seems to defeat the purpose.

I fully believe that a lotto machine is uniform (that is, each ball has an equal probability of being selected) but the number of possible combinations far exceeds the number of draws that have ever been made. Any distribution that is based on less than a few billion draws is going to be flawed.

Searching Google Scholar or something similar (to find academic papers) will likely find some very thorough analysis of the subject.

Cheers,

Zooba :U

MichaelW

I also believe that the numbers produced by most lotto machines come close to a uniform distribution, it’s just not apparent because the available samples are too small. AFAIK most PRNGs are designed to produce numbers with a uniform distribution, and the fixed version of nrandom does a reasonable job of this. This code compares the distribution produced by nrandom to the distribution produced by the Windows CSP RNG.

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
    include \masm32\include\advapi32.inc
    includelib \masm32\lib\advapi32.lib
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
      cnts dd 10 dup(0)
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; This proc collects random bytes generated by the cryptographic
; service provider (CSP), and returns a 32-bit integer in the
; interval 0 to base-1.
;
; Everything is in a single procedure to make automatic release
; of the CSP handle simple. The large buffer is a quick and dirty
; correction for the slowness of the CSP generator.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
  hProv       dd 0
  rand_buffer dd 10000 dup(0)
  pNext       dd 0
.code

PROV_RSA_FULL equ 1

; ------------------------------------------------------------
; This value for the dwFlags parameter of CryptAcquireContext
; is necessary for Windows 2000. For Windows 98 the parameter
; can be zero.
; ------------------------------------------------------------

CRYPT_VERIFYCONTEXT equ 0F0000000h

align 4
csp_rand proc uses ebx base:DWORD
    mov ebx,pNext
    .IF ebx == 0
        invoke CryptAcquireContext, ADDR hProv,
                                    NULL,
                                    NULL,
                                    PROV_RSA_FULL,
                                    CRYPT_VERIFYCONTEXT
        .IF eax == 0
            print "CryptAcquireContext failed",13,10
            ret
        .ENDIF
        invoke CryptGenRandom, hProv, 10000*4, ADDR rand_buffer
        .IF eax == 0
            print "CryptGenRandom failed",13,10
        .ENDIF
        invoke CryptReleaseContext, hProv, 0
    .ENDIF
    mov eax, [rand_buffer+ebx]
    add ebx, 4
    .IF ebx > 9999*4
      xor ebx, ebx
    .ENDIF
    mov pNext, ebx
    xor edx, edx
    div base
    mov eax, edx
    ret
csp_rand endp

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    SAMPLES equ 10000000

    xor ebx, ebx
    .WHILE ebx < SAMPLES
      invoke nrandom, 10
      inc [cnts+eax*4]
      inc ebx
    .ENDW
    xor ebx, ebx
    .WHILE ebx < 10
      print ustr$(ebx),9
      print ustr$([cnts+ebx*4]),9
      sub [cnts+ebx*4], SAMPLES / 10
      js @F
      print chr$("+")
    @@:
      add edi, [cnts+ebx*4] 
      print str$([cnts+ebx*4]),13,10
      mov [cnts+ebx*4], 0
      inc ebx
    .ENDW

    print chr$(13,10)

    xor ebx, ebx
    .WHILE ebx < SAMPLES
      invoke csp_rand, 10
      inc [cnts+eax*4]
      inc ebx
    .ENDW
    xor ebx, ebx
    .WHILE ebx < 10
      print ustr$(ebx),9
      print ustr$([cnts+ebx*4]),9
      sub [cnts+ebx*4], SAMPLES / 10
      js @F
      print chr$("+")
    @@: 
      print str$([cnts+ebx*4]),13,10
      mov [cnts+ebx*4], 0
      inc ebx
    .ENDW

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


0       1001901 +1901
1       999198  -802
2       1001659 +1659
3       1000060 +60
4       999904  -96
5       1000079 +79
6       999780  -220
7       998903  -1097
8       1000925 +925
9       997591  -2409

0       1000055 +55
1       999199  -801
2       1001228 +1228
3       999010  -990
4       999554  -446
5       1001137 +1137
6       1000259 +259
7       999340  -660
8       999389  -611
9       1000829 +829

eschew obfuscation

u

The better your nrandom() function is, the worse distribution you'd get with that current approach:

        inc [arr+eax*4]             ; Merken, wie oft die Zahl gekommen ist
        .if [arr+eax*4] > 1         ; Wenn Zahl bereits gekommen
            dec [arr+eax*4]
            jmp @B ; !!!!!! this. You're forcing another nrandom() call, instead of using the data
        .endif


If nrandom() returns "0", take the first empty slot. If nrandom() returns "33", take the 34th empty slot.
It's not striking that ball 45 is rarely taken - you give a chance for it to be taken _only_ on the first draw! And 44 - only on the first two draws. That's why balls "1" to "40" are getting good even results currently.

Btw,

(line 120)    .until ebx == range

This is a future bug waiting to happen. Use "Max" instead of "range" :). Same for line 139, but you have to restore the initial value of Max somehow. 
Please use a smaller graphic in your signature.

u

Here's an outline of the above idea:


create_draw proc Max,Num
local Taken[range]:BYTE

inc Num ; so that we fit the "bonus number"

;-----[ clear Taken ]-------[
mov ebx,Max
@@:
dec ebx
mov Taken[ebx],0
jnz @B
;---------------------------/


xor ebx,ebx
.repeat
invoke nrandom,Max
;---[ find eax-th empty slot ]------[
mov ecx,-1
inc eax

@@:
inc ecx

cmp Taken[ecx],0
jne @B

dec eax
jnz @B
mov Taken[ecx],1
; ecx is the index 0...44 now
;-----------------------------------/

inc ecx
mov [draw+ebx*4], ecx

inc ebx
dec Max
.until ebx==Num


ret
create_draw endp
Please use a smaller graphic in your signature.

jj2007

Quote from: Ultrano on November 11, 2007, 07:19:07 AMIf nrandom() returns "0", take the first empty slot. If nrandom() returns "33", take the 34th empty slot.
It's not striking that ball 45 is rarely taken - you give a chance for it to be taken _only_ on the first draw!
Indeed. What you might do instead is to offer the full 45 in each loop, check if the drawn number has been used already, and discard if it has been used.

MichaelW

This code displays the distribution for 1,000,000,000 groups of 6 unique numbers from the range 0 to 44.

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
      cnts  dd 45 dup(0)
      flags dd 45 dup(0)
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    SAMPLES equ 1000000000

    xor ebx, ebx
    .WHILE ebx < SAMPLES
      invoke RtlZeroMemory, ADDR flags, 45*4
      xor esi, esi
      .WHILE esi < 6
        invoke nrandom, 45
        .IF DWORD PTR[flags+eax*4] == 0
          inc DWORD PTR[flags+eax*4]
          inc [cnts+eax*4]
          inc esi
        .ENDIF
      .ENDW
      inc ebx
    .ENDW

    xor ebx, ebx
    .WHILE ebx < 45
      print ustr$([cnts+ebx*4]),13,10
      inc ebx
    .ENDW

    print chr$(13,10)

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


133323453
133327735
133330106
133329841
133331917
133328262
133336255
133337350
133334383
133336577
133337976
133340867
133334076
133333848
133343289
133321969
133328977
133330938
133332123
133331912
133333104
133335274
133333310
133343180
133332216
133337966
133336933
133339215
133338040
133334872
133329817
133331469
133329380
133332563
133327579
133330870
133333383
133329160
133338755
133331647
133332257
133334142
133330230
133336198
133336586
eschew obfuscation

jj2007

Quote from: jj2007 on November 11, 2007, 12:39:09 PM
Quote from: Ultrano on November 11, 2007, 07:19:07 AMIf nrandom() returns "0", take the first empty slot. If nrandom() returns "33", take the 34th empty slot.
It's not striking that ball 45 is rarely taken - you give a chance for it to be taken _only_ on the first draw!
Indeed. What you might do instead is to offer the full 45 in each loop, check if the drawn number has been used already, and discard if it has been used.
Other option, maybe more precise: Instead of discarding and repeating, provide an initial table with 45 entries
balls  db 1,2,3,4,5,6, ... 44,45
.. and apply the random generator with equal chances for all 45 entries.
Now assume that 5 has been drawn in the first round:
- copy 6...45 into the initial 5..44 slots
- which results in a modified table 1,2,3,4,6,...45
- apply the random generator with equal chances for all remaining 44 entries

This procedure mimics exactly what happens in reality.

cobold

Thanks to everone for your feedback. For now I'm trying to adept all the ideas.

Mark Jones

A 64kb serial result of Agner Fog's prng routine (analyzed with ent.exe) yields:

Entropy = 7.997172 bits per byte.

Optimum compression would reduce the size
of this 65536 byte file by 0 percent.

Chi square distribution for 65536 samples is 256.51, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.9213 (127.5 = random).
Monte Carlo value for Pi is 3.130562168 (error 0.35 percent).
Serial correlation coefficient is 0.003217 (totally uncorrelated = 0.0).


That's the best PRNG I've seen. For larger sample sets (1MB, 10MB) the numbers approach the ideals.
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

Tedd

Quote from: jj2007 on November 11, 2007, 04:14:39 PM
Other option, maybe more precise: Instead of discarding and repeating, provide an initial table with 45 entries
balls  db 1,2,3,4,5,6, ... 44,45
.. and apply the random generator with equal chances for all 45 entries.
Now assume that 5 has been drawn in the first round:
- copy 6...45 into the initial 5..44 slots
- which results in a modified table 1,2,3,4,6,...45
- apply the random generator with equal chances for all remaining 44 entries

This procedure mimics exactly what happens in reality.
To avoid the copying (which is inherently slow), you can count through the list until you find the Nth undrawn number.
Or, if you have complete faith in randomness (you shouldn't :lol) then you can swap the Nth number with the last element of the list -- a random random number is still random.
No snowflake in an avalanche feels responsible.