Probably in this part of the FUN ASM CODING, I can learn some trick and
some mnemonics as well.
The idea I have in my mind is about searching inside the combinations we generated
with many kind of statistical patterns.
Bacause all the job can be done using memory, registers and stack, and simple arithmetic operations,
and we are not I/O bound, or anything else, all the full speed and potential of ASM CODING
can be used in this SUB. At least this is what I hope.
Let's start the easy way:
- we have a file of 16 groups of 4-value integer numbers, for 5,000 data records.
- we can read all of them inside the memory, into an array structure or the like
- we start searching, among all the combinations of 4 numbers, which one is the "lucky one",
so to say, adding 1 to a counter each time one of the four digit we are searching is found
inside one of the 16 groups of 4-value numbers
- at the end "The winner is..." n1,n2,n3,n4 totaling xxx,xxx count inside the 80,000 groups
of 4-digit combinations.
I'm going to realize a PB version, in this way I can learn some PB as well.
Afterwhile I think we can find many points to optimize with ASM.
We only have integer values, integer registers, integer operations. It should be a good
occasion to speed up things with ASM.
Let's see what we get.
well - "search" probably isn't the right word
let me see if i understand you...
you want to see which of the values from 1 to 50 occurs most often ?
.data
Totals dd 50 dup(0)
.code
Total0: push esi
mov ecx,80000
push edi
mov esi,offset FileArray
;tally the totals
Total1: mov edi,[esi]
or edi,edi ;safety test - may be omitted
jz Total2 ;safety branch - may be omitted
cmp edi,50 ;safety test - may be omitted
ja Total2 ;safety branch - may be omitted
shl edi,2
inc dword ptr Totals[edi-4]
Total2: add esi,4
dec ecx
jnz Total1
;find the largest value in Totals array
;in this case, we return the lower index if more than one are max total
mov esi,offset Totals
mov ecx,50
Total3: mov eax,[esi]
mov edx,esi
jmp short Total5
Total4: cmp eax,[esi]
jb Total3
Total5: add esi,4
dec ecx
jnz Total4
sub edx,offset Totals-4
pop edi
shr edx,2
pop esi
;EAX = frequency
;EDX = number
EDIT - corrected a mistake :P
oops - another one
Great!
This could be one of the millions of possible statistic. I'll save and study it
for exercising some ASM concept. :P
What I had in mind, for the first "counting", is a little more
complicated:
- We have to generate all the possible combinations of 4 digit ranging 1-50
like I did in the small routine I posted earlier: perm4.zip, that means:
230,300 combinations.
- For each of them we have to check if one/more of the four digit is inside the
80,000 groups of 4 digit created previosly with ramdom generating routine.
- This means that we can find more than one digit at a time and have to count
them all.
- When we finish with the first group [1 2 3 4], we save the result and the
combination somewhere, like numeric variables. :lol
- We start with the next combination [1 2 3 5] and do the same.
- We confront the results of the first and the second, and save the bigger counter
and the relative 4 digit group, deleting the other
- we go on until the last combination [47 48 49 50]
- when the last counting and compare is done we remain with the biggest
counter found and the relative four digit combination
- we display the result:
"The combination: " n1 n2 n3 n4 " has totalised " count_hits " the biggest found so far"
This will take a little bit more to compute and we have to find ways of optimizing it
to have the job done in less than 1 minute.
It is somewhat more challenging. :dazzled:
I think in optimized PB I can have it done in about 4 minutes, but I've to check
before giving the correct amount of elapsed time and CPU cycles.
It's past midnight over here, and I don't know if my mind is ready. I try anyway
right now. If I fall asleep, I'll continue in few hours. :8)
well - that is a LOT of comparing
i seriously doubt you can do it exactly that way in less than a minute on any computer
one way to speed things up would be......
only test combinations that are present
for example - it is unlikely that 1,2,3,4 will exist in the randomly generated groups
no need to test for it, then
better to pick up a group that you do have, and totalize matching groups
mark them - you have plenty of extra bits to use as markers
then, on the next pass, the ones that are marked may be skipped
you will want a little sorter so, when comparing groups, they are in sorted order
when you pick up the first "master" group, sort it
then, look for a match against the lowest value only - not all 4
if a group does not have that value in it - no need wasting time looking at the other 3 values
Quote from: dedndave on April 25, 2010, 10:27:28 PM
well - that is a LOT of comparing
i seriously doubt you can do it exactly that way in less than a minute on any computer
one way to speed things up would be......
only test combinations that are present
for example - it is unlikely that 1,2,3,4 will exist in the randomly generated groups
no need to test for it, then
better to pick up a group that you do have, and totalize matching groups
mark them - you have plenty of extra bits to use as markers
then, on the next pass, the ones that are marked may be skipped
you will want a little sorter so, when comparing groups, they are in sorted order
Well, if the group [1 2 3 4] is not among the randomly generated groups, it doesn't
matter, because we have to test for the single digits, so if we test against [43 12 2 38]
we find the digit "2" and we count +1, if we test against [23 4 1 18] we find "4" and "1"
and we count +2, and so on.
Sorry for not being very clear, I'm trying to express the idea in the best way I can.
That is a lot of comparing, you're right, but the algo I have in mind doesn't really need
a lot of them. I'm thinking about some tricks to get the job done in a faster way.
Give me some time to CODE and test, and I'll show you what I designed.
One of the trick could be:
Instead of comparing each single digit of [1 2 3 4] with all the four single digits of [43 12 2 38]
that is 16 compare, we do:
1) an array of 50 integer initialized to 0 value outside of the cycle with index ranging 1 to 50
2) we use the 4 digits of the random group [43 12 2 38] like index to switch to 1 the relative
array elements, "ADD 1" I mean, or better "MOV 1" to the array element.
3) we use the 4 digits of the testing group [1 2 3 4] like index and ADD the four array elements
this way, if "1" is not in the [43 12 2 38] group we ADD 0 to the counter, because the array element
has 0 value, and if "2" is switched to "1" we ADD 1 to the counter.
4) in total we do 4 MOV, 4 ADD, 4 MOV to reset to "0" the four array index [43 12 2 38]
5) I think this should be faster than 16 COMPARE and some ADDs to the counter.
With tricks like this in ASM we can do a very fast routine, I think. ::)
i am not sure what you mean - lol
when you write the code, i will see if i can read it and "gleen" an answer like Carnac the Magnificent
(http://blogs.dallasobserver.com/sportatorium/Carnac.jpg)
Trying to explain in PB-like CODE:
file randomly generated: I group [12- 5- 34- 1]
First group to test: [1 - 2 - 3 - 4]
the randomly generated numbers switch the relative array elements to 1:
num(12) = 1
num(5) = 1
num(34) = 1
num(1) = 1
we count the hits found:
count = count + num(1) + num(2) + num(3) + num(4) ' in this case num(1) has got "1" the others have "0"
we reset the switched array elements for the next group to count:
num(12) = 0
num(5) = 0
num(34) = 0
num(1) = 0
so we did no compare at all, only moved values between integers and added integers [very fast operations]
Is it a little bit clearer? :8)
Yes, basically a score board. And cheaper to clear the four known locations than to clear the whole array. Personally I'd thought of using a bit vector to the same effect.
Any reason for 50 (lottery), verses 64? Generating a power of two would be cheaper than MOD 50.
-Clive
no :P
i am not sure what you expect "count" to be at the end
i think it will be 80000
maybe - instead of explaining the algorithm, explain the real-world use for it
i am an engineer, but i understand business, too :bg
or - just write it in PB and i'll see if i get it, then :P
Quote from: clive on April 25, 2010, 11:23:59 PM
Yes, basically a score board. And cheaper to clear the four known locations than to clear the whole array. Personally I'd thought of using a bit vector to the same effect.
Any reason for 50 (lottery), verses 64? Generating a power of two would be cheaper than MOD 50.
-Clive
Do you think a bit vector would be faster than integers? I don't know very much about ASM but
anything I read so far tells me integer operations are faster. Of course it could be the reverse.
50 or 64 or 100 doesn't really matter, I just used a manageable amount of combinations.
100 numbers in groups of 6-10 would be almost impossible to check in a actual machine.
In order to learn and to test the SUBs, the amounts have to be doable in a reasonable time,
and you cannot always choose the fastest way in real life. ::)
Quote from: dedndave on April 25, 2010, 11:25:20 PM
no :P
i am not sure what you expect "count" to be at the end
i think it will be 80000
maybe - instead of explaining the algorithm, explain the real-world use for it
i am an engineer, but i understand business, too :bg
or - just write it in PB and i'll see if i get it, then :P
Probably it will be near 80,000 more or less, if the RND function in PB is a good one.
The point is: it doesn't really matter, otherwise it would be "ASM for JOB", or the like.
Mine if "ASM for FUN", so if I can't use it in real life I'll enjoy more.
It is very satisfying to reach a goal, and mine is to learn ASM playing with it.
It was somewhat boring to CODE for BUSINESS in PLI, I could not choose
very much.
Data structure was already there, the flow of procedure was already established,
so I had just a small portion of the whole to make working.
When I really enjoyed was when I could "invent" some algo or find some trick
to have a better performance.
Now I think I found one of those trick, not that I am the first, maybe it is a well
established concept, but for me it is a new one.
As done before for all the SUBs, I'll CODE it in PB [PB for FUN :lol ] and afterwhile
with the help of the more experienced ones I'll try to optimize it.
Optimizing is really a good exercise for the mind, isn't it? :lol
A bit vector would be faster than integers especially if you could keep it in a couple of registers, your [1, 2, 3, 4] example could be confirmed with a cmp eax,0Fh. Not that I think that [1,2,3,4] or [4,3,2,1], etc is an unreasonable sequence.
Using 64 combinations, I'd be tempted to use a Maximal LFSR to generate random sequences. I'd have to do some tests on the number distributions, but they'd be highly efficient.
-Clive
Quote from: clive on April 26, 2010, 12:47:06 AM
A bit vector would be faster than integers especially if you could keep it in a couple of registers, your [1, 2, 3, 4] example could be confirmed with a cmp eax,0Fh. Not that I think that [1,2,3,4] or [4,3,2,1], etc is an unreasonable sequence.
-Clive
Let me see if I've got your point. If I can use EAX and EBX I have 64 bits available, how do I
test the sequence [1, 2, 3, 4] against for example [23, 11, 7, 3] ?
How do I count the results?
Do I start with all the bits "0" and after I switch the bits [23, 11, 7, 3] to "1" before the cmp?
Or what? Please I'm a beginner, I don't actually know very much about ASM.
Bear some patience with me, try not to skip any step. It could be obvious for you but I easily get lost
at the moment. ::)
Here are some thoughts on how to do this in assembler. Hopefully the fragments will spark some ideas/questions.
I'm going to pack your numbers into one 32-bit word, as 1..50 easily fits in 8-bits. You could read it out of DWORDs in memory if your wanted instead of rotating the packed version.
xor eax,eax ; clear bit vector
xor ebx,ebx
mov ecx,004030201h ; [1, 2, 3, 4] packed in a DWORD
; rotate through packed format converting to bit vector ebx:eax
call setbit ; bit convert entry in CL
rol ecx,8
call setbit
rol ecx,8
call setbit
rol ecx,8
call setbit
rol ecx,8
setbit PROC
push ecx ; save the original
movzx ecx,cl ; convert low order byte into dword
dec ecx ; convert 1..50 to 0..49 range
test ecx,20h ; test bit 5 (ie >= 32)
jnz highdword ; bit 5 is high
; 1-32 (0..31)
bts eax,ecx ; set bit ecx (0..31), in eax
jmp done
highdword: ; 33-64 (32..63)
and ecx,1Fh ; mask lower 5 bits, eqv -32
bts ebx,ecx ; set bit ecx (0..31), in ebx
done:
pop ecx
ret
bitset ENDP
For [1, 2, 3, 4]
eax=00000000 ebx=00000000 Cleared
cl=01
eax=00000001 ebx=00000000
cl=02
eax=00000003 ebx=00000000
cl=03
eax=00000007 ebx=00000000
cl=04
eax=0000000F ebx=00000000
For [23, 11, 7, 3]
eax=00000000 ebx=00000000 Cleared
cl=17
eax=00400000 ebx=00000000
cl=0B
eax=00400400 ebx=00000000
cl=07
eax=00400440 ebx=00000000
cl=03
eax=00400444 ebx=00000000
Ok, so if you think [1, 2, 3, 4] is equivalent to [4, 3, 2, 1], [3, 2, 1, 4], etc
You convert all your number sets to bit vectors, then sort the bit vectors, and finally you could enumerate through the list of sorted bit vectors counting the ones which are the same.
The point of sorting would be to minimize the amount of searching required to find matching bit vector patterns. Sorting would naturally group them together making the counting phase simpler.
Quote from: clive on April 26, 2010, 04:01:43 AM
Ok, so if you think [1, 2, 3, 4] is equivalent to [4, 3, 2, 1], [3, 2, 1, 4], etc
You convert all your number sets to bit vectors, then sort the bit vectors, and finally you could enumerate through the list of sorted bit vectors counting the ones which are the same.
The point of sorting would be to minimize the amount of searching required to find matching bit vector patterns. Sorting would naturally group them together making the counting phase simpler.
I can imagine this can be done, but what about speed? It looks like it'll take longer doing all these bit operations than
doing the simple MOVs and ADDs I had in mind.
All depends of course on how fast these bit operations are.
As soon as I'll have the PB code and test it, we can see the difference.
Thanks anyway for your help
Here we have the first counting excercise :lol
In the attached zip file:
- CombCheck4.bas
- CombCheck4.exe
- CombCheck4.scn
to run the program you have to use the program RandGen3.exe/scn attached in SUB #5
to generate the file with random numbers and after the CombCheck4 to count/display
the repetitions of the single numbers.
The RND function of PB has done a pretty good job as you can see.
This is a simplified counting, and the performance misures all the steps togheter
not the single counting routine.
All the job is done with a simple loop:
SUB CountNumbers
FOR x = 1 TO 320000
INCR numbers&(n&(x))
NEXT x
END SUB
It is just a short sample of what we can count on the file.
The next SUBs are going to do more complex counting, like the one I described earlier.
Enjoy
Ok, so creating 80000, 4 groups of 4, followed by identifying [1, 2, 3, 4] in any order, as well as any other 4 group pairings.
3 GHz P4 Prescott, takes 220 ms in C, without much effort to optimize.
Here we identified FOUR occurrences of [1,2,3,4],[1,2,4,3],[4,3,2,1],etc from the 320000 groups generated
Buffer 5120000 bytes
Generating..
16 Ticks, 76402283 Cycles
Saving.. 'rndtest.out'
Evaluating Distribution..
25600 Expected Average per bin
1 : 25616 16
2 : 25528 -72
3 : 25565 -35
4 : 25677 77
5 : 25915 315
6 : 25689 89
7 : 25880 280
8 : 25484 -116
9 : 25690 90
10 : 25340 -260
11 : 25548 -52
12 : 25686 86
13 : 25382 -218
14 : 25813 213
15 : 25240 -360
16 : 25729 129
17 : 25560 -40
18 : 25385 -215
19 : 25507 -93
20 : 25676 76
21 : 25339 -261
22 : 25727 127
23 : 25488 -112
24 : 25821 221
25 : 25463 -137
26 : 25583 -17
27 : 25844 244
28 : 25361 -239
29 : 25504 -96
30 : 25660 60
31 : 25542 -58
32 : 25476 -124
33 : 25628 28
34 : 25629 29
35 : 25512 -88
36 : 25493 -107
37 : 25708 108
38 : 25533 -67
39 : 25874 274
40 : 25475 -125
41 : 25723 123
42 : 25612 12
43 : 25477 -123
44 : 25528 -72
45 : 25925 325
46 : 25596 -4
47 : 25648 48
48 : 25706 106
49 : 25543 -57
50 : 25672 72
Converting to bit vectors..
16 Ticks, 71265015 Cycles
Sorting bit vectors..
187 Ticks, 571231005 Cycles
4 000000000000000F [1,2,3,4]
2 0000000000000017 [1,2,3,5]
3 000000000000001D [1,3,4,5]
4 000000000000002B [1,2,4,6]
2 000000000000002D [1,3,4,6]
2 000000000000002E [2,3,4,6]
... ad tedium
2 0003804000000000 [39,48,49,50]
3 0003840000000000 [43,48,49,50]
2 0003880000000000 [44,48,49,50]
2 0003900000000000 [45,48,49,50]
The most frequent in this run
9 0000001000020820 [6,12,18,37]
9 0000001100480000 [20,23,33,37]
9 0000044000088000 [16,20,39,43]
9 0000100420000020 [6,30,35,45]
9 0000200001001100 [9,13,25,46]
For the generation of random numbers did you use a C ANSI function?
the performance you reported:
Generating..
16 Ticks, 76,402,283 Cycles
is much the same of the PB function RND, that in my core duo 2.6 Ghz takes:
16 ticks, 60,298,705 Cycles
For identifing the sequences of 4 numbers I should have a look at the C CODE to
understand what you did, hoping I can grasp it's meaning. :lol
As I get time enough to code the 4 digit search inside the random numbers file
I'll post it and test the performance of the routine to have a performance match.
Regarding this part of the report:
Sorting bit vectors..
187 Ticks, 571231005 Cycles
That's what I suspected, a lot of time spent doing something that the
algo I've in mind skips altogheter, no need to sort anything.
If I don't fall asleep, in few hours I'll post the new version for the 4 digit searching
and counting.
I used a 32-bit MLS (maximal length sequence) pseudo random binary sequence pulling out 6-bits at a time. Rolling the dice again if the number is outside 0..49, or if I hit a previously selected number. It's derived from some efficient ARM assembler (5 cycles), and probably doesn't translate well in C, or x86. The distribution looked better than rand().
The vectorization takes your groups of 4, and scoreboards the values into a 64-bit representation. Once I've filled this table, I then QuickSort it.
ELEMENTS = (80000 * 4)
printf("Converting to bit vectors..\n\n");
TickStart = GetTickCount();
TSCStart = ReadTSC();
p = Buffer; // DWORD *, where group of 4 elements [ p[0],p[1],p[2],p[3] ]
q = BitVector; // __int64 *
for(i=0; i<ELEMENTS; i++)
{
*q = ((__int64)1 << (p[0]-1)) | ((__int64)1 << (p[1]-1)) | ((__int64)1 << (p[2]-1)) | ((__int64)1 << (p[3]-1));
p += 4; // Advance by 4 entries
q++; // Advance by single 64-bit entry
}
TSCEnd = ReadTSC();
TickEnd = GetTickCount();
printf("%10d Ticks, %10d Cycles\n",TickEnd - TickStart, TSCEnd - TSCStart);
Quote from: clive on April 26, 2010, 05:11:35 PM
Ok, so creating 80000, 4 groups of 4, followed by identifying [1, 2, 3, 4] in any order, as well as any other 4 group pairings.
4 000000000000000F [1,2,3,4]
2 0000000000000017 [1,2,3,5]
3 000000000000001D [1,3,4,5]
4 000000000000002B [1,2,4,6]
2 000000000000002D [1,3,4,6]
2 000000000000002E [2,3,4,6]
... ad tedium
2 0003804000000000 [39,48,49,50]
3 0003840000000000 [43,48,49,50]
2 0003880000000000 [44,48,49,50]
2 0003900000000000 [45,48,49,50]
The most frequent in this run
9 0000001000020820 [6,12,18,37]
9 0000001100480000 [20,23,33,37]
9 0000044000088000 [16,20,39,43]
9 0000100420000020 [6,30,35,45]
9 0000200001001100 [9,13,25,46]
I can't really understand these results.
Let's see what they mean:
The most frequent in this run
9 0000001000020820 [6,12,18,37]
9 = how many times you found it?
0000001000020820 = the vectorization of a 4 digit group?
[6,12,18,37] = the corresponding 4 digit group?
Please help me to understand the output of your routine. ::)
Yes, the bit patterns at the end were each identified 9 times in the 320000 entries. I ignored entries found only once.
The bit patterns are 64-bit HEX, where the bits represent 2**(N-1). So bit 0 is 1, bit 49 is 50
N=1, 0001
N=2, 0002
N=3, 0004
N=4, 0008
[1, 2, 3, 4] = 1 or 2 or 4 or 8 = 15 or 0x0F
Quote from: clive on April 26, 2010, 09:38:32 PM
Yes, the bit patterns at the end were each identified 9 times in the 320000 entries. I ignored entries found only once.
The bit patterns are 64-bit HEX, where the bits represent 2**(N-1). So bit 0 is 1, bit 49 is 50
N=1, 0001
N=2, 0002
N=3, 0004
N=4, 0008
[1, 2, 3, 4] = 1 or 2 or 4 or 8 = 15 or 0x0F
This puzzles me even more.
We should have a total of 230,300 possible combinations of 4 numbers,
with no repetition and ignoring the order inside the single combinations,
[1,3,4,1] = not allowed because of repetition
[1,2,3,4] = any combination of the 4 digit = we consider 1 combination total
if we have a total of 230,300, and we find some of them 9 times, it means we should
find others 0 times to balance the total.
Did you find many combinations with 0 score?
We are matching 230,300 combinations [all the possible ones] against
80.000 [the random generated ones], or at least this is what I have imagined.
Are you using this pattern?
if there are 230,300 combinations and 80,000 groups, most of them should have a 0 count no matter what :red
it is somewhat unlikely that a group will occur twice and more unlikely it will occur three times if you have a decent RNG
There are 80000 x 4 entries (320000) each with a group of 4 numbers
With potentially 230300 combinations, it looks to cover about 172000 of them. I'd have to say the routine I was using, and rand() both has "sequence" patterns.
I haven't counted the number of 0 counts, because it doesn't generated any.
i thought there were only supposed to be 80,000 four-value groups
no wonder you have the bigger numbers, Clive :red
it should be easy to figure out how many 0-groups there are
take the number of possible groups minus the number of different groups found
With 80000, I'm getting around ~67650 unique combinations.
Quote from: clive on April 27, 2010, 01:57:25 AM
With 80000, I'm getting around ~67650 unique combinations.
That's quite probable.
What about the performance?
My PB algo is taking something around 10-11 minutes, based on early tests,
to match 230,300 combinations against the 80,000 random ones.
What performance did you get? How long does it take to run the whole loop?
groups occurring 0 times: 162,887
groups occurring 1 times: 56,191
groups occurring 2 times: 9,970
groups occurring 3 times: 1,143
groups occurring 4 times: 105
groups occurring 5 times: 4
groups occurring 6 times: 0
groups occurring 7 times: 0
groups occurring 8 times: 0
groups occurring 9 times: 0
groups occurring 10 times: 0
Total groups scanned: 230,300
Elapsed MS: 646,936
Elapsed CPU Cycles: 1,548,762,482,493
The core routine for matching the combinations:
'---------------------------------------------------------------------------------------
' Counts the occurrences of the 4 digit groups that match all the digits.
' Test all the groups of 4 digits starting from [1,2,3,4] to [47,48,49,50]
' against the 80,000 randomly generated ones.
'---------------------------------------------------------------------------------------
SUB CountNumbers
groups = 0
n_idx = 1
n_target = 4
FOR x1 = 1 TO 47
FOR x2 = x1 + 1 TO 48
FOR x3 = x2 + 1 TO 49
FOR x4 = x3 + 1 TO 50
GOSUB Check
NEXT x4
NEXT x3
NEXT x2
NEXT x1
GOTO exit_count
Check:
FOR x = 1 TO 80000
numbers&(n&(n_idx)) = 1
INCR n_idx
numbers&(n&(n_idx)) = 1
INCR n_idx
numbers&(n&(n_idx)) = 1
INCR n_idx
numbers&(n&(n_idx)) = 1
total_group = numbers&(x1) + numbers&(x2) + numbers&(x3) + numbers&(x4)
IF total_group = n_target THEN
INCR match
END IF
total_group = 0
numbers&(n&(n_idx)) = 0
numbers&(n&(n_idx -1)) = 0
numbers&(n&(n_idx -2)) = 0
numbers&(n&(n_idx -3)) = 0
INCR n_idx
NEXT x
INCR occurr&(match)
match = 0
n_idx = 1
RETURN
exit_count:
END SUB
I didn't do anything to optimize, yet. This is the raw version. :P
With the correct use of a couple of registers it runs in less than 8 minutes now :8)
#REGISTER NONE
REGISTER n_idx AS LONG
REGISTER total_group AS LONG
:8) <--- not so fast, Frank
8 minutes is too long, i should think
that is because you are testing all 230,300 combinations
sorting the groups was mentioned earlier (sorting the 4 values in each group - not the groups)
i think i would sort them before they are stored - as they are generated, in fact
that will speed up anything you want to do with them afterward
if they are ordered groups, you could also convert them to ordinals, as well
it depends a lot on what you want to do with the values when you are done
but for each group, you could store the index (dword), the ordinal (dword), and the 4 values (bytes)
now you can even sort the groups - you know what order they were chose in by the index value
tallying the totals like you are would be very fast
Quote from: dedndave on April 27, 2010, 11:20:38 AM
:8) <--- not so fast, Frank
8 minutes is too long, i should think
that is because you are testing all 230,300 combinations
You are right, of course. I was enjoying the fact that without
changing the algo, only defining 2 variables in a different way, the
algo got a 30% optimization. :P <== probably this is more appropriate :lol
Quote from: dedndave on April 27, 2010, 11:20:38 AM
sorting the groups was mentioned earlier (sorting the 4 values in each group - not the groups)
i think i would sort them before they are stored - as they are generated, in fact
that will speed up anything you want to do with them afterward
if they are ordered groups, you could also convert them to ordinals, as well
it depends a lot on what you want to do with the values when you are done
but for each group, you could store the index (dword), the ordinal (dword), and the 4 values (bytes)
now you can even sort the groups - you know what order they were chose in by the index value
tallying the totals like you are would be very fast
You are right again, in this particular situation, given a file with numbers
that are not changing, an index would speed up the algo a lot.
I think because it could help skipping the combinations [most of them]
that are out of the index range we are searching.
Anyway I've to think about it. I haven't decided yet what kind of statistics
estrapolate from the randomly generated numbers, but it should be
something like: any. :dazzled:
For example in this case we have searched for 4 digit match, it could be
for 1 digit match or 2 or 3 as well, or it could be: when a specific number appears, what are
the numbers that are more likely to appear in the next 10-20 groups, if any,
or it could be how long [or many groups] between a number appears and the next
occurrence, any combination of the 1 digit - 2 digit - 3 digit inside the groups and
the above, and so on...
Like in a lottery they say: this number doesn't show since 12 months, or in
a web shop: this client doesn't place an order since feb 2009, or the like
in any flavour.
In which season of the year this particular customer places a lot of orders,
or in numbers: when this particular number shows more frequently...
Probably for some task a sorted group [1,2,3,4] is better than [2,1,4,3] and
for some others it is not, or it doesn't affect the performance. I'm not sure. :red
From your comments and suggestions I have to think that it is always better
to have some kind of index.
I don't have an engineer background, so I'm just making suppositions.
Recalling my own past experience, mostly of the times it is better to have
an index than not.
I have to thank you and all the guys in the forum for the good suggestions.
I need some time to digest, I am quite slow at the moment.
Because it is just an exercise, with no particular objective to reach, anything
that shows up is just an occasion to learn something new.
So bear some patience, and let's have some random fun :P
P.S. When I get time enough I'll try using an index with this particular task
and see how does it work. No algo in my mind at the moment. :lol
I added some code to provide stats and bin the groups. Even with printing and saving we are at about 80ms.
-Clive
Buffer 1280000 bytes
Generating..
16 Ticks, 18381900 Cycles
Saving.. 'rndtest.out'
Converting to bit vectors..
0 Ticks, 17054647 Cycles
Sorting bit vectors..
47 Ticks, 136977937 Cycles
Scanning..
0 Ticks, 2399887 Cycles
67515 Unique
80000 Elements
Groups occuring 1 times : 56370
Groups occuring 2 times : 9949
Groups occuring 3 times : 1060
Groups occuring 4 times : 128
Groups occuring 5 times : 8
67515
Entire Application Totals
79 Ticks, 229950982 Cycles
Quote from: clive on April 27, 2010, 05:37:38 PM
I added some code to provide stats and bin the groups. Even with printing and saving we are at about 80ms.
-Clive
Buffer 1280000 bytes
Generating..
16 Ticks, 18381900 Cycles
Saving.. 'rndtest.out'
Converting to bit vectors..
0 Ticks, 17054647 Cycles
Sorting bit vectors..
47 Ticks, 136977937 Cycles
Scanning..
0 Ticks, 2399887 Cycles
67515 Unique
80000 Elements
Groups occuring 1 times : 56370
Groups occuring 2 times : 9949
Groups occuring 3 times : 1060
Groups occuring 4 times : 128
Groups occuring 5 times : 8
67515
Entire Application Totals
79 Ticks, 229950982 Cycles
It looks like this is a far better algorithm :U
Did you write it in C or ASM?
I imagine that all the job is:
1) generating 80,000 random numbers and storing them into memory
2) writing them to a file
3) vectorizing the whole
4) sorting it in groups of four integer
5) counting the occurrences of matching pair of groups
without generating all the possibile 230,300 combinations
and without having to match 230,300 against 80,000
6) displaying the results
Well, skipping more than half unnecessary work is in itself
a good optimization :lol
And you have also used some methods I am not aware of. :red
Would you mind posting the complete source code?
With some study and effort I could understand pivotal methods of
doing things that I have never used before.
Thanks for your time
Let me clean it up a little, it is some what devoid of comments, it was a quick prototype and was done in C. There are probably bits which could be done more efficiently in ASM. The comparison used by qsort() is one that's called a lot. And the whole QuickSort could probably be customized/optimized for 64-bit integers.
The real trick is the sort, because it gets everything in order and you don't need to search for matches. You just check the immediate neighbour(s) in a single linear pass over the 80000 groups.
The bit vectoring flattens your 24 (4!) combinations of [1,2,3,4] in to a single comparison.
The whole taking a few minutes thing is what initially bugged me. But I think that is because you were dealing with 5.5M iterations.
-Clive
Quote from: clive on April 27, 2010, 09:09:18 PM
Let me clean it up a little, it is some what devoid of comments, it was a quick prototype and was done in C. There are probably bits which could be done more efficiently in ASM. The comparison used by qsort() is one that's called a lot. And the whole QuickSort could probably be customized/optimized for 64-bit integers.
The real trick is the sort, because it gets everything in order and you don't need to search for matches. You just check the immediate neighbour(s) in a single linear pass over the 80000 groups.
The bit vectoring flattens your 24 (4!) combinations of [1,2,3,4] in to a single comparison.
The whole taking a few minutes thing is what initially bugged me. But I think that is because you were dealing with 5.5M iterations.
-Clive
The original plan was about searching a single digit match inside the 4 digit group,
dealing with all the possible combinations of 4 digits for statistical reasons,
but at this point I am more interested in understanding the algo you have used, so
I'd like to deal first with vectorization that is quite new for me, and return to the
previous plan later on. :U
App with source.
-Clive
Quote from: clive on April 29, 2010, 04:06:02 PM
App with source.
-Clive
Thanks Clive, very kind of you.
Being back from week-end, in few days I'll start studying it
and probably bothering you with some questions as well. :P