ASM for FUN - #6 SUB [Searching inside random numbers]

frktons · April 25, 2010, 07:10:21 PM

Probably in this part of the FUN ASM CODING, I can learn some trick and
some mnemonics as well.
The idea I have in my mind is about searching inside the combinations we generated
with many kind of statistical patterns.

Bacause all the job can be done using memory, registers and stack, and simple arithmetic operations,
and we are not I/O bound, or anything else, all the full speed and potential of ASM CODING
can be used in this SUB. At least this is what I hope.

Let's start the easy way:

- we have a file of 16 groups of 4-value integer numbers, for 5,000 data records.
- we can read all of them inside the memory, into an array structure or the like
- we start searching, among all the combinations of 4 numbers, which one is the "lucky one",
so to say, adding 1 to a counter each time one of the four digit we are searching is found
inside one of the 16 groups of 4-value numbers
- at the end "The winner is..." n1,n2,n3,n4 totaling xxx,xxx count inside the 80,000 groups
of 4-digit combinations.

I'm going to realize a PB version, in this way I can learn some PB as well.
Afterwhile I think we can find many points to optimize with ASM.
We only have integer values, integer registers, integer operations. It should be a good
occasion to speed up things with ASM.

Let's see what we get.

dedndave · April 25, 2010, 08:02:59 PM

well - "search" probably isn't the right word

let me see if i understand you...
you want to see which of the values from 1 to 50 occurs most often ?

Code Select

        .data

Totals  dd 50 dup(0)

        .code

Total0: push    esi
        mov     ecx,80000
        push    edi
        mov     esi,offset FileArray

;tally the totals

Total1: mov     edi,[esi]
        or      edi,edi              ;safety test   - may be omitted
        jz      Total2               ;safety branch - may be omitted

        cmp     edi,50               ;safety test   - may be omitted
        ja      Total2               ;safety branch - may be omitted

        shl     edi,2
        inc dword ptr Totals[edi-4]

Total2: add     esi,4
        dec     ecx
        jnz     Total1

;find the largest value in Totals array
;in this case, we return the lower index if more than one are max total

        mov     esi,offset Totals
        mov     ecx,50

Total3: mov     eax,[esi]
        mov     edx,esi
        jmp short Total5

Total4: cmp     eax,[esi]
        jb      Total3

Total5: add     esi,4
        dec     ecx
        jnz     Total4

        sub     edx,offset Totals-4
        pop     edi
        shr     edx,2
        pop     esi

;EAX = frequency
;EDX = number

EDIT - corrected a mistake :P
oops - another one

frktons · April 25, 2010, 10:16:09 PM

Great!

This could be one of the millions of possible statistic. I'll save and study it
for exercising some ASM concept. :P

What I had in mind, for the first "counting", is a little more
complicated:

- We have to generate all the possible combinations of 4 digit ranging 1-50
like I did in the small routine I posted earlier: perm4.zip, that means:
230,300 combinations.
- For each of them we have to check if one/more of the four digit is inside the
80,000 groups of 4 digit created previosly with ramdom generating routine.
- This means that we can find more than one digit at a time and have to count
them all.
- When we finish with the first group [1 2 3 4], we save the result and the
combination somewhere, like numeric variables. :lol
- We start with the next combination [1 2 3 5] and do the same.
- We confront the results of the first and the second, and save the bigger counter
and the relative 4 digit group, deleting the other
- we go on until the last combination [47 48 49 50]
- when the last counting and compare is done we remain with the biggest
counter found and the relative four digit combination
- we display the result:
"The combination: " n1 n2 n3 n4 " has totalised " count_hits " the biggest found so far"

This will take a little bit more to compute and we have to find ways of optimizing it
to have the job done in less than 1 minute.

It is somewhat more challenging. :dazzled:
I think in optimized PB I can have it done in about 4 minutes, but I've to check
before giving the correct amount of elapsed time and CPU cycles.

It's past midnight over here, and I don't know if my mind is ready. I try anyway
right now. If I fall asleep, I'll continue in few hours. :8)

dedndave · April 25, 2010, 10:27:28 PM

well - that is a LOT of comparing
i seriously doubt you can do it exactly that way in less than a minute on any computer
one way to speed things up would be......
only test combinations that are present
for example - it is unlikely that 1,2,3,4 will exist in the randomly generated groups
no need to test for it, then
better to pick up a group that you do have, and totalize matching groups
mark them - you have plenty of extra bits to use as markers
then, on the next pass, the ones that are marked may be skipped

you will want a little sorter so, when comparing groups, they are in sorted order

when you pick up the first "master" group, sort it
then, look for a match against the lowest value only - not all 4
if a group does not have that value in it - no need wasting time looking at the other 3 values

frktons · April 25, 2010, 10:54:57 PM

Quote from: dedndave on April 25, 2010, 10:27:28 PM
well - that is a LOT of comparing
i seriously doubt you can do it exactly that way in less than a minute on any computer
one way to speed things up would be......
only test combinations that are present
for example - it is unlikely that 1,2,3,4 will exist in the randomly generated groups
no need to test for it, then
better to pick up a group that you do have, and totalize matching groups
mark them - you have plenty of extra bits to use as markers
then, on the next pass, the ones that are marked may be skipped

you will want a little sorter so, when comparing groups, they are in sorted order

Well, if the group [1 2 3 4] is not among the randomly generated groups, it doesn't
matter, because we have to test for the single digits, so if we test against [43 12 2 38]
we find the digit "2" and we count +1, if we test against [23 4 1 18] we find "4" and "1"
and we count +2, and so on.
Sorry for not being very clear, I'm trying to express the idea in the best way I can.

That is a lot of comparing, you're right, but the algo I have in mind doesn't really need
a lot of them. I'm thinking about some tricks to get the job done in a faster way.

Give me some time to CODE and test, and I'll show you what I designed.

One of the trick could be:

Instead of comparing each single digit of [1 2 3 4] with all the four single digits of [43 12 2 38]
that is 16 compare, we do:

1) an array of 50 integer initialized to 0 value outside of the cycle with index ranging 1 to 50
2) we use the 4 digits of the random group [43 12 2 38] like index to switch to 1 the relative
array elements, "ADD 1" I mean, or better "MOV 1" to the array element.
3) we use the 4 digits of the testing group [1 2 3 4] like index and ADD the four array elements
this way, if "1" is not in the [43 12 2 38] group we ADD 0 to the counter, because the array element
has 0 value, and if "2" is switched to "1" we ADD 1 to the counter.
4) in total we do 4 MOV, 4 ADD, 4 MOV to reset to "0" the four array index [43 12 2 38]
5) I think this should be faster than 16 COMPARE and some ADDs to the counter.

With tricks like this in ASM we can do a very fast routine, I think. ::)

dedndave · April 25, 2010, 11:01:25 PM

i am not sure what you mean - lol
when you write the code, i will see if i can read it and "gleen" an answer like Carnac the Magnificent

frktons · April 25, 2010, 11:14:48 PM

Trying to explain in PB-like CODE:

file randomly generated: I group [12- 5- 34- 1]

First group to test: [1 - 2 - 3 - 4]

the randomly generated numbers switch the relative array elements to 1:

num(12) = 1
num(5) = 1
num(34) = 1
num(1) = 1

we count the hits found:

count = count + num(1) + num(2) + num(3) + num(4) ' in this case num(1) has got "1" the others have "0"

we reset the switched array elements for the next group to count:

num(12) = 0
num(5) = 0
num(34) = 0
num(1) = 0

so we did no compare at all, only moved values between integers and added integers [very fast operations]

Is it a little bit clearer? :8)

clive · April 25, 2010, 11:23:59 PM

Yes, basically a score board. And cheaper to clear the four known locations than to clear the whole array. Personally I'd thought of using a bit vector to the same effect.

Any reason for 50 (lottery), verses 64? Generating a power of two would be cheaper than MOD 50.

-Clive

dedndave · April 25, 2010, 11:25:20 PM

no :P

i am not sure what you expect "count" to be at the end
i think it will be 80000

maybe - instead of explaining the algorithm, explain the real-world use for it
i am an engineer, but i understand business, too :bg

or - just write it in PB and i'll see if i get it, then :P

frktons · April 25, 2010, 11:43:32 PM

Quote from: clive on April 25, 2010, 11:23:59 PM
Yes, basically a score board. And cheaper to clear the four known locations than to clear the whole array. Personally I'd thought of using a bit vector to the same effect.

Any reason for 50 (lottery), verses 64? Generating a power of two would be cheaper than MOD 50.

-Clive

Do you think a bit vector would be faster than integers? I don't know very much about ASM but
anything I read so far tells me integer operations are faster. Of course it could be the reverse.
50 or 64 or 100 doesn't really matter, I just used a manageable amount of combinations.
100 numbers in groups of 6-10 would be almost impossible to check in a actual machine.
In order to learn and to test the SUBs, the amounts have to be doable in a reasonable time,
and you cannot always choose the fastest way in real life. ::)

Quote from: dedndave on April 25, 2010, 11:25:20 PM
no :P

i am not sure what you expect "count" to be at the end
i think it will be 80000

maybe - instead of explaining the algorithm, explain the real-world use for it
i am an engineer, but i understand business, too :bg

or - just write it in PB and i'll see if i get it, then :P

Probably it will be near 80,000 more or less, if the RND function in PB is a good one.
The point is: it doesn't really matter, otherwise it would be "ASM for JOB", or the like.
Mine if "ASM for FUN", so if I can't use it in real life I'll enjoy more.

It is very satisfying to reach a goal, and mine is to learn ASM playing with it.
It was somewhat boring to CODE for BUSINESS in PLI, I could not choose
very much.

Data structure was already there, the flow of procedure was already established,
so I had just a small portion of the whole to make working.
When I really enjoyed was when I could "invent" some algo or find some trick
to have a better performance.

Now I think I found one of those trick, not that I am the first, maybe it is a well
established concept, but for me it is a new one.

As done before for all the SUBs, I'll CODE it in PB [PB for FUN :lol ] and afterwhile
with the help of the more experienced ones I'll try to optimize it.

Optimizing is really a good exercise for the mind, isn't it? :lol

clive · April 26, 2010, 12:47:06 AM

A bit vector would be faster than integers especially if you could keep it in a couple of registers, your [1, 2, 3, 4] example could be confirmed with a cmp eax,0Fh. Not that I think that [1,2,3,4] or [4,3,2,1], etc is an unreasonable sequence.

Using 64 combinations, I'd be tempted to use a Maximal LFSR to generate random sequences. I'd have to do some tests on the number distributions, but they'd be highly efficient.

-Clive

frktons · April 26, 2010, 01:21:44 AM

Quote from: clive on April 26, 2010, 12:47:06 AM
A bit vector would be faster than integers especially if you could keep it in a couple of registers, your [1, 2, 3, 4] example could be confirmed with a cmp eax,0Fh. Not that I think that [1,2,3,4] or [4,3,2,1], etc is an unreasonable sequence.

-Clive

Let me see if I've got your point. If I can use EAX and EBX I have 64 bits available, how do I
test the sequence [1, 2, 3, 4] against for example [23, 11, 7, 3] ?
How do I count the results?
Do I start with all the bits "0" and after I switch the bits [23, 11, 7, 3] to "1" before the cmp?
Or what? Please I'm a beginner, I don't actually know very much about ASM.
Bear some patience with me, try not to skip any step. It could be obvious for you but I easily get lost
at the moment. ::)

clive · April 26, 2010, 03:51:45 AM

Here are some thoughts on how to do this in assembler. Hopefully the fragments will spark some ideas/questions.

I'm going to pack your numbers into one 32-bit word, as 1..50 easily fits in 8-bits. You could read it out of DWORDs in memory if your wanted instead of rotating the packed version.

Code Select


    xor  eax,eax ; clear bit vector
    xor  ebx,ebx

    mov  ecx,004030201h ; [1, 2, 3, 4] packed in a DWORD

    ; rotate through packed format converting to bit vector ebx:eax

    call setbit ; bit convert entry in CL
    rol  ecx,8
    call setbit
    rol  ecx,8
    call setbit
    rol  ecx,8
    call setbit
    rol  ecx,8

Code Select


setbit PROC

    push  ecx ; save the original
    movzx ecx,cl ; convert low order byte into dword

    dec ecx     ; convert 1..50 to 0..49 range
    test ecx,20h ; test bit 5 (ie >= 32)
    jnz highdword ; bit 5 is high

        ; 1-32 (0..31)

    bts eax,ecx ; set bit ecx (0..31), in eax
    jmp done

highdword:  ; 33-64 (32..63)

    and ecx,1Fh ; mask lower 5 bits, eqv -32
    bts ebx,ecx ; set bit ecx (0..31), in ebx

done:

    pop  ecx
    ret

bitset ENDP

For [1, 2, 3, 4]

Code Select

eax=00000000 ebx=00000000 Cleared

cl=01
eax=00000001 ebx=00000000

cl=02
eax=00000003 ebx=00000000

cl=03
eax=00000007 ebx=00000000

cl=04
eax=0000000F ebx=00000000

For [23, 11, 7, 3]

Code Select

eax=00000000 ebx=00000000 Cleared

cl=17
eax=00400000 ebx=00000000

cl=0B
eax=00400400 ebx=00000000

cl=07
eax=00400440 ebx=00000000

cl=03
eax=00400444 ebx=00000000

clive · April 26, 2010, 04:01:43 AM

Ok, so if you think [1, 2, 3, 4] is equivalent to [4, 3, 2, 1], [3, 2, 1, 4], etc

You convert all your number sets to bit vectors, then sort the bit vectors, and finally you could enumerate through the list of sorted bit vectors counting the ones which are the same.

The point of sorting would be to minimize the amount of searching required to find matching bit vector patterns. Sorting would naturally group them together making the counting phase simpler.

frktons · April 26, 2010, 07:58:00 AM

Quote from: clive on April 26, 2010, 04:01:43 AM
Ok, so if you think [1, 2, 3, 4] is equivalent to [4, 3, 2, 1], [3, 2, 1, 4], etc

You convert all your number sets to bit vectors, then sort the bit vectors, and finally you could enumerate through the list of sorted bit vectors counting the ones which are the same.

The point of sorting would be to minimize the amount of searching required to find matching bit vector patterns. Sorting would naturally group them together making the counting phase simpler.

I can imagine this can be done, but what about speed? It looks like it'll take longer doing all these bit operations than
doing the simple MOVs and ADDs I had in mind.
All depends of course on how fast these bit operations are.
As soon as I'll have the PB code and test it, we can see the difference.

Thanks anyway for your help

News:

ASM for FUN - #6 SUB [Searching inside random numbers]