News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Find zero/empty slot quickly ...

Started by James Ladd, April 05, 2009, 06:34:51 AM

Previous topic - Next topic

redskull


Intel(R) Core(TM)2 Duo CPU     E4500  @ 2.20GHz (SSE4)

Code sizes:
findslotSSE2   66
findslotMW     48
findslotScasd  26

Timings:
findslotSSE2   19536   kilocycles, result: slot 10000000
findslotMW     22199   kilocycles, result: slot 10000000
findslotScasd  44976   kilocycles, result: slot 10000000

findslotSSE2   1968    kilocycles, result: slot 1000000
findslotMW     2220    kilocycles, result: slot 1000000
findslotScasd  4247    kilocycles, result: slot 1000000

findslotSSE2   59      kilocycles, result: slot 100000
findslotMW     116     kilocycles, result: slot 100000
findslotScasd  391     kilocycles, result: slot 100000
Strange women, lying in ponds, distributing swords, is no basis for a system of government

James Ladd

I think it is awesome that someone can post a simple question and get such great responses !
Thanks all.

jj2007

#17
Quote from: redskull on April 09, 2009, 10:53:12 PM
REP SCASD, Baby!  It's probably not the fastest, but you don't get smaller than a single instruction (some of us still irresponsibly optimize for size)

Hard to believe, but you can get faster and smaller:

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)

Code sizes:
findslotSSE2   65
findslotMW     48
findslotJJ     18
findslotScasd  26

Timings:

findslotSSE2   86      cycles/100 slots, slot found at byte 500000
findslotMW     157     cycles/100 slots, slot found at byte 500000
findslotJJ     181     cycles/100 slots, slot found at byte 500000
findslotScasd  398     cycles/100 slots, slot found at byte 500000


James, perhaps it's time to reveal how big your array is... the timings show a strong dependency on whether the array fits into the cache ;-)

EDIT: A bugfix in my non-SSE algo, and 3 cycles less for the SSE2 version.

[attachment deleted by admin]

PBrennick

The easiest way to find an empty slot in an array is via a mathematical progression. You continually add the current slot to the next slot and then compare the result to the first slot. If they are equal, then the next slot is empty. If not equal you update the array pointer and add again. Do this until the entire array is tested. You can also build an array of pointers to empty slots and when you use one of them you remove it from the second array. This will work for any type of array, ASCII or otherwise.

Paul
The GeneSys Project is available from:
The Repository or My crappy website