Advanced Array Handling

Draakie · December 05, 2007, 05:49:20 AM

From searching the Forum - I got this from Donkey about standard array lookup's:

The format to use this type of addressing is :
[BASE+INDEX*SCALE]
Where:
BASE is the address of the start of the array
INDEX is the entry number you are interested in
SCALE is the size of each entry -> 1,2,4 or 8 for BYTEs, WORDs, DWORDs or QWORDs respectively

Understandable and Yes, This works quite well for single dimentional arrays.
As soon as you start going multi-dimentional (3 or more), things get a bit hairy......
Also the "Struct"'s employed in array's are typically larger than 16-bytes... do you really have to "shift", "add"
"sub" etc. to get hold of the correct INDEX values. ex. 24-byte -->eax=ebx / shl eax, 4 / shl ebx, 3 / add ebx, eax / use ebx as index
every time you want to index the correct element. In my current project I need to sometimes get hold of specific elements per
tier of the array at the same time. This can be quite a bother - besides running out of registers - it's becomes complicated and a real
brain teaser.

Any "neat" tricks OR advise for navigating these ?!

donkey · December 05, 2007, 06:06:47 AM

Hi,

The problem is that the SCALE values are fixed by the processor, they are not modifiable - for example a SCALE of 3 is not allowed. However, The INDEX can be separately calculated as follows...

BASE+(INDEX_X*INDEX_Y)*SCALE

The index would require a second instruction mul or some such. So assuming 32 bit and you are using EAX to hold your index...

Code Select

mov eax, [index_x]
mul D[index_y]
mov eax, [arraybase+eax*8]

Donkey

gabor · December 05, 2007, 08:12:46 AM

Hi Draakie!

Nice to hear from you! I believe the array handling problem leads to a more general point: data storage. And in this domain there are 2 big rivals, hash and trees. One can be optimized for complex operations (trees) and the other for speed (hash). I'm sure you too know a lot about these 2 topics.
An n dimensional array could be stored as n piece 1 dim array or all n arrays interlaced. Interlacing has the advantage that data that belongs to the same index is at near addresses. The seperate arrays can be managed easier.
Finally, there is the common problem with arrays: they are static structures with predefined size. This can be avoided basicly in 2 ways: The array allocates n elements at a time and when it gets full a new array is created that is n elements bigger. Here, copying the elements from old array can be time consuming. That is why I choosed the 2nd way, the dynamlically linked subarrays. Here fix sized subarrays are linked in a linked list. No need to copy at all and with properly choosen subarray size the hops along the links can be kept on a minimum.
If you are interested have a look on it here:
http://www.masm32.com/board/index.php?topic=6248.0
I guess there are better solutions (memory handling is not optimal in my implementation).

Greets, GÃ¡bor

Draakie · December 05, 2007, 08:28:58 AM

Hi Gabor, nice to see ya tooÂ :P

Yes, I'am also using dynamic linked lists - my problem stems from the fact of transversal to
elements above 3rd tier level (multi dimentional) seems overly intensive. I'am having a look
at u're code and can already see that I'am about to steal some of u're code ideas :toothy.
The problem for me at present is'nt knowing how to access elements - but really how to
efficiently do so (least amount of cycles or overhead) seeing as the STRUCTure per array tier
is variable and not within the pre-determined SCALE sizes. I'am looking for a good strategy specific
to the assemler genre...

Thanx
Draakie

Tedd · December 05, 2007, 01:18:04 PM

Let's say you have an array of 100x12, dword elements..

To access a particular element you need to find the start of its row, and then you can treat that row as a simple single-dimension array.
So, to access array[3][5] (row 5, col 3) you do:

start_of_row_5 = addr(array) + (size_of_full_row) * 5 = addr(array) + (100 elements * element_size) * 5 = addr(array) + 400 * 5
And then to get the actual element you're interested in = start_of_row_5 + (3 * element_size) = start_of_row_5 + 3 * 4

So, fully: array[px][py] = addr(array) + (py*WIDTH*sizeof(dword)) + (px*sizeof(dword))

Now all you have to do is code that! Which is easy once you know the array you're using because most things are constant.
(addr(array) + px*4) + (py*400) :: {(base + index*scale) + (index2*scale)}
..3 instructions (lea, mul, add) :bg (..and then a mov to get/put)

Now, for a 3 dimensional array.. :dazzled:
Actually, it's the same idea extended (surprisingly :P) Find the start of the 2D array you want [z], then the 1D row within that [y], and finally the element

I'll leave the fun part for you :bdg

Work out the equation you need on paper first, then try to simplify that, arrange the parts to make it easier to code, then finally code it. Scaling is only good for element size, not struct size - so you're stuck with a mul (or shift & add) by the struct size.

Draakie · December 06, 2007, 05:16:10 AM

Thanx Ted,

That is effectively what I was looking for - You never dissapointÂ :clap:
I'll post an implementation soon-ish. The use of which might prove usefull to many
ppl.......think 3D graphics, recursive function programming etc.

Draakie

gabor · December 06, 2007, 08:34:57 AM

Hi!

Ted's solution does it all, again :bg
Though, there is 1 problem that maybe I mentioned in my post and that was the reason for me to move to dynamic methods: to know the width, length of a row is not always easy, and defining an satisfÃingly large length can be space wasting too.
If speed is essential then after precalculating the size this is no problem any more.
If space usage is essential (think of a graph with millions of nodes) the linked list of subarrays seems to be optimal for elements with same sizes, and linked list is better for different sized elements.

I'm keen on seeing your solution Draakie!

Greets, GÃ¡bor

ecube · December 06, 2007, 08:49:48 AM

all thats not necessary really, create a big chunk of memory, create a struct, copy the struct with the data you want into the memory then just loop through is with add ecx,SIZEOF thestruct. And to read the memory assume a register is pointing to the struct(e.g [ecx].filehandle) and use is like you would a struct directly.. so can have dwords, bytes whatever you want with ease.

Rockoon · December 07, 2007, 06:01:34 AM

Random access of 2D+ arrays simply isnt a trivial problem that you can wish away, and the core evils are that of instruction latency and cache misses.

Regardless of the methodology employed, the calculation of the pointer to the 2D+ item you want is a dependency chain so even if the array fits in the L1 cache, avoiding the cache miss issue, the lackluster performance will leave you desparately wanting for a 1D 8-byte-or-less solution. Even under the confines of the "best" 2 instruction 2D solutions the depedency chain calculating the final pointer will have 6+ cycle latency. On CPU's that commonly retire 2+ instructions per clock cycle even with unoptimized code (AMD64, Core2), that represents a very significant delay.

On the "brighter" side, if you expect a lot of cache misses because the array is vastly larger than the L1 cache, then the dependency chain is the least of your worries. The horrible performance of code that causes lots of cache misses will leave you desparately wanting an algorithm that doesnt require random access.

If the access isnt random, then none of this means anything.. its just 1D memory after all and working to and fro is just constant displacements and typically good cache performance.

RuiLoureiro · December 07, 2007, 06:36:03 PM

Hi

I dont know nothing about arrays. I never used it (in assembly). But

Here is a very simple example to get the address of an array element

array [X,Y,Z]Â Â DimX=4, DimY=2, DimZ=3.Â Length of each element (Len) = 1

Follow the picture. The elements are divided in 3 groups (Z=1, Z=2 and Z=3).
In each group they are divided in Y=1 and Y=2.

;........................................................................... Group Z=1
point A |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Y= 1
Â Â Â Â Â Â 0Â Â Â Â Â Â 1Â Â Â Â 2Â Â Â Â Â Â 3Â Â Â Â Â Â Â Â Â Â Â Â Â <---- addresses
Â Â Â Â .............................................
Â Â Â Â |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Â Â Y= 2
Â Â Â Â 4Â Â Â Â Â 5Â Â Â Â Â 6Â Â Â Â Â 7Â Â Â Â Â Â Â Â Â Â Â Â Â <---- addresses

;........................................................................... Group Z=2
point B |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Y= 1
Â Â Â Â Â Â 8Â Â Â Â Â Â 9Â Â Â Â Â 10Â Â Â Â Â 11Â Â Â Â Â Â Â Â Â Â Â Â Â <---- addresses
Â Â Â Â ................................................
Â Â Â Â |Â x=1Â Â |Â x=2Â |**x=3**|Â x=4Â |Â Â Y= 2
Â Â Â Â 12Â Â Â Â 13Â Â Â Â Â 14******15Â Â Â Â Â Â Â Â Â Â Â Â Â <---- addresses

;........................................................................... Group Z=3
point C |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Y= 1
Â Â Â Â Â 16Â Â Â Â 17Â Â Â Â 18Â Â Â Â Â 19Â
Â Â Â Â ..............................................
Â Â Â Â |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Â Â Y= 2
Â Â Â Â 20Â Â Â Â 21Â Â Â Â Â 22Â Â Â Â 23Â

We want the adress of the element [3,2,2] ( X= 3, Y= 2, Z= 2 ) which is 14 (see picture)

Â Â Â Â Â Â Â Â Â 2Â Â Â Â Â 4Â Â Â Â Â 1Â Â Â 1Â Â
Point B:Â Â DimY . DimX . Len . (Z-1)Â =Â 8
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 4Â Â Â Â Â 1Â Â Â Â 1
PlusÂ Â :Â Â Â Â Â Â Â Â DimX . Len . (Y-1)Â Â =Â 4
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 1Â Â Â Â 2
PlusÂ Â :Â Â Â Â Â Â Â Â Â Â Â Â Len . (X-1) Â =Â 2
Â Â Â Â Â ----------------------------------------------
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = 14

;******************** another example ****************
Length of each element (Len) = 4

;........................................................................... Group Z=1
point A |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Y= 1
Â Â Â Â Â Â 0Â Â Â Â Â Â 4Â Â Â Â Â Â 8Â Â Â Â Â 12Â Â Â Â Â Â Â Â Â Â Â Â Â Â <---- addresses
Â Â Â Â .............................................
Â Â Â Â |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Â Â Y= 2
Â Â Â Â 16Â Â Â Â 20Â Â Â Â 24Â Â Â Â 28Â

;........................................................................... Group Z=2
point B |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Y= 1
Â Â Â Â Â 32Â Â Â Â 36Â Â Â Â 40Â Â Â Â Â 44Â
Â Â Â Â .............................................
Â Â Â Â |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Â Â Y= 2
Â Â Â Â 48Â Â Â Â 52Â Â Â Â 56Â Â Â Â 60Â

;........................................................................... Group Z=3
point C |Â x=1Â Â |Â x=2Â |Â x=3Â |**x=4**|Â Â Y= 1
Â Â Â Â Â 64Â Â Â Â Â Â 68Â Â Â Â 72Â Â Â Â 76*******Â Â Â Â Â Â Â Â Â Â Â <---- addresses
Â Â Â Â .............................................
Â Â Â Â |Â x=1Â Â |Â x=2Â |Â x=3Â |Â x=4Â |Â Â Â Â Â Â Y= 2
Â Â Â Â 80Â Â Â Â 84Â Â Â Â 88Â Â Â Â 92Â

We want the adress of the element [4,1,3] ( X= 4, Y= 1, Z= 3 ) which is 76 (see picture)

Â Â Â Â Â Â Â Â Â Â 2Â Â Â Â 4Â Â Â Â Â 4Â Â Â 2Â Â
Point C:Â Â DimY . DimX . Len . (Z-1)Â = 64
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 4Â Â Â Â 4Â Â Â 0
PlusÂ Â :Â Â Â Â Â Â Â Â DimX . Len . (Y-1)Â Â =Â 0
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 4Â Â Â Â 3
PlusÂ Â :Â Â Â Â Â Â Â Â Â Â Â Â Len . (X-1)Â Â = 12
Â Â Â Â Â ----------------------------------------------
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = 76

If the point A has the address Â«offset VarÂ» then the adress is Â«offset Var + 76Â».

Generally we have:
Â Â Â Â Â Â Â Â Â Â Â Â Â Â address = Base + [DimY . DimX . (Z-1) + DimX . (Y-1) + (X-1)] . Len
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = Base + { [DimY . (Z-1) + (Y-1)]. DimX + (X-1) }Â Â Â Â Â . Len

( X from 1 to DimX, Y: 1 to DimY, Z: 1 to DimZ )

For a 2 dimensional arrays:

Â Â Â Â Â Â Â Â Â Â Â Â Â address = Base + [ DimX . (Y-1) + (X-1)] . Len

Â Â ( X from 1 to DimX, Y: 1 to DimY )
;..............................................................
For indexes X, Y from 0 to DimX-1, DimY-1:

Â Â Â Â Â address = Base + ( DimX . Y + X ) . LenÂ Â Â Â Â Â Â Â Â Â ; [ put X instead of X-1 ... Z instead Z-1]

Â Â Â Â Â ( X from 0 to DimX-1, Y: 0 to DimY-1 )

Have a good night/day
Rui

Mark Jones · December 10, 2007, 04:36:06 PM

Hey Rui, to help with your illustration, try putting that text in [ tt ] and [ / tt ] code tags. That makes the text a mono-spaced font like Courier, and the spacing will show up properly. This text is placed between those tags, and it is mono-spaced. :U

RuiLoureiro · December 10, 2007, 05:39:28 PM

Hi Mark,
OK ! Thanks

Rui

News:

Advanced Array Handling

ecube