When processing arrays in a loop, i often write out the code as "array[counter*SIZEOF array]", which usually works well enough (with counter as a register that get's incremented by 1s from 0 to LENGTHOF array). However, when I try to use the same method with an array of structures, everything falls apart pretty quickly and I get 'invalid scaling' errors. From what i've gathered by searching, apparently the scaling factor can only be 1,2,4,or 8, so trying to use SIZEOF with a structure over 8 bytes is no good; can someone confirm or deny this? So if I have a an array of elements of some large structure, what is the most elegant way to access it? The way i've got it now is using a loop counter going from 0 to SIZEOF 'array', and then ADDing SIZEOF 'structure' instead of INCing it. Is there a better way i'm not seeing?
Thanks as always
regards, alan
The scaling factors allow you to do the multiplication implicitly, but yes only by 1, 2, 4, or 8. If you need to multiply by another number, then you'll need to do it explicitly.
Although multiplying every loop is a bit of an overhead (multiply is fairly slow, though you can often replace it with a series of shifts and adds) so simply adding on the size of each element is probably the best way to go.
You can use this macro (from my ASM Runtime package - link in signature) for it. I haven't benchmarked most of the combinations but they should all be faster than a straight out MUL.
Set 'canusereg' to a scratch register or leave it blank to use the stack. This is only used for multiplying by 7 so far.
Cheers,
Zooba :U
mult MACRO dest:REQ, src:REQ, canusereg
LOCAL tmp
IF ((OPATTR dest) AND 10000b) EQ 0
GOTO fullmult
ENDIF
IF ((OPATTR src) AND 100b)
IF src EQ 0
xor dest, dest
ELSEIF src EQ 1
ELSEIF src EQ 2
add dest, dest
ELSEIF src EQ 3
lea dest, [dest*2+dest]
ELSEIF src EQ 4
add dest, dest
add dest, dest
ELSEIF src EQ 5
lea dest, [dest*4+dest]
ELSEIF src EQ 6
lea dest, [dest*2+dest]
add dest, dest
ELSEIF src EQ 7
IFNB <canusereg>
mov canusereg, dest
lea dest, [dest*8]
sub dest, canusereg
ELSE
mov [esp-4], dest
lea dest, [dest*8]
sub dest, [esp-4]
ENDIF
ELSEIF src EQ 8
lea dest, [dest*8]
ELSEIF src EQ 9
lea dest, [dest*8+dest]
ELSEIF src EQ 10
lea dest, [dest*4+dest]
add dest, dest
ELSEIF src EQ 12
lea dest, [dest*2+dest]
shl dest, 2
ELSEIF src EQ 16
lea dest, [dest*8]
add dest, dest
ELSEIF src EQ 20
lea dest, [dest*4+dest]
add dest, dest
add dest, dest
ELSEIF src EQ 24
lea dest, [dest*2+dest]
lea dest, [dest*8]
ELSEIF src EQ 28
add dest, dest
add dest, dest
lea dest, [dest*8-dest]
ELSEIF src EQ 32
add dest, dest
add dest, dest
lea dest, [dest*8]
ELSEIF src EQ 36
lea dest, [dest*8+dest]
lea dest, [dest*4]
ELSEIF src EQ 40
add dest, dest
lea dest, [dest*8+dest]
add dest, dest
add dest, dest
ELSE
GOTO fullmult
ENDIF
ELSE
GOTO fullmult
ENDIF
EXITM
:fullmult
IFDIFI <dest>, <ecx>
push ecx
ENDIF
IFDIFI <dest>, <edx>
push edx
ENDIF
IFDIFI <dest>, <eax>
push eax
mov eax, dest
ENDIF
xor edx, edx
mov ecx, src
mul ecx
IFDIFI <dest>, <eax>
mov dest, eax
pop eax
ENDIF
IFDIFI <dest>, <edx>
pop edx
ENDIF
IFDIFI <dest>, <ecx>
pop ecx
ENDIF
ENDM
Quote from: Tedd on June 04, 2006, 06:21:48 AM
The scaling factors allow you to do the multiplication implicitly, but yes only by 1, 2, 4, or 8. If you need to multiply by another number, then you'll need to do it explicitly.
Although multiplying every loop is a bit of an overhead (multiply is fairly slow, though you can often replace it with a series of shifts and adds) so simply adding on the size of each element is probably the best way to go.
but LEA is fast
first a lea to scale
followed by mov that scales it yet x2,x4,x8 for example
lea eax,[ebx*8] ;scale by 8
mov eax,[eax*8] ;scale by 8x8
Red,
If you need to do it fast, create 2 arrays, one array of data and the other an array of pointers to the data. Have a look at this module in the MASM32 library.
create_array proc acnt:DWORD,asize:DWORD
thanks to everybody, those are all great ideas, that i probably would of never thought of. You guys truley are mavens :U
I just add (SIZEOF array) to ecx. I don't see anything wrong with it.
It's better to use ADD instead of INC anyway.
Quote from: savage on June 04, 2006, 06:43:24 PM
It's better to use ADD instead of INC anyway.
On Intel architectures anyways. :P