News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Optimization subject.

Started by Mincho Georgiev, July 10, 2010, 11:32:07 AM

Previous topic - Next topic

Mincho Georgiev

Hello, Friends!
Probably most of you don't remember me, since it's been a while, but here I am again, sending best regards to all of my old friends in here!
First, my apology that this is not an assembler question in general, but I am sure that there are a lot of people with the experience towards this.
It's about optimization of C code, and most specifically, x86 addressing modes.
Some information for start:

We have 2d array, let's say like this:

unsigned char array[128][128];


and auxiliary array of pointers to the 'array's rows like this:

unsigned char *parray[128]; //initialized with '=&array[0....127][0];


Now... In C those 2 local pointers expressions are supposed to be absolutely equivalent:

unsigned char *a = &array[x][0];
and
unsigned char*a = parray[x];


Now for the reference to 'a[]'. Most compilers with /O2 optimization flag on will translate it as:

"a[y]" - reference to the 'y' column of the 'x' row, where y is a dynamic (code flow) variable.
to
BYTE PTR [edx+eax]
(registers in use are not important - could be edi instead an so on) where edx contains the address of 'x' and eax - the value 'y', i.e. dynamic displacement value.
What puzzles me is the translation of the intel x86 compiler, which gives different results of translation with the different initialization of the 'a' ptr:

when unsigned char*a = parray[x];
then the reference a[y] is translated to:
BYTE PTR [edx+eax]

when unsigned char *a = &array[x][0];
then the reference a[y] is translated to:
[_array + X + eax], where _array - start address and X - displacement.


In other words, in the first case the compiler uses Base + Index (indexed mode), in the second - Base + Index + Displacement (indexed + displacement).
This happens only with the intel x86 compiler and none of any other I've used. So my question is whether I should use the first mode with one register busy with the address (edx in that sample), or the second one that adds extra bytes to the reference.
Any suggestions are welcomed!
Thanks in advance!

p.s. I'm glad to be here again.  :bg