News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Arrays and structs

Started by RedXVII, December 06, 2005, 09:22:15 PM

Previous topic - Next topic

RedXVII

Ive already searched the forum for this, didnt find anything to answer my 2 questions;

1) can arrays be used and maniulated in the form of square brackets? (matrix[40]) or is there different syntax/different way to do it with masm?

2) can you have arrays of structs? eg. Code in C:

   CHAR_INFO buffer[400] = {0};

Thanks alot  :U

AeroASM

If your array element size is 1,2,4,8,16 bytes, then you can use this method:


;C code: __int32 my_array[100];
my_array dword 100 dup(?)

;C code: my_array[40]=0x12345;
mov [offset my_array + 40 * 4],12345h
;multiply 40 by four because the array element size is four bytes.


Otherwise, you must manually multiply the index by the scale to get the offset.

zooba

Arrays in assembly only exist in byte terms. You define a contiguous block of memory and then refer to offsets within that block.

dwArray DWORD 1, 2, 3, 4, 5, 6, 7, 8
...
mov eax, dwArray[0]             ; moves 1 into eax
mov eax, dwArray[4]             ; moves 2 into eax


Arrays in C automatically determine the size of each item, but in assembly you have to specify the exact byte. So if you tried the following:
mov eax, dwArray[1]
you would find eax contains garbage (well, in this case it'd contain 20000000h).

The solution is to either know that you're talking about bytes, or alternatively:
mov eax, dwArray[1*4]
OR
mov eax, dwArray[1*(SIZE dwArray)]      ; (SIZE dwArray) = 4


As AeroASM said, these evaluate directly to an offset from the label (dwArray). If your element size is not a power of 2 (or is above 16 bytes) then you will need to use a 'mul' instruction to find the offset:

   ;  eax contains the 0-based index of the array of 14-byte items
xor  edx, edx
mov  ecx, 14
mul  ecx
mov  edx, myArray[eax]    ; will return the first DWORD
mov  edx, myArray[eax+4] ; will return the second DWORD

Note that this assumes the mul didn't overflow or carry into edx.

RedXVII

Thanks alot - i should be able to do this now, i think...  :U

Its gonna be a bit of a pain doing ->  CHAR_INFO buffer[400] = {0};


I need to pass it into the MS function "WriteConsoleOutput",
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/writeconsoleoutput.asp

or i was... I want to print stuff on the console then update only certain parts of it; ive done this in C. I wanted to do this in assembly as part of the whole learning experiance  :toothy

Is there an even better/quicker way im overlooking?

Hypervista

zooba, excellent post and informative response.  one small correction though,

QuotedwArray DWORD 1, 2, 3, 4, 5, 6, 7, 8
...
mov eax, dwArray[0]             ; moves 1 into eax
mov eax, dwArray[4]             ; moves 2 into eax

in the example you gave, wouldn't the following line be correct?

mov eax, dwArray[4]             ; moves 5 into eax


dsouza123

mov eax, dwArray[4]  does move 2 into eax. Remember it is an array of dwords, 4 bytes each.

It can be decoded as, copy the dword starting at byte 4 of dwArray to eax.

0 1 2 3  4 5 6 7  8 9 A B  <= the bytes
1 0 0 0, 2 0 0 0, 3 0 0 0  <= the dwords, each 4 bytes, low to high byte

RedXVII

So i take it MASM recognises the form:

Quotemov eax, dwArray[number]

where masm translates it as the pointer to dwArray + number of bytes.

In that case for an array of CHAR_INFO, i will need to know the size of the struct. I get this struct to be 5 bytes long, is that right?

typedef struct _CHAR_INFO {
union {                                                 //what is "union"?
WCHAR UnicodeChar;
CHAR AsciiChar;
} Char;
WORD Attributes;
} CHAR_INFO, *PCHAR_INFO;

zooba

Correct.

In this case, you're probably better to use the SIZEOF operator:

mov eax, charArray[(SIZEOF CHAR_INFO) * 2]     ; retrieve the first DWORD of the second CHAR_INFO

However, keep in mind that if (SIZEOF CHAR_INFO) is not 1, 2, 4, 8 or 16, for a loop counter you'll need to do this (if you're using an immediate instead of a loop counter, the above works fine):

mov eax, dwLoopCounter
xor edx, edx
mov ecx, SIZEOF CHAR_INFO
mul ecx
; now EAX contains the byte offset to item 'dwLoopCounter' of the array


I believe your assessment of 5 bytes is correct, though personally I can't stand C's habit of making a new type for every possible variable. If you use this same structure in MASM (I think Windows.inc contains the required TYPEDEFs) I'd confirm it or just use SIZEOF.

BTW, 'UNION' lets you refer to different parts of the same variable. The largest member of 'Char' is a WCHAR, so that is the size of 'Char'. If you refer to 'Char' as a CHAR, it will work and (I think) only retrieve the least-significant byte of 'Char'.

tenkey

The CHAR_INFO structure, as posted, is only 4 bytes.

A union defines "overlays". Each member of the union starts at the same offset.

The union of a WCHAR (2 bytes) and a CHAR (1 byte) is 2 bytes. The union itself is named "Char". It is followed immediately by a WORD (2 bytes) which is named "Attributes".
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

RedXVII

Woohoo! understood. Long post - thank you sooo much if you take the time to read it,  :U

However - your previous post has conflicting statements about "union", i also disaggree with my previous statement after looking at some C code in olly  :toothy
I think its 4 bytes long -> DWORD which makes sense.

How i understand it looking at the running program:

CHAR_INFO

   1st byte    -    2nd byte    -    3rdbyte    -    4th byte
|--AsciiChar---|
|----------UnicodeChar-----------|
                                        |-------------Attributes-----------|



With that done would this be correct.

.data
charArray CHARINFO 48000000h, 65000000h, 6C000000h, 6C000000h, 6F000000h       //"Hello" string


And would i be able to manipulate it something like:
mov ax, charArray.Attributes   //move attributes of first Letter (H) to ax
mov ebx, charArray[4]    //move DWORD of letter(e) into ebx
mov cx, charArray[8].Attributes    //move attributes of 3rd letter(l) to cx


Would these work? Or is all that garbage?

I know i probably could do,
mov cx, charArray[10]
for moving attributs of the 3rd letter to cx but i was just curious if masm will accept this?

Thanks  :U

AeroASM

The code looks correct. MASM will accept mov cx, chardata[10] because there is no type checking.
Your data definition is wrong. You should have 48h etc not 48000000h because the data is stored back to front.

Keep in mind how MASM translates the high level syntax into assembler.

mov ax,chardata[8].Attributes
becomes
mov ax,[401000h+8+2]
where the address of chardata is 401000h and the offset into CHARINFO of Attributes is 2.

Ratch

AeroASM,

Quote
You should have 48h etc not 48000000h......


     Hex numbers should ALWAYS be prefixed by a zero.  Otherwise an error results if the lead number is not 0 thru 9. Ratch


00000000     .data
00000000  00000C48 DWORD 0C48H
00000004  00000480 DWORD 0480H
00000008  00000480 DWORD  480H
DWORD  C48H
TEST.ASM(23) : error A2006: undefined symbol : C48H

RedXVII

Excellent! Thanks for the replies  :8)

One last thing, i want to make 400 zeroed CHAR_INFO's

so, how would i say this without going:

.data
charArray CHAR_INFO 0,0,0,0,0,0........  //400 zeros - eeek


AeroASM

charArray CHAR_INFO 400 dup(0)

pro3carp3

I would put the array declaration in the uninitialize data section and initialize them to zeros at program start-up.
LGC