News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Arrays

Started by boogara, June 20, 2007, 12:07:06 AM

Previous topic - Next topic

boogara

Probably one of the last questions I'll be posting for a few days :D

But, this has really been bothering me...I've been reading for a few hours now (and trial-and-error'ing for that matter), trying to understand arrays...

My last venture was going to this thread: http://www.masm32.com/board/index.php?topic=5099.0 and, well, the code has already been put into the MASM32 library :)

For the most part, the code makes sense...(taking code from the example in the link above.)


    sas hCL, "  one two three 'four four four' five 'six six six' seven eight nine ten  "

    mov acnt, rv(txt2arr,hCL,ADDR hArray)


This is just assigning the "one two three [etc...]" text to hCL.  Then, you just move all the data that the rv macro creates (how many keys to cycle through) into the "acnt" variable.


    [b]mov esi, hArray[/b]
    mov edi, acnt
  @@:
    [b]print [esi],13,10[/b]
    add esi, 4
    sub edi, 1
    jnz @B


Here's where I get confused I guess.  The data in hArray gets stored in the ESI register now, and the string length of hCL is stored in EDI now, right?  Then, while ESI and EDI are not zero, it repeats a loop of the array (each array section is stored in ESI...??).

It's the bolded code above that confuses me.  It seems like it kind of divides ESI into bits, and stores every array-key into a different chunk of ESI.  I'm just not sure if I'm understanding this correctly...

Side Questions
the "add esi, 4" and "sub edi, 1" are just to basically be like counters, correct?  By this, I mean...it increments the next starting offset of ESI 4 more spots, and decreases the amount of loops left to perform by one, right?

Sorry for being a bother here ^_^;  Just...trying to better understand this.  Everything else about assembly seems to make at least some sense...but the arrays.

(Wouldn't mind there being more documentation on this topic :D...well, if I can better understand arrays, I'll write one for ya'll!)

Edit

Instead of creating a new topic...and since this vaguely is related to the above issue...

Has anyone tried to use "wordreplace"?

For me, it won't work...


; ---------------  omitted

period db ".", 0
space db " ", 0

; -------------- omitted

LOCAL ne_ip:DWORD

invoke wordreplace,addr IP,addr new_ip,addr period, addr space


I modified the table inside of "wordrep.asm" to make it so it will find both the period and spaces, and I re-make'd the library.

I think it's the "period" and "space" variables, but...I'm not sure.

hutch--



The example you have picked is not exactly learner material with an in place tokeniser but see if this makes sense. It allocates memory for an array based off the length of the source string. It then scans the source string to find the start of each word, stores the address of each word in the array then terminates each word with a zero. You use the array once the algo has returned and the return vaue from the array as the argument count in the array.

The algo modifies the string passed to it so if you need to keep the original data you copy it to another block of memory first.

If you have the last service pack for masm32 it is documented in the masmlib help file and there are two versions of a tokeniser in the library.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

boogara

Quote from: hutch-- on June 20, 2007, 01:15:23 AM


The example you have picked is not exactly learner material with an in place tokeniser but see if this makes sense. It allocates memory for an array based off the length of the source string. It then scans the source string to find the start of each word, stores the address of each word in the array then terminates each word with a zero. You use the array once the algo has returned and the return vaue from the array as the argument count in the array.

The algo modifies the string passed to it so if you need to keep the original data you copy it to another block of memory first.

If you have the last service pack for masm32 it is documented in the masmlib help file and there are two versions of a tokeniser in the library.
Makes sense to an extent...I'm sure it'll make more once I dive more into it and such.

You were mentioning about a couple of versions of a tokenizer in the library...but, it doesn't tokenize anything.

The documentation on how to use it isn't very exact, so I'm just going to speculate that you have to create your own tokenizer-table (which says what to look for to do the job), then pass the function?

hutch--

I referred you to the latest service pack for a reason, it contain 2 tokeniser algos, one for words and the other for lines of text. Both are useful in their context. With the service pack comes the help file that properly documents both algos and you need do little else than read the documentation to use them.

As per the specs, pass the address of a variable for the memory allocation to use as a handle and the source string as well. What you get back is an array that uses the passed variable and the return value is the item count of the array.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

Here is a small demo on how to use the standard library tokeniser. You must make sure you have the latest service pack installed to have both the help file and the 2 tokenisers. The main punch line with an algo of this type is speed and by tokenising in place it is far faster than old C junk like argv argc for command lines and similar.


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    .code

start:
   
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

    call main

    exit

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

main proc

    LOCAL hMem  :DWORD
    LOCAL flen  :DWORD
    LOCAL wcnt  :DWORD
    LOCAL parr  :DWORD

    mov hMem, InputFile("\masm32\include\windows.inc")  ; load source file
    mov flen, ecx

    mov wcnt, rv(wtok,hMem,ADDR parr)       ; tokenise words in source

    print "File tokenised",13,10
    print "Press CTRL + C if you get bored watching tokenised text",13,10

    inkey                   ; wait for user key press

    push esi                ; preserve ESI and EDI
    push edi

    mov esi, parr           ; load array address into ESI
    mov edi, wcnt           ; put word count in EDI

  @@:
    print [esi],13,10       ; display data at ADDRESS in ESI
    add esi, 4              ; step up to next address in ESI
    sub edi, 1              ; decrement counter
    jnz @B

    pop edi                 ; restore EDI and ESI
    pop esi

    ret

main endp

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

end start

[attachment deleted by admin]
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

boogara

Alrighty, well...now I understand how it works and everything :)  (Thanks very much by the way!)

But, I'm still confused as to how to make it tokenize with a different character (aka: instead of every space, do every "." [period]).

I do have the latest SP though (set all that up right after I installed MASM32)..and, I have been reading through the documentation and everything.