News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Early beta of a new script engine

Started by hutch--, January 25, 2006, 01:37:23 AM

Previous topic - Next topic

hutch--

I have been working on this one for a couple of weeks on and off and it is starting to get into useful shape. I have spent the bulk of the development time on it trying to get its instruction fetch speed fast enough and this version is much faster than the script engine built into QE. Its basic design is a raw script processing engine that has both advantages and disadvantages. It makes more complicated text processing a lot easier but imposes additional burdens on getting it fast enough to be useful.

The current tests on the PIV I use is about 1.1 million instruction fetches a second which is probably fast enough for a script engine aimed at text output. I have done the most work so far on the back end that actually runs the script, the loader is reasonably fast but not complete. Currently it does not test for duplicate labels and if a label is duplicated it will junp to the first label in the script of the same name. This will be completed when the loader is further developed.

It currently uses variable that are predefined within a range but if I can find a fast enough method in the loader, it will support named variables much like a normal language.

It has a reasonable number of command/functions so far but it has little useful runtime capacity yet. To start testing it I have added the minimum file IO capacity and this is demonstrated in the template.se file that the script engine will run. The capacity to build scripts for code generation is one of the target uses for this script engine but over time I intend to expand it a lot further.

[attachment deleted by admin]
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

zooba

Works fine for me. Not sure how I can test the speed though...

I don't feel there's anything wrong with the numbers instead of variable names. However, perhaps each number could only be used once and it is preceeded by # if it is treated as a number and $ if it is treated as a string. Then all you need is textual substitution variable names for numbers (ie FILE equ 1 ... #FILE = fcreate...).

Is the '#0 = ...' going to be necessary on each command? Just for typing reduction. And is it possible to remove the line break after an IF statement?

Looks good :U

Zooba

hutch--

This is the script I used to test the instruction fetch speed. I am using a sequential word parser for each word on a line of code and this makes it viable to do the number to string conversions on the fly for the two fprint functions and the cat$ function. The two fprint functions actually return the written byte count so the value is not useless, its just that it does not need to be used in every instance so pointing the return value at #0 does the job.


  ; ********************************
  ; This is to test the instruction
  ; fetch speed with minimum runtime
  ; functionality for the loop.
  ; ********************************

    msgbox "Start" "Loop speed test" MB_OK

    #1 = 100000         ; assign the loop reference count
    #2 = 0              ; set counter to zero

    clock 0             ; start timing

  start:
    nop                 ; nop is a no content
    nop                 ; instruction mainly
    nop                 ; for testing
    nop
    nop
    nop
    nop
    add #2, 1
    if #2 != #1         ; while #1 is not equal to #2
    goto start          ; loop back to start

    clock 1             ; end timing and store results in #0

    num2str #0 $1       ; convert it to a string

    msgbox "Milliseconds for 1 million iterations" $1 MB_OK

    end
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

zooba

Quote from: hutch-- on January 25, 2006, 02:35:20 AM
The two fprint functions actually return the written byte count so the value is not useless, its just that it does not need to be used in every instance so pointing the return value at #0 does the job.

Quote from: syntax.txt#0 is always used as a number return value

Why can't it be omitted if it isn't going to be used? If you're parsing a word at a time presumably you have a flag which contains the current line's return destination, so default it to #0.

I got 1072ms for 1M instructions.

hutch--

I parse each line as either a statement or a function and as the file IO capacity is being implimented as functions, they require a variable to place the return value, even if it is not used.

The IF runtime evaluation works by allowing or disallowing the following instruction and it is a logic I would like to keep in the backend. When I get the backend runtime engine  finished, I need to tweak the loader for a number of things and in that area I can do a reasonable amount of preprocessing and catch errors that are currently being done at runtime.

I am stuck with a tradeoff between capacity and speed and what I don't want is the loader / pre-processor being too slow. As long as I keep the backend reasonably low level, it should remain reasonably fast for a script engine of its type.

Once this is working its reasonably easy to emulate prebuilt loop code in the form,

Do while condition
  ; instructions
Loop

  or

Do
  ; instructions
Loop While condition
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

gabor

Hello Hutch!


Since I am really interested in scripting and parsing, I would like to know more about the parser model you are using. For example, I would use a finite state machine for such task (as I did in my XML parser).
Could you tell me your basic desing?

Greets, Gábor

hutch--

gabor,

It is internally a strange animal to describe. I wanted the dynamic string engine from PowerBASIC so I have written it in PowerBASIC which simplifies the dynamic string functions I have written into it so far.

I process the text file by reading line by line into memory, remiving each label and storing both its name and line number. The label is not written to the array that stores the lines so each label points to the following instruction. This is basically a compacting phase that removes the labels from the code.

I do another pass after this to replace the label names with their line numbers so that the branching instructions have the correct line number as a target.

It is then passed to the script engine and this is where the speed is needed for a raw text script engine.

The array index is treated like an IP so instructions are run one after the other unless there is a branch "goto or call" which change the IP for the following instruction. With each line, I use a sequential parser originally written in masm that I have ported directly to PB which reads one word at a time left to right and stores the pointer to the next word as it return value so it can be used for the next call.

In masm I would have used a static tree (a variety of finite state machine) for the branching based on which word was read but in PB I have the option of a vectorised Switch block which I have used and it seems fast enough.

If the first word is not recognised it passes to the next parse operation which checks the second word to see if its the assignment operator "=". It then branches to the third word after the "=" using a following vectorised switch block.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php