News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Read next statement algo

Started by hutch--, November 23, 2005, 01:39:49 AM

Previous topic - Next topic

hutch--

I have tried these before but using a callback in the middle of the algo which is much harder to use. This version uses an offset variable as a parameter which it also updates as the return value so it is far easier to use. This version will scan through spaces, tabs, unused lines and comments then read the next statement into the user defined buffer for further work like tokenising or in the case of more complex statements, a more complex parser.

It supports quoted text with split lines and comments on each split line so if you have source like,


    var = function(arg1,  ; comment
                         arg2,  ; comment
                         arg3)

You get output like,

var = function(arg1,       arg2,     arg3)

No attempt has been made to further format the output as most tokenisers and parsers can handle additional spacing without any problems.

For line continuation, it recognises either a "," or a "\" for splits on word combinations like,

win_style1 or \
win_style2

with an output,

win_style1 or win_style2


No serious effort has been made to optimise the code but it does the file IO and processing of a 30 meg source file in about 1.5 seconds on my PIV so its probably fast enough to start with.

[attachment deleted by admin]
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

I have rewritten this algo to make it more reliable. This one is larger but tests faster and it seems to handle the conditions it was designed for more correctly. As before it evaluates the end condition of a line to see if its a split line, it strips the comments from each line and writes the results as a single line without comments. Apart from so space removal on following lines which was added for speed reasons, there is no attempt at formatting the results as a tokeniser will normally read through unused spaces.

This one does the file IO and processing of a 30 meg file in about 1 second on this PIV.

It will continue a line if the  is a trailing comma and it also recognises the trailing backslash as a line continuator when you have code like ORRED windows styles split across multiple lines as is common.

comment * ---------------------------------------------------
It does not attempt to remove comments of this type
as this task requires word recognition which is a later task.
        --------------------------------------------------- *


[attachment deleted by admin]
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

Here is a test piece using the get_ml algo and a slight variation of the MASM32 library module parse_line. This approach is a lot cleaner and easier to work with that the earlier callback methods I tried out.

The combination of these two algorithms is probably powerful enough to load and read a moderately complex scripting language and it appears to be fast enough. The test piece is slowed down by the high count of STDOUT calls to display the tokenised array members. Single line assembler statements are of course very simple to parse.

[attachment deleted by admin]
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

ToutEnMasm

Hello,
perhaps I will disturbed you of your thought and testing piece of code,but I don't see very much what this algo made.
A little explain in common english could say me more.
                                           ToutEnMasm