The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: Grincheux on July 02, 2007, 08:51:04 PM

Title: Syntax Analyzer
Post by: Grincheux on July 02, 2007, 08:51:04 PM
In my new project I need to analyze source code.
I can find such a string : "A/(B-(C+D))*E"

I know how to get the final string : "ABCD+-/E*" but I don't understand how I can compute the temporaries result.

Can anyone help me ?
Title: Re: Syntax Analyzer
Post by: gabor on July 03, 2007, 07:39:18 AM
Hi!

If I recall it correctly you are trying to parsing a post order syntax. I would recommend a recursive method, thus the temporaries are stored on the stack. With very long expressions this couold lead to stack overflow, but this can be fixed by using a "custom stack" (simple memory space with push and pop macros).

Honestly, I was wondering a bit: you already know how to convert from infix to postfix syntax, but you cannot calculate the values? Are you doing simple string manipulations when rewriting the expression into postfix?
Another notice is that with postfix AFAIK the evaluationi of the expressioin should done from the right to the left, that is it should be started at the end.
Prefix syntax might be slightly easier because of this.

To give a real answer I would recommend this:
- read the string from right to left
- there  are 2 cases: operator was read, a symbol was read
- the basic suggestion is this: an operator and 2 symbols can be evalutated (like CD+ in your example)
- if 2 symbols are not next to each other, this means the current operation has to put aside and another operation with higher priority is following (like /E* multiplication will be preceded by the division)

I hope to find some more time for this idea since parsers and syntax analysers are of a great interest to me.

Greets, Gábor
Title: Re: Syntax Analyzer
Post by: zooba on July 03, 2007, 08:14:13 AM
If you want to calculate the result of that, the easiest method is to use a stack.

Read the string (list) from left (first) to right (last). For each value, push it onto the stack. For each operator, perform the function on the top two values. At the end you should be left with one value which is the final result. If you've got more than one value left on your stack, or you run out too early, something went wrong.

I wrote a small library/function to do this and posted it here (http://www.masm32.com/board/index.php?topic=3910.0) last year. Feel free to use whichever parts of it you'd like.

Cheers,

Zooba :U
Title: Re: Syntax Analyzer
Post by: Grincheux on July 03, 2007, 01:07:32 PM
Zooba, I have dowloaded your library and will see how integrate it into my project.
Could you have such operations :

+/ 1,2,3,4 Result 10
-/  1,2,3,4 Result 0
*/ 1,2,3,4 Result 24

They are operators that I used when programming in APL and I found them very useful
Title: Re: Syntax Analyzer
Post by: gabor on July 03, 2007, 09:26:26 PM
Hello!

I was thinking about the pre/in/postfix forms and I found the postfix form to be quite easy to process. My idea is very simple, read the input, push the non-operator values on the stack. If an operator was read, execute it using the last 2 values on the top of the stack. The result is stored on the stack again, instead of the original 2 operands there will be 1 value, the result. This is done until the entire input string, the expression is processed.
Interesting question is are there letters or numbers or symbol names in the expressions? This can make it a bit more difficult.

Greets,
Gábor
Title: Re: Syntax Analyzer
Post by: Grincheux on July 03, 2007, 11:09:06 PM
I don't think, because I don't wan to give the real variable name but a letter that will identify the variable.In fact it is the variable name in an internal line sympbol table. That limits the assembler to 26 variables per line ! So I really have the A, B, C, D, E and F variable. With this method it is shorter to analyze.

Suppose :

VARIABLE1 = VARIABLE2 / ((VARIABLE3 + VARIABLE4) * VARIABLE5) will be translate to :
A = B / ((C + D) * E)

Title: Re: Syntax Analyzer
Post by: zooba on July 04, 2007, 10:12:50 AM
Quote from: Grincheux on July 03, 2007, 01:07:32 PM
Could you have such operations...

It's possible, but I only implemented binary operators. It's been a while since I looked at the code for it, but I imagine it would take quite a bit of work to achieve what you are after. Maybe some form of preprocessor which would turn +/ 1, 2, 3, 4 into 1+2+3+4?

In terms of variable names, if you implement a SymbolCallback function the Evaluate function will pass symbols to your own function and you can return whatever value you like. The downside is you now need to handle string to integer/float conversions yourself.

Quote from: gabor on July 03, 2007, 09:26:26 PM
Read the input, push the non-operator values on the stack. If an operator was read, execute it using the last 2 values on the top of the stack. The result is stored on the stack again, instead of the original 2 operands there will be 1 value, the result. This is done until the entire input string, the expression is processed.

This is very easy once your expression is in RPN (http://en.wikipedia.org/wiki/Reverse_Polish_notation). The hard part is converting from infix to postfix.

Quote from: gabor on July 03, 2007, 09:26:26 PM
Interesting question is are there letters or numbers or symbol names in the expressions? This can make it a bit more difficult.

It's easy enough to pull out anything that's not an operator, then a hash map is the fastest way of resolving it (if it isn't a number).

Cheers,

Zooba :U