News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Syntax Analyzer

Started by Grincheux, July 02, 2007, 08:51:04 PM

Previous topic - Next topic

Grincheux

In my new project I need to analyze source code.
I can find such a string : "A/(B-(C+D))*E"

I know how to get the final string : "ABCD+-/E*" but I don't understand how I can compute the temporaries result.

Can anyone help me ?
Kenavo

Grincheux
_____________________________________________________
http://www.phrio.biz

gabor

Hi!

If I recall it correctly you are trying to parsing a post order syntax. I would recommend a recursive method, thus the temporaries are stored on the stack. With very long expressions this couold lead to stack overflow, but this can be fixed by using a "custom stack" (simple memory space with push and pop macros).

Honestly, I was wondering a bit: you already know how to convert from infix to postfix syntax, but you cannot calculate the values? Are you doing simple string manipulations when rewriting the expression into postfix?
Another notice is that with postfix AFAIK the evaluationi of the expressioin should done from the right to the left, that is it should be started at the end.
Prefix syntax might be slightly easier because of this.

To give a real answer I would recommend this:
- read the string from right to left
- there  are 2 cases: operator was read, a symbol was read
- the basic suggestion is this: an operator and 2 symbols can be evalutated (like CD+ in your example)
- if 2 symbols are not next to each other, this means the current operation has to put aside and another operation with higher priority is following (like /E* multiplication will be preceded by the division)

I hope to find some more time for this idea since parsers and syntax analysers are of a great interest to me.

Greets, Gábor

zooba

If you want to calculate the result of that, the easiest method is to use a stack.

Read the string (list) from left (first) to right (last). For each value, push it onto the stack. For each operator, perform the function on the top two values. At the end you should be left with one value which is the final result. If you've got more than one value left on your stack, or you run out too early, something went wrong.

I wrote a small library/function to do this and posted it here last year. Feel free to use whichever parts of it you'd like.

Cheers,

Zooba :U

Grincheux

Zooba, I have dowloaded your library and will see how integrate it into my project.
Could you have such operations :

+/ 1,2,3,4 Result 10
-/  1,2,3,4 Result 0
*/ 1,2,3,4 Result 24

They are operators that I used when programming in APL and I found them very useful
Kenavo

Grincheux
_____________________________________________________
http://www.phrio.biz

gabor

Hello!

I was thinking about the pre/in/postfix forms and I found the postfix form to be quite easy to process. My idea is very simple, read the input, push the non-operator values on the stack. If an operator was read, execute it using the last 2 values on the top of the stack. The result is stored on the stack again, instead of the original 2 operands there will be 1 value, the result. This is done until the entire input string, the expression is processed.
Interesting question is are there letters or numbers or symbol names in the expressions? This can make it a bit more difficult.

Greets,
Gábor

Grincheux

I don't think, because I don't wan to give the real variable name but a letter that will identify the variable.In fact it is the variable name in an internal line sympbol table. That limits the assembler to 26 variables per line ! So I really have the A, B, C, D, E and F variable. With this method it is shorter to analyze.

Suppose :

VARIABLE1 = VARIABLE2 / ((VARIABLE3 + VARIABLE4) * VARIABLE5) will be translate to :
A = B / ((C + D) * E)

Kenavo

Grincheux
_____________________________________________________
http://www.phrio.biz

zooba

Quote from: Grincheux on July 03, 2007, 01:07:32 PM
Could you have such operations...

It's possible, but I only implemented binary operators. It's been a while since I looked at the code for it, but I imagine it would take quite a bit of work to achieve what you are after. Maybe some form of preprocessor which would turn +/ 1, 2, 3, 4 into 1+2+3+4?

In terms of variable names, if you implement a SymbolCallback function the Evaluate function will pass symbols to your own function and you can return whatever value you like. The downside is you now need to handle string to integer/float conversions yourself.

Quote from: gabor on July 03, 2007, 09:26:26 PM
Read the input, push the non-operator values on the stack. If an operator was read, execute it using the last 2 values on the top of the stack. The result is stored on the stack again, instead of the original 2 operands there will be 1 value, the result. This is done until the entire input string, the expression is processed.

This is very easy once your expression is in RPN. The hard part is converting from infix to postfix.

Quote from: gabor on July 03, 2007, 09:26:26 PM
Interesting question is are there letters or numbers or symbol names in the expressions? This can make it a bit more difficult.

It's easy enough to pull out anything that's not an operator, then a hash map is the fastest way of resolving it (if it isn't a number).

Cheers,

Zooba :U