Hi. For my 6502 monitor I'm going to implement a pseudoassembler, that is a command that inputs for mnemonics and assembles it into 6502 RAM.
The accepted format of course is:
<mnemonic> <operands>
e.g: LDA #$09
ADC $FF01
etc.
I was thinking of using the pattern matching routines of the HLA Standard Library, but maybe it's better to use string tokenization functions to do this (token 1: OPCODE, token 2: OPERANDS using delimiters,etc)
Which do you think I should use?
Quote from: indiocolifa on October 04, 2005, 05:27:02 AM
Hi. For my 6502 monitor I'm going to implement a pseudoassembler, that is a command that inputs for mnemonics and assembles it into 6502 RAM.
The accepted format of course is:
<mnemonic> <operands>
e.g: LDA #$09
ADC $FF01
etc.
I was thinking of using the pattern matching routines of the HLA Standard Library, but maybe it's better to use string tokenization functions to do this (token 1: OPCODE, token 2: OPERANDS using delimiters,etc)
Which do you think I should use?
I've actually written a "Y86" assembler using the HLA pattern matching code for the sample exercises in the on-line version of AoA. You might look up the source code for "simy86" (or whatever I've called it) on Webster.
Cheers,
Randy Hyde
Quote from: indiocolifa on October 04, 2005, 05:27:02 AM
I was thinking of using the pattern matching routines of the HLA Standard Library, but maybe it's better to use string tokenization functions to do this (token 1: OPCODE, token 2: OPERANDS using delimiters,etc)
Which do you think I should use?
Seems like str.tokenize( stringArray, string ) would be quite helpful. Whatever you do, you should think about the order in which you process the information gained. For instance, right after tokenizing it, I'd determine the Addressing Mode first before I pass it on to code that matches-up the mneumonic. Here's some rough pseudocode to show what I mean:
asmline - input from user
tok[x] - string array of tokens
str.tokenize( tok, asmline );
if (eax = 2)
if (firstchar(tok[1*4]) = "#")
flag = immediate
else
if (length(tok[1*4]) = 2)
flag = zeropage
else
flag = absolute
endif
endif
else
if (tok[2*4] = "X"
if (length(tok[1*4]) = 2)
flag = zeropageX
else
flag = absoluteX
endif
elseif (tok[2*4] = "X)")
flag = indirectX
else
if (firstchar(tok[1*4] = "(")
flag = indirectY
else
flag = absoluteY
endif
endif
endif
Then you pass this on to code to match tok[0*4] with one of about 56 instructions. If it happens to be "ADC", for instance, then the 'flag' will tell you which opcode (69, 65, 75, 6D, 7D, 79, 61, 71) to use. It also tells you how many bytes are expected after the opcode.
Nathan.
How do I search in my opcode table using pattern-matching. I'm using the following code that does not work:
// do pattern match
pat.match (asmentry);
// check for valid mnemonic
push (ecx);
push (edi);
pat.onePat;
FOR (xor(eax,eax); eax<=256; inc (eax)) DO
stdout.put ("EaX=",eax);
intmul (@size(opcode),eax,edi); // calc offset at string table
mov (OP_STR[edi], esi);
mov (esi, mnemo); // get string at table
pat.alternate;
pat.matchStr(mnemo);
ENDFOR;
pop(edi);
pop(ecx);
pat.endOnePat;
pat.if_failure
stdout.put ("Unknown instruction -",asmentry,nl);
pat.endmatch;
What I'm trying to do is:
pat.alternate
<test for opcode 1>
pat.alternate
<test for opcode 2>
.
.
.
pat.alternate
<test for opcode n>
Thank you very much.
Well I've had an itch for a while now to do a full-fledged 6502/10 assembler (with some support for macros, a few control constructs, an expression evaluator, and some directives and such) so I've occasionally poked around into places like http://webster.cs.ucr.edu/AsmTools/RollYourOwn/index.html and Sevag's Arayna project (take a look at "CmpAtFuncs.hla") for some ideas. Now a binary tree might be overkill for a simple assembler that only deals with 56 instructions, so I came up with this approach:
1) put the 56 instructions into one long string and follow each with a space.
2) take the user-supplied instruction, make it the same case, add a space, and put it into EAX.
3) use SCASD to find it in step 1's string.
4) use the resulting position as an index into an array of opcodes.
Here's some test code for the matching part:
program asm;
#include("stdlib.hhf")
static
mnem: byte[4*4] := ['A','D','C',' ','S','U','B',' ','S','T','A',' ','L','D','A',' '];
find: byte[4] := ['S','T','A',' '];
begin asm;
mov( 4*4, ecx );
lea( edi, mnem );
mov( (type dword find), eax );
cld();
back:
scasd();
loopne back;
mov( 4*4, eax );
sub( ecx, eax );
stdout.puti32(eax);
stdout.newln();
end asm;
EAX will contain an index into our array. Just use a 2-dimensional byte array with the first dimension being the mnemonic and the second dimension being the addressing mode. Here's code that I wrote for the aoaprogramming forum (Yahoo Groups) which shows a straight-forward way of dealling with 2D arrays:
program afp;
// Array Fill & Print
// low-level example by Nathan Baker
//---
// esi - 1st dimension counter
// edi - 2nd dimension counter
// ebx - base register
// eax - multi-purpose scratchpad
#include("stdlib.hhf")
var
MyArray: int32[3, 5];
endvar;
begin afp;
xor( esi, esi ); //clear our first dimension counter
stdout.puts( nl "Enter 15 integers, 5 per column:" );
lp1:
stdout.puts( nl "Column " );
mov( esi, eax ); //get current 1st dim count
inc( eax ); //zero-based, so we adjust for the display
stdout.puti32( eax ); //display it
stdout.newln();
mov( esi, eax ); //lets calc offset from basepointer
shl( 2, eax ); //multiply by 4 because 'int32' is 4 bytes
intmul( 5, eax ); //multiply by 5 because second dimension is 5
lea( ebx, MyArray ); //get array pointer into base register
add( eax, ebx ); //add the offset
xor( edi, edi ); //clear our second dimension counter
lp2:
mov( edi, eax ); //get current 2nd dim count
inc( eax ); //zero-based, so we adjust for the display
stdout.puti32( eax ); //display it
stdout.puts( ":" );
stdin.geti32();
mov( eax, [ebx+edi*4] ); //[base + (1st dim * size_of 2nd)] + 2nd dim * 4
inc( edi ); //increase 2nd dim counter
cmp( edi, 5 ); //have we reached upper limit?
jne lp2; //no, then jump back -- yes, then continue
inc( esi ); //increase 1st dim counter
cmp( esi, 3 ); //have we reached upper limit?
jne lp1; //no, then jump back -- yes, then continue
xor( esi, esi );
stdout.puts( nl "You entered:" );
lp3:
stdout.puts( nl "Column " );
mov( esi, eax );
inc( eax );
stdout.puti32( eax );
stdout.newln();
mov( esi, eax );
shl( 2, eax );
intmul( 5, eax );
lea( ebx, MyArray );
add( eax, ebx );
xor( edi, edi );
lp4:
mov( edi, eax );
inc( eax );
stdout.puti32( eax );
stdout.puts( ":" );
mov( [ebx+edi*4], eax );
stdout.puti32( eax );
stdout.newln();
inc( edi );
cmp( edi, 5 );
jne lp4;
inc( esi );
cmp( esi, 3 );
jne lp3;
end afp;
You'll want to remove the "*4" from the end of "[ebx+edi*4]" and the "shl( 2, eax );" needs deleted (and maybe a few other tweaks) since you only need a byte array. Hope this helps. Have fun!
Nathan.