News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Writing an assembler.

Started by travism, June 27, 2009, 06:45:04 PM

Previous topic - Next topic

travism

Ive written some basic interperters in the past and just finished my fun project of a brainf*ck interperter, Now I want to step it up a little bit and actually write an assembler. Ive got some documents on the COFF file format and the intel manuals. Ive read instruction encoding, but Im still not understanding how your actually suppose to encode each instruction or how your actually suppose to write the coff file? I also have 2 pdf's on compiler technology but nothing explains anything about actually writing object files or encoding instructions its just all about parsing, which is the easiest part. Does anyone have any information or anything? Ive already been looking at the sources of jwasm and fasm to see how others have done it. Any help would be appreciated thanks! :)

dedndave

as for the coff format, Vortex is the man - his website has all the proper docs
and, he can assist you if you have a hard time understanding parts

as for the encoding, it is not too difficult, there are just a lot of instructions to encode
and, you will want to provide support for different extended instruction sets, as well - that part will be confusing
you will also want to look at the amd material - and know what instructions are unique to the different processors
you will make a lot of friends if you make a good 64-bit assembler - lol
an assembler (or compiler, for that matter) is mostly a big text parser
providing macro support will be a big part of the task

which intel docs do you have ? - i may have a couple that can help
i think the main one will be the "Intel 64 and IA-32 Architectures Software Developer's Manual" (5 pdf's i think)
that manual fully describes instruction encoding


travism

Wow thanks a bunch for those links, right now im still in the planning phase for it, deciding on syntax to support and writing a outline of the assembler. I really hope this works out, I just wanted to test encoding instructions and outputting to a object file with my brainf*ck interperter to understand how it works before I use it in an assembler, I will have a look at all those manuals. Thank you again!

dedndave

ya know, Travis, i admire you for taking on such a large project
i would like to write a debugger/disassembler/resource viewer/pe editor/etc/etc
but, i am not ready for that, yet - maybe someday before i am too old - lol

travism

Hey thanks, Vortex and also Jeremy with his GoAsm assembler have really inspired me to create my own. It will be a long process, but I will learn alot. :)

Farabi

If Im not mistaken on 2005 there is an information from the svin to write an assembler. He wrote about bitfield.
Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"


Farabi

Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"

hutch--

Travis,

I understand the task you have in mind is a large and complex one and it will take a lot of coding skill and work to get it to work properly. Much of the task will be in the PRE-CODING architecture and this will dictate how much success you will get in doing something this complicated. What I would suggest you do before you take on the doing part of the project is get a good idea of what you need to do to make something like this work.

Japheth's rewrite of JWASM is a good start as his code is well written and very clear. Tomasz Gristar's FASM is written directly in FASM so you will need to be able to read the notation but it is an excellent assembler. Some of our members know about techniques like recursive descent parsers, various temporary data storage methods and the like and general parsing knowledge will in fact be very useful to you.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

travism

Thank you all for all the information I have saved every bit of it. Hutch thank you for responding. I know this task will be very complicated and it will take a lot of time. I am not even thinking about coding it yet just some design principles and such. I am most confused right after the parsing so I am trying to read a lot on that. I hope soon ill begin to understand :/

Thank you all again

travism

Wow so I have been reading everything under the sun about compiler technology. And so I've begun to start at the beginning the parse tree. A couple things I'm having trouble understanding is 1 line at a time read in and tokenized then that line sent to parser or whatever u choose next then to code generation? Or is the whole file first parsed then sent to the next stages?

I've tried looking at fasm source and jwasm and I am trying to see how they store there parse and ast trees. But I guess there is a zillion ways to do it?


dedndave

the entire file
simplify everything for the next pass
i don't know if they expand includes and macros before (i.e. during) or after that pass
it seems like it would make sense to expand them as well - you could ask the RadAsm and GoAsm guys for a little direction, there
of course, a single-pass assembler would be nice, but i dunno how practical it would be to implement
i think MASM is a 3-pass assembler
they don't count the beautifier pass so it is called a 2-pass assembler (i think that's right)
all this seems rather simple until you take into account very large projects
then, you are generating temporary files and balancing memory against file-size and symbol space
also, everyone likes an assembler that is fast, but i like one that works - lol
maybe, better than trying to find documentation,
you might see if you can get the source for some assembler and browse it for ideas and concepts
i think the Gnu Assembler (GAS) is open-source

travism

Hey dedndave thanks for the quick reply, yeah I was kind of figuring it would be more practical to do it that way... I've been studying the jwasm source but its a bit much I'm having a easier time understanding the fasm source code more which is so clean and fast. Its just hard trying to break it down at that low of level lol. It kind of sucks cuz I'm only getting bits and pieces of what it takes to make one. I'm not putting myself on any time constraint though, so I have all the time to make one :) thanks for your help and information

hutch--

travis,

Its a bit to do with what target you have in mind in terms of assembler complexity, at its simplest a bare mnemonic grinder of masm 5 technology or lower is a far simpler task than one that adds any pseudo high level capacity to it. A macro engine is another layer of complexity that collectively makes the result larger and a lot harder to code.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php