News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

microprocessor overview

Started by t48605, March 01, 2006, 01:16:21 PM

Previous topic - Next topic

t48605

Hi I ' m newbie here an new to Assembly ! I know people here have much experience about it , so you also have much knowledge about microprocessor , I ' m making friend with it but like in the mess between many terms such as segments, index registers , offset registers ,  etc ... , so could you give me some easy-to-understand explaination about components of it ??? (maybe it takes many words )
Thanks ! :red

cman

That might take quite a few words! :bg You might check out a book on computer architecture. I bought a great book on ebay called Computer Architecture A Quantitative Appoach by Hennessy and Patterson for just a few dollars. You might want to try to do something like that or find a good tutorial somewhere. Good luck! :U

Mark Jones

Hi t48605, you could try the emu8086 emulator. It can teach you how to make basic DOS apps and exactly what segments, registers, interrupts are, etcetera. Windows assembly is a little different (some say easier) but emu8086 would make for a nice start. For windows assembly, I'd recommend reading the ASMINTRO.HLP file which comes with MASM32, then go through all the Iczelion tutorials. :)
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

BogdanOntanu

Honestly it might help if you study a much simpler CPU first (i sugest 8bit Z80). Current CPU have evolved from such machines. They have evolved a lot and suffer a lot of changes but the main principles remain the same. Starting with the very complex ones might make you abort.

So let us talk about a hypothetical simple CPU just to make you understand a few points:
(pease let me know if you feel like talking about the latest P4/AMD64 beasts)

The CPU is an integrated circuit. It has some internal logic (we will see what kind later on)
It also has a set of external metalic pins used to electrically/logically interact with the external world.

The external pins are:
===============
-DATA pins serving to read from and write to MEMORY and other DEVICES.  We will name those D0, D1, up to D7 for an 8bits data bus or D0...D31 for a 32 bits data bus.
-ADDRESS pins that serve to form the address of the source or destination DEVICE to be read or written; we will name those pins A0...A15 for an 16bit address buss.
-CONTROL pins with somehow regulate or help the whole process: examples are RESET pin, IRQ pin, IO/MEM pin, WR/RD pin
-POWER pins that serve to power up the device.(we will not talk about this)

Now the operations are as follows:
==========================
1) When RESET pin is externally activated or when power is first elevated on the dvice it will start its internal INSTRUCTION POINTER to a fixed startup address. Usually this address is either ZERO or a big value near the end of accessible address space like F000h. From now on we will name the INSTRUCTION POINTER "IP". All values are written in HEX or binary.

The current IP value is conected to the ADDRESS BUSS A0...A15 and a RD control pin is activated.
This way the CPU tells to any external device attached to the buss tha it wants to READ something from address ZERO.

2)Usually at the startup address there is a ROM memory that contains a startup code or a BIOS.
ROM memory are Read Only Memory that do not loose content when they are left without electrical power. We could assume that this ROM memory is 16Kylobytes in size and located at address 0000h up to 3FFFh. As said before it contains either boot code or the BIOS or the OS itself (a simple OS of course). We could have RAM memory starting from 4000 up to FFFF.

This ROM memory will place the first byte it contains on the DATA bus (D0-D7).
The CPU will read this first BYTE and by convention will consider it to be an INSTUCTION.
This instruction will be copied from the D0-D7 pins to an internal temporary CPU register and the RD pi deactivated.

3) Now the CPU will decode the value of this OPCODE BYTE and based on its contents will decide what to do next.
The CPU is basically  ahuge STATE MACHINE that react on INPUT DATA and PREVIOUSE STATES. OF course at startup PREVIOUSE STATE is NULL.
So the CPU could do one of 2 things:
a) Decide that the OPCODE is sufficient an dcan be EXECUTED imediately as it is
or
b)Decide that it needs more DATA/Information in order to perform its TASK

In the seccond case it will increment IP and reactivate RD on the new ADDRESS+1 requesting the ROM to present more BYTES to the CPU.
This is what is called a INSTRUCTION FETCH Cycle. It can be simple or longer.

For the sake of simplicity is will choose Path a) --> the CPU decides that the OPCODE BYTE is quite sufficient for now.

Tips on internal structure
Now you can already see that the CPU has a state machine and instruction DECODER inside as well as a few temporary registersor buffers inside.
Clearely it needs one for the OPCODE read from the ROM and if you are smart enough you guessed that it might  also need one buffer at each buss (ADDRESS and DATA).
Besides there is an IP register inside that keeps track of the current intruction location in ROM.

What else could it be inside the CPU?

Well more registers of course. For example our hypotetical simple CPU could have 8 Registers for general usage.
Let us name them R0, R1... R7.

Operation code  Encodings
So what would the OPCODE tell the CPU to DO?

Well it depends on the ones that designed the CPU and it internal STATE MACHINE/ INSTRUCTION DECODER
But since we are the creators of this simple CPU le us decide this:

The OPCODE could be divided like this:
===================================
- 2 bits for the OPERATION CODE (most semnificative bits)
- 3 bits for the DESTINATION register
- 3 bits for the SOURCE register (less semnificative bits)

We choose the Operation code like this:
00 - ADD
01 - SUB
10 - XOR
11 - EXTENSION

Registers are encoded like this:
R0 = 000
R1 = 001
R2 = 010
...
R7 = 111

This way IF the OPCODE BYTE in ROM is 10_001_100
it actually means :

XOR R1, R4 because 10=XOR, 001=R1=DESTINATION, 100=R4=SOURCE

Of course since the values of R0..R7 are zero or non defined when CPU starts this is not a very inteligent start-up for a BIOS :P
Besides the OPCODE encoding has some problems already... there is too little place for adding more operations

But I guess you willget to the point when you can design a better CPU  ;)

So let us assume that the CPU is blindly XOR'ing its internal registers and stores the result of the operation inside R1

Now: what next?
================
Well, the CPU internal state machine will notice that all job with this INSTRUCTION is done and as a consequence it will INCREMENT the IP register.
In fact if could have done this in the SAME time with the opcode decode phase and exhibit a primitive "pipeline".

And again the RD pin will get activated and a new instruction will be read from the ROM.
The process repeates itself for ever. This is ALL what a basic CPU does:
-read opecodes from ROM or RAM memory,
-decode them and decide what to do next,
-execute next actions and then go for the next opcode fetch.

program flow Control
For controlling the process some instructions will have to be IFs and JUMPS arround in ROM.
Otherwise the ROM will be exhausted and the program will go crazy.
JUMPS could be based on TESTing some FLAGS setup by the CPU after ARITHMETIC operations or LOGICAL operations like our XOR presented above.

More internals deductions
So now you get the feeling that we will need some extra registers or bits called FLAGS inside the CPU.

Let us say the we have 2 FLAGS
=======================
01bit= CARRY FLAG that is automatically put on 1 by the CPU when an operation overflows
10bit= ZERO FLAG that is automatically put on 1 by the CPU whe destination is ZERO
Those fLAGS might represent the bits of yet another CPU internal register ... More about them later...


JUMPS are of the essence

Let us return to JUMPS:
A JUMP can be simply made by and instruction that directly chnages the IP register instead of changing the R0...R7 registers.

So let us extend our CPU with a JMP instruction. Notice how I left operation code 11 as extension ?
Let us ues it by considering the the fileds are like this in "11" case

2 bits = fixed "11"
3 bits = extended instruction code
3 bits = register or immediate depending on above extended code

let us say that we do define the 3 bits extended instruction codelike this:

000 JMP [here + offset]
001 JMP [here - offset]

010 JC [here + offset]
011 JC [here - offset]

100 JZ [here + offset
101 JZ [here - offset]

110  JMP Register

111  instruction extension

So with this encoding if the next INSTRUCTION in ROM is: 11_101_001
It will mean Jump if ZERO here - offset=1 so JMP $-1

In fact IF the IP was already increpmeted and IF the CPU registers start with zero values...
THEN the XOR would have setup the ZERO FLAG and the above instruction would have loop for ever at the same location.

The internad OPCODE Decoder and  state machine of the CPU would recognize the extender code "11"
and decode the new opcode "101" ... by now it will know that the last 3 bits are in fact an offset  to be SUBSTRACTED from IP and act acordingly.

Out first program

The ASM program would look like this:

ADDR  CODE  Operation
==== ==== ================
0000   8C      XOR R1,R4 
0001   E9      JZ $-1       ; could not help myself and made E9 a JMP :P

The job of converting the nice "mnemonics" into 8C, E9 hex bytes is done by a programm called "Assembler".
A very simple one... but still..


Summary

You have seen some basic internal CPU architecture and some basic mode of operation.
We have even designed the encoding of the new ASM language that will in turn dictate the elctronics of the state machine.
And we have encoded our first ASM program for this "virtual" CPU

Now you should be able to improve on my "hypothetic" CPU design...
By showing how and what you would improve --> you show that you understood

Let me know if you understand ... and if you want me to continue  ;)

Any questions are wellcome ...

 
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

t48605

well thanks , I will save time to read your post ... but i wonder that Assembly is run on DOS so I think it 's very dangerous to write Assembly code ( especially with one newbie ) , how to overcome this drawback ???

BogdanOntanu

Modern Assembly is NOT run in DOS anymore.

It is run in Windows or Linux or custom OSes as you can see from this forum.
After all it is a programming language that has advantages like speed, power and full control and disadvantages like more lines to write.

You can take for example my RTS game HostileEncounter (www.oby.ro) that is a full 32bits ASM application and still designed for Windows and DirectX interfaces.

It is not dangerouse by itself, and modern OS's like Windows will not allow you to do dangerouse things anyway ... well, not more dangerouse that other languages like C/C++ etc. Just take care, ignore malware and direct hardware access for a while and you should safe... start with making a simple window ;)
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

t48605

Quote000 JMP [here + offset]
001 JMP [here - offset]

010 JC [here + offset]
011 JC [here - offset]

100 JZ [here + offset
101 JZ [here - offset]

110  JMP Register

111  instruction extension

So with this encoding if the next INSTRUCTION in ROM is: 11_101_001
It will mean Jump if ZERO here - offset=1 so JMP $-1

could you explain me more at this part ( for ex , the sign $ ... ) thanks !!!

BogdanOntanu

Hi,

The "$" sign in assembler usually represents "here".
So $ = "here" = current value of the IP register.

JUMPS can be towards an absolute addres or relative to the curent IP or in ASM notation "$+ some_offset_here"
The advantage of relatives jumps is that the code will run even if it is located at another adress in ROM/RAM memory.
also the value of the jump can be encoded on a smaller number of bits.

I have on purpose exagerated here by encoding the jjump offset in only 3 bits so this gives a maximum jump offset of 7 bytes away :D

Also please observe that the IP is assumed to be already incremented by the CPU at the time the jump offset is decoded.
As a consequence a loop that does:
while_1: JMP while_1
is actually a:
while_1: JMP (while_1 + 1) -1 and since while1+1 = $ it is actually: JMP $-1

of course that woul have been "unconditionally" so in our case it was conditioned by the status of the ZERRO flag.
so i made a small typo: it should have been: JZ $-1 not JMP $-1 :D

The offset of the jump is clearely 001 = 1 and the code 11_101 means the offset is to be considered negative (aka backwards)






Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

t48605

Quote01bit= CARRY FLAG that is automatically put on 1 by the CPU when an operation overflows
10bit= ZERO FLAG that is automatically put on 1 by the CPU whe destination is ZERO
thanks and I understand what you say ... basically !!!
Could you explain more clearly about red quote above , what is exactly ZERO and OVERFLOW here ???

BogdanOntanu

"when destination is ZERO" is simply that. When all of the bits of the destination register (or memory) operand are zero TEHN the ZERROR FLAG is put on 1
OVERFLOW means when either an ADDITION or an SUBSTRACTION would generate an extra bit but there is no place for it in the destination register. This is because the registers have a limited width (8bits, 16bits, 32bits or 64bits depending on hardware).

For example on an 8bits register containing the value 255 decimal = 0xFF hex adding 1 to it will overflow and setup the CARRY flag and because the rezult is actuallu zero it will also setup the ZERO FLAG ;)

Also substracting 5 form a register that contains the value 3 will  setup the CARRY flag
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

t48605

Quote from: BogdanOntanu on March 03, 2006, 07:20:19 AM
For example on an 8bits register containing the value 255 decimal = 0xFF hex adding 1 to it will overflow and setup the CARRY flag and because the rezult is actuallu zero it will also setup the ZERO FLAG ;)



255 decimal means 11111111 , add 1 to it , take to 100000000
and suppose that R0 contains 00000000 , R4 contains 00000001 ( add means 00 as you said ! )
so the instruction is 00_000_001 :)
after CPU do this instruction than the value in R0 is 100000000 , overflow and zero ...
and you say it will set up the Carry flag and also Zero flag ???
but I know that CPU processs one by one of instructions ... so I want to ask where the " set up " here happens ??? ... and what is the next instruction of CPU ???
and also wonder about your last statement
QuoteAlso substracting 5 form a register that contains the value 3 will  setup the CARRY flag
??? Thanks !

BogdanOntanu

Quote
255 decimal means 11111111 , add 1 to it , take to 100000000

NO, if the register is just 8 bits there is no place left for storring the extra 1 after the addition.
Actually that extra 1 goes into the CARRY flag . In a way it is a kind of a 9 bits register extension.

Because all other 8 bits of the register become 0 the CPU also setups the ZERRO flag.
This is done by internal electronic logic. For example one way to do this is to use an 8 input OR gate and then use a NOT gate for inverting the signal.

3 - 5 is obviouse the electornic insize ALU (Arithmetical Logical Unit) will detect the "borrow" and setup CARRY flag

The operations can be perfored syncroniousely like setting the ZERRO flag if it does not require a state transition or the delay is not too big.
One OR gate and one NOT gate will not insert a huge delay.

However normally operations are executed on CLOCK transitions. One of the control pins is CLOCK.
so let us say that for an addition like ADD R0, R1 it could be like this:
Clock   Operations
1 --- Instruction is read from ROM memory (additional Wait clocks can be inserted if the memory device is slow)  is a FETCH CYCLE
2 --- Instruction is decoded internaly, IP=IP+1 (2 operations done in the same time)
3 --- R0 = R0 + R1 inside ALU, also CARRY and ZERO are setup now --> thi ADD instruction is now Done
4 --- IP is put on the ADDRESS bus and once again a FETCH Cycle starts, instruction at IP is read from ROM memory
5 --- New instruction is decoded internally ... here we go again ...
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

t48605

Quote
3 --- R0 = R0 + R1 inside ALU, also CARRY and ZERO are setup now --> thi ADD instruction is now Done
...
5 --- New instruction is decoded internally ... here we go again ...

so what is the last purpose of Carry and Zero flag ? , seem as they are setted up only for nothing
and don't affect the next instruction ???
thanks !!!

BogdanOntanu

CARRY and ZERRO (and other flags in real CPUs) do represent the status of the last arithmetical or logical operation.
They will be taken into consideration by the next intruction only if that instruction is conditional like:
JZ, JC, JNZ, JNC or arithetic operations that depend on them like ADC (Add with Carry)

So the FLAGS are ALWAYS setup but used only by some instructions. And if you take care not to destroy them you can use the FLAGS even after a few instructions.
For example on Intel x86 architecture:
-MOV instructions do not affect CARRY.
-INC or DEC instructions do not affect CARRY but they do affect ZERRO flag.

So you could use an instruction that depends on CARRY or ZERO after a few MOV instructions

One typical usage of the flags is to make LOOPS and IFs
For example one could do this:


MOV R0,77h   ; initial random value
MOV R1,13h   ; value used to shuffle things a little
MOV R7,137h   ; counter initial value
@@loop1:
  ADD R0,R0   ; actually does a SHL R0,1
  XOR R0,R1   ; eratic change of bits
  DEC R7        ; decrement counter
  JNZ  @@loop1  ; IF counter!=0 goto @@loop1


This will shuffle the value of R0 in some hopefully eratic ways for 135h times until finishing
The real functionality is irelevant and i have used some instructions that we have not (yet) defined for our hypotetical CPU.

However I just wanted to show how you can use the ZERRO flag and the JNZ instruction to perform a LOOP:
-DEC R7 will setup the ZERO flag to either  1 or 0 dependng on the value of R7 and will also make R7 = R7 - 1
-JNZ @@loop1 instruction after it will check the persistent value of ZERO flag and determine IF the JUMP is to be performed or NOT.
-When the jump is not performed --> the loop has finished.

This is similar to the folowing C high level code:

x = 0x77;
s = 0x13;
for(i=0x137; i>0; i--)
{
  x = x << 1;
  x = x ^ s ;
}

where R7 = i , R0 = x, R1= s .


Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

t48605

@@loop1:
  ADD R0,R0   ; actually does a SHL R0,1
  XOR R0,R1   ; eratic change of bits
  DEC R7        ; decrement counter
  JNZ  @@loop1  ; IF counter!=0 goto @@loop1


if R0 's value is overflow , how do you think about that ???