Print Page - Curiosity about data segment initialization

Title: Curiosity about data segment initialization
Post by: falcon01 on July 31, 2011, 10:49:28 AM

Hi, reading at some 16bit code I encountered these sequences:


DSEG        segment para public 'DATA'
...
DSEG        ends

...other segment definition...

assume	CS:CSEG,DS:DSEG,SS:SSEG

and

Code Select


mov ax,DSEG
mov ds,ax

Now, what's the meaning of second sequence? I mean, I already tell in the first one (assume line) I'm associating DS register with DSEG area, so why update DS again?
And why is this done just for DS one?Are CS and SS different? :S

Title: Re: Curiosity about data segment initialization
Post by: MichaelW on July 31, 2011, 11:29:16 AM

The ASSUME directive sets an assembly-time association, that the assembler needs to calculate addresses. The loading of segment registers is a run-time operation. The program loader sets the CS segment register to the segment address of the program's code segment (or the starting code segment if there is more than one), and loads the inital SS and SP values from the EXE header, but leaves the DS and ES segment registers set to the segment address of the Program Segment Prefix (PSP). For an EXE (but not for a .COM file), to access data through the DS segment register (the default segment register for most instructions that access memory) it must be loaded with the segment address of the program's data segment.

Title: Re: Curiosity about data segment initialization
Post by: dedndave on July 31, 2011, 11:35:13 AM

i haven't seen much of it in here, for some reason
but, it's good to save the original contents of DS or ES
that way, you may access the PSP that Michael refered to
the command line is in there, as well as pointers to the environment block, FCB's,
and even the first 2 command line parms are parsed if they ressemeble DOS filenames

while it's true that you cannot load a segment register directly with an immediate value,
you can store/load it directly to/from memory...

Code Select

        .DATA?

PspSeg  dw ?

        .CODE

        mov     PspSeg,DS
        mov     ax,@DATA
        mov     ds,ax

Title: Re: Curiosity about data segment initialization
Post by: falcon01 on July 31, 2011, 05:57:54 PM

So, the "assume" is basically for saying to processor: "ehy at execution time I'll give you the right address of DS" ?
2Dave:
Thank you for every improvement hint you give ;)

Title: Re: Curiosity about data segment initialization
Post by: FORTRANS on July 31, 2011, 06:57:17 PM

Hi,

ASSUME is to tell the assembler where you expect to find
your data so it can generate the proper address for the code
that follows the assume. And some error checking. So if you
ASSUME DS:DATASEG, you pretty much have to initialize the
DS register to actually point to DATASEG or the assembler
will emit bad code.

HTH,

Steve N.

Title: Re: Curiosity about data segment initialization
Post by: falcon01 on July 31, 2011, 07:33:29 PM

Mmm...then I can't understand!
If I tell the assembler I expect to find my data at DATASEG, what need do have I to set DS=DATASEG manually?
Can't the assembler do it itself after reading DS:DATASEG?Sorry but it's not very clear...

moreover I usually put the assume directive at the END of my code, rigth before END directive...is it wrong?

Title: Re: Curiosity about data segment initialization
Post by: dedndave on July 31, 2011, 07:54:52 PM

it wants to know that DS is pointing to where the data is
it is really a type of "strong checking"

if you reference a label in SOMESEG, at offset 2000h....

Code Select

mov ax,SOMELABEL
it creates the following code....

Code Select

mov ax,[2000h]

for data MOV's of this type, DS is the default segment register
so, it will load AX with the contents at DS:[2000h]
the assembler does not know at assembly time what value will be in DS at runtime :P
by using ASSUME, you tell the assembler you are refering to data labels in SOMESEG
now, it is happy - lol
at runtime, DS may actually not point to SOMESEG
then, you get whatever is at DS:[2000h]

let's say we have 2 data segments, DSEG1 and DSEG2
i use ASSUME.....

Code Select

ASSUME DS:DSEG1,ES:DSEG2

and, i load the segment registers...

Code Select

        mov     ax,DSEG1
        mov     ds,ax
        mov     ax,DSEG2
        mov     es,ax

now, let's say that i want to get a value into AX from a label in DSEG2.....

Code Select

mov ax,DSEG2DATA

the assembler knows that ES is pointing to DSEG2
it also knows that DS is not pointing to DSEG2
it knows those things because we told it so with ASSUME
it generates the following code with a segment override....

Code Select

mov ax,ES:[2000h]
a segment override is required because the data is not in the default (DS) segment

Title: Re: Curiosity about data segment initialization
Post by: FORTRANS on July 31, 2011, 08:38:55 PM

Quote from: falcon01 on July 31, 2011, 07:33:29 PM
Mmm...then I can't understand!
If I tell the assembler I expect to find my data at DATASEG, what need do have I to set DS=DATASEG manually?

Well, one more time for redundancy's sake.

Code Select


DATASEG SEGMENT PUBLIC

MyData  DW      42

DATASEG ENDS
EXTRASG  SEGMENT PUBLIC
; more data...
EXTRASG  ENDS

        ASSUME  DS:DATASEG
        MOV     AX,SEG DATASEG
        MOV     DS,AX

; Now you can access MyData

        ASSUME  DS:EXTRASG
        MOV     AX,SEG EXTRASG
        MOV     DS,AX

; Now you can NOT access MyData, and if you try
; you will get an error message.

Quote
Can't the assembler do it itself after reading DS:DATASEG?

Probably it could, but it doesn't.

Quotemoreover I usually put the assume directive at the END of my code, rigth before END directive...is it wrong?

If you never change a segment register, you can probably
do it that way. If it's not broken, you don't have to fix anything.
But it is better to put the ASSUME where you initialize the
segment register(s). Easier to see what is happening. At
least for me.

Cheers,

Steve N.

Title: Re: Curiosity about data segment initialization
Post by: falcon01 on July 31, 2011, 09:14:05 PM

Ok, got it thanks to both!:D

Title: Re: Curiosity about data segment initialization
Post by: mineiro on July 31, 2011, 10:37:18 PM

When you code some rotine (procedure), you put names like PROC and end the rotine with ENDP.
Your rotine are inside some segment, so naturally you need start and end this segment, with SEGMENT and ENDS.
These names tell the assembler to generate some pseudo-ops(false operations). The assembler assume this is true.

From your point of view, you know you can "call" another procedure inside some procedure, and you know you can have more than one segment.
And if you like to call another procedure in another segment? You need inform the assembler that it must perform some work to make your job easy. And this is where pseudo-ops enter in scene, like public, near, far, assume ... .
"Assume" give informations to assembler about segments, how we like to use registers segment, ... . To understand about "assume", you need first understand about labels and variable names.
Every time you create a label, like "something proc near" or a memory variable, assembler take care about not only their simbolic name, but the type too (byte, word,..), the address of simbolic name, and the segment they are defined. Assume is closed(looking) to that last information.
Assembler do not assume automatically that all rotines are in the same segment.

So, the pseudo code below:
assume cs:code_segment

tell to assembler that cs is pointing to a simbolic name code_segment. Without this information, assembler just crosses they arms if you try to call that with "call someplace", and you get some error message like "No or unreachable cs".
Appears strange uh, because cs is every time pointing to code segment. In fact, we dont need use "assume" in this situation, but, we need use "assume" in this situation because have one thing called "segment superposition".
The processor generaly read data (like "mov al,some_variable) in data segment. But, he can read that variable from another segment (eg: es segment).
This is why assembler needs the pseudo-ops "assume:", to know what segment register he will use.
So your program can deal with multiple segments, relocations, ... .
Just a last comment, imagine yourself writing a program that have a data,code,stack,extra segment. So all segments are filled allright? Now, the point is, how can your write to video address (segment b800h)? You need discard temporary some register segment, put video segment in this addres, do some actions ,and after restore that segment.

Some parts of this post are translated by me using my poor english, if you like see a better explanation, look to:
"Peter Norton's Assembly Language Book for the IBM PC", by Peter Norton & John Socha, chapter 29 and chapter 11.

Title: Re: Curiosity about data segment initialization
Post by: MichaelW on August 01, 2011, 01:11:21 AM

BTW, starting with ML 6.0 the ASSUME CS was no longer necessary.

Title: Re: Curiosity about data segment initialization
Post by: mineiro on August 01, 2011, 03:36:03 AM

Thank you Sr MichaelW, very well noted, and I admit, when I was done that translation, this thing don't come in my mind.

Title: Re: Curiosity about data segment initialization
Post by: falcon01 on August 01, 2011, 07:51:27 AM

A very interesting explaination mineiro, thanks u too^^

Title: Re: Curiosity about data segment initialization
Post by: falcon01 on August 01, 2011, 02:16:22 PM

Mmm...ok now a strange thing occurs!
If I simply remove

Code Select

assume DS:DATASEG
but I leave

Code Select

mov AX,DATASEG
          mov DS,AX

everything just works...

Title: Re: Curiosity about data segment initialization
Post by: dedndave on August 01, 2011, 02:21:05 PM

that is because you are only using the data area in a simple program :bg

if you load the offset of a label...

Code Select

mov dx,offset String
the assembler does not care about the DS register
it gives you the offset relative to the beginning of the segment

however, if you use something like....

Code Select

mov ax,SomeData
the assembler wants to know that DS matches the segment or group of SomeData

Title: Re: Curiosity about data segment initialization
Post by: falcon01 on August 01, 2011, 02:24:07 PM

Sorry, where someData would be?In another data segment?
I tried

Code Select


lea dx,FIRSTR

instead of using offset directive and it works fine...

Title: Re: Curiosity about data segment initialization
Post by: dedndave on August 01, 2011, 02:33:25 PM

sorry - i should have given a clearer example....
notice that, in our keyboard entry program, we do not access and data via labels directly
we load the offset into a register and address the data via register

for most programs you write, however, you will access data directly via the label...

Code Select

        .DATA

SomeData dw 5

        .CODE

        mov     ax,SomeData
        add     ax,3
        mov     SomeData,ax

for this type of access, the ASSUME directive will be required

for this type of access, it is not....

Code Select

        .DATA

SomeData dw 5

        .CODE

        mov     bx,offset SomeData
        mov     ax,[bx]
        add     ax,3
        mov     [bx],ax

Title: Re: Curiosity about data segment initialization
Post by: falcon01 on August 01, 2011, 02:50:20 PM

Sorry again, I'm the newby here but your "assume is needed part" still works without assume...
here's my code

Code Select

;----------------------------------------------------------------------------------------
SSEG SEGMENT PARA STACK 'STACK'
	db 128 dup(?)
SSEG ENDS  

;-----------------------------------------------------------------------------------------
DSEG segment para public 'DATA'

SomeData dw 5


DSEG ENDS

;-----------------------------------------------------------------------------------

CSEG SEGMENT PARA PUBLIC 'CODE'

;----------------------------------------------------------------------------------------------------------------------
; Main proc
;----------------------------------------------------------------------------------------------------------------------
MAIN proc far
	 mov    ax,DSEG
	 mov    ds,ax	
												    ;	Program
	 mov     ax,SomeData
     add     ax,3
     mov     SomeData,ax
	
	mov ah,4ch    
	int 21h 
															; Exit program, return to OS
MAIN endp  

CSEG	ends      
;---------------------------------------------------------------------------------------
        ;ASSUME CS:CSEG,SS:SSEG,DS:DSEG <---commented
		end	MAIN

Maybe it's because I'm using 8086emu for 16bit programming?

Title: Re: Curiosity about data segment initialization
Post by: dedndave on August 01, 2011, 04:04:46 PM

yah - most of us use MASM
some use JwAsm, GoAsm, a few others
our statements about ASSUME apply primarily to MASM

ASSUME at the end of the source file does nothing, really :P
MASM treats ASSUME in a serial manner, with regard to other text in the source

Code Select

        ASSUME  DS:DSEG1

;the assembler assumes DS is pointing to DSEG1 for any source text here

        ASSUME  DS:DSEG2

;the assembler assumes DS is pointing to DSEG2 for any source text here

        ASSUME  DS:DSEG1

;the assembler assumes DS is pointing to DSEG1 for any source text here

        ASSUME  DS:Nothing

;the assembler assumes nothing about DS for any source text here

ASSUME may also be used in other ways
for example, it may be used to tell the assembler that EBX is pointing to a particular data type or structure

Code Select

        ASSUME  EBX:Ptr BYTE

        mov     [ebx],0             ;stores a byte at [EBX]

        ASSUME  EBX:Ptr DWORD

        mov     [ebx],0             ;stores a dword at [EBX]

the default assumption about EBX is Nothing
so, without the ASSUME's, you would have to...

Code Select

        mov byte ptr [ebx],0        ;stores a byte at [EBX]
        mov dword ptr [ebx],0       ;stores a dword at [EBX]

Title: Re: Curiosity about data segment initialization
Post by: mineiro on August 01, 2011, 05:28:53 PM

emu8086 use fasm

The MASM Forum Archive 2004 to 2012

Miscellaneous Forums => 16 bit DOS Programming => Topic started by: falcon01 on July 31, 2011, 10:49:28 AM