what's up with procedure local labels and ML?!

arafel · July 04, 2005, 11:55:51 PM

Just stumbled on some strange issue I wanted to ask about. Why ML chooses to use external label when assembling 'jmp the_label' rather than the local if both labels have same name? I have noticed that internal label is selected only if procedure has no parameters. Here is example of what I mean:

Code Select

.386
.model flat, stdcall
option casemap:none

.DATA

correct_lbl	db "correct label",0
wrong_lbl	db "wrong label",0

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

Test_p	PROTO :DWORD

.CODE


Test_p PROC parameter:DWORD                ; <-- commenting the parameter out and proc prototype definition
                                                             ; will make ML to choose  local label
	jmp    the_label
	nop
	nop
	nop

the_label:
	nop
	invoke  MessageBox,0,ADDR correct_lbl,0,0
	ret

Test_p ENDP


start:  invoke  GetModuleHandle, 0
	call    Test_p
	invoke  ExitProcess, 0

the_label:
	invoke  MessageBox,0,ADDR wrong_lbl,0,0
	invoke  ExitProcess, 0

end start

hutch-- · July 05, 2005, 05:41:40 AM

arafel,

There were a few things wrong with the layout, the code was not contained between the "start:" label and the "end start" statement. By rearranging the layout to avoid this there is a duplicate label error which is correct. The first one is in global scope, the second is local to a procedure.

Code Select


.386
.model flat, stdcall
option casemap:none

.DATA

correct_lbl db "correct label",0
wrong_lbl db "wrong label",0

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

Test_p PROTO :DWORD

.CODE

start:  

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

invoke  GetModuleHandle, 0
call    Test_p
invoke  ExitProcess, 0

the_label:
invoke  MessageBox,0,ADDR wrong_lbl,0,0
invoke  ExitProcess, 0

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

Test_p PROC parameter:DWORD                ; <-- commenting the parameter out and proc prototype definition
                                                             ; will make ML to choose  local label
jmp    the_label
nop
nop
nop

the_label:
nop
invoke  MessageBox,0,ADDR correct_lbl,0,0
ret

Test_p ENDP

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

end start

Here is a modified version that duplicates te label only in procedures which has no problems at all.

Code Select


.386
.model flat, stdcall
option casemap:none

.DATA

correct_lbl1 db "correct label 1",0
correct_lbl2 db "correct label 2",0

wrong_lbl db "wrong label",0

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

 ; Test_p PROTO :DWORD
 ; Test_q PROTO :DWORD

.CODE

start:  

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

invoke  GetModuleHandle, 0
call    Test_p
call    Test_q
invoke  ExitProcess, 0

the_labelx:
invoke  MessageBox,0,ADDR wrong_lbl,0,0
invoke  ExitProcess, 0

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

Test_p PROC ;; parameter:DWORD                ; <-- commenting the parameter out and proc prototype definition
                                                             ; will make ML to choose  local label
jmp    the_label
nop
nop
nop

the_label:
nop
invoke  MessageBox,0,ADDR correct_lbl1,0,0
ret

Test_p ENDP

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

Test_q PROC ;; parameter:DWORD                ; <-- commenting the parameter out and proc prototype definition
                                                             ; will make ML to choose  local label
jmp    the_label
nop
nop
nop

the_label:
nop
invoke  MessageBox,0,ADDR correct_lbl2,0,0
ret

Test_q ENDP

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

end start

arafel · July 05, 2005, 06:27:44 AM

I though 'end start'`s only purpose is to specify entry point and close any current segment. However I didn't knew that it's mandatory to place procedures between start label and end statement.

Rifleman · July 05, 2005, 11:32:07 PM

Arafel,
The assumption you are making is correct and there is nothing wrong with your code. Just because a rearrangement can help the assembler not make an error does not mean your code is incorrect. Hutch is wrong on this one.

Paul

hutch-- · July 06, 2005, 01:59:24 AM

Noting that debates on these issues are done in other subforums, mistakes of this type should not go uncorrected. MASM as an assembler requires that the programmer specify the code entry point and the termination of the module and in the main module that contains the entry point, it is done with a matching start label and an end start terminator.

Code Select


begin_label:

   ;ALL of your code goes here

end begin_label

By convention it is usually "start:" but it can be any label name.

The problem that arafel had was one of different SCOPE for two labels of the same name and the entry point and exit terminator being out of order. One being module level code and the other being procedure level code.

Now this is further compounded by the layout of a procedure with no arguments where the form,

Code Select


name proc

  ; your code

    ret

name endp

generates the code,

Code Select


name:
  ; your code
    ret

This is why a DLL coded in MASM does not require a seperate "start:" label, the entry point is OS specified as a procedure that takes 3 x DWORD arguments so a procedure with ANY name does the job here as long as the argument count is right.

Code Select


LibMain proc etc .....

    ; all of the code

end LibMain

is the correct code for the task.

It is also normal to be able to write any text you like AFTER the "end start:" or similar name terminator as the terminator tells MASM that there is no more code. Its a useful place to put notes, build info and the like as MASM does not treat ANYTHING after the "end start" terminator as code.

Now if anyone wishes to continue this debate, it will be done in the Colesseum as the rules of the Campus specifically prohibit debate of this type.

Rifleman · July 06, 2005, 02:27:23 AM

It would be pointless.
Paul

MichaelW · July 06, 2005, 07:15:11 AM

Not to muddy the water any further, but a backward jump to the local label will work correctly. This suggests to me that MASM is not scanning the code sequentially (as coded), and the jump is going to the first matching label found.

Code Select


Test_p PROC parameter:DWORD ; <-- commenting the parameter out and proc prototype definition
                            ; will make ML to choose  local label
	;jmp    the_label
      jmp   @F
	nop
	nop
	nop

the_label:
	nop
	invoke  MessageBox,0,ADDR correct_lbl,0,0
	ret

@@:
      jmp   the_label 

Test_p ENDP

arafel · July 06, 2005, 01:45:21 PM

Quote from: hutch-- on July 06, 2005, 01:59:24 AM
This is why a DLL coded in MASM does not require a seperate "start:" label, the entry point is OS specified as a procedure that takes 3 x DWORD arguments so a procedure with ANY name does the job here as long as the argument count is right.

Imho neither exe does not require separate 'start' label. In both exe and dll the entry point specified by 'end label' statement regardless of 'label' name. So the label could be a procedure name as well.

By the way I found that specifying entry point label as a procedure instead of standalone in exe will make ml to handle labels more correctly. I guess it's more proper than to arrange the code as:

Code Select

.386
.model flat, stdcall
option casemap:none

.DATA

correct_lbl db "correct label",0
wrong_lbl db "wrong label",0

include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

Test_p PROTO :DWORD

.CODE


Test_p PROC parameter:DWORD          

    jmp    the_label
    nop
    nop
    nop

the_label:
    nop
    invoke MessageBox,0,ADDR correct_lbl,0,0
    ret

Test_p ENDP

entry_procedure PROC 

    call   Test_p
    invoke ExitProcess, 0

the_label:
    invoke MessageBox,0,ADDR wrong_lbl,0,0
    invoke ExitProcess, 0

entry_procedure ENDP

end entry_procedure

Rifleman · July 06, 2005, 01:50:43 PM

Michael,
Yes, it is obviously an error by the assembler. I have run into it before but just developed a habit to not reuse labels, ever, as a result.

Arafel,
Using a procedure name as the entry point is a lot safer as you cannot duplicate procedure names.

Paul

hutch-- · July 06, 2005, 03:50:16 PM

Directly from the MASM online help from masm 6.11 is the following.

Syntax: END [address]

See also: .STARTUP, .EXIT, ORG

Description:

Marks the end of a source file and optionally indicates the
program load address. The optional <address> is a label or
expression identifying where program execution begins. You can
define <address> only once in a program, usually in the main
module.

You cannot specify <address> if you have used the .STARTUP
directive, which automatically sets a start address. If you are
linking with a high-level language, the start address is
typically set by that language's compiler.

END also closes the last segment in the source file.

This is the punch line for the "start:" label.

The optional <address> is a label or expression identifying where program execution begins. You can define <address> only once in a program, usually in the main module.

At the other end is the following.

Marks the end of a source file and optionally indicates the program load address.

Translate this to current 32 bit Windows code and you get the following architecture.

Code Select


start:
  ; ALL of you code
end start

This is not a code logic issue, its a specific parsing issue with the design of the assembler and outside of this you fly in unchartered territory with no documentation. All you end up testing with unusual variants is undocumented bits and pieces in the assembler's parser which may or may not work correctly and with a very good chance of messing up the documented behaviour.

Now with label names, use the tool properly and you have 2 forms of label scope, module level and procedure level. Within procedures you can routinely reuse a name in another procedure because the assembler is designed to handle them at a local level. If you name a label at a module level it cannot be reused anywhere within that module as you get a label redefinition error.

Now with the documented architecture, its reliable and you can predictably use the documented behaviour for labels where if you deviate you end up with unpredictable behaviour.

You can do it all with the published architecture.

Code Select


start:
  call library code
  call seperate procedures within the start and end.
  call seperate modules that are later linked into the binary file.
  write direct assembler instructions
end start

I am yet to see the advantage of testing methods of creating unreliable code when it can be produced in a reliable manner according to the assembler's documentation.

Rifleman · July 06, 2005, 04:17:35 PM

Hutch,
The point is the assembler should be able to handle scoping issues and it does not. There is no refuting that. A label in a procedure should never be confused with a global label. You can show all the documentation you want but the weakness of scoping with this assembler has always been there.

Paul

MichaelW · July 06, 2005, 06:27:23 PM

From the MASM 6.0 Programmer's Guide:

Quote
Making a Scoped Label Public
MASM 5.1 considers code labels defined with a single colon inside a procedure to be local to that procedure if the module contains a .MODEL directive with a language type. Although the label is local, MASM 5.1 does not generate an error if it is also declared PUBLIC. MASM 6.0 generates error A2203:

cannot declare scoped code label as public.

If you want to make the label PUBLIC, it must not be local. You can use the double colon operator to define a non-scoped label, as shown in this example:

PUBLIC publicLabel
publicLabel:: ; Non-scoped label MASM 6.0

So starting with MASM 6.0 reusing labels in this way is officially an error, but MASM has problems detecting the error if the local label and the reference to it precede the start label. I would never have found this problem, for two reasons :)

Rifleman · July 06, 2005, 09:04:29 PM

Michael,
Errors like that can be very frustrating and I think they made a mistake in the coding that caused this problem and so just modified the documentation. Pretty cheesy. But it is no big deal as long as you do not make that mistake. I did once a long time ago and so never played with that again. And when documentation is written in that manner it pretty much tells you to forget talking to Microsoft about it. I hope the next version does a better job with scoping. For instance there should be no reason why a variable and a label in the same procedure cannot have the same name. This is something else they never got right. GOASM handles this properly. :thumbu

Paul

Paul

hutch-- · July 07, 2005, 01:20:42 AM

Quote
The point is the assembler should be able to handle scoping issues and it does not. There is no refuting that. A label in a procedure should never be confused with a global label. You can show all the documentation you want but the weakness of scoping with this assembler has always been there.

The problem is that the design of the assembler does not have this hidden adjenda, it is finally an assembler, not a compiler and it is the responsibility of the programmer to write code in the format that it is designed to work in. Before any other consideration an assembler is a source code parsing program and if you present it with source that is not written in the published format, you can get very strange results.

As an example,

Try and write a procedure with the same reversed logic as is being argued for the "start: -- end start" directives and you end up in all sorts of trouble.

Code Select


; somewhere else in another procedure.
local_label:
; ...........

MyProc proc arglist ......
  ; ............
  jmp local_label
  ; ............
MyProc endp

Scoping issues are in the hands of the author and the difference between module level scope and procedure level scope is clear and reliable if its written properly.

Quote
For instance there should be no reason why a variable and a label in the same procedure cannot have the same name. This is something else they never got right.

This is simply nonsense, it is common design for a tool like MASM to use a single data structure as a symbol table for the wordlist it must evaluate and in this context, you cannot have two identical symbols. You would have to ask Jeremy but I imagine he has added additional functionality to his design that handles labels and procedure names in different data structures.

If you want really complex compiler design, look at the technica data for a JAVA compiler, the complexity level will knock you over. It does not mean that all compilers/assembler need to adopt that design though.

Rifleman · July 07, 2005, 08:59:27 AM

So GOASM is a compiler, then? Anyway, I stand by what I have said and this issue is closed. I am a professional with a proven track record and years of producing sellable, usable code. This gives me an entitlement that you do not have.

Paul

News:

what's up with procedure local labels and ML?!

arafel

hutch--

arafel

Rifleman

hutch--

Rifleman

MichaelW

arafel

Rifleman

hutch--

Rifleman

MichaelW

Rifleman

hutch--

Rifleman