static linking, dynamic linking, and reduce size of exe

Started by abitlater, February 18, 2008, 03:05:50 PM

Previous topic - Next topic

abitlater

Hello,

I'm a noob, but the forum description said it would be safe ... :)

I've been going through a book that teaches assembly with masm and MS Vis Studio - all project settings are pre-configured.  Now I want to do it all myself from the command line so I will know what I am doing!

I don't know how to determine when libraries are statically linked or dynamically linked?  Supposedly the book's library included in all my Visual Studio projects thus far is statically linked,  But how does one know?  Is the determinations of whether a library will be linked statically a function of the library itself?  I mean, I was thinking that LIB files were all statically linked, but I don't know.  And I'm confused as to why I need to include kernel32.lib if I'm calling functions from kernel32.dll

Also, how can I make my files smaller?  The following shell asm file produces an exe with 2,560 bytes.  That seems large.  Looking at it in LordPE, the sections are too big for this simple file - can the size of sections be changed in the assembly code?  What are other things that can be done to make files smaller?

TITLE one   (one.asm)
.686P
.model flat, stdcall
.stack 4096
option casemap:none

include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\kernel32.lib

.data
mystring BYTE "hello world", 0dh, 0ah

.code
main PROC
                   mov eax, offset mystring
push 0
call ExitProcess
main EndP
END main



ToutEnMasm

Hello,
A static link to a dll must be some where.A library is the object that have this link in it ,it is the adress of the function.
A dynamic link don't know the adress in the dll and don't use a library.You have to search where is the function in the dll.
Here is a sample:
Quote
   invoke LoadLibrary,SADR("ntdll.dll")
   mov Hlibrairie,eax
   invoke GetProcAddress,Hlibrairie,SADR("fabs")
Dynamic link is usefull when dll are update many times.This avoid to recompile the program with the new library.




MichaelW

Run-Time Dynamic Linking is also useful if there is a possibility that the DLL may not be present and you want to present the (possibly ignorant) user with an error message that is more meaningful than the system error message, for example: "The dynamic link library WINIO.dll could not be found on the specified path...", followed by the current directory and the path string.
eschew obfuscation

abitlater

maybe you guys assumed I was asking a smarter question than I really am :)

to use loadlibrary, I have to call a function from the kernel32.dll (dynamic link library).  When I include \masm32\kernel32.lib in the linker options (or in the asm file as "includelib \masm32\lib\kernel32.lib"), am I linking that function in statically?... so it becomes part of my exe and all calls to it within my exe are already resolved?

Let's say I want to make my own libary for functions I have written, and I want it to be a static library. 

First, I don't know how to do that! 
Second, how would I link it in - in the same way as I'd link in a dll? 

What determines if the library code is copied to my file in the linking process, or if the linker will not copy in the code from the library (leaving it for the loader).

what is actually happening in the assemble / link process when I have a line like this in my asm file: "extern LoadLibrary@4:PROC" (or, PROTO LoadLibrary, :DWORD). 

yes, I'm slow :)



Vortex

abitlater,

To make smaller your executable :

i) Use Pelle's linker Polink coming with the Masm32 package, it create smaller executables.
ii) Merge the sections to create a final one section :

\masm32\bin\ml /c /coff Test.asm
\masm32\bin\polink /SUBSYSTEM:WINDOWS /MERGE:.data=.text Test.obj


The size of the executable is dropped from 2560 bytes to 1024 bytes.

Tedd

And to answer your actual question... :lol

You'll almost always be linking dynamically. Where you see 'include/includelib' for kernel32/user32/gdi32/etc, your function calls are bound to the relevent dlls, so there's no extra modules/code attached to your exe.
Lib files can be created for static linking, but these will mostly be libararies you create yourself. All of the ones for system dlls are dynamic. (Yes, it's a property of the lib file itself.)

The 'large' size of your exe is mainly due to padding added to the sections - as you've noticed. Unfortunately that's mainly down to a requirement of the PE format, so although you 'can' get around it (i.e. reduce the padding), your exe becomes technically invalid and may or may not run under any particular windows version.
The alternatives are to either tell the linker to merge the code and data sections (as Vortex mentioned), which effectively puts the data into the code section, but then you may run into protection problems if you try to modify that data and your program will crash for no apparent reason. The other is to place the data into a '.const' section instead, which is already used as part of importing the external functions, so it avoids creating an extra section just for the small amount of data, however you still can't write into this data, so as soon as you need initialised writable data you'll need to use a '.data' section again. (See also '.data?' section, for uninitialised writable data.)
Short version: don't worry about it :P If you your app is so small it's important, place the constant data into a .const section (instead of .data section).
No snowflake in an avalanche feels responsible.

abitlater


MichaelW

#7
Ok, this time I'll try to actually answer the question(s). At least normally, for DLLs you have Run-Time Dynamic Linking and Load-Time Dynamic Linking.

The best simple description of static linking I could find was "In the case of static linking, the linker gets all of the referenced functions from the static link library and places it with your code into your executable file."

This simple source:

.486
    .model flat, stdcall
    option casemap :none
    include \masm32\include\masm32.inc
    include \masm32\include\kernel32.inc
    includelib \masm32\lib\masm32.lib
    includelib \masm32\lib\kernel32.lib
    .data
      buffer db 100 dup(0)
    .code
start:
    invoke dwtoa, 12345678, ADDR buffer
    invoke szLen, ADDR buffer
    invoke Sleep, 3000
    invoke ExitProcess, 0
end start


Utilizes the dwtoa and szLen procedures from the MASM32 library and calls the Sleep and ExitProcess functions from kernel32.dll. Since the MASM32 library is a static library the functions are statically linked, as described above. For the kernel32.dll functions load-time dynamic linking is used. I'm sure others here can explain load-time dynamic linking in detail, but I can only provide the short form. Basically, the object module is linked with the import library kernel32.lib, and the linker adds information to the EXE that the system will use at load time to load kernel32.dll, or at least the necessary parts of it, into the EXE process's address space, and code that the process will use to call the functions.

If you build the EXE and then disassemble it using DumpPE, in the list of imports you will see:

Imp Addr Hint Import Name from kernel32.dll - Not Bound
-------- ---- ---------------------------------------------------------------
00002000  260 Sleep
00002004   80 ExitProcess


Here is the code that calls the functions:

00401019 68B80B0000             push    0BB8h
0040101E E8C9000000             call    fn_004010EC
00401023 6A00                   push    0
00401025 E8BC000000             call    fn_004010E6


And at the end of the listing, a jump table that is used to transfer execution to the functions.

004010E6                    fn_004010E6:
004010E6 FF2504204000           jmp     dword ptr [ExitProcess]
004010EC                    fn_004010EC:
004010EC FF2500204000           jmp     dword ptr [Sleep]


As described above, the code for the procedures from the MASM32 library is placed directly into the EXE. Here is the code that calls dwtoa and szLen:

00401000 6800304000             push    403000h
00401005 684E61BC00             push    0BC614Eh
0040100A E821000000             call    fn_00401030
0040100F 6800304000             push    403000h
00401014 E887000000             call    fn_004010A0


And here is the code for dwtoa:

00401030                    fn_00401030:
00401030 55                     push    ebp
00401031 8BEC                   mov     ebp,esp
00401033 53                     push    ebx
00401034 56                     push    esi
00401035 57                     push    edi
00401036 8B4508                 mov     eax,[ebp+8]
00401039 8B7D0C                 mov     edi,[ebp+0Ch]
0040103C 85C0                   test    eax,eax
0040103E 7507                   jnz     loc_00401047
00401040 66C7073000             mov     word ptr [edi],30h
00401045 EB47                   jmp     loc_0040108E
00401047                    loc_00401047:
00401047 7908                   jns     loc_00401051
00401049 C6072D                 mov     byte ptr [edi],2Dh
0040104C F7D8                   neg     eax
0040104E 83C701                 add     edi,1
00401051                    loc_00401051:
00401051 B9CDCCCCCC             mov     ecx,0CCCCCCCDh
00401056 8BF7                   mov     esi,edi
00401058 EB18                   jmp     loc_00401072
0040105A                    loc_0040105A:
0040105A 8BD8                   mov     ebx,eax
0040105C F7E1                   mul     ecx
0040105E C1EA03                 shr     edx,3
00401061 8BC2                   mov     eax,edx
00401063 8D1492                 lea     edx,[edx+edx*4]
00401066 03D2                   add     edx,edx
00401068 2BDA                   sub     ebx,edx
0040106A 80C330                 add     bl,30h
0040106D 881F                   mov     [edi],bl
0040106F 83C701                 add     edi,1
00401072                    loc_00401072:
00401072 83F800                 cmp     eax,0
00401075 77E3                   ja      loc_0040105A
00401077 C60700                 mov     byte ptr [edi],0
0040107A EB0E                   jmp     loc_0040108A
0040107C                    loc_0040107C:
0040107C 83EF01                 sub     edi,1
0040107F 8A06                   mov     al,[esi]
00401081 8A27                   mov     ah,[edi]
00401083 8807                   mov     [edi],al
00401085 8826                   mov     [esi],ah
00401087 83C601                 add     esi,1
0040108A                    loc_0040108A:
0040108A 3BF7                   cmp     esi,edi
0040108C 72EE                   jb      loc_0040107C
0040108E                    loc_0040108E:
0040108E 5F                     pop     edi
0040108F 5E                     pop     esi
00401090 5B                     pop     ebx
00401091 C9                     leave
00401092 C20800                 ret     8


And here is the code for szLen:

04010A0                    fn_004010A0:
004010A0 8B442404               mov     eax,[esp+4]
004010A4 83E804                 sub     eax,4
004010A7                    loc_004010A7:
004010A7 83C004                 add     eax,4
004010AA 803800                 cmp     byte ptr [eax],0
004010AD 7430                   jz      loc_004010DF
004010AF 80780100               cmp     byte ptr [eax+1],0
004010B3 7420                   jz      loc_004010D5
004010B5 80780200               cmp     byte ptr [eax+2],0
004010B9 7410                   jz      loc_004010CB
004010BB 80780300               cmp     byte ptr [eax+3],0
004010BF 75E6                   jnz     loc_004010A7
004010C1 2B442404               sub     eax,[esp+4]
004010C5 83C003                 add     eax,3
004010C8 C20400                 ret     4
004010CB                    loc_004010CB:
004010CB 2B442404               sub     eax,[esp+4]
004010CF 83C002                 add     eax,2
004010D2 C20400                 ret     4
004010D5                    loc_004010D5:
004010D5 2B442404               sub     eax,[esp+4]
004010D9 83C001                 add     eax,1
004010DC C20400                 ret     4
004010DF                    loc_004010DF:
004010DF 2B442404               sub     eax,[esp+4]
004010E3 C20400                 ret     4


You can verify that the code is from the library by comparing the listing to the source files, available in the masm32\m32lib directory.

One easy, but I think not infallible, method of determining if a library is an import library or a static library is to examine a hex-ascii dump of the library, or just load it into an editor, and search for ".obj". A static library will usually contain the names of object modules, and an import library will not. For example, the MASM32 library has hundreds of instances of ".obj", but the kernel32 import library has none.

eschew obfuscation

jj2007

Quote from: Vortex on February 18, 2008, 07:12:40 PM
To make smaller your executable :

i) Use Pelle's linker Polink coming with the Masm32 package, it create smaller executables.
ii) Merge the sections to create a final one section :

\masm32\bin\ml /c /coff Test.asm
\masm32\bin\polink /SUBSYSTEM:WINDOWS /MERGE:.data=.text Test.obj


The size of the executable is dropped from 2560 bytes to 1024 bytes.

I know it's bad practice to reopen a defunct thread, but I just stumbled over a goodie:

include \masm32\include\masm32rt.inc

.code
start: mov ebx, 123
test ebx, ebx
.if parity?
print str$(ebx), " is odd"
.else
print str$(ebx), " is even"
.endif
print chr$(13, 10, "That was short and crispy, right?")
getkey
exit
end start


Linker option:   /merge:.data=.text   ; <---- fails miserably
Linker option:   /merge:.text=.data   ; <---- works fine

As it seems, the order matters...
The exception happens here:
mov byte ptr [edi], bl with edi=401000h

(edit: replaced OPT_DebugL with "linker option")

Vortex

jj,

What does mean OPT_DebugL? Searching this statement with Google did not return any result.


jj2007

Quote from: Vortex on October 13, 2008, 05:23:27 PM
What does mean OPT_DebugL?

In RichMasm, I set build options in the source instead of in a separate file. debugL is an option that is added to the linker's commandline.
Sorry for the confusion.

jj2007

Thanks to Vortex, I found out that it was a problem with the build.bat - a non-valid character caused it to crash. The two linker options both perform well.
SORRY!