News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Hello topic, with some questions inside

Started by zer0code, June 08, 2011, 04:06:48 AM

Previous topic - Next topic

zer0code

Well, hello all!

I'm from Brazil (you'll notice because of my english) and this is my first time on assembly and, its not being easy.

I come from C#, before C#, delphi, and before delphi, C. It's like 4 to 5 years of developing, not much.

4 days ago i found a "pack", entitled "Programmer's Pack" with 18Gb, a lot of manuals/PDF's/CHM files that teach from advanced C# to ASM. Then i decided to start learning asm.

But i must be doing it wrong.
I'm doing it the same way i did to learn C#: Read (watch if its on video) and CODE. I got frustrated after trying to write a "simple" program that read 2 numbers from keyboard, add both, and display the result.
I got stucked. Now i must ask

- What EXACTLY should i do? download an DOS emulator? Read all the books FIRST and CODE AFTER?
Or can i go READING and CODING at the same time?

ASM is not like C# or delphi, and before i start asking "HEY HOW DO I READ FROM KEYBOARD", i did chose to explain the whole situation to get a direction, cause i'm honestly lost.

Nice to meet you all and thanks for your time reading this
Eduardo (that's my name, just call me Ed, its shorter and easier)

ps: i just decide that it would be nice if i post all e-books i have here, it could help

- 32-64-BIT 80 x 86 Assembly Language Architecture (2005)
- A lot of INTEL manuals
- Irvine - Assembly Language for Intel-based Computers
- Wrox Press - Professional Assembly Language
- Assembly Language for x86 Processors
- MIPS Assembly Language Programming 2003
- PC Assembly Language (reading this one atm, it's easy to comprehend and explain in details)
- The 80x86 IBM PC and Compatible Computers
- The Art of Assembly Language (noticed that this is HLA, and may not be what i'm looking for)
- The Assembly Programming Master Book
- The Zen Of Assembly Language 1990 - Michael Abrash
- Write Great Code -  Volume I - Understanding the Machine
- Write Great Code - Volume II - Thinking Low-Level, Writing High-Level

:bg






hutch--

Hi Eduardo (Ed),

Welcome on board. Try the MASM32 project, it does many useful things and there are many people here who understand 32 bit MASM code. Make sure your Intel manuals are reasonably late versions. To do this stuff you need Microsoft API reference which you can get from ewither MSDN online or if you can still find it, the old WIN32.HLP file.

Avoid the old DOS stuff, its almost useless if you are not writing legacy code, modern 32 bit FLAT memory model is much more powerful and it is a lot closer to the emerging 64 bit code so you learning will not be wasted.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Hi Ed,

Hutch is the boss here, and he is absolutely right: forget DOS, Win32 is a lot more exciting, and easier, too.
Check the "Tips" below to get essential steps for starting your journey with assembler - and come back soon to ask your coding questions.

Welcome to the Forum :thumbu

zer0code

Hutch and JJ, thanks for your advices!
I just checked your "tips" JJ, it really helped.

About all literature i got here, most of it just uses old style asm, i guess i'll look for a modern (if exists) asm book
Thanks again :)

hutch--

Eduardo,

Don't hold out a lot of hope on current books about x86 assembler, most of them are out of date crap. Putting your effort into the Intel manuals and the vast number of code examples will be far more profitable to you far more quickly. Once you start to get up to pace there a good working knowledge of the Windows API will help a lot. The real action for you is online data that you can download and study and play with.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

zer0code

Hutch

Thanks. I'll do exactly as you said, even cause i'm "used" to learn this way (short examples + code)

zer0code

As i dont wanna flood this forum, i'll post my questions all here and up the topic

Well, today as part of my "asm learning routine" (i got late on my work and started over 30 min ago @ 1:00 am)
im gonna implement the input check to see if the input is a number, or character

I found the ISNUMBER macro, opened the isnumber.asm and found

align 16
  fail:
    xor eax, eax
    ret 4

align 16
isnumber proc char:BYTE

    cmp BYTE PTR [esp+4], 48
    jb fail
    cmp BYTE PTR [esp+4], 57
    ja fail
    mov eax, 1
    ret 4

isnumber endp


By googling it i found that ESP is the register points to the top of the stack. Considering that i moved an input to a var (or register), will its address be pushed automatically to ESP?
the ESP+4 refers to the case of pushing a DWORD
48 is the ascii representation of 0, i guess

if the pushed value is below 48, its not 0, its any other thing but a number (jb fail)
Same if its above 57 (ja fail)

fail routine just clear EAX value and returns (i don't know what the 4 is doing there, anyone mind to explain?)
if the compared value is not below 48, nor greater than 57, mov eax, 1, and returns

ok, now i understood why it returns 0 in case of fail and 1 in case of success but

i got 2 questions
1 - why the ret 4, and not only ret?
2 - what's the use of BYTE PTR? could'nt you have used only 

cmp [esp+4], 48

Sorry for bother, and thanks!

zer0code

Guess i got it, correct me if i'm wrong

Intel architecture stores addresses in llittle-endian, meaning that the result value will be reversed
So, if im storing the number 0x00000004h, in memory it would go as

04 00 00 00
and not
00 00 00 04

this explains why if i pass
invoke isnumber, byte ptr [myDWORDvar]

it will always return 0, wherever i pass a number or a character

but if i pass
invoke isnumber, byte ptr [myDWORDvar + 4]

it will return 1 if its a number, and 0 if it's a char, meaning that for a 1-digit length char, i have to move N-positions forward, where N depends on data type (excluding byte, that even reversed will be at the same position)

I'm not sure if i'm going right, but after over 1 hour studying, and reading about memory & addressing that's what i could conclude.

dedndave

isnumber looks to be a bit bloated   :P
there is no reason to 16-align a short branch label
align 16
isnumber proc char:BYTE

    cmp BYTE PTR [esp+4], 48
    jb fail
    cmp BYTE PTR [esp+4], 57
    ja fail
    mov eax, 1
    ret 4

  fail:
    xor eax, eax
    ret 4
isnumber endp

notice that it only looks at the low-order byte
i guess it examines a byte that is passed as a dword

here is how i would do it - probably every bit as fast as the other one...
        OPTION  PROLOGUE:None
        OPTION  EPILOGUE:None

        ALIGN   16

IsNumber PROC   char:DWORD

        movzx   eax,byte ptr [esp+4]
        xor     al,30h
        cmp     al,0Ah
        mov     al,0
        jae     fail

        inc     eax

fail:   ret     4

IsNumber ENDP

        OPTION  PROLOGUE:PrologueDef
        OPTION  EPILOGUE:EPIlogueDef


48 (30h) is the ASCII representation of the number 0
57 (39h) is the ASCII representation of the number 9

i have never tried anything like this...
invoke isnumber, byte ptr [myDWORDvar]
so, i don't know what it would do - lol
when you pass parameters with INVOKE, they are pushed onto the stack
as i mentioned before, always keep the stack 4-aligned in 32-bit code

ESP is a pointer to the "top of stack"
which is actually the lower address   :bg
the stuff at and above the address in ESP are things stored on the stack
the stuff below the address in ESP are undefined
when you PUSH a dword onto the stack, ESP automatically gets 4 subtracted from it and the value is placed at that address
when you POP a dword, it is the value at the address in ESP, after that - ESP gets 4 added

        RET     4
this code returns, then adds 4 to ESP, discarding the pushed parameter that was passed

jj2007

Here is an even shorter version - a learning piece:


Quoteinclude \masm32\include\masm32rt.inc

.data
ptrBuffer   dd CharBuffer      ; instead of using offset CharBuffer, we declare a dword variable here
CharBuffer   db " x", 0

.code
TestString      db "Masm32 is 10*faster:", 0   ; the string is read-only, so it can reside in the .code section

start:
  mov esi, offset TestString
  .Repeat
   movzx edx, byte ptr [esi]   ; we need this for printing
   
.Break .if !dl   ; get out if the string is finished
   
mov CharBuffer[1], dl   ; move the char into the x
   cmp byte ptr [esi], "0"   ; everything below "0" sets the carry flag
   .if
!Carry?
      cmp byte ptr [esi], ":"
      cmc   ; invert the carry flag - the 2nd comparison was the other way round
   .endif
   .if Carry?
      print ptrBuffer, 9, "not a number", 13, 10
   .else
      print ptrBuffer, 9, "IS a number", 13, 10
   .endif
   inc esi
  .Until 0         ; an endless loop, but we have a .break above
 
exit
end start

M      not a number
a      not a number
s      not a number
m      not a number
3      IS a number
2      IS a number
        not a number
i      not a number
s      not a number
        not a number
1      IS a number
0      IS a number
*      not a number
f      not a number
a      not a number
s      not a number
t      not a number
e      not a number
r      not a number
:      not a number

zer0code

@Dave
Thanks for the detailed explanation and for your patience.
As you noticed i ever try to understand everything in details....

like, the first time i invoked isnumber, i checked the syntax, and found a byte parameter
but my var was DWORD....i didnt know how to pass

tried first: invoke isnumber, byte ptr [myDwordvar]

failed, it was always returning zero.
then tried invoke isnumber, byte ptr [myDwordvar + 4] and it worked. The problem is not if it works or not, the problem is when it WORKS but YOU DONT KNOW WHY it worked!!! And this took me another night to START understanding....

you said: when ESP got a DWORD pushed onto it, his address is subbed by 4 and the dword valeu is place at that address.
But if the address was subtracted by 4 (considering DWORD), why am i checking the byte at [DWORD + 4] and not only the byte at  [DWORD]?

You got my point? This question must sounds like a dumb question, but i must ask cause since you didnt say anything about my very last post (little endian etc), i dont know if what i said is right. Anyway, that's it

@JJ
Brother, nice to meet you and thank you a lot for your example, i would start implementing that as part of my adding routine
I must say that, honestly, i'm STILL not able to fully understand your code, but i'll store it for study and later use.

dedndave

no - it's not dumb at all
on the stack, between the current ESP pointer, and the pushed parameter is the RETurn address
when you start a PROC, ESP always points to this RET address (ignoring any prologues)
above that is the parameter(s)

maybe this link will help a little...
http://www.masm32.com/board/index.php?topic=14381.msg114921#msg114921

zer0code

Now EVERYTHING makes sense. This is not about little or big endians, its only because the ESP was pointing to ret value instead of the pushed value.
So, to test a byte for example i'd have to move one byte forward
[byte + 1] to tst the value
or
[word + 2]
or
[dword + 4], [qword + 8], and so on....

i'm gonna check your link as soon as i finish a report i'm working on. Thanks a lot for that, you just gave a direction!

zer0code

I got the point on ESP, i'm just crazy with these addresses. I tried to figure out in many different ways. The order i just understood. What i'm not getting now is why Parm3, the 1st param, is in the address [ebp+28].


1) Parm3                   [ebp+28]
2) Parm2                   [ebp+24]
3) Parm1                   [ebp+20]
4) RETurn address for CALL [ebp+16]
5) saved EBX contents      [ebp+12]
6) saved ESI contents      [ebp+8]
7) saved EDI contents      [ebp+4]
8) saved EBP contents      [ebp]
9) Local1                  [ebp-4]
10) Local2                 [ebp-8]
11) Local3                 [ebp-12]


Guess i need more reading about registers and memory addressing, its not trivial if you never did that.

jj2007

Quote from: zer0code on June 10, 2011, 11:27:54 PM
What i'm not getting now is why Parm3, the 1st param, is in the address [ebp+28].

A normal proc with params and Locals has a stack frame, i.e. the locals are created by
push ebp
mov ebp, esp
sub esp, 4*9
where 4*9 would e.g. mean nine DWORDs. From then on, ebp offers a "stable" reference pointer, while esp depends on what you are pushn' and pop'n around.
Before the creation of the stack frame,
[esp+0] is your return address
[esp+4] is para1
[esp+8] is para2 etc
Afterwards, your params are better addressed as [ebp+x], where x depends on the difference between ebp and esp.
You can only really understand it if you see your procs in Olly.
:U

MyTest proc arg1:DWORD, arg2:DWORD
LOCAL lv1, lv2, locbuf[260]:BYTE
...
  ret
MyTest endp


... with Olly:
00401032            $  55                      push ebp
00401033            .  8BEC                    mov ebp, esp
00401035            .  81EC 0C010000           sub esp, 10C
...
0040104A           ³.  8BE5                    mov esp, ebp
0040104C           ³.  5D                      pop ebp
0040104D           À.  C2 0800                 retn 8