News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

upper case to lower case

Started by scooter4483, March 02, 2006, 09:26:36 PM

Previous topic - Next topic

scooter4483

alright, i did some tweaking around and i thought i had it:

A0: mov al, [edx]       
    cmp al, 0           
    je Display_String
    cmp al, 'A'                            ;upper to lower
    jb A1
    cmp al, 'Z'
    ja A1
    add al, 'a'-'A'
    mov [edx], al

    mov edx, OFFSET String3
    mov al, byte ptr [edx]
    cmp al, 'w'                       ;comparing the pointer to w
    jg A2                              ;if the letter is greater than w, go to A2
    add al, 3                         ;otherwise add 3 to al

A2: sub al, 23                      ;subtract 23 if greater than w
    mov [edx], al                   ;put al back into the edx pointer

A1: inc edx
    jmp A0

Am I close, any suggestions?

PBrennick

scooter4483,

Try this next:

    mov edx, OFFSET String3      ; Point to the string
A0: mov al, byte ptr [edx]       ; Get a character or next character
    cmp al, 0                    ; End of string?
    je Display_String            ; Go show it if yes
    cmp al, 'A'                  ; Check for lower boundary
    jb A1                        ; Jump if not a letter
    cmp al, 'Z'                  ; Uppercase?
    ja A3                        ; Go check for lower, if no
    or al, 20h                   ; Convert to lowercase
    mov byte ptr [edx], al       ; Store it
A3: cmp al, 'w'                  ; Compare character for upper bounds, first check
    jg A2                        ; Jump if possible upper bounds error
    cmp al, 'a'                  ; Check for lowercase
    jb A1                        ; Jump if not a letter
    add al, 3                    ; Otherwise add 3 to al
    mov byte ptr [edx], al       ; Store it
    jmp A1                       ; Done with this character
A2: cmp al, 'z'                  ; Compare character for upper bounds, second check
    ja, A1                       ; Jump if not a letter
    sub al, 23                   ; Otherwise do a wraparound
    mov byte ptr [edx], al       ; Store it
A1: inc edx                      ; Point to next character
    jmp A0                       ; Go get it
Display_String:                  ; You go from here and tell me how your are doing ...


This is untested but looks correct, you try it and tell me.

Paul

The GeneSys Project is available from:
The Repository or My crappy website

scooter4483

i got the following error when i tried that code:

test.asm<48> : error A2008:  syntax error: ,

You just added a comma in A2, but I fixed it.  Paul, you are my new hero.  The things you did with that code, i just understood after seeing my TA.  I was doing something similar and here are the things she made me realize.  My changing of the letters would only work if I have capital letters.  I was missing the connection that if the letters are lower case then they have to also do the changing of the letters.  that the first thing i was missing.  Im not familiar with the 'byte ptr' code cuz the teacher did not go in depth in that part.  i love this site and i do understand what you are doing.  i'm still trying to understand the jump if above/below statemetents.  I know that my code is looking between A-Z  with the ja and jb.  You used the or statement which just adds, meaning it searches its lowercase letters which is 20h ahead, correct?  The way i did it, was it wrong with the add al, 'a'-'A'?

Overall, I would like to thank you all for the help and encouragement.  I'm sorry I failed you guys in not figuring it out.  But im glad i figured out the last program.  Thanks again. guys.

PBrennick

scooter4483,
Thank you for your nice words, they mean a lot.  I see my error in the ja line, how the heck did I miss that!  But you fixed it!  Your add al, 'a'-'A' is functionally correct but the or command does a better and faster job (it sets the flags in a more meaningful way).  If I was only interested in the flags and not changing the actual value using tst would even be better.  Remember that and play with it until you understand it.  Also, about changing jg to ja, this is good programming practice that can prevent hard to detect errors in your future coding.  Make it a habit to only use jg and jl when dealing with signed numbers.  ja and jb is for unsigned numbers which is something that would not create an error in todays code (using jg, I mean) but if any of the computations happened to be greater than 7Fh jg would fail because those are negative numbers if you are testing the sign which is what jg and jl does, remember that and play with it until you understand it.  When using jg, 80h-0FFh is actually less than 0-7Fh!

And above all, do a lot of coding and have a lot of fun.  Say hi to your teacher from me, just before retiring I taught assembly and related classes (Introduction to Computers 101 and Introduction to Computer Programming 101) in a college in New York (Suffern).

Paul
The GeneSys Project is available from:
The Repository or My crappy website

PBrennick

scooter4483,
One more thing, about 'byte ptr' this is another good programming practice.  When assembling, masm pays no attention to the brackets any more so you can, again, get errors if you are not careful.  byte ptr makes sure that the register is an address and not a value so consider it a form of indirect addressing and use it always.

Paul
The GeneSys Project is available from:
The Repository or My crappy website

Mark Jones

Hey Scooter, glad you got it figured out! Not so bad once understand it, hmm? :)
Paul certainly knows his stuff so listen to him, k? :toothy See \masm32\help\ASMINTRO.HLP under "Addressing and Pointers" for more help with pointers and addressing.
Since you expressed some interest, here is a little more info for you. Come back to it later if needed, don't want to overwhelm ya.

Flags register - this is a special register of the processor which contains a number of bit-flags, indicating such things as zero, carry, parity, etc. Most instructions cause updates to occur to the flag register seamlessly as they execute.

JG,JA,JLE... these are conditional jumps. There are lots of 'em. They all look at the flags register internally and jump based on simple logic. JG jumps if the value is GREATER only, JA jumps if the value is ABOVE only, JLE jumps if the value is LESS or EQUAL, etc. Paul explained when to use certain ones. Take this for example:


szCopyMJ1 proc  uses esi edi  szDest:DWORD,szSource:DWORD
    mov esi,szSource                ; szSource is already an offset
    mov edi,szDest                  ; szDest too
@@:
    mov al,byte ptr[esi]            ; fetch a byte
    inc esi                         ; increment source pointer
    mov byte ptr[edi],al            ; put byte
    inc edi                         ; increment dest pointer
    test al,al                      ; null?
    jnz @B                          ; loop if not
    ret
szCopyMJ1 endp


This code loads string offsets into ESI and EDI, then copies bytes from ESI to EDI. But how does it know when to stop? Answer: the TEST function sets the flags register. When TEST AL,AL returns zero (that is, when AL is zero), the Zero Flag is set. The JNZ @B tests the zero flag to see if it is set - and if NOT, code execution jumps back to the previous @@:. JNZ stands for "Jump If Not Zero" so it does exactly that - loops until a zero is found, then falls through to the RET. Note that the zero IS copied before returning - that is a requirement for valid ANSI strings. Incidentally, CMP AL,0 does the same thing as TEST AL,AL. The TEST is often used for testing for zero because it is usually faster than CMP.) Note however that TEST AL,2 is not the same as CMP AL,2.

Carry is another very useful flag. Imagine you want to count to 1000 using only AL and CL. Is that even possible? Sure! When you increment AL from FFh to 00h, the carry flag will be set. Now if you monitor the carry flag, it is possible to then increment CL, effectively allowing you to count up to 65535. That's not a very large number, but if you apply this logic to EAX and CL, suddenly you can count up to 1,099,511,627,775! Your prof will likely have an excercise on this later, so I won't go into too much detail yet. :bg

The above is a PROC - a procedure, which is MASM32 syntax for a type-defined function call. To define this function, a PROTO statement is utilized, normally at the beginning of the file:


    szCopyMJ1    PROTO   :DWORD,:DWORD


This tells MASM that there is a function prototype called szCopyMJ1 which is passed two dword-sized parameters. Then the function can be called using MASM32 INVOKE syntax:


    szCopyMJ1    PROTO   :DWORD,:DWORD
.data
    myString     db  "Hello world!",0
    myBuffer     db  12 dup(0)
.code
    invoke szCopyMJ1,addr myBuffer,addr myString   ; string is copied to myBuffer
    invoke MessageBox,0,addr myBuffer,0,MB_OK      ; show copied string


This is basically how most functions are done in assembly. :)

Note, in the PROC line it says "uses ESI EDI". What does that mean? That's a handy way to preserve registers. What's that? Well if your Prof hasn't already said so, there are some registers which you must not modify. ESI and EDI are two of them. Windows expects the values in ESI and EDI to be the same after your program exits. If they are not, windows might crash! But ESI and EDI are useable, as long as you put the original values back when you're done. That's exactly what "uses esi edi" does in the PROC line - it preserves ESI and EDI so you can use them in your code. It is the equivalent to this:


    push esi
    push edi
; all the other code here
    pop edi
    pop esi


PUSH and POP are commands which save data to a circular buffer. That is, FIRST IN - LAST OUT. All you need to know about it now is that the above works every time to save and restore ESI and EDI.

Hope that's interesting and I didn't confuse you too much. Have fun!
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

zooba

Quote from: PBrennick on March 03, 2006, 06:37:22 PM
When assembling, masm pays no attention to the brackets any more so you can, again, get errors if you are not careful.  byte ptr makes sure that the register is an address and not a value so consider it a form of indirect addressing and use it always.

Unless I've misunderstood you, I don't believe this is true:

    .data
        dwTemp DWORD 0

    .code

Main PROC
    ConsoleInit
   
    mov eax, OFFSET dwTemp
    mov DWORD PTR eax, 1
   
    Print "eax = %d\ndwTemp = %d\n", eax, dwTemp
   
       
    CPause
    ret
Main ENDP


The output:

eax = 1
dwTemp = 0


So MASM does pay attention to the brackets, otherwise the value of dwTemp would have changed to 1.

I do agree though, that it is a good habit to get into. At least until you're comfortable with how MASM interprets it.

Cheers,

Zooba

Mark Jones

The whole "brackets and ptr" thing is an unnecessary complexity simply because brackets are ignored in all but one case. That makes a newcomer's experience with brackets highly ambiguous. One could say that "Brackets signify the CONTENTS of" such as "mov al,[myVal]", which does take the value of myVal and move it into AL. But it also does the same thing without the brackets. So why doesn't it give an error for trying to move the DWORD-sized offset of myVal into AL? How confusing, really.

I fully believe MASM should yell and moan about spurious brackets instead of ignoring them, so as to restrict our use of them. Anyways yes, in my limited and very confused experience, brackets are required to access memory contents when used after PTR.

<rant>
Not like PTR shouldn't already do what brackets do... sheesh PTR does stand for POINTER! ::) Yeah, eliminate brackets altogether, yeah... Hmm, ask Pelle to eliminate brackets, hmm..... :bg
</rant>
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

zooba

'Brackets and ptr' is unnecessary in all but one case. That case being where the assembler can't figure out how much memory you're referring to:

mov [esi], al       ; fine because MASM knows that AL is 8-bits
mov [esi], dwNumber ; fine because MASM knows that dwNumber is a DWORD
mov [esi], 3        ; no good because 3 could be anywhere upwards of 2-bits

mov WORD PTR [esi], 3   ; these are identical and both indicate
mov [esi], WORD PTR 3   ; that we're discussing WORDs here

PBrennick

Mark is correct, you need to use the brackets only after byte ptr and word ptr and is used for indirect addressing as I explained.  That was what I was trying to say and my example shows the 'proper' use.  Otherwise masm has problems understanding that there used to be a difference between esi and [esi] for example.  Zooba, your code does not even show the point we are making so, yes, you have misunderstood us.  If you need to see examples of the point we are making look at the source codes for the masm32 library, Hutch uses the same method for the same reason and this topic was beaten to death many moons ago on this forum, IIRC.  Please, lets not confuse our friend who is just learning.  The example code I gave him is correct as written (except for one spurious comma) so it does not need to be discussed at this point any longer.

Paul
The GeneSys Project is available from:
The Repository or My crappy website

zooba

I spent time looking for the 'beating-to-death' of this issue and thought I'd post the links I found.

http://www.old.masmforum.com/viewtopic.php?t=4699
http://www.old.masmforum.com/viewtopic.php?t=409

PBrennick

Zooba,
Thank you for the help.  I did not think to search the old forum.  I could not remember when it happened, looks like it was longer into the past than I thought.  I think it says it well, though, and Tedd says what I said, just use them and use them always.  It makes your code more readable, also.

Again, thank you for the help.
Paul
The GeneSys Project is available from:
The Repository or My crappy website

Mark Jones

Tenkey's post at the bottom of  http://www.old.masmforum.com/viewtopic.php?t=4699  is very interesting. Perhaps we should make that into a BRACKETS.HLP file. :lol

Thanks for searching for that Zooba.
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

zooba

To briefly summarise the findings of those two posts I posted above:

When you define a variable, MASM gives you the brackets for free. This makes using them similar to C (ie. the name accesses the contents. In some assemblers the name accesses the address). For example:

.data
    dwValue DWORD 0
    ; dwValue EQU [00403000h]

.code
    mov dwValue, 1
    ; mov [00403000h], 1


Note that the commented code above won't actually work, since MASM does not support dereferencing of a constant (for good reason IMO).

Once the assembler sees a set of brackets, it knows it's dealing with an address (since to access the contents of the variable it needs to use the address) and so whether or not the rest of the expression is in brackets is irrelevant. However, more bracketed expressions will be treated as addition:

mov dwValue+1, 1
; mov [00403001h], 1

mov dwValue[1], 1
; mov [00403001h], 1


Multiple sets of brackets around an expression achieves nothing, since they are treated as additions with zero:

mov [dwValue], 1
; mov [[00403000h]], 1


Including a variable in an effective address (bracketed expression) uses the address rather than the value (because the extra brackets are treated as an addition):

mov [eax+dwValue], 1
; mov [eax+[00403000h]], 1


MASM stores type information for variables, so when it encounters dwValue it knows that it is dealing with a DWORD. The PTR directive is similar to a type-cast, in that it overrides any other type information. Remember, dwValue came with brackets for free, so they don't need to be included again.

mov dwValue, 1
; mov DWORD PTR [00403000h], 00000001h

mov BYTE PTR dwValue, 1
; mov  BYTE PTR [00403000h], 01h


Multiple type-casts are valid but the right-most overrides all others:

mov DWORD PTR [eax] + BYTE PTR [ecx], 1
; mov BYTE PTR [eax+ecx], 1


Registers and variables are different animals. Variables have an address and value. Registers have a value but no address. A register can be used as a pointer, dereferencing it's contents, while a variable cannot:

mov eax, OFFSET dwValue
; mov eax, 00403000h

mov DWORD PTR [eax], 1
; mov DWORD PTR [00403000h], 1


Note that DWORD PTR must be included in this case; since we are no longer using the variable directly, MASM no longer knows how big it is. Also, since we don't get the brackets for free with a register, they need to be included.

Hopefully this clears up some MASM inconsistencies and better explains how it interprets what we're telling it :bg

Cheers,

Zooba :U

PBrennick

Zooba,

Very nicely done.  You are a very thorough person and a good speaker.  I think that our new users should save this portion of the thread as a webpage for future referencing and hopefully Hutch will include this information in one of his excellent help files in the future.  :U

Paul
The GeneSys Project is available from:
The Repository or My crappy website