upper case to lower case

scooter4483 · March 03, 2006, 03:20:44 PM

alright, i did some tweaking around and i thought i had it:

A0: mov al, [edx]
cmp al, 0
je Display_String
cmp al, 'A' ;upper to lower
jb A1
cmp al, 'Z'
ja A1
add al, 'a'-'A'
mov [edx], al

mov edx, OFFSET String3
mov al, byte ptr [edx]
cmp al, 'w' ;comparing the pointer to w
jg A2 ;if the letter is greater than w, go to A2
add al, 3 ;otherwise add 3 to al

A2: sub al, 23 ;subtract 23 if greater than w
mov [edx], al ;put al back into the edx pointer

A1: inc edx
jmp A0

Am I close, any suggestions?

PBrennick · March 03, 2006, 03:55:23 PM

scooter4483,

Try this next:

Code Select

    mov edx, OFFSET String3      ; Point to the string
A0: mov al, byte ptr [edx]       ; Get a character or next character
    cmp al, 0                    ; End of string?
    je Display_String            ; Go show it if yes
    cmp al, 'A'                  ; Check for lower boundary
    jb A1                        ; Jump if not a letter
    cmp al, 'Z'                  ; Uppercase?
    ja A3                        ; Go check for lower, if no
    or al, 20h                   ; Convert to lowercase
    mov byte ptr [edx], al       ; Store it
A3: cmp al, 'w'                  ; Compare character for upper bounds, first check
    jg A2                        ; Jump if possible upper bounds error
    cmp al, 'a'                  ; Check for lowercase
    jb A1                        ; Jump if not a letter
    add al, 3                    ; Otherwise add 3 to al
    mov byte ptr [edx], al       ; Store it
    jmp A1                       ; Done with this character
A2: cmp al, 'z'                  ; Compare character for upper bounds, second check
    ja, A1                       ; Jump if not a letter
    sub al, 23                   ; Otherwise do a wraparound
    mov byte ptr [edx], al       ; Store it
A1: inc edx                      ; Point to next character
    jmp A0                       ; Go get it
Display_String:                  ; You go from here and tell me how your are doing ...

This is untested but looks correct, you try it and tell me.

Paul

scooter4483 · March 03, 2006, 05:35:10 PM

i got the following error when i tried that code:

test.asm<48> : error A2008: syntax error: ,

You just added a comma in A2, but I fixed it. Paul, you are my new hero. The things you did with that code, i just understood after seeing my TA. I was doing something similar and here are the things she made me realize. My changing of the letters would only work if I have capital letters. I was missing the connection that if the letters are lower case then they have to also do the changing of the letters. that the first thing i was missing. Im not familiar with the 'byte ptr' code cuz the teacher did not go in depth in that part. i love this site and i do understand what you are doing. i'm still trying to understand the jump if above/below statemetents. I know that my code is looking between A-Z with the ja and jb. You used the or statement which just adds, meaning it searches its lowercase letters which is 20h ahead, correct? The way i did it, was it wrong with the add al, 'a'-'A'?

Overall, I would like to thank you all for the help and encouragement. I'm sorry I failed you guys in not figuring it out. But im glad i figured out the last program. Thanks again. guys.

PBrennick · March 03, 2006, 06:02:32 PM

scooter4483,
Thank you for your nice words, they mean a lot. I see my error in the ja line, how the heck did I miss that! But you fixed it! Your add al, 'a'-'A' is functionally correct but the or command does a better and faster job (it sets the flags in a more meaningful way). If I was only interested in the flags and not changing the actual value using tst would even be better. Remember that and play with it until you understand it. Also, about changing jg to ja, this is good programming practice that can prevent hard to detect errors in your future coding. Make it a habit to only use jg and jl when dealing with signed numbers. ja and jb is for unsigned numbers which is something that would not create an error in todays code (using jg, I mean) but if any of the computations happened to be greater than 7Fh jg would fail because those are negative numbers if you are testing the sign which is what jg and jl does, remember that and play with it until you understand it. When using jg, 80h-0FFh is actually less than 0-7Fh!

And above all, do a lot of coding and have a lot of fun. Say hi to your teacher from me, just before retiring I taught assembly and related classes (Introduction to Computers 101 and Introduction to Computer Programming 101) in a college in New York (Suffern).

Paul

PBrennick · March 03, 2006, 06:37:22 PM

scooter4483,
One more thing, about 'byte ptr' this is another good programming practice. When assembling, masm pays no attention to the brackets any more so you can, again, get errors if you are not careful. byte ptr makes sure that the register is an address and not a value so consider it a form of indirect addressing and use it always.

Paul

Mark Jones · March 03, 2006, 07:57:15 PM

Hey Scooter, glad you got it figured out! Not so bad once understand it, hmm? :)
Paul certainly knows his stuff so listen to him, k? :toothy See \masm32\help\ASMINTRO.HLP under "Addressing and Pointers" for more help with pointers and addressing.
Since you expressed some interest, here is a little more info for you. Come back to it later if needed, don't want to overwhelm ya.

Flags register - this is a special register of the processor which contains a number of bit-flags, indicating such things as zero, carry, parity, etc. Most instructions cause updates to occur to the flag register seamlessly as they execute.

JG,JA,JLE... these are conditional jumps. There are lots of 'em. They all look at the flags register internally and jump based on simple logic. JG jumps if the value is GREATER only, JA jumps if the value is ABOVE only, JLE jumps if the value is LESS or EQUAL, etc. Paul explained when to use certain ones. Take this for example:

Code Select


szCopyMJ1 proc  uses esi edi  szDest:DWORD,szSource:DWORD
    mov esi,szSource                ; szSource is already an offset
    mov edi,szDest                  ; szDest too
@@:
    mov al,byte ptr[esi]            ; fetch a byte 
    inc esi                         ; increment source pointer
    mov byte ptr[edi],al            ; put byte
    inc edi                         ; increment dest pointer
    test al,al                      ; null?
    jnz @B                          ; loop if not
    ret
szCopyMJ1 endp

This code loads string offsets into ESI and EDI, then copies bytes from ESI to EDI. But how does it know when to stop? Answer: the TEST function sets the flags register. When TEST AL,AL returns zero (that is, when AL is zero), the Zero Flag is set. The JNZ @B tests the zero flag to see if it is set - and if NOT, code execution jumps back to the previous @@:. JNZ stands for "Jump If Not Zero" so it does exactly that - loops until a zero is found, then falls through to the RET. Note that the zero IS copied before returning - that is a requirement for valid ANSI strings. Incidentally, CMP AL,0 does the same thing as TEST AL,AL. The TEST is often used for testing for zero because it is usually faster than CMP.) Note however that TEST AL,2 is not the same as CMP AL,2.

Carry is another very useful flag. Imagine you want to count to 1000 using only AL and CL. Is that even possible? Sure! When you increment AL from FFh to 00h, the carry flag will be set. Now if you monitor the carry flag, it is possible to then increment CL, effectively allowing you to count up to 65535. That's not a very large number, but if you apply this logic to EAX and CL, suddenly you can count up to 1,099,511,627,775! Your prof will likely have an excercise on this later, so I won't go into too much detail yet. :bg

The above is a PROC - a procedure, which is MASM32 syntax for a type-defined function call. To define this function, a PROTO statement is utilized, normally at the beginning of the file:

Code Select


    szCopyMJ1    PROTO   :DWORD,:DWORD

This tells MASM that there is a function prototype called szCopyMJ1 which is passed two dword-sized parameters. Then the function can be called using MASM32 INVOKE syntax:

Code Select


    szCopyMJ1    PROTO   :DWORD,:DWORD
.data
    myString     db  "Hello world!",0
    myBuffer     db  12 dup(0)
.code
    invoke szCopyMJ1,addr myBuffer,addr myString   ; string is copied to myBuffer
    invoke MessageBox,0,addr myBuffer,0,MB_OK      ; show copied string

This is basically how most functions are done in assembly. :)

Note, in the PROC line it says "uses ESI EDI". What does that mean? That's a handy way to preserve registers. What's that? Well if your Prof hasn't already said so, there are some registers which you must not modify. ESI and EDI are two of them. Windows expects the values in ESI and EDI to be the same after your program exits. If they are not, windows might crash! But ESI and EDI are useable, as long as you put the original values back when you're done. That's exactly what "uses esi edi" does in the PROC line - it preserves ESI and EDI so you can use them in your code. It is the equivalent to this:

Code Select


    push esi
    push edi
; all the other code here
    pop edi
    pop esi

PUSH and POP are commands which save data to a circular buffer. That is, FIRST IN - LAST OUT. All you need to know about it now is that the above works every time to save and restore ESI and EDI.

Hope that's interesting and I didn't confuse you too much. Have fun!

zooba · March 03, 2006, 10:14:12 PM

Quote from: PBrennick on March 03, 2006, 06:37:22 PM
When assembling, masm pays no attention to the brackets any more so you can, again, get errors if you are not careful. byte ptr makes sure that the register is an address and not a value so consider it a form of indirect addressing and use it always.

Unless I've misunderstood you, I don't believe this is true:

Code Select

    .data
        dwTemp DWORD 0

    .code

Main PROC
    ConsoleInit
    
    mov eax, OFFSET dwTemp
    mov DWORD PTR eax, 1
    
    Print "eax = %d\ndwTemp = %d\n", eax, dwTemp
    
        
    CPause
    ret
Main ENDP

The output:

Code Select

eax = 1
dwTemp = 0

So MASM does pay attention to the brackets, otherwise the value of dwTemp would have changed to 1.

I do agree though, that it is a good habit to get into. At least until you're comfortable with how MASM interprets it.

Cheers,

Zooba

Mark Jones · March 03, 2006, 11:38:48 PM

The whole "brackets and ptr" thing is an unnecessary complexity simply because brackets are ignored in all but one case. That makes a newcomer's experience with brackets highly ambiguous. One could say that "Brackets signify the CONTENTS of" such as "mov al,[myVal]", which does take the value of myVal and move it into AL. But it also does the same thing without the brackets. So why doesn't it give an error for trying to move the DWORD-sized offset of myVal into AL? How confusing, really.

I fully believe MASM should yell and moan about spurious brackets instead of ignoring them, so as to restrict our use of them. Anyways yes, in my limited and very confused experience, brackets are required to access memory contents when used after PTR.

<rant>
Not like PTR shouldn't already do what brackets do... sheesh PTR does stand for POINTER! ::) Yeah, eliminate brackets altogether, yeah... Hmm, ask Pelle to eliminate brackets, hmm..... :bg
</rant>

zooba · March 03, 2006, 11:58:45 PM

'Brackets and ptr' is unnecessary in all but one case. That case being where the assembler can't figure out how much memory you're referring to:

Code Select

mov [esi], al       ; fine because MASM knows that AL is 8-bits
mov [esi], dwNumber ; fine because MASM knows that dwNumber is a DWORD
mov [esi], 3        ; no good because 3 could be anywhere upwards of 2-bits

mov WORD PTR [esi], 3   ; these are identical and both indicate
mov [esi], WORD PTR 3   ; that we're discussing WORDs here

PBrennick · March 04, 2006, 05:03:58 AM

Mark is correct, you need to use the brackets only after byte ptr and word ptr and is used for indirect addressing as I explained. That was what I was trying to say and my example shows the 'proper' use. Otherwise masm has problems understanding that there used to be a difference between esi and [esi] for example. Zooba, your code does not even show the point we are making so, yes, you have misunderstood us. If you need to see examples of the point we are making look at the source codes for the masm32 library, Hutch uses the same method for the same reason and this topic was beaten to death many moons ago on this forum, IIRC. Please, lets not confuse our friend who is just learning. The example code I gave him is correct as written (except for one spurious comma) so it does not need to be discussed at this point any longer.

Paul

zooba · March 04, 2006, 07:19:00 AM

I spent time looking for the 'beating-to-death' of this issue and thought I'd post the links I found.

http://www.old.masmforum.com/viewtopic.php?t=4699
http://www.old.masmforum.com/viewtopic.php?t=409

PBrennick · March 04, 2006, 12:37:39 PM

Zooba,
Thank you for the help. I did not think to search the old forum. I could not remember when it happened, looks like it was longer into the past than I thought. I think it says it well, though, and Tedd says what I said, just use them and use them always. It makes your code more readable, also.

Again, thank you for the help.
Paul

Mark Jones · March 04, 2006, 06:01:10 PM

Tenkey's post at the bottom of http://www.old.masmforum.com/viewtopic.php?t=4699 is very interesting. Perhaps we should make that into a BRACKETS.HLP file. :lol

Thanks for searching for that Zooba.

zooba · March 05, 2006, 01:24:06 AM

To briefly summarise the findings of those two posts I posted above:

When you define a variable, MASM gives you the brackets for free. This makes using them similar to C (ie. the name accesses the contents. In some assemblers the name accesses the address). For example:

Code Select

.data
    dwValue DWORD 0
    ; dwValue EQU [00403000h]

.code
    mov dwValue, 1
    ; mov [00403000h], 1

Note that the commented code above won't actually work, since MASM does not support dereferencing of a constant (for good reason IMO).

Once the assembler sees a set of brackets, it knows it's dealing with an address (since to access the contents of the variable it needs to use the address) and so whether or not the rest of the expression is in brackets is irrelevant. However, more bracketed expressions will be treated as addition:

Code Select

mov dwValue+1, 1
; mov [00403001h], 1

mov dwValue[1], 1
; mov [00403001h], 1

Multiple sets of brackets around an expression achieves nothing, since they are treated as additions with zero:

Code Select

mov [dwValue], 1
; mov [[00403000h]], 1

Including a variable in an effective address (bracketed expression) uses the address rather than the value (because the extra brackets are treated as an addition):

Code Select

mov [eax+dwValue], 1
; mov [eax+[00403000h]], 1

MASM stores type information for variables, so when it encounters dwValue it knows that it is dealing with a DWORD. The PTR directive is similar to a type-cast, in that it overrides any other type information. Remember, dwValue came with brackets for free, so they don't need to be included again.

Code Select

mov dwValue, 1
; mov DWORD PTR [00403000h], 00000001h

mov BYTE PTR dwValue, 1
; mov  BYTE PTR [00403000h], 01h

Multiple type-casts are valid but the right-most overrides all others:

Code Select

mov DWORD PTR [eax] + BYTE PTR [ecx], 1
; mov BYTE PTR [eax+ecx], 1

Registers and variables are different animals. Variables have an address and value. Registers have a value but no address. A register can be used as a pointer, dereferencing it's contents, while a variable cannot:

Code Select

mov eax, OFFSET dwValue
; mov eax, 00403000h

mov DWORD PTR [eax], 1
; mov DWORD PTR [00403000h], 1

Note that DWORD PTR must be included in this case; since we are no longer using the variable directly, MASM no longer knows how big it is. Also, since we don't get the brackets for free with a register, they need to be included.

Hopefully this clears up some MASM inconsistencies and better explains how it interprets what we're telling it :bg

Cheers,

Zooba :U

PBrennick · March 05, 2006, 01:21:21 PM

Zooba,

Very nicely done. You are a very thorough person and a good speaker. I think that our new users should save this portion of the thread as a webpage for future referencing and hopefully Hutch will include this information in one of his excellent help files in the future. :U

Paul

News:

upper case to lower case

scooter4483

scooter4483