The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: Herakles on May 17, 2010, 04:40:50 PM

Title: Manipulation of strings
Post by: Herakles on May 17, 2010, 04:40:50 PM
QuoteWrite a SPIM-programme, which is able to read in a string, scrolling character by character through it and replacing every small letter with capital characters, without touching the other characters. The programme must be able to print the modified string on the shell. Also describe the algorithm of your conversion function with pseudo code in the form of a commentary right before your actual programme starts. Don't forget to write comments wherever it's necessary.
Hints:

- In case you are using a UNIX system, you can get an overview of ASCII codes with man ascii and on the internet you can find it on http://www.asciitable.com.
- The actual conversion from small to capital letters can be implemented through the subtraction of a constant.


Hi, folks! :wink

First of all I want to say again a big THANKS for your unequalled help recently.

As I just started to work on the new exercise, I'll be able to post my code later on.

BTW: Unfortunately I am not a native English speaker nor am I living in a country where English is an official language, so please feel free to ask me in case parts of the task above sound strange or absurd.

Regards,
Herakles
Title: Re: Manipulation of strings
Post by: redskull on May 17, 2010, 04:45:25 PM
while not end-of-string
     if (ascii_value >= 97) and (ascii_value <=122)
          ascii_value = ascii_value - 0x30 0x20

edit - nice catch, clive
Title: Re: Manipulation of strings
Post by: clive on May 17, 2010, 04:56:23 PM
Quote from: redskull on May 17, 2010, 04:45:25 PM
while not end-of-string
     if (ascii_value >= 97) and (ascii_value <=122)
          ascii_value = ascii_value - 0x30

That should be 0x20 (32), 'A' = 0x41, 'a' = 0x61

SPIM is a MIPS simulator, for those unfamiliar with the question
http://pages.cs.wisc.edu/~larus/spim.html

It would be appropriate to post your initial stab at the code first, before we all pile on..


Title: Re: Manipulation of strings
Post by: Herakles on May 17, 2010, 09:03:51 PM
Sorry, I should have mentioned that we are talking about the MIPS simulator SPIM.

In the meanwhile I've read the documents they gave us for help, but they are completely useless.

I didn't really get if they want us to write a programme that is able to read in a string which was typed in the shell from a user or if it is enough to define a string once for all in the code?


I am really desperate since I have no clue how to start this exercise, not to mention how to implement it.
You must imagine that I am a totally layman. I hear the task "only small letters to capital letters conversion" and my first reaction is OMG.

The code which redskull posted must be the pseudo code the answer asked for, right?
Title: Re: Manipulation of strings
Post by: redskull on May 17, 2010, 09:34:17 PM
Check out the ASCII chart (http://www.asciitable.com/), which is what maps the numerical representation of characters that the CPU uses to the little pictures of letters that the shell will spit out.  If you peruse it further, you'll see that all the lowercase letters fall in one range, and all the uppercase in another.  The project consists of scanning a string (a sequential range of bytes, terminated by a '00') for numbers in the one range (lowercase), and then converting them to the equivelent number in the other (uppercase).  As the hint on the problem statment goes, they are seperated by hex 20.  If my MIPS class memory serves me, the SPIM simulator emulates some psudo "syscall" instructions to do input and output to the simulator, but that was many moons ago; google SPIM and SYSCALL

-r
Title: Re: Manipulation of strings
Post by: clive on May 17, 2010, 09:49:44 PM
print_string is syscall # 4, read_string is syscall #8

By scrolling, I suspect it means just traverse the input string, and perhaps printing out each char as you process it.

Pseudo code in C of conversion task

char chartoupper(char a)
{
  if ((a >= 'a') && (a <= 'z'))
    a = a - 32;

  return(a);
}

void stringtoupper(char *s)
{
  while(*s)
    *s++ = chartoupper(*s);
}


Convert char to upper case, branching

        andi    $t0,$t0,0xFF    # Mask it to a character
        addiu   $t1,$t0,-0x61   # 'a'
        sltu    $t1,$t1,26      # a-z 26 character span
        beq     $t1,$0, upper   # $t1 == 0?
        nop                     # stuff branch delay slot

        addiu   $t0,-0x20       # -('a' - 'A')

upper:


Converting char to upper case, non branching (MIPS IV)

        andi    $t0,$t0,0xFF    # Mask it to a character
        addiu   $t1,$t0,-0x61   # 'a'
        sltu    $t1,$t1,26      # a-z 26 character span
        addiu   $t2,$t0,-0x20   # -('a' - 'A')
        movn    $t0,$t2,$t1     # if $t1 NZ then $t0 = $t2
Title: Re: Manipulation of strings
Post by: Herakles on May 17, 2010, 10:04:10 PM
Thanks redskull. :U

In case I really understood it, I have to say it is easy.
I have to take the Hx value of the small character and then I have to subtract Hx 20 to get the Hx value for the capital letter...?

The most difficult thing now is to implement the idea...


Thanks clive. :U
Yes, "to traverse" puts it in a nutshell.

I am not totally sure if they will accept the pseudo code in C... I think they want a pseudo code in MODULA. But it's not that important (I hope :lol ).

clive, the two last codes you posted, are they two alternatives and which one would you say is more easy to implement for laymans...?

In the German language there are some "strange" letters, e.g. the so-called "umlauts" (Ü ü, Ä ä, Ö ö). Is there a way to let SPIM handle them, too?


EDIT:
The commands sltu and movn aren't mentioned in the alphabetical command overview they gave us... Are there different SPIM versions?
Title: Re: Manipulation of strings
Post by: clive on May 18, 2010, 12:02:01 AM
Correct, subtracting 32 is how to do it with ASCII.

With umlauts, etc, I think I would probably use a case conversion table as that would be more efficient than doing multiple comparisons/conversions. I think the IBM PC (ASCII) character set has them in the high order characters (>=128)

a = uppertbl[a];

SLTU is a MIPS 3000 instruction, MOVN/MOVZ are MIPS IV, they both work on the current SPIM cited earlier.

I come at this from an ARM perspective, I would try to avoid branches whenever possible. The pipeline penalty is reasonably cheap for the original MIPS design, but as more pipeline stages get added, the more important things like branch prediction become. It was reasonably easy for me to demonstrate a ~13 (16) cycle stall on the Intel Atom.

The layman would probably use the compare/branch approach. On ARM which has always had conditional execution, one would probably try to use that instead of branches.
Title: Re: Manipulation of strings
Post by: Herakles on May 18, 2010, 07:56:45 AM
Thank you very much so far.

The only thing is that we are using the MIPS R2000... Do you think it's allowed to use the two commands?
Title: Re: Manipulation of strings
Post by: clive on May 18, 2010, 02:23:42 PM
Well the solution to that is probably to provide multiple answers to the question, providing timing estimates of each.

You could also ask the professor.

I don't have definitive documents in front of me, but Googling some more suggest SLT,SLTI,SLTU,SLTIU,SLTUI are present in the MIPS R2000 instruction set. Most of the SoC stuff I deal with is M4K, MIPS32, MIP16, so I'll apologize for not having the antiquated stuff all straight.

Title: Re: Manipulation of strings
Post by: Herakles on May 18, 2010, 05:20:19 PM
Never mind! :wink

I am too silly to write a working programme! :'(

That's what I have so far:

.data
str1: .ascii "Please type in chars.\n"
.ascii "In case you enter 0 the programme will stop \n"
.asciiz "the input and convert all small letters to capital ones.\n"
askstr: .asciiz "\n?-> "
answstr: .asciiz "The new string is: "
str2: .asciiz "\n\n"

.text
main: li $v0, 4
la $a0, str1       # load str1
syscall                 # PRINT_STR(str1)

loop: li $v0, 4
la $a0, askstr     
syscall               # PRINT_STR(askstr)

li $v0, 8
syscall # $v0 := READ_STRING;
li $t2, 41
bgeu $v0, $t2, upper # jump to upper if v0 >= t2


upper:

exit: li $v0, 4
la $a0, answstr   
syscall                 # PRINT_STR(answstr)



li $v0, 4
la $a0, str2       
syscall # PRINT_STR(str2)

li $v0, 10 # Systemaufrufnr. 10 = EXIT
syscall


I think it's the best to describe my issues, so you can see where I have problems.

1.) Is it possible to subtract Hx 20 from a entered character without "saying" to the programme "Hey, this is a character, you have to find it's Hx value and then subtract Hx 20!". For example is it allowed to say "c - Hx 20"?
2.) How do I implement the idea of "if the Hx value of the character is >= 61 and <= 7A then subtract Hx 20 else don't change anything"?
3.) How do I implement the idea of traversing the input string character by character, while converting to capital letters when it is necessary, without deleting the capitals checked/converted so far?
For example: the entered string is "AbCcDd" and the output string must be "ABCCDD" not "bcd" or "d"...
4.) It think it's the best when the programme is able to present the converted string in one line and not character by character (vertical order). But how do I implement this?

As you can see, my mind's IT architectural thinking is very poor.
I know that the whole thing about SPIM is the implementation with labels and that shouldn't be very difficult, but although the course started three weeks ago, I still have huge problems. :'(

Thank you very much for your help and your patience.
Title: Re: Manipulation of strings
Post by: clive on May 19, 2010, 04:01:03 AM
1) You have to load the value into a register. Once you have it in a register you can add, subtract, compare the value

2) Once the value is in a register, you can compare it against values in other registers (constants), or immediate values. You can test whether the value is equal, different, above, below, etc. and make decisions/branches based on those comparisons.

3) Again, you need to load the address into a register. Then use that address to read the content into another register. This second register holds the character you want to process. If the character is zero (NUL), it means the end of the string, if you encounter this character you stop. Otherwise you add one to the register holding the address, and then read the content from the new address, and so on.

4) You need to look at the Read String SYSCALL function some more. You can specify a buffer, and a length. This way it can read multiple characters into a buffer, zero terminate it, and return. So you type the sentence and hit enter/return. Once you have the whole string in a buffer, you process it the manner covered in 3)
Title: Re: Manipulation of strings
Post by: clive on May 19, 2010, 04:49:37 AM
Please type in chars, then ENTER.
The input is converted to upper case.

?-> The Quick Brown Fox Jumped Over The Lazy Dog
The original strings is: The Quick Brown Fox Jumped Over The Lazy Dog

The new string is: THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG


# Assumes no Load Delays, and no Delay Slots for branches

        .data

str1:   .ascii "Please type in chars, then ENTER.\n"
        .asciiz "The input is converted to upper case.\n"

askstr: .asciiz "\n?-> "

orgstr: .asciiz "The original string is: "

answstr:.asciiz "The new string is: "

crlf:   .asciiz "\n\n"

strbuf: .asciiz "12345678901234567890123456789012345678901234567890123456789012345678901234567890"

        .text

main:   li      $v0, 4                  # Print String
        la      $a0, str1               # load address str1
        syscall

        li      $v0, 4                  # Print String
        la      $a0, askstr             # load address askstr
        syscall

        li      $v0, 8                  # Read String
        la      $a0, strbuf             # Buffer Address
        li      $a1, 81                 # Characters to read + 1
        syscall

        li      $v0, 4                  # Print String
        la      $a0, orgstr             # Original Message
        syscall

        li      $v0, 4                  # Print String
        la      $a0, strbuf
        syscall

        li      $v0, 4                  # Print String
        la      $a0, crlf
        syscall

        la      $a0, strbuf             # load address of string buffer

loop:

        lbu     $t0, 0($a0)             # Load Byte (unsigned) from [$a0 + 0] to $t0
        beq     $t0, $zero, done        # NUL encountered

        addiu   $t1, $t0, -0x61         # 'a'
        sltu    $t1, $t1, 26            # a-z 26 character span
        beq     $t1, $zero, upper       # $t1 == 0?

        addiu   $t0, -0x20              # -('a' - 'A')
        sb      $t0, 0($a0)             # Store Byte (only when we change it)

upper:

        addi    $a0, $a0, 1             # $a0++
        b       loop

done:

        li      $v0, 4                  # Print String
        la      $a0, answstr            # Answer Message
        syscall

        li      $v0, 4                  # Print String
        la      $a0, strbuf
        syscall

        li      $v0, 4                  # Print String
        la      $a0, crlf
        syscall

        li      $v0, 10                 # Exit
        syscall
Title: Re: Manipulation of strings
Post by: Herakles on May 19, 2010, 09:04:50 PM
Thank you very much, clive!

Concerning my issues: You hit that nail square on the head! :U

I read your second last post first and I intentionally ignored your last post in order to implement your explanations by my own (learning effect). Like mad I made it with branches and now I do see why you said "avoid branches whenever it's possible." My code works but I'll keep your warning always in mind.

BTW: Are you from the USA?
Title: Re: Manipulation of strings
Post by: clive on May 19, 2010, 09:32:32 PM
Technically I'm English, and just live/work in the US.

I dug up my copy of "MIPS RISC ARCHITECTURE, KANE 1998" ISBN 0-13-584749-4
http://www.alibris.com/booksearch?binding=&mtype=&keyword=0-13-584749-4&hs.x=0&hs.y=0&hs=Submit
http://www.amazon.com/MIPS-R2000-RISC-Architecture-Gerry/dp/0135847494/ref=sr_1_1?ie=UTF8&s=books&qid=1274303952&sr=8-1

It's a 1st edition, 4th printing from 1989. This covers mostly the R2000, with a bit on the R3000. The SLTxx instructions look to be pretty safe.
Title: Re: Manipulation of strings
Post by: Herakles on May 20, 2010, 06:18:07 PM
Very interesting books, but as they just want us to learn the basics (the course lasts solely three months), I'll stick to the document they gave us. I realized that the most important and difficult aspect is the architectural thinking and not the implementation.

I'm thinking of working in the USA after I'll finish university. The IT world over there is just amazing and many of the most important IT companies are in the USA. Would you say it's difficult for Europeans to find a job in the USA, especially because of the current economic crisis?


Thanks.

Regards,
Herakles
Title: Re: Manipulation of strings
Post by: clive on May 20, 2010, 06:56:37 PM
Quote from: Herakles
Would you say it's difficult for Europeans to find a job in the USA, especially because of the current economic crisis?

For someone fresh out of college, I would think that it would currently be next to impossible. No offense, but there are probably tens of thousands of better qualified/skilled people looking for jobs who wouldn't require vast amounts of paperwork (permits, visas, etc) to hire tomorrow. This alone would make it very difficult to secure a work permit. Plus there are the equally or less qualified.

The trick would be have a US company come look for you, as coming to the US to look for work isn't going to fly (even less so now, but historically a quick way to get deported on a visitor visa). You wouldn't be able to get a work visa unless a company wanted to fight for you.

You would probably want to do one of the following

a) Start a very successful open source project
b) Create/build some industry leading technology (hardware, software, security, analysis, forensics)
c) Work for a US multi-national company in your own country (IBM, Cisco, Microsoft, etc)
d) Work for a European multi-national company that has offices in the US
e) Get a Masters/PhD from a US University
f) Have a couple of million dollars to invest, or ideas/investors
g) Be a skilled/professional Goat Herder (used to be on the H1B visa list as I recall)
Title: Re: Manipulation of strings
Post by: dedndave on May 20, 2010, 07:25:47 PM
it's hard for anyone to find a good job here right now
Title: Re: Manipulation of strings
Post by: Herakles on May 20, 2010, 09:39:01 PM
Thanks for the replies.

The current situation must be very difficult for many people living in the USA. I knew that there is a crisis over there, but I didn't know that it is so dramatically.
I don't want to stay forever in the USA, but my aim is, to stay there for approximately one year, in order to improve my English skills and also my IT skills. Well, I could also go to England, but it would be nice to live and work in a non-European country.

I also don't expect to find a job that is a heaven on earth but it would be nice, if that job could give me as much money as I need to pay the accommodation and the food. My ultimate ambition is to increase my English skills and also my IT skills during that year.

Do you think it's more simply to find a payed work placement rather than a job in the USA?

clive, how did you manage to go from England to the USA?
(Sorry, if I am too personal...)
Title: Re: Manipulation of strings
Post by: clive on May 21, 2010, 01:14:37 AM
Probably want to do it for 3 years, because the paperwork might take you months. You could always go back earlier. Back when I did it in 1991 it took about 3 months. I suspect it is a lot worse now. Figure at least 6 months, if immigration/employment agencies don't throw roadblocks.

Not sure I'd want to go back to England, pretty depressing there these days.

I'm a hardware guy that does software (drivers, embedded, tools). I built several computers from scratch in my teens with chips and wire. I was doing IC Design at Philips Semi (NXP), and had been on the technical staff at another company for 7 years that distributed hw/sw in Europe for the company I went to work with in the USA.

These days I work on the software for several GPS SoC devices using ARM, MIPS and Sparc processors. I've done development on a couple of GPS receivers, media players, WiFi and wireless modems. Before that I worked on PC storage peripherals.