The MASM Forum Archive 2004 to 2012

Miscellaneous Forums => The Orphanage => Topic started by: shankle on August 21, 2009, 02:58:25 AM

Title: Question about Orientals usage of Computers
Post by: shankle on August 21, 2009, 02:58:25 AM
As a US user I have 26 letters in the alphabet. All the characters can be handled in the 256 byte EBCDIC code.
How in the world do the Orientals cram thousands of their characters into this EBCDIC code?
Title: Re: Question about Orientals usage of Computers
Post by: dedndave on August 21, 2009, 03:11:54 AM
i don't think you use ebcidc
i think you use ascii
anyways, they "cram" all those characters in by using unicode
each character uses 2 bytes (or possibly more)
Title: Re: Question about Orientals usage of Computers
Post by: mitchi on August 21, 2009, 02:17:40 PM
Did you know that you could print Japanese to the Console if you use UTF8 text?
I actually noticed that when doing CGI web programming . UTF8 strings are not weird and they are all zero terminated, so we can use good old ASCII apis to print UTF8 characters  :green

(Of course, you won't see them in the old barebones console of Windows but on the webpage)
Title: Re: Question about Orientals usage of Computers
Post by: hutch-- on August 21, 2009, 02:18:57 PM
Jack,

Dave is right, with other alphabets and writing systems they generally use UNICODE which can handle 64k of characters, not the 256 in the ASCII/ANSI set. It allows for traditional chinese, big 5 chinese, japanese Kanji, arabic, greek and a whole host of other character sets that cannot easily be handled in the 256 available in ASCII?ANSI.
Title: Re: Question about Orientals usage of Computers
Post by: mitchi on August 21, 2009, 02:28:54 PM
There are various types of UNICODE as well.

UTF-16 : the NT Kernel operates in UTF-16 (fixed 2 byte unicode). And so do all the Unicode APIs of Windows.
UTF-8   :  Most web pages, if not all the web operates in UTF-8 (1 to 4 bytes)
UTF-32 :  Nothing uses that yet.
Title: Re: Question about Orientals usage of Computers
Post by: shankle on August 21, 2009, 04:03:54 PM
Thanks for the explanation guys.

As a side, I would think that would increase the size of programs by 2 to 4 times.

So on the Puter screen for English I can get 80 characters on one line and 40 or 20 in Chinese.
Is that a correct assumption?
Just curious.

Title: Re: Question about Orientals usage of Computers
Post by: dedndave on August 21, 2009, 04:38:28 PM
normally, displayed text is only a small part of a program, so it doesn't affect the size that much
but - no - they get 80 chars on a line in a normal console
windows loads what is called a "code page" that translates characters for the display screen
it doesn't behave like the traditional DOS screens of old
Title: Re: Question about Orientals usage of Computers
Post by: tenkey on August 24, 2009, 04:52:40 PM
There are also traditional character encodings, such as JIS and Shift-JIS for Japanesse, and Big 5 for traditional Chinese.

Don't know the details.
Title: Re: Question about Orientals usage of Computers
Post by: shankle on August 28, 2009, 08:04:41 PM
Ok, I can see how each characters is coded far as zeros and ones but how do you put
3,000 or so characters on a keyboard???
Title: Re: Question about Orientals usage of Computers
Post by: dedndave on August 28, 2009, 08:26:29 PM
simple - they have small hands and can type twice as fast ? - lol
Title: Re: Question about Orientals usage of Computers
Post by: Tedd on August 29, 2009, 11:43:49 AM
Quote from: shankle on August 28, 2009, 08:04:41 PM
Ok, I can see how each characters is coded far as zeros and ones but how do you put
3,000 or so characters on a keyboard???

Obviously, you don't
There's a common subset that you can by with, which reduces it to maybe 200.
Then, for actually inputting symbols, you essentially spell them out (since each one represents a word/notion) and as you type they're auto-completed so you can pick out the one you want.
Title: Re: Question about Orientals usage of Computers
Post by: shankle on August 29, 2009, 12:38:23 PM
Thanks for the response guys.
Seems to me there are 2 solutions for there very difficult languages.

1: learn English
2: Get a Scottie Puter (like in Star Trek)

I would wonder how they ever get anything done on a Puter with such
difficult languages.
Title: Re: Question about Orientals usage of Computers
Post by: dedndave on August 29, 2009, 03:44:55 PM
many of them know enough english to get past a command prompt - lol
windows "codepages" help them a lot, i am sure
Title: Re: Question about Orientals usage of Computers
Post by: tenkey on August 29, 2009, 06:44:34 PM
Quote from: shankle on August 28, 2009, 08:04:41 PM
Ok, I can see how each characters is coded far as zeros and ones but how do you put
3,000 or so characters on a keyboard???

Multiple key combinations for the most common stuff. As I don't do Chinese, I'm don't know the logic of the key combinations.

In the case of Japanese, IME uses "Romanized" notation. Enter the word using Latin-1 characters. It will convert to Hiragana, the preferred phonetic notation. Sometimes it will attempt to convert to Kanji. At any time after conversion, you can hit the space bar to select alternate notations. It's similar to the way the Japanese wapuro (word processors) worked.
Title: Re: Question about Orientals usage of Computers
Post by: Farabi on October 11, 2009, 03:04:23 AM
I bet on every language there would be less than 4 Gigs word used on daily conversation. If only we could make a words database where each unique word can be represented by a dword. It will decrease the size of the text file.

Maybe  :green
Title: Re: Question about Orientals usage of Computers
Post by: hutch-- on October 11, 2009, 10:40:16 AM
 :bg

こんばんは。漢字タイピング平易。

Good evening. kanji typing is easy.

Load all of the east asian fonts, get an IME editor and the biggest collection of dictionaries you can find and start typing. I found a program called WAKAN 1.76 that seems to work fine. Word order is SUBJECT, OBJECT, VERB [particle]. Particles will take some time to get the swing of. I have it set up to type romaji and it produces hirigano alphabetic text, watch the word to see if you got it right, press space bar to insert kanji character.
Title: Re: Question about Orientals usage of Computers
Post by: ecube on October 19, 2009, 08:36:35 PM
this stuff is confusing, but the UTF-8 looks interesting, you have any c/asm/delphi etc functions that can convert em, or play with utf-8