News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Question about Orientals usage of Computers

Started by shankle, August 21, 2009, 02:58:25 AM

Previous topic - Next topic

shankle

As a US user I have 26 letters in the alphabet. All the characters can be handled in the 256 byte EBCDIC code.
How in the world do the Orientals cram thousands of their characters into this EBCDIC code?
The greatest crime in my country is our Congress

dedndave

i don't think you use ebcidc
i think you use ascii
anyways, they "cram" all those characters in by using unicode
each character uses 2 bytes (or possibly more)

mitchi

Did you know that you could print Japanese to the Console if you use UTF8 text?
I actually noticed that when doing CGI web programming . UTF8 strings are not weird and they are all zero terminated, so we can use good old ASCII apis to print UTF8 characters  :green

(Of course, you won't see them in the old barebones console of Windows but on the webpage)

hutch--

Jack,

Dave is right, with other alphabets and writing systems they generally use UNICODE which can handle 64k of characters, not the 256 in the ASCII/ANSI set. It allows for traditional chinese, big 5 chinese, japanese Kanji, arabic, greek and a whole host of other character sets that cannot easily be handled in the 256 available in ASCII?ANSI.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

mitchi

There are various types of UNICODE as well.

UTF-16 : the NT Kernel operates in UTF-16 (fixed 2 byte unicode). And so do all the Unicode APIs of Windows.
UTF-8   :  Most web pages, if not all the web operates in UTF-8 (1 to 4 bytes)
UTF-32 :  Nothing uses that yet.

shankle

Thanks for the explanation guys.

As a side, I would think that would increase the size of programs by 2 to 4 times.

So on the Puter screen for English I can get 80 characters on one line and 40 or 20 in Chinese.
Is that a correct assumption?
Just curious.

The greatest crime in my country is our Congress

dedndave

normally, displayed text is only a small part of a program, so it doesn't affect the size that much
but - no - they get 80 chars on a line in a normal console
windows loads what is called a "code page" that translates characters for the display screen
it doesn't behave like the traditional DOS screens of old

tenkey

There are also traditional character encodings, such as JIS and Shift-JIS for Japanesse, and Big 5 for traditional Chinese.

Don't know the details.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

shankle

Ok, I can see how each characters is coded far as zeros and ones but how do you put
3,000 or so characters on a keyboard???
The greatest crime in my country is our Congress

dedndave

simple - they have small hands and can type twice as fast ? - lol

Tedd

Quote from: shankle on August 28, 2009, 08:04:41 PM
Ok, I can see how each characters is coded far as zeros and ones but how do you put
3,000 or so characters on a keyboard???

Obviously, you don't
There's a common subset that you can by with, which reduces it to maybe 200.
Then, for actually inputting symbols, you essentially spell them out (since each one represents a word/notion) and as you type they're auto-completed so you can pick out the one you want.
No snowflake in an avalanche feels responsible.

shankle

Thanks for the response guys.
Seems to me there are 2 solutions for there very difficult languages.

1: learn English
2: Get a Scottie Puter (like in Star Trek)

I would wonder how they ever get anything done on a Puter with such
difficult languages.
The greatest crime in my country is our Congress

dedndave

many of them know enough english to get past a command prompt - lol
windows "codepages" help them a lot, i am sure

tenkey

Quote from: shankle on August 28, 2009, 08:04:41 PM
Ok, I can see how each characters is coded far as zeros and ones but how do you put
3,000 or so characters on a keyboard???

Multiple key combinations for the most common stuff. As I don't do Chinese, I'm don't know the logic of the key combinations.

In the case of Japanese, IME uses "Romanized" notation. Enter the word using Latin-1 characters. It will convert to Hiragana, the preferred phonetic notation. Sometimes it will attempt to convert to Kanji. At any time after conversion, you can hit the space bar to select alternate notations. It's similar to the way the Japanese wapuro (word processors) worked.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

Farabi

I bet on every language there would be less than 4 Gigs word used on daily conversation. If only we could make a words database where each unique word can be represented by a dword. It will decrease the size of the text file.

Maybe  :green
Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"