Unicode problem

Started by WYVERN666, January 07, 2012, 03:53:00 PM

Hello. I cant make the work correctly to display chinese characters, here is mi code:

INCLUDE \include\   
INCLUDE \macros\ucmacros.asm

        WSTR title, "標題"        ;titulo de la ventana

start PROC
        INVOKE MessageBoxW, NULL, uni$("消息"), offset title, MB_OK + MB_ICONINFORMATION
start ENDP

END start

The file is saved in UTF-8 without BOM. Mcompiles fine, but the characters showed in the box at runtime arent correct.


MASM doesn't know UTF8 thus the characters are interpreted as ASCI.
Append your strings as resource and load them using LoadStringW.
As qWord rightly wrote, WSTR and uni$ are not meant to produce Chinese. You may try MasmBasic:

include \masm32\MasmBasic\   ; download
wLet esi=wChr$("This is Unicode:")+wCrLf$+wRes$(1)
   wPrint esi
   wMsgBox 0, esi, "Hello", MB_OK
end start

If your fonts and configuration are ok, i.e. they allow the display of Chinese in principle, the console should show this:

This is Unicode:

If you can't see this, try setting the console font under properties to Lucida Console - but there is no guarantee that it works. In any case you will see a nice MsgBox with Chinese characters. Windows controls work fine, but the console is tricky.

You need this in your resource file:
   1,   "按一下這個按鈕"      ; "Click on this button" in Chinese


The action with displaying unicode character sets is to create a UNICODE resource file and place your text in a string table. I use a US English version of Windows and have the East Asain fonts loaded.

        250,  "早上好计算机程序员。\0"
        251,  "おはようのコンピュータのプログラマー。\0"
        252,  "Хороший программист утром.\0"
        253,  "Καλή προγραμματιστής ηλεκτρονικών υπολογιστών πρωί.\0"
        254,  "सुप्रभात कंप्यूटर प्रोग्रामर.\0"
        255,  "Chào buổi sáng lập trình máy tính.\0"
        256,  "დილა მშვიდობისა, კომპიუტერული პროგრამისტი.\0"
        257,  "Добро јутро компјутерски програмер.\0"
        258,  "Բարի լույս ծրագրավորող.\0"
        259,  "안녕하세요 컴퓨터 프로그래머.\0"
Interesting - I copied this from my console, where #259 shows only rectangles. There were also two chars replaced by rectangles in #255 ::)
But then, my console reacts even to changes of text or background colours. Console and Unicode is great fun in Windows :toothy


Ok, mi work with "string tables" is a fail  :( :

INCLUDE c:\masm32\include\

        lpBuffer db 25 dup(0)        ;some buffer
        winTitle db 'Title',0

start PROC
        invoke GetModuleHandle, NULL   ;really needed?
        invoke LoadStringW, eax, 258, offset lpBuffer, 25

        invoke  MessageBoxW, NULL, offset lpBuffer, offset winTitle, MB_OK + MB_ICONINFORMATION

start ENDP

END start

STRINGTABLE    ;hutch string table example
That is not working, i think im missing something here... when debugged i cant find the "string table" anywhere, where is supossed to be saved?, the LoadStringW function cant find any string.

By the way, if the doesnt "really" support unicode... what is their purpose?


Hi Your string table is a resource, it should be in the RC file, not in the ASM file and compiled with rc.exe along with dialogs etc...
Quote from: WYVERN666 on January 08, 2012, 02:37:22 AMTBy the way, if the doesnt "really" support unicode... what is their purpose?
Some part of the WinAPI only works with Unicode strings (e.g. GDI+).
The problem is that an ASCII character set does not support UNICODE characters, this is why you put the UNICODE text in the RESOURCE file which CAN be UNICODE. use a UNICODE editor to create the RC resource file then compile it with RC.EXE and you can store and display any character in the UNICODE set. You will of course have to have the right fonts loaded to display them.
Ok, thank you all. Now i got a working simple example:

INCLUDE \include\

        winTitleBuffer dw 3 dup(0)
        winMsgBuffer  dw 3 dup(0)

start PROC
        PUSH   NULL
        CALL    GetModuleHandle    ;is this necesary?¿
        MOV    edi, eax
        invoke  LoadStringW, edi, 100, offset winTitleBuffer, 3
        invoke  LoadStringW, edi, 101, offset winMsgBuffer, 3

        invoke  MessageBoxW, NULL, offset winMsgBuffer, offset winTitleBuffer, MB_OK + MB_ICONINFORMATION
start ENDP

END start

        100, "標題\0"
        101, "消息\0"

Anyway i think im going to take a look at MasmBasic to make this things more easier xD.


@jj2007 Im using MasmBasic now, when linked your code i get a warning:

MasmBasic.lib(libtmpAC.obj) : warning LNK4078: multiple ".drectve" sections found with different att  ributes (00000240)


I have seen the LNK4078 warning occasionally, it's harmless. It has to do with obj files created by previous versions of the linker. For example, if your Masm32 installation has been performed with the 6.14 linker, and the MasmBasic *.lib with a newer one (version 9 in this case), the warning may be issued. But it will not affect the code.

Still, I would be curious to find out how to avoid it. @Hutch: Is that the reason why Masm32 builds on the target PC instead of having the lib files in the archive?

MSDN: This warning can be caused by an import library or exports file that was created by a previous version of LINK or LIB


the LIB's are created at installation time
i think the INC files are used to create DEF files - not sure about that
but, it might explain why all the EQUates are in, rather than the individual INC files   :P

poke around for batch files in the masm32 folder   :bg