News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Hex to ASCII

Started by ChillyWilly, September 13, 2008, 07:01:05 AM

Previous topic - Next topic

ChillyWilly

I have a string of two words in hex that i need to convert to ascii the only problem is that there is NULL chars in the string between each letter and 3 NULL chars to identify the end of each word

for example:
68 00 65 00 6c 00 6c 00 6f 00 00 00 77 00 6f 00 72 00 6c 00 64 00 00 00
would translate to:
hello world

Is there a quick streamlined function to convert the string and return each word to their own destination buffers and replace the 3 NULLS to terminate each string

BlackVortex

There are many solutions to this. Maybe you should use the szappend macro to append them one by one in a temporary buffer

EDIT: Errr, I didn't read you post very well I believe. Those look like 2 unicode zstrings. There are unicode functions in masmlib.

ChillyWilly

the problem i run into is that the nulls terminate the string when trying to convert
so instead of returning 'hello' it just returns 'h'

BlackVortex

Quote from: ChillyWilly on September 13, 2008, 08:43:19 AM
the problem i run into is that the nulls terminate the string when trying to convert
so instead of returning 'hello' it just returns 'h'
Use the unicode functions, not the asciiz functions. These are 2 unicode strings.

ChillyWilly

is there a function in masm that returns the string?
i dont know which unicode function will return them to ascii
and separate both words


in delphi i would do this


function ConvertDataToAscii(Buffer: pointer; Length: Word): string;
var
  Iterator: integer;
  AsciiBuffer: string;
begin
  AsciiBuffer := '';
  for Iterator := 0 to Length - 1 do
  begin
    if char(pointer(integer(Buffer) + Iterator)^) in [#32..#127] then
      AsciiBuffer := AsciiBuffer + ' ' + char(pointer(integer(Buffer) + Iterator)^) + ' '
    else
      AsciiBuffer := AsciiBuffer + ' . ';
  end;
  Result := AsciiBuffer;
end;

hutch--

chilly,

You need to know what the text format is and what you want it to end up as. Normal unicode does not use 00 00 as a word separator so it tends to look like a unicode list with zero separation and zero termination which makes it difficult to process as you don't know where the text end is.

If you could get it to be consistent its an easy enough algorithm to write. The form that would be useful is like the common dialog box path word pairs that are zero separated and double zero terminated.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

ChillyWilly

its a string in the registry

ChillyWilly

invoke WideCharToMultiByte, CP_ACP, 0, addr Source, -1, addr Destination, ecx, NULL, NULL

only returns the first strin "hello"
is there a way to seperate the two unicode strings from the string  before calling WideCharToMultiByte

BlackVortex

Quote from: ChillyWilly on September 13, 2008, 05:48:52 PM
invoke WideCharToMultiByte, CP_ACP, 0, addr Source, -1, addr Destination, ecx, NULL, NULL

only returns the first strin "hello"
is there a way to seperate the two unicode strings from the string  before calling WideCharToMultiByte
add the length of the string to source and then call widechar again

RuiLoureiro

ChillyWilly,

Quote
I have a string of two words in hex that i need to convert to ascii
the only problem is that there is NULL chars in the string between each letter

and 3 NULL chars to identify the end of each word
   ( ? )

for example:
68 00   65 00   6c 00   6c 00   6f 00   00 00    -> means hello  (end in 00 00)
77 00   6f 00   72 00   6c 00   64 00   00 00   ->     "     world  (end in 00 00)

;-----------------------------------------------------------------------
           1st: each char is 1 word: ex: 68 00 is the char "h". Yes ?
                So, each text word (ex: "hello") ends in one word 00 00. Yes ?

                This is the way we can see your problem.

            2nd: now, you want to convert each word (ex: 68 00 ) to byte
                 and return each text word to its own destination buffer
                 So, we should have 2 destination buffers

            You can do this with the code below. Each buffer ends with 00
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
_unicode    db 68h, 00, 65h, 00, 6ch, 00, 6ch, 00, 6fh, 00,    00, 00   ; text word 1
                 db 77h, 00, 6fh, 00, 72h, 00, 6ch, 00, 64h, 00,    00, 00   ; text word 2
           
_ascii1     db 120 dup (?)          ; buffer for text word 1
_ascii2     db 120 dup (?)          ; buffer for text word 2
.code
; ««««««««««««««««««««««««««««««««««««««««««««««««««««««
        mov     edi, offset _ascii1         ; buffer for text word 1
        mov     esi, offset _unicode
        ;
@@:     movzx   eax, word ptr [esi]
        mov     byte ptr [edi], al

        add     esi, 2
        inc     edi
       
        cmp     ax, 0
        jne     @B

        mov     edi, offset _ascii2         ; buffer for text word 2

@@:     movzx   eax, word ptr [esi]
        mov     byte ptr [edi], al

        add     esi, 2
        inc     edi
       
        cmp     ax, 0
        jne     @B
;»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»
        Have o good work

RuiLoureiro

PBrennick

Alright, it is my turn for the stupid question of the day.

Why are the zeroes in there at all? I use the method described by Hutch in my tables. Using his method it is a simple matter to step around each zero ands test for another zero. if no, then decode another word, concatenate it to the string you are building and so on. You could, also, replace the zeroes with spaces if you are truly building a string.

--Paul
The GeneSys Project is available from:
The Repository or My crappy website

hoverlees

ChillyWilly,
This is a unicode string array. The first string is "hello" and the second string is "world"
after WideCharToMultiByte,the destination buffer is like this:
'hello',0,'world',0  the hex is  68 65 6c 6c 6f 00 77 6f 72 6c 64 00

so,as a zero terminated string,the destination buffer is only "hello"
you can access "world" like this:

lea esi,destination
invoke lstrlen,addr destination
add esi,eax
inc esi
;esi now point "world"
invoke MessageBox,0,esi,0,0


hutch--

This technique when applied to the registry is usually used to hide other data behind the first pair of zeros. Now for someone who has a valid reason to do this, it will not be done with a standard API or simple reusable algo, it needs to be written to read a zero separated sequence while having some method of determining where the end of the string lies.

Its a problem of this type using pseudo ANSI notation.


"string1",0,"hidden string2",0,"hidden string3",0 etc ....


If you know for certain that it is written like a common dialog path string pairs, you read the single zero as a separator and terminate the read with a pair of zeros but if this registry entry is non standard which appears to be the case, the person reading the data will need to know how many items are zero separated but no double zero terminated.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

ChillyWilly

there is two items in the registry entry if i convert from unicode to hex
the string looks like this:

"68 00 65 00 6c 00 6c 00 6f 00 00 00 77 00 6f 00 72 00 6c 00 64 00 00 00"

where the first word 'hello' is 68 00 65 00 6c 00 6c 00 6f 00 00 00
so the string ends with  00 00 00

the second word is 77 00 6f 00 72 00 6c 00 64 00 00 00
and also ends with 00 00 00

is there some way to separate the two before calling WideCharToMultiByte
and also to check if there is indeed a second string
because sometimes it does not have one


sinsi

Quote from: hutch-- on September 14, 2008, 03:36:56 AMThis technique when applied to the registry is usually used to hide other data behind the first pair of zeros.
If it ends with a double null (00 00 00 00) then it could be a REG_MULTI_SZ.

Quote from: ChillyWilly on September 14, 2008, 04:03:23 AM
is there some way to separate the two before calling WideCharToMultiByte
and also to check if there is indeed a second string
because sometimes it does not have one
If you are using RegQueryValueEx then you get the length of the data returned in lpcbData. The only way to see if there's more than one string is to step through each one until
you reach the end.
Light travels faster than sound, that's why some people seem bright until you hear them.