Print Page - Converting C to MASM (newbie stuff)

Title: Converting C to MASM (newbie stuff)
Post by: axtens on April 14, 2007, 03:18:23 AM

G'day everyone,

Well it seems I'm not as intelligent as I once thought. I really need help.

The code below is from the Unicode Consortium and converts UTF32 to UTF16. This is one of 6 similar routines. I figure if can get help with the first, I should be able to wriggle through the rest without losing too much skin.

sourceStart is a pointer to a pointer to an array of unsigned 32 bit values.
sourceEnd is a pointer to the end of the same array.
targetStart is a pointer to a pointer to an array of unsigned 16 bit values.
targetEnd is a pointer to the end of the same array.
flags is an enum, strictConversion or lenientConversion.

This stuff is from the header file and the main source:

Code Select


typedef unsigned long	UTF32;	/* at least 32 bits */
typedef unsigned short	UTF16;	/* at least 16 bits */
typedef unsigned char	UTF8;	/* typically 8 bits */
typedef unsigned char	Boolean; /* 0 or 1 */

/* Some fundamental constants */
#define UNI_REPLACEMENT_CHAR (UTF32)0x0000FFFD
#define UNI_MAX_BMP (UTF32)0x0000FFFF
#define UNI_MAX_UTF16 (UTF32)0x0010FFFF
#define UNI_MAX_UTF32 (UTF32)0x7FFFFFFF
#define UNI_MAX_LEGAL_UTF32 (UTF32)0x0010FFFF

typedef enum {
	conversionOK, 		/* conversion successful */
	sourceExhausted,	/* partial character in source, but hit end */
	targetExhausted,	/* insuff. room in target for conversion */
	sourceIllegal		/* source sequence is illegal/malformed */
} ConversionResult;

typedef enum {
	strictConversion = 0,
	lenientConversion
} ConversionFlags;

static const int halfShift  = 10; /* used for shifting by 10 bits */

static const UTF32 halfBase = 0x0010000UL;
static const UTF32 halfMask = 0x3FFUL;

#define UNI_SUR_HIGH_START  (UTF32)0xD800
#define UNI_SUR_HIGH_END    (UTF32)0xDBFF
#define UNI_SUR_LOW_START   (UTF32)0xDC00
#define UNI_SUR_LOW_END     (UTF32)0xDFFF
#define false	   0
#define true	    1

My biggest hassle at the moment is how to handle the incoming pointer to pointer to array parameters.

Any ideas?

Kind regards,
Bruce.

actual routine

Code Select


ConversionResult ConvertUTF32toUTF16 (
 const UTF32** sourceStart, const UTF32* sourceEnd,
 UTF16** targetStart, UTF16* targetEnd, ConversionFlags flags) {
    ConversionResult result = conversionOK;
    const UTF32* source = *sourceStart;
    UTF16* target = *targetStart;
    while (source < sourceEnd) {
 UTF32 ch;
 if (target >= targetEnd) {
     result = targetExhausted; break;
 }
 ch = *source++;
 if (ch <= UNI_MAX_BMP) { /* Target is a character <= 0xFFFF */
     /* UTF-16 surrogate values are illegal in UTF-32; 0xffff or 0xfffe are both reserved values */
     if (ch >= UNI_SUR_HIGH_START && ch <= UNI_SUR_LOW_END) {
      if (flags == strictConversion) {
          --source; /* return to the illegal value itself */
          result = sourceIllegal;
          break;
      } else {
      *target++ = UNI_REPLACEMENT_CHAR;
            }
     } else {
      *target++ = (UTF16)ch; /* normal case */
     }
 } else if (ch > UNI_MAX_LEGAL_UTF32) {
     if (flags == strictConversion) {
      result = sourceIllegal;
     } else {
      *target++ = UNI_REPLACEMENT_CHAR;
     }
    } else {
     /* target is a character in range 0xFFFF - 0x10FFFF. */
     if (target + 1 >= targetEnd) {
  --source; /* Back up source pointer! */
  result = targetExhausted; break;
     }
     ch -= halfBase;
     *target++ = (UTF16)((ch >> halfShift) + UNI_SUR_HIGH_START);
     *target++ = (UTF16)((ch & halfMask) + UNI_SUR_LOW_START);
 }
    }
    *sourceStart = source;
    *targetStart = target;
    return result;
}

Title: Re: Converting C to MASM (newbie stuff)
Post by: Tedd on April 14, 2007, 04:45:40 PM

UTF32** sourceStart

Code Select

mov eax,sourceStart    ; sourceStart
mov ecx,[eax]               ; *sourceStart
mov eax,[ecx]               ; **sourceStart

You could do it all through eax, but to access the next array pointer all you need to do is:

Code Select

add ecx,4
mov eax,[ecx]

...and so on for each element of the array.

Title: Re: Converting C to MASM (newbie stuff)
Post by: axtens on April 15, 2007, 05:27:26 AM

That simple??!! Wow. So much for that panic-fest.

Thanks.

Bruce.

Title: Re: Converting C to MASM (newbie stuff)
Post by: Tedd on April 15, 2007, 07:20:35 PM

It's okay, pointers seem to cause people so many problems, but when you get to it low-level they're pretty simple :wink

The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: axtens on April 14, 2007, 03:18:23 AM