News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Float-to-half conversion

Started by c0d1f1ed, January 16, 2008, 10:22:10 PM

Previous topic - Next topic

c0d1f1ed

Hi All,

I'm trying to write an SSE routine to convert 32-bit floating-point numbers to 16-bit floating-point numbers (a.k.a. the 'half' variable type). I found some C code to convert a single value and I simplified it to the following:


static const unsigned int integer = 0x52000000;

short convert(int x)
{
short sign = (x >> 16) & 0x8000;

int absolute = x & 0x7FFFFFFF;

int X = absolute | (absolute << 10) | (absolute << 13);
int Y = 0x0F7FE000;
int Z = absolute + 0xC8000000;
volatile float F = (float&)absolute * (float&)integer;
int W = (int)F;

int result;

if(absolute >= 0x7F800000)   // NaN
{
result = X;
}
else if(absolute > 0x477FE000)   // Infinity
{
result = Y;
}
else if(absolute >= 0x38800000)   // Normal
{
result = Z;
}
else   // Denormal
{
result = W;
}

return sign | ((result >> 13) & 0x7FFF);
}


Translating the calculation of X, Y, Z and W to SSE assembly is easy, it's the conditional code I'm having a little trouble with. Since I want to comvert four values in parallel, I need to do it by masking things out with AND operations and OR-ing them together again. But since the numbers are in one of four categories, the straightforward approach takes a very high number of operations.

So I was wondering if anyone has experience with this sort of things and can come up with an optimized solution. There might also be other opportunities to optimize float-to-half conversion that I'm not seeing right now.

Thanks,

Nicolas

raymond

Quoteif(absolute >= 0x7F800000)   // NaN
else if(absolute > 0x477FE000)   // Infinity
else if(absolute >= 0x38800000)   // Normal

I don't know what would be the use of 16-bit floating-point numbers. And I don't know who would have written that C code. However, there seems to be some serious lack of knowledge about the standard 32-bit floating point format (unless it is based on some other float format which would not be supported by most modern computers).

For a start, the value of INFINITY is 0x7F800000. NaNs would have a value greater than that.
Values between 0x477FE000 and 0x7F800000 are valid floats.
Values between 0x800000 and 0x38800000 are normal floats. Only values <0x800000 are denormals.

NOTE: All of the above are for 32-bit floats only.

If you should want more information about the standard floating point format, check some of the Intel documentation or have a quick look at:
http://www.ray.masmcode.com/tutorial/fpuchap2.htm#floats

Raymond
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

c0d1f1ed

Quote from: raymond on January 17, 2008, 02:29:11 AM
I don't know what would be the use of 16-bit floating-point numbers.

It is used extensively in graphics for storing high dynamic range color values.

QuoteAnd I don't know who would have written that C code.

Industrial Light & Magic.

Also note that AMD added float-to-half and half-to-float conversion instructions to its SSE5 specification.

QuoteHowever, there seems to be some serious lack of knowledge about the standard 32-bit floating point format (unless it is based on some other float format which would not be supported by most modern computers).

For a start, the value of INFINITY is 0x7F800000. NaNs would have a value greater than that.
Values between 0x477FE000 and 0x7F800000 are valid floats.
Values between 0x800000 and 0x38800000 are normal floats. Only values <0x800000 are denormals.

The comments can be a little misleading. They refer to the output values, not the input values. Also, infinity actually means the maximum representable half-precision value, while the number that falls in the NaN category which has a zero mantissa returns an actual representation of infinity in half-precision format.

QuoteIf you should want more information about the standard floating point format...

Don't worry, I know IEEE 754 inside an out. And the above code is correct; I compared its output to the reference for all 232 inputs.

All I could need a little help with is optimizing the SSE code.

Thanks,

Nicolas

zooba

I knocked up this quasi-SSE2 code which may help you find a direction to go. I haven't tested it, nor can I vouch for its speed. However, I can guarantee that there are no branches involved :D

Some work will be required to make it workable, but you seem to be on top of that side of things. Good luck.

Cheers,

Zooba :U

movaps xmm0, {absolutes x 4}
movaps xmm1, xmm0
movaps xmm2, xmm0
movaps xmm3, xmm0

pcmpgtd xmm1, {7F7FFFFFh x 4}   ; subtract 1 since comparison is greater-than
pcmpgtd xmm2, {477FE000h x 4}
pcmpgtd xmm3, {387FFFFFh x 4}   ; subtract 1 since comparison is greater-than

; Take account of priorities by stripping true flags where a higher
; condition also has a true flag
; (This can be swapped with the block below)
pxor xmm3, xmm2
pxor xmm2, xmm1

; Use all true-conditions to clear out the sections of xmm0 that will get other values
; (This can be swapped with the block above)
pandn xmm0, xmm1
pandn xmm0, xmm2
pandn xmm0, xmm3

; Mask out conditions, including and-not for the else condition
; Hopefully, only one of xmm0/xmm1/xmm2/xmm3
pand xmm0, {W x 4}
pand xmm1, {X x 4}
pand xmm2, {Y x 4}
pand xmm3, {Z x 4}

por xmm0, xmm1
por xmm0, xmm2
por xmm0, xmm3

u

Insomniac Games needed something similar, though it's CellBE SPE code. (at least the SPE is like a SSE-only cpu)

http://www.insomniacgames.com/tech/articles/0807/beware_of_statics.php
Please use a smaller graphic in your signature.

raymond

QuoteThe comments can be a little misleading.

That's what threw me off. However, with your clarification, my post will at least have brought them out and prevented others from using information which may have been wrongly interpreted.

Good luck with your project.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com