Please help unpack and PACK the data in WORD / DWORD

Started by BytePtr, December 13, 2009, 12:03:55 AM

Previous topic - Next topic

BytePtr

Hi all. Haven't been here for a while.

There is a simple program that takes a real numbers and saves them to file.
By real numbers i mean numbers with dot, for example: 1.2, 3.4, 124.9 etc.

For example the input numbers are
1.0, 0.0, 0.0

When it writes them to file it packs them somehow or swaps and result is this:

00 00 40


If i do a: 40 shr 6 then i get correct value: 1 or anything else that is in that place.
Why 6? Because Bit-Set has 6 zeroes at the right.

Here is a shot:


When something else is used instead of zeroes at the right, for example:
1.2, 0.0, 0.0


Things get complicated, it saves this like so:
00 CC 4C

Here is a shot:



Im always watching that Bit-Set binary data in that window.

The problem is that i don't get it how it packs the numbers in that case?
Also it flips the bytes: it should be 4C CC which is 19660 in decimal and 01001100 11001100 in binary. Which is (16bit) 2 bytes (a WORD)

0.1 is saved like this: 66 06

Assembler gurus here, tell me how the real numbers are saved?

If i could at least pack the real numbers to get same bits then unpacking just would basically be reverse of this.
Something is probably ANDed, XORed and maybe ORed and then shifted.

I need to unpack the data and later again pack.

Any help is greatly appreciated.

dedndave

i don't know if they are using intel standard real format - they may be (it would seem convenient)
i didn't evaluate the numbers you posted to see
but, here is a site that has the info...
http://www.ray.masmcode.com/tutorial/fpuchap2.htm
they look like they might be "real4" values - 4 byte reals

raymond

Quotethey look like they might be "real4" values - 4 byte reals

That is impossible. The ".2" decimal fraction is not an exact multiple of fractional powers of 2 and bits would be set in all 4 consecutive bytes of a REAL4. His 1.2 would look more like 3F99999A according to the IEEE format of REAL4 floating points.

BytePtr must provide some additional information on what is generating those displayed numbers if he expects an answer.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dedndave

well - that's what i said - i didn't take the time to evaluate the numbers he posted
but, the format may be similar - at least it gives him a place to start
the thing is, he has posted only a few values
to reverse-engineer the format would require several more values as examples
and, even then, you might not have it right
it shouldn't be all that hard to find the ms word file format defintion
keep in mind that the format may be from ms excel, instead - you may want to search for that file definition also
that could save you a lot of headaches and wasted effort

BytePtr

Thanks for replies. The numbers above are just examples. User can use any number in range of 0.0 to 255.8
And it's not Excel or any such app.

It's a game scripting compiler tool that takes user scripts that parses them with Lexx or Yacc and generates compiled script.
Compiled scripts are always 80.7KB in size. This is what game uses.
And unfortunately the specs are not available. They were never released.

Seems that this compiler detects if number after dot is greater than 0, if it is then it uses some other methods to save the numbers and if the number after dot is 0 then it saves the real number as integer.
At least it seems so.


But the fact is: dot must always present in number. Otherwise compiler throws an error.

FORTRANS

Hi,

   Plugging the (little endian) numbers in a calculator, it seems
that you have fixed point numbers, not floating point.  At least
they sure look that way, given the examples.

Regards,

Steve N.

BytePtr

Yeah fixed point for sure.

I don't think that they made this so complex by using floating point. No.
The numbers after dot are just for adjusting, so they can handle things more precisely.

dedndave

well - we don't call that "flipping" - lol
for us, it is normal for higher-order bytes of values to have higher addresses (called "little-endian", as opposed to "big-endian")
but - the 255.8 clue doesn't help much
the reason is, that could be stored as a 16-bit integer of 2558
that tells me that the format probably allows larger values than 255.8, and/or more than 2 decimal places (25580 also fits into 16 bits)
the problem you may have is describing special values like infinity, if it allows it and other special values
and - how does the program recognize whether it is stored as an integer or a real ?
the answer to that is: it is likely that the format is the same - i.e. values under 256 with no decimals just happen to be integers
if you look at Ray's tutorial i linked above, you will see some characteristics that may help you
that is, the general form in how real numbers are stored
in the case of intel reals, there is a sign bit, some number of exponent bits, and some number of mantissa bits
i do see a pattern with which i am familiar
that is the CCCCC pattern (in binary, that's ...110011001100...)
that pattern results when you divide a binary orthogonal (like 65536 or 4294967296) by 10

BytePtr

It maybe allows larger values but they are never used.
Everything is limited to user. Nothing can be infinite in scripting, that's for sure. Scripting manual also tells that.
Otherwise game will crash and access violation occurs.
The game itself is limited to 256x256x8 area. The real numbers i posted are the coordinates.

I will try more different numbers and see what it generates from them.

dedndave

in "4CCCh", the bit from the 4 may be the "1.0" part and the CCC bits may be the "0.2" part (2/10)

QuoteThe game itself is limited to 256x256x8 area.
well - there is more to it than that, if you have a number like 1.2 or 255.8, then the real grid is something like 2558 x 2558 x 78

perhaps they are storing the three coordiantes as seperate integers

dedndave

you need to make a table of values

0.0
0.2
0.4
0.6
0.8
1.0
1.2
.
.
10.0
12.0

and so on - do enough decades until you see the pattern

FORTRANS

Hi,

   To elaborate, given the examples:


  Given    Hex      Dec          Binary
   1.0 => 4000H => 16384 => 100110011001100B
   1.2 => 4CCCH => 19660 => 100000000000000B
   0.1 => 0666H =>  1638 =>     11001100110B

  You (apparently) have a fixed point number created by
multiplying a real number by 16384 and saving the integer
part.

HTH?

Steve

dedndave

enter the maximum 3D coordinate 255.8, 255.8, 7.8 - or whatever it is
show us that representation

EDIT - looks like Steve is on the right track   :U

BytePtr



I entered. (255.7, 255.7, 7.7).
Because 0 is also used. So: 0-255 = 256 and 0-7 = 8.


Steve:
Well it seems that you are correct. I took one simple real number like 1.0 and multiplied it with the number: 1.0 * 16384 and really got the same value as in the compiled script.

But more tests must be done. To be sure. I will try more numbers. But i think that this is it.



dedndave

yup - Steve got it - take the real, mul by 16384 and round to the nearest whole integer