Really need to understand bitpacking and unpacking and masking.

Started by BytePtr, February 14, 2010, 11:51:57 AM

Previous topic - Next topic

dedndave

well - i thought i knew my way around masm 5.10 pretty well back in the 16-bit days
i don't think that supported record/mask - i dunno - lol

it looks so "non-masm-ish", if you know what i mean
it is as though a C-compiler guy got his hand on the masm source or something - lol

jj2007

For playing around, here is a simple version of bin$. Usage:

Quoteprint bin$(ebx), 9, "this is ebx", 13, 10
print bin$(MyDword), 9, "this is MyDword", 13, 10
print bin$(MyDword, f), 9, "the same but formatted", 13, 10

The f option adds a string that makes identifying the position count easier.

include \masm32\include\masm32rt.inc

bin$ MACRO dwArg:REQ, tgt:=<0>
  ifndef binbuffer
.data
binbuffer db 32 dup(32), 13, 10, "10987654321098765432109876543210", 0
.code
  endif
  invoke dw2bin_ex, dwArg, offset binbuffer
  ifidni <tgt>, <f> ;; option f, formatted
mov byte ptr binbuffer[32], 13
  else
mov byte ptr binbuffer[32], 0
  endif
  EXITM <offset binbuffer>
ENDM

.code
start: mov ebx, 12345
print bin$(ebx), 13, 10
shl ebx, 2
print bin$(ebx), 13, 10
shr ebx, 2
print bin$(ebx, f), 13, 10
getkey
exit

end start


Output:
00000000000000000011000000111001
00000000000000001100000011100100
00000000000000000011000000111001
10987654321098765432109876543210

P.S.: Do not forget that print and bin$ will modify the registers eax, ecx, edx. If you need eax etc to be preserved, use
pushad
print bin$(eax), " that was eax, and I need it below", 13, 10
popad

BytePtr


FORTRANS

Quote from: dedndave on February 15, 2010, 05:46:34 AM
well - i thought i knew my way around masm 5.10 pretty well back in the 16-bit days
i don't think that supported record/mask - i dunno - lol

it looks so "non-masm-ish", if you know what i mean
it is as though a C-compiler guy got his hand on the masm source or something - lol

Hi Dave,

    Bzzzt!  It was supported back to the MACRO-86 assembler
in ZDOS days.  For the rest that means MASM version 1.??.
And checking one of my oldest books "Assembly Language
Programming for the IBM Personal Computer", David J. Bradley,
1984, shows a RECORD example.  And the printouts don't
show a vesion number.

   Bradley, Holzner, and others had examples that almost
looked useful, but I never got around to using RECORD and
MASK.  By the time I run into something that could use
it, I've forgotten all about it.


   Here's an example from Holzner.

        .MODEL SMALL
        .CODE
        .8087
        ORG     100H
ENTRY:  JMP     PROG
        A       DQ 10.0
        B       DQ 5.0
        ST8087  DW  0
        M RECORD BY:1,C3:1,TOP:3,C2:1,C1:1,C0:1,IT:1,X:1,P:1,U:1,O:1,Z:1,D:1,I:1
PROG:   FINIT
        FLD     A
        FCOMP   B
        FSTSW   ST8087
        MOV     DL,'A'
        MOV     AH,2
        INT     21H
        ;C3=0  C0=0 --> A>B
        ;C3=0  C0=1 --> A<B
        ;C3=1  C0=0 --> A=B
        TEST    ST8087,MASK C3
        JZ      NOTEQ
        MOV     DL,'='
        JMP     SHORT PRINT
NOTEQ:  TEST    ST8087,MASK C0
        JNZ     LESS
        MOV     DL,'>'
        JMP     SHORT PRINT
LESS:   MOV     DL,'<'
PRINT:  MOV     AH,2
        INT     21H
        MOV     DL,'B'
        INT     21H
        FCLEX
        INT     20H
        END     ENTRY     


Cheers,

Steve N.

jj2007

Quote from: FORTRANS on February 15, 2010, 06:04:18 PM
but I never got around to using RECORD and MASK

It's also explained in Masm Programmer's Guide, at the end of chapter 5, "Defining Record Variables". Rarely seen a more obscure and clumsy documentation of a feature that seems to be fairly simple. Here is an attempt to clarify some of the mysteries:


Quoteinclude \masm32\MasmBasic\MasmBasic.inc    ; download

.data
MyRECORD RECORD SubHigh:2, SubMid:7, SubLow:7, SubRest:32-16   ; the order is high to low
MyRec1   MyRECORD <1, 3, 7, 15>
MyRec2   MyRECORD <1, 0, 7-2, 15-2-4>

.code
start:
   Print "MyRec1, MyRec2: ", CrLf$, "__..mid..__low__.......rest.....", CrLf$
   Print Bin$(MyRec1), CrLf$
   Print Bin$(MyRec2, f), CrLf$, CrLf$
   .if MyRec1 & MASK SubMid
      Print "SubMid has data in MyRec1", CrLf$
   .else
      Print "SubMid has no data in MyRec1", CrLf$
   .endif
   .if MyRec2 & MASK SubMid         ; test dword ptr [MyRec2], 3F800000 (right code with single &)
      Print "SubMid has data in MyRec2", CrLf$
   .else
      Print "SubMid has no data in MyRec2", CrLf$
   .endif
   .if MyRec2 && MASK SubMid      ; cmp dword ptr [MyRec2], 0 (wrong code with double &&)
      Print "SubMid has data in MyRec2 (wrong!)", CrLf$
   .else
      Print "SubMid has no data in MyRec2", CrLf$
   .endif

   getkey
   Exit

end start
Output:
MyRec1, MyRec2:
__..mid..__low__.......rest.....
01000001100001110000000000001111
01000000000001010000000000001001
10987654321098765432109876543210

SubMid has data in MyRec1
SubMid has no data in MyRec2
SubMid has data in MyRec2 (wrong)

jj2007

Try inserting the following above:
.data
MyRECORD RECORD SubHigh:4, SubMid:7, SubLow:7, SubRest:32-18 ; the order is high to low
...
.code
start:
tmp$ CATSTR <SubHigh=>, %SubHigh
% echo tmp$
tmp$ CATSTR <SubMid=>, %SubMid
% echo tmp$
tmp$ CATSTR <SubLow=>, %SubLow
% echo tmp$
tmp$ CATSTR <SubRest=>, %SubRest
% echo tmp$
tmp$ CATSTR <SubHigh=>, %mask SubHigh
% echo tmp$
tmp$ CATSTR <SubMid=>, %mask SubMid
% echo tmp$
tmp$ CATSTR <SubLow=>, %mask SubLow
% echo tmp$
tmp$ CATSTR <SubRest=>, %mask SubRest
% echo tmp$


Result (in the IDE's output window):
SubHigh=28
SubMid=21
SubLow=14
SubRest=0
SubHigh=4026531840
SubMid=266338304
SubLow=2080768
SubRest=16383


So field by itself is the bit position of the field start, while mask field is the integer with which we must and to extract the field content.

Another little test:

.data
MyRECORD RECORD SubHigh:4, SubMid:7, SubLow:7, SubRest:32-18 ; the order is high to low
MyRec1 MyRECORD <1, 3, 7, 15>
...
mov eax, MyRec1
and eax, mask SubMid
shr eax, SubMid
Print Str$("The value of MyRec1.SubMid is %i\n", eax)

mov eax, MyRec1
and eax, mask SubLow
shr eax, SubLow
Print Str$("The value of MyRec1.SubLow is %i\n", eax)


Output:
The value of MyRec1.SubMid is 3
The value of MyRec1.SubLow is 7

jj2007

I have written two macros for setting & getting fields in a record, and they work just fine. However, I don't understand what happens when there is a record inside a structure. Consider this:
TheRec   RECORD   Format:4, Reserved1:3, Msf:1  ; defines the record

CDROM_READ_TOC_EX   struct
TheRec <> ; WHAT EXACTLY DOES THIS MEAN ???
SessionTrack   UCHAR   ?
Reserved2   UCHAR   ?
Reserved3   UCHAR   ?
; OneMore UCHAR   ? ; fails
CDROM_READ_TOC_EX   ends

.data
tocex CDROM_READ_TOC_EX <255,12h,34h,56h> ; initial values
tocex2 CDROM_READ_TOC_EX <-1,-1,-1,-1> ; have no effect
NextVar dd 78563412h

This assembles just fine:
  mov edi, offset tocex
  SetField [edi.CDROM_READ_TOC_EX].Format, 15

What I would have expected, though, was
SetField [edi.CDROM_READ_TOC_EX.TheRec].Format, 15
... and that one fails miserably. So what exactly does a record in a structure mean...?

Note it doesn't matter where TheRec <> sits, this version produces the same code:
QuoteCDROM_READ_TOC_EX   struct
   SessionTrack   UCHAR   ?
   Reserved2   UCHAR   ?
   Reserved3   UCHAR   ?
   TheRec <>
CDROM_READ_TOC_EX   ends

Testbed attached.

dedndave

i just may use this stuff, Jochen
it might come in handy with CPUID feature bits   :bg
it will be interesting to see what the generated code looks like

jj2007

Quote from: dedndave on March 02, 2010, 12:07:24 PM
i just may use this stuff, Jochen
it might come in handy with CPUID feature bits   :bg
it will be interesting to see what the generated code looks like

It works fine for non-structure records. Here is another mysterious problem:
TheRec   RECORD   Format:4, Reserved1:3, Msf:1

CDROM_READ_TOC_EX   struct
; TheRec <15, 7, 1> ; assembly fails
SessionTrack   UCHAR   <12h>
Reserved2   UCHAR   <34h>
Reserved3   UCHAR   <56h>
TheRec <15, 7, 1> ; works, FF as fourth byte in 402000h
CDROM_READ_TOC_EX   ends

.data
tocex CDROM_READ_TOC_EX <22h,33h,44h,55h> ; initial values: 55h overwritten with FF from TheRec


Now it works only if TheRec is in the last row of the structure... ::)

dedndave

maybe you can put the record in a seperate structure, then reference that in the main structure

WryBugz

Way back when we used byte records for almost everything. Keeping program data size down was crucial so creative use of the bitwise operators was standard. It was a game to see who could come up with the most clever operations. Mpst power for the least bytes.
Under 32 bit and 32 align, I suspect that converting all your data to dwords would actually be more efficient than navigating bit fields. That said, I have not tested it. It might be interesting seeing which was faster. Changing 8 dword values or eight single bits one mask per bit.
hmm, a project...

joemc

Quote from: WryBugz on March 02, 2010, 11:09:52 PM
Way back when we used byte records for almost everything. Keeping program data size down was crucial so creative use of the bitwise operators was standard. It was a game to see who could come up with the most clever operations. Mpst power for the least bytes.
Under 32 bit and 32 align, I suspect that converting all your data to dwords would actually be more efficient than navigating bit fields. That said, I have not tested it. It might be interesting seeing which was faster. Changing 8 dword values or eight single bits one mask per bit.
hmm, a project...

It can still be very important when writing/reading to disk or network.  disk access and network connection tend to be big bottlenecks.  Although i am sure while in memory it is best to keep everything 32 bytes long, and I am pretty sure changing 8 dwords in memory is faster.  It is much faster to read 1 dword from a Hard Drive / receive it from a network.  Even bettter 1,000,000 instead of 8,000,000.  Rather use that  cpu you have sitting at idle than flood your network cable.

Mincho Georgiev

Bit packing and unpacking is not only about getting the 'Nth' bit, when you know what are you looking for.
Consider the next scenario. You have a bitmap, a 64 bit number, but you don't know which bits are set and you want to extract them.
BSF or BSR are the solutions for you in that case.

http://chessprogramming.wikispaces.com/BitScan