News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Graphics a la FPU

Started by donkey, January 09, 2005, 04:02:45 AM

Previous topic - Next topic

donkey

Hi All,

I have been playing with the idea of doing some YUV <> RGB conversions using the FPU, mostly for fun but maybe also to include in the graphics library. I am not very versed on the use of the FPU so I am posting the code to see if anyone can A: fix it up and make it look pretty, B: convert it to integer math with the same approx level of accuracy or C: suggest a completely different way to do this...

Note that this is probably ripe with bad assumptions about the FPU...

/*
Y = 0.299 R + 0.587 G + 0.114 B
U = 0.492 (B - Y)
V = 0.877 (R - Y)
*/

YUV STRUCT
Y DD ?
U DD ?
V DD ?
ENDS

DATA SECTION
n877 DQ 0.877
n492 DQ 0.492
n114 DQ 0.114
n299 DQ 0.299
n587 DQ 0.587
n236 DQ 236.1
CODE SECTION

YUV2RGB FRAME pYUV
LOCAL RED :D
LOCAL GREEN :D
LOCAL BLUE :D

mov esi,[pYUV]

; finit
; ########## RED (R = (V+0.877Y)/0.877)
fild D[esi+YUV.Y]
fld Q[n877]
fmulp ;ST0, ST1
fiadd D[esi+YUV.V]
fdiv Q[n877]
fistp D[RED]
mov eax,[RED]
or eax,eax
jns >
; add one for the zero crossing if necessary
inc D[RED]
:
and D[RED],0FFh

; ########## BLUE (B = (U+0.492Y)/0.492)
fild D[esi+YUV.Y]
fld Q[n492]
fmulp
fiadd D[esi+YUV.U]
fdiv Q[n492]
fist D[BLUE]
mov eax,[BLUE]
or eax,eax
jns >
; add one for the zero crossing if necessary
inc D[BLUE]
:
and D[BLUE],0FFh

; ########## GREEN (G = (Y-(0.299R + 0.114B))/0.587)
fild D[BLUE]
fld Q[n114]
fmul
fild D[RED]
fld Q[n299]
fmul
fadd st0,st1
fild D[esi+YUV.Y]
fsub ST0, ST1
fld Q[n587]
fdiv
fistp D[GREEN]
mov eax,[GREEN]
or eax,eax
jns >
; add one for the zero crossing if necessary
inc D[GREEN]
:
and D[GREEN],0FFh

mov eax,[BLUE]
shl eax,8
mov al,[GREEN]
shl eax,8
mov al,[RED]
RET
ENDF

RGB2YUV FRAME clrRGB, pYUV
LOCAL RED :D
LOCAL GREEN :D
LOCAL BLUE :D
LOCAL Y :Q

mov esi, [pYUV]
mov eax,[clrRGB]
and eax,0FFh
mov [RED],eax

mov eax,[clrRGB]
shr eax,8
and eax,0FFh
mov [GREEN],eax

mov eax,[clrRGB]
shr eax,16
and eax,0FFh
mov [BLUE],eax

; finit
; ######### Y
fild D[RED]
fld Q[n299]
fmul
fild D[BLUE]
fld Q[n114]
fmul
fild D[GREEN]
fld Q[n587]
fmul
fadd ST0,ST1
fadd ST0,ST2
fist D[esi+YUV.Y]

; ######### U
fild D[BLUE]
fsub ST0,ST1
fld Q[n492]
fmul
fistp D[esi+YUV.U]

; ######### V
fild D[RED]
fsub ST0,ST1
fld Q[n877]
fmul
fistp D[esi+YUV.V]

RET
ENDF
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

daydreamer

#1
Well you only need upper 8bits for each channel in the usual 0-255 /channel
first calculating your constants to 8:24 fixed point dword constants and:
pseudocode:
ALU mul for 0.114 x B
packed dword mul for 0.299*R at the same time 0.587 G (ALU and mmx different pipelines)
two add to get Y and store two copies of Y for later packed operation
packed sub B-Y and R-Y at the same time
packed mul with 0.492 and 0.877


btw I thought this subject on a first glance was fpu graphics ala rendering to a A32R32G32B32 texture
on newer gpu's which support floating point textures


dioxin

Donkey,
   why would you want to use the FPU to do this? (unless it's just an exercise in familiarising yourself with the FPU).


I'd do it with normal integer maths although MMX may be better, I assume you want to avoid that for compatibility with older CPUs.

Things to look at:

MUL does EAX*r/m32 and puts the result in EDX:EAX pair.

Suppose EAX contains the COLOUR (in this case R, G or B) and r.m32 contains 2^32 (I know it doesn't quite fit, but stick with it for now!)
The result in EDX:EAX will then be:
EDX= COLOUR, EAX=0

If instead I put 2^32*0.5 i.e. 2^31 in r/m32 and do the same then the result will be:
EDX=COLOUR/2, EAX= possibly a leftover bit in the msb.

i.e. I half the value since I loaded EAX with half of 2^32.


Now, if instead I'd used r/m32= 2^32*0.299 = 4C8B4395h then the result would be:
EDX= COLOUR*0.299 and EAX= remainder.

EDX is the result you want!

So, you can do 32 bit accuracy multiplies by your "FP" constants using just integer maths and you'll get accuracy way beyond what you need.


Your code for RGB to YUV would become something like,(completely untested)

mov eax,  2^32*0.299
mul [red]
mov ebx,edx                        'ebx=0.299R

mov eax,  2^32*0.114
mul [blue]
add ebx,edx                    'ebx=0.299R+0.144B

mov eax,  2^32*0.587
mul [green]
add ebx,edx                     'ebx=0.299R+0.144B+0.587G = Y

mov Y,ebx                       'save Y

mov eax,[blue]
sub eax,ebx                     'eax=B-Y
imul 2^32*0.492              'edx=U
mov U,edx                       'save U

mov eax,[red]
sub eax.ebx                     'eax=R-Y
imul 2^32*0.877              'edx=V
mov V,edx


I'm sure the same sort of thing could be done with YUV to RGB

Paul.

donkey

Hi Paul,

I found that the level of inaccuracy was too high if I was not dealing in floating point, actually the routines I posted are too inaccurate for any real usage even with the FPU. These ones are the modified ones to get rid of spurious FFh in place of 0 when some combinations are used. The problem is that whenever you convert the YUV to an integer you get a whole mess of problems. Yes, it was just an exercise for fun but I would also very much like to find a 100% accurate way of doing it, so far only the FPU has offered me that.

I also did an approximation in integer form to get the YUV but my output values were trashed for example I would get the following

in                         out
RGB = 000000CEh  RGB = 00FF01CEh

Only off by one in Blue and Green, not bad for an approx. but useless for color.

I had used the YUV approximation suggested by Microsoft...

Y = ( (  66 * R + 129 * G +  25 * B + 128) >> 8) +  16
U = ( ( -38 * R -  74 * G + 112 * B + 128) >> 8) + 128
V = ( ( 112 * R -  94 * G -  18 * B + 128) >> 8) + 128

I also used a formula similar to the one you have used where I did the math then shifted out the lower contents. But in all tests it failed to consistently yeild the same RGB pattern as I input. This has lead me to believe that to be accurate I must pass the YUV as a float and deal with the speed on that basis. I am hoping that someone can come up with a way to convert to integer math and still maintain at least a 99% accuracy with no roll-overs (ie 00 becomes FF or 01)

The following yeilds a perfect RGB>YUV>RGB conversion which is what I need

YUV STRUCT
Y DQ ?
U DQ ?
V DQ ?
ENDS

DATA SECTION
n877 DQ 0.877
n492 DQ 0.492
n114 DQ 0.114
n299 DQ 0.299
n587 DQ 0.587

CODE SECTION

YUV2RGB FRAME pYUV
uses esi
LOCAL RED :D
LOCAL GREEN :D
LOCAL BLUE :D

mov esi,[pYUV]

finit

; ########## RED (R = (V+0.877Y)/0.877)
fld Q[esi+YUV.Y]
fld Q[n877]
fmulp ;ST0, ST1
fadd Q[esi+YUV.V]
fdiv Q[n877]
fistp D[RED]
and D[RED],0FFh

; ########## BLUE (B = (U+0.492Y)/0.492)
fld Q[esi+YUV.Y]
fld Q[n492]
fmulp
fadd Q[esi+YUV.U]
fdiv Q[n492]
fist D[BLUE]
and D[BLUE],0FFh

fild D[BLUE]
fld Q[n114]
fmul
fild D[RED]
fld Q[n299]
fmul
fadd st0,st1
fld Q[esi+YUV.Y]
fsub ST0, ST1
fld Q[n587]
fdiv
fistp D[GREEN]
and D[GREEN],0FFh

mov eax,[BLUE]
shl eax,8
mov al,[GREEN]
shl eax,8
mov al,[RED]
RET
ENDF

RGB2YUV FRAME clrRGB, pYUV
uses esi
LOCAL RED :D
LOCAL GREEN :D
LOCAL BLUE :D
LOCAL Y :Q

/*
Y = 0.299 R + 0.587 G + 0.114 B
U = 0.492 (B - Y)
V = 0.877 (R - Y)
*/
finit

mov esi, [pYUV]
mov eax,[clrRGB]
and eax,0FFh
mov [RED],eax

mov eax,[clrRGB]
shr eax,8
and eax,0FFh
mov [GREEN],eax

mov eax,[clrRGB]
shr eax,16
and eax,0FFh
mov [BLUE],eax

; finit
; ######### Y
fild D[RED]
fld Q[n299]
fmul
fild D[BLUE]
fld Q[n114]
fmul
fild D[GREEN]
fld Q[n587]
fmul
fadd ST0,ST1
fadd ST0,ST2
fst Q[esi+YUV.Y]

; ######### U
fild D[BLUE]
fsub ST0,ST1
fld Q[n492]
fmul
fstp Q[esi+YUV.U]

; ######### V
fild D[RED]
fsub ST0,ST1
fld Q[n877]
fmul
fstp Q[esi+YUV.V]

RET
ENDF
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

dioxin

Donkey,
   Things to look at, part 2

Your equations for Y, U and V should be rearranged to avoid the DIVs. Instead, multiply by the reciprocal. Since it's only constant values that you divide by then this will be no extra programming but will run a lot quicker.


e.g. you have R = (V+0.877Y)/0.877

i.e. R = V/0.877 + Y
i.e. R = V*1.140 + Y

We now have 1 MUL and 1 ADD instead of 1 ADD, 1 MUL and 1 DIV.

The same can be done with your other equations to give

R = V*1.140 + Y
B = U*2.032 + Y
G = Y*1.703 - R*0.509 - G*0.194


   I might look into the rounding problems later..

Paul.

donkey

#5
Hi dioxin,

Thanks, that works very well and is 100% accurate for the range 0..0FFFFFFh  :U

YUV2RGB_NEW FRAME pYUV
uses esi
LOCAL RED :D
LOCAL GREEN :D
LOCAL BLUE :D
LOCAL garbage :Q

DATA SECTION
n2p032 DQ 2.032
n1p703 DQ 1.703
n1p14 DQ 1.14
np509 DQ 0.509
np194 DQ 0.194

CODE SECTION

mov esi,[pYUV]

finit

fld Q[esi+YUV.V]
fld Q[n1p14]
fmul
fld Q[esi+YUV.Y]
fadd ST0, ST1
fist D[RED]
fxch ST0,ST1
fstp Q[garbage]

fld Q[esi+YUV.U]
fld Q[n2p032]
fmul
fld Q[esi+YUV.Y]
fadd ST0, ST1
fist D[BLUE]
fxch ST0,ST1
fstp Q[garbage]

fld Q[esi+YUV.Y]
fld Q[n1p703]
fmul

; Bring RED to the batters box...
fxch ST0,ST2
fld Q[np509]
fmul

; Bring BLUE to the batters box...
fxch ST0,ST1
fld Q[np194]
fmul

; Bring Y to the batters box
fxch ST0,ST2
fsub ST0,ST1
fsub ST0,ST2
fistp D[GREEN]

mov eax,[BLUE]
shl eax,8
mov al,[GREEN]
shl eax,8
mov al,[RED]

RET
ENDF


I'm not completely sure if the fxch function is optimal but I assumed that it had to be better than anything that moved data out of or back into the FPu.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

MichaelW

Hi Donkey,

I don't know what you intend to do with the YUV values, but assuming they will ultimately end up as integers, it seems to me that an RGB-YUV-RGB conversion followed by a test for matching input and output values is not a valid measure of conversion accuracy. It works when the YUV values are stored as real numbers, but it should fail if the YUV values are stored as integers. I think judging the accuracy of integer-based conversion routines will require something more sophisticated.
eschew obfuscation

dioxin

Donkey,
it can certainly be done without the FPU and in a similar number of instructions to your FPU solution but the solution I have is a bit fiddly and gives the YUV values as fixed point integers.
I have a working version (PowerBASIC syntax unfortunately!) which I'll try to tidy up a bit before I post it.

Paul.

raymond

#8
Y = 0.299 R + 0.587 G + 0.114 B
U = 0.492 (B - Y)
V = 0.877 (R - Y)

The values of R, G and B are always positive integers ranging from 0 to 255.

According to the above equations,
- the value of Y would always be positive and also range from 0 to 255,
- the value of U can range from -111 for pure yellow (R=255,G=255,B=0) to +111 for pure blue (R=0,G=0,B=255),
- the value of V can range from -157 for pure cyan (R=0,G=255,B=255) to +157 for pure red (R=255,G=0,B=0)

This is a typical application for the use of fixed point math.
- Using the lower 16 bits of a DWORD for the fractional part provides an accuracy equivalent to 5 decimal places; ALL the numbers used above only have an accuracy of 3 digits.
- This leaves the upper 16 bits of the DWORD for the integer part with a range of -32767 to +32767, ALL the numbers used above being well within that range.

(MASM syntax is more familiar for most readers and is used for the following code. The YUV struct being a simple one, offsets within the struct were used instead of the more complex struct type addressing.)

YUV STRUCT
Y DD ?
U DD ?
V DD ?
ENDS

.data
; the various factors are initialized with the
; decimals converted to binary fractions

   n114  dd  114*65536/1000  ;0.114
   n299  dd  299*65536/1000  ;0.299
   n492  dd  492*65536/1000  ;0.492
   n587  dd  587*65536/1000  ;0.587
   n877  dd  877*65536/1000  ;0.877
   n236  dd  2361*65536/10   ;236.1

; although that last variable n236 is not used, it has
; been included as an additional example of initialization.

;%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

RGB2YUV proc USES esi clrRGB:DWORD, pYUV:DWORD

LOCAL  RED    :DWORD
LOCAL  BLUE   :DWORD

   mov  esi,pYUV

;compute Y = 0.299 R + 0.587 G + 0.114 B

   mov  eax,clrRGB
   and  eax,0FFh   ;isolate the RED
   mov  RED,eax
   imul n299
   mov  ecx,eax    ;use ECX as the accumulator

   mov  eax,clrRGB
   shr  eax,8
   and  eax,0FFh   ;isolate the GREEN (will not be reused)
   imul n587
   add  ecx,eax

   mov  eax,clrRGB
   shr  eax,16
   and  eax,0FFh   ;isolate the BLUE
   mov  BLUE,eax
   imul n114
   add  ecx,eax    ;ECX = Y = 0.299 R + 0.587 G + 0.114 B

   shr  ecx,16     ;shift out the fractional portion
                   ;The last bit shifted out is copied to
                   ;the CARRY flag and is the one equivalent
                   ;to a decimal fraction of 0.5
   adc  ecx,0      ;ECX = Y rounded to the nearest integer
   mov  [esi],ecx  ;store it

;compute U = 0.492 (B - Y)

   mov  eax,BLUE
   sub  eax,ecx    ;(B - Y)
   imul n492
   or   eax,eax    ;test for negative sign
   pushf           ;keep flags
   jns  @F
   neg  eax        ;make it positive for rounding
@@:
   shr  eax,16     ;shift out the fractional portion
   adc  eax,0
   popf            ;retrieve sign flag
   jns  @F
   neg  eax
@@:
   mov  [esi+4],eax ;store the U value

;compute V = 0.877 (R - Y)

   mov  eax,RED
   sub  eax,ecx    ;(R - Y)
   imul n877
   or   eax,eax    ;test for negative sign
   pushf           ;keep flags
   jns  @F
   neg  eax        ;make it positive for rounding
@@:
   shr  eax,16     ;shift out the fractional portion
   adc  eax,0
   popf            ;retrieve sign flag
   jns  @F
   neg  eax
@@:
   mov  [esi+8],eax ;store the V value
   ret
endp

;%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

YUV2RGB proc USES esi pYUV:DWORD

LOCAL RED     :DWORD
LOCAL GREEN   :DWORD
LOCAL BLUE    :DWORD

   mov esi,pYUV

; RED (R = (V/0.877+Y))
   mov  eax,[esi+8]
   shl  eax,16      ;shift it to the integer bits
   cdq              ;extend sign to EDX
   idiv n877        ;will give the integer part of V/0.877
                    ;in the lower 16 bits of EAX
                    ;and the fractional part in
                    ;the lower 16 bits of EDX
   add  eax,[esi]   ;+Y = R
   or   eax,eax     ;test for sign
   jns  @F
   xor  eax,eax     ;negative values can only be due to
                    ;to rounding errors
   jmp storeRED
@@:
   shl  dx,1        ;the most significant bit of the
                    ;fractional part is equivalent to 0.5
                    ;and gets transferred to the CARRY flag
   adc  eax,0       ;increment result if fraction > 0.5
storeRED:
   cmp  eax,255
   jbe  @F
   mov  eax,255
@@:
   mov  RED,eax

; BLUE (B = (U/0.492+Y))
   mov  eax,[esi+4]
   shl  eax,16      ;shift it to the integer bits
   cdq              ;extend sign to EDX
   idiv n492        ;will give the integer part of U/0.492
                    ;in the lower 16 bits of EAX
                    ;and the fractional part in
                    ;the lower 16 bits of EDX
   add  eax,[esi]   ;+Y = B
   or   eax,eax     ;test for sign
   jns  @F
   xor  eax,eax     ;negative values can only be due to
                    ;to rounding errors
   jmp storeBLUE
@@:
   shl  dx,1        ;the most significant bit of the
                    ;fractional part is equivalent to 0.5
                    ;and gets transferred to the CARRY flag
   adc  eax,0       ;increment result if fraction > 0.5
storeBLUE:
   cmp  eax,255
   jbe  @F
   mov  eax,255
@@:
   mov  BLUE,eax

; GREEN (G = (Y-(0.299R + 0.114B))/0.587)
   mov  ecx,[esi]   ;use ECX as accumulator and
                    ;initialize it with Y
   shl  ecx,16      ;shift it to the integer bits
   mov  eax,RED
   mul  n299
   sub  ecx,eax
   mov  eax,BLUE
   mul  n114
   sub  ecx,eax     ;ECX = Y-(0.299R + 0.114B)
   mov  eax,ecx
   cdq              ;extend sign to EDX
   idiv n587
   or   eax,eax     ;test for sign
   jns  @F
   xor  eax,eax     ;negative values can only be due to
                    ;to rounding errors
   jmp storeGREEN
@@:
   shl  dx,1        ;the most significant bit of the
                    ;fractional part is equivalent to 0.5
                    ;and gets transferred to the CARRY flag
   adc  eax,0       ;increment result if fraction > 0.5
storeGREEN:
   cmp  eax,255
   jbe  @F
   mov  eax,255
@@:
   mov  GREEN,eax

   mov  eax,BLUE
   shl  eax,8
   add  eax,GREEN
   shl  eax,8
   add  eax,RED
   ret
endp


This was based on your original post. It could be modified easily to yield an accuracy equivalent to 5 decimal places if those YUV values are to be used only internally.

If you would rather use the FPU, let me know and I will prepare my comments on your code.

Raymond

EDIT: Had forgotten to add code to check for overflow in YUV2RGB proc. Now added

When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dioxin

It works but it's still a bit untidy. PowerBASIC syntax but it's close enough to MASM to see how it's done.
No FPU used.
Runs an RGB-> YUV ->RGB conversion in under 60clks, about 20 for RGB-> YUV and about 40 for YUV->RGB



'R, G, B are integers from 0-255
'Y, U, V are fixed point integers, 9bits before and 23 bits after the point

'define constants
v0299&=2^23*0.299
v0587&=2^23*0.587
v0114&=2^23*0.114

v0492&=2^32*0.492
v0877&=2^31*0.877  'not 2^32 otherwise it overflows into sign bit. Correct after using it

v1140&=2^24/0.877    '1/0.877   
v2032&=2^24/0.492    '1/0.492   
v1703&=2^24/0.587    '1/0.587   



'RGB -> YUV
!movzx  eax,byte ptr col[1]    'eax=green
!movzx  edi,byte ptr col[2]    'edi=blue
!movzx  esi,byte ptr col       'esi=red

!mul v0587&      ;G*0.587       ;green
!mov ebx,eax     ;accumulate result in ebx

!mov eax,edi     ;blue
!mul v0114&      ;B*0.114
!add ebx,eax

!mov eax,esi     ;'red
!mul v0299&      ;R*0.299
!add ebx,eax

!mov y,ebx      ;store Y        'Y done


!mov eax,edi    ;blue
!shl eax,23     ;line up with Y
!sub eax,ebx    ;(B-Y)
!imul v0492&    ;0.492*(B-Y)
!mov U,edx      ;U done


!mov eax,esi    ;red
!shl eax,23     ;line up with Y
!sub eax,ebx    ;(R-Y)
!imul v0877&    ;0.877*(R-Y)

!shl eax,1      ;double result to correct for v0877& being half size to prevent overflow
!rcl edx,1

!mov V,edx      ;V done



'YUV -> RGB
!mov eax,U
!imul v2032&        ;U*1/0.492

!mov ebx,Y          ;line up Y
!shr ebx,8

!adc edx,ebx        ;Y+U/0.492

!add edx,&h4000     ;round if needed
!shr edx,15         ;scale
!mov blue,edx       ;blue done

   

!mov eax,V
!imul v1140&        ;V * 1/0.877

!mov ebx,Y          ;line up Y
!shr ebx,8

!adc edx,ebx        ;Y+V/0.877

!add edx,&h4000     ;round if needed
!shr edx,15         ;scale
!mov red,edx        ;red done



!mov eax,blue
!mul v0114&         ;B*0.114
!mov ebx,eax        ;acumulate in ebx

!mov eax,red
!mul v0299&         ;red*0.299
!add ebx,eax        ;acumulate in ebx

!mov eax,Y

!sub eax,ebx        ;Y-(B*0.114 + red*0.299)
!imul v1703&        ;(Y-(B*0.114 + red*0.299))/0.587


!add edx,&h4000     ;round if needed
!shr edx,15         ;scale

!mov green,edx      ;green done


!mov eax,blue       ;merge RGB into a single value, col2
!shl eax,8
!or eax,green
!shl eax,8
!or eax,red
!mov col2,eax

donkey

Hi Raymond and Dioxin,

Thanks, I will take a look at them as soon as I thaw out, darn Calgary winter.

MichaelW,

Not really planning on anything right now, I have a grayscale routine and am testing an MMX based contrast so outside of those 2 functions I can't even imagine an application. However, I came across the formulae when I did the grayscale some time ago and decided to have another look. I am sure there is some practical application, perhaps a unique kind of processing or something. For now though I took it on as an exercise in graphics and the FPU. However, an integer or better still MMX version would be nice to add to the library of graphics functions.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

raymond

donkey

On further analysis of your requirements relative to the accuracy of the conversion to/from RGB/YUV, and running some actual code, here are my conclusions:

RGBs are normally converted to YUVs to perform some transformation of the overall color (such as brightness) and then converted back to RGBs for proper display. Conversions are not done simply for the sake of conversion.

Using 3 significant digits for the conversion factors and the YUV values is generally sufficient to return RGBs within +/-1 of the original RGB values, which is almost impossible to detect visually.

Running the conversion back and forth 10 consecutive times would definitely produce variations larger than the +/-1. However, even if the computations are performed with an accuracy of 19 digits (such as with the FPU), exactly the same variations would still be observed if the results of the computations are simply stored as 3-digit integers. The only way to obtain the maximum accuracy would be to store the YUV values as 80-bit floats, which may make it difficult to handle when performing the overall color adjustments.

Raymond

P.S. In case you copied my code posted originally, I edited it to add the necessary code to prevent overflow of the RGB values.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

donkey

Hi Raymond,

Yes, I tested my integer algo and came up with +/- 1 as well, however as an example a value of black -1 yeilds FF and that is not really acceptable in graphics applications. The algorithm seemed to fail when a color was in the middle-top of the range 0CEh+, Ofcourse clipping it is the solution however I did not find a reliable way to clip the rollover based on direction (ie rolls over 0 going down or up).

QuoteThe only way to obtain the maximum accuracy would be to store the YUV values as 80-bit floats

Which is why I finally ended up at the FPU and stored the YUV as 3 floats. Using the FPU but storing as integers required too many fixes to the code, hence my second attempt with a modified FPU algo. Brightness is a good application for this, I geuss it would involve only adjusting the LUMA while leaving the color difference relatively untouched, I will have to investigate the possibilities though I already have a pretty decent adjust LUMA function that I wrote for TBPaint there is always room for improvement.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

donkey

BTW,

I agree that there is little need to convert to YUV just for the sake of converting, which is why I have no problem with using floats in the YUV structure. But accuracy is very important when dealing with images, which is why I am looking for 99% or so, I have 100% but at the cost of speed, every number in the range works perfectly but I admit that I have not tested the speed of the FPU one yet though to convert RGB>YUV>RGB without error takes about 1.5 seconds on my P3-700 for all 16.7 million colors. Still to slow for any kind of image processing but OK for small bitmaps. I hope to get the time to run a speed vs accuracy test this week sometime on all the algos.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

raymond

#14
This is getting interesting.

I would suspect that graphics editors mostly use 32-bit floats to compute YUVs, perform their modifications on those floats and then convert from the modified floats back to RGBs.

I thus ran some tests for that assumption and found that the RGBs seem to be reconstructed with 100% accuracy based on several different samples I tested throughout the RGB ranges.

I then modified the "CPU" code to store the YUV values with the 16 bits of the fraction and, as expected, the RGBs seem to be reconstructed with 100% accuracy based on several similar samples throughout the RGB ranges.

Attached are the two files of the source code for doing the testing (one RGB at a time), each containing the two necessary procedures.

If your tests should prove that the CPU route may be the fastest, I can provide more help for the computations you would need to do on that type of fixed point data.

Raymond

Edit: Had changed the location of some labels before zipping the source files and had forgotten to remove one of the redundant labels. New zip file is corrected.


[attachment deleted by admin]
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com