Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent rgb/rgba behavior #4

Open
shekh opened this issue Oct 31, 2017 · 14 comments
Open

inconsistent rgb/rgba behavior #4

shekh opened this issue Oct 31, 2017 · 14 comments

Comments

@shekh
Copy link

shekh commented Oct 31, 2017

I see different maximum values depending on requested decoding format.

decode to CFHD_PIXEL_FORMAT_BGRA:
result: r/g/b/a = 0xFF as expected

decode to CFHD_PIXEL_FORMAT_R210:
result: r/g/b = 0x3FC (expected 0x3FF)

decode to CFHD_PIXEL_FORMAT_B64A:
sample: CFHD_ENCODED_FORMAT_RGB_4444
result: r/g/b = 0xFF00, a = 0xFFFF (expected r/g/b 0xFFFF)
sample: CFHD_ENCODED_FORMAT_RGB_444
result: r/g/b = 0xFF00, a = 0xFFF0

Main issue is underscaled rgb, alpha seems ok where it is present.

@dnewman-gpsw
Copy link
Collaborator

Internally all 444 formats are 12-bit, not 16-bit. So when 0xffff is input using CFHD_PIXEL_FORMAT_B64A, it is shifted to 0x0fff, encoded as 0x0fff, which is shifted back to 0xfff0 for the pixel container precision (not 0xff00 as you have suggested, I think that was a typo.) For alpha channel the codec will clip to 0xffff as a special case, but I'm not sure that is the correct behavior for RGB data. The issue is round-trip losses if 0x0fff is scaled to 0xffff, with a wavelet encode in between. Round-tripping quality is a big concern for codecs, but an optional scaled input and output pixel format would be straight forward to add, but the PSNR would drop slightly over round-trip encodes.

For CFHD_PIXEL_FORMAT_R210 I got 0x3FF for 0x3FF input (using a white frame.)

@shekh
Copy link
Author

shekh commented Oct 31, 2017

Maybe it depends on how to encode sample? I used vfw codec feeding rgba32.
There is some dithering/noise but after some averaging I get 0xFF00, not 0xFFF0.
If I understand correctly the values are shifted without scaling both at encoding and decoding.
So to interpret the decoded data I also need to know how many bits was fed into the encoder?

@dnewman-gpsw
Copy link
Collaborator

dnewman-gpsw commented Oct 31, 2017

If you used VFW you only feed 0xff into the codec. I'm not convinced that should be 0x3ff (10-bit) or 0xffff (16-bit) should be on the output. 0x3fc is 0xff shifted up by 2 bits. Seems to be working correctly. The ambiguity over the value of white when switching bit depths has been long known. After Effects used to use 15+1bit representation where white was 0x8000, maybe to help with this. As codec's performance is measured by its ability to match its input, I think CineForm is correct here. Currently it matches the input, but if you want 0xff00 to represent white (or know 8-bit input 0xff, should be 0xffff on 16-bit output) then metadata would be the solution. Fortunately CineForm has an existing metadata engine that could be used, we just need to define a suitable FourCC.

@shekh
Copy link
Author

shekh commented Nov 1, 2017

It seems SAMPLE_HEADER.input_format can be used as a clue.

@dnewman-gpsw
Copy link
Collaborator

Yes it might work, however I'm still not convinced that gaining up the image should be the automatic behavior for 8-bit source encodes -- particularly if it impacts performance or round-trip quality.

@shekh
Copy link
Author

shekh commented Nov 1, 2017

It is not uncommon approach to scale rgb values when changing bitdepth.
Some examples:
Convert 8-bit image in Photoshop to 16-bit. 0xFF is scaled to 0xFFFF.
r210 decoder in FFmpeg expands to 16-bit:
*dst++ = r | (r >> 10); *dst++ = g | (g >> 10); *dst++ = b | (b >> 10);
Up-down conversion with precise rounding is quite trivial and cannot be a performance issue.
I don't claim it is necessary to change Cineform behavior. Scaling or not can be left to application.
However if the 8-bit source was encoded without scaling, it is best to deal with it as 8-bit on decoding (in comparison, the codecs found in FFmpeg explicitly differentiate each bitdepth as separate internal format).
One more question: if active metadata is applied to modify decoded picture, is the scaling the same (skipped)?

@dnewman-gpsw
Copy link
Collaborator

Interesting, so for 444 decodes to 12-bit, the upscale to 16-bit would be:
*dst++ = (r<<4)| (r>>8); *dst++ = (g<<4)| (g>>8); *dst++ = (b<<4) | (b>>8);

For 8-bit input to 12-bit encodes:
*dst++ = (r<<4)| (r>>4); *dst++ = (g<<4)| (g>>4); *dst++ = (b<<4) | (b>>4);

Changing inputs before encoding is really weird, but I understand the logic. There would need to new tuning for DC offsets, but worth the experiment. Still expect 8-bit in to 8-bit out will decrease in PSNR, can;t know until trying.

@dnewman-gpsw
Copy link
Collaborator

If Active Metadata was used, the would be more compute expensive as it involves additional buffer copies and the use of multiples rather shifts. CDL values for RGB Gain would be 1.00391 to take 0xff00 to 0xffff.

@olafmatt
Copy link

olafmatt commented Nov 2, 2017

Multipliers is a bad idea, IMHO, since it would introduce rounding errors and you would not be able anymore to recover the "original" data. Filling the lower bits with the upper bits from the same value is a common thing to do in image applications. It makes sure you can retrieve the original data and it makes sure you get full-scale output when the input was also full-scale (or zero when input was zero).

But I'd vote against changing anything. Just add a comment to the header that defines the CFHD_PixelFormats (i.e. CFHDTypes.h) saying that on decode only the upper 12 or 10bit contain valid data. That way the user can decide how to expand them to true 16bit. Some people might want to convert the output to float anyway, so any extra ORing-in of lower bits is wasted processing time if you can instead just use the proper multiplier to turn it into float directly.

@shekh
Copy link
Author

shekh commented Nov 24, 2017

I am trying to verify something.
I made white rectangle and saved it as rgba using rgba32 source format.
On decoding to B64A it is seen as: r,g,b=0xFF00, a=0xFFFF.
But when I apply active metadata (framing: vertical offset) it becomes r,g,b=0xFFDF, a=0xFFE0.
I don't understand what happens.

@olafmatt
Copy link

olafmatt commented Nov 24, 2017

I can not fully confirm this, but I do see also some strange stuff going on.

Edit: Spoke to soon...:
Just made two files encoding pure white (0xFFFF for all channels) and both with quality set to High or Filmscan1 the file decode to 0xFFF0 RGB and 0xFFFF for A. A file encoded as just RGB has the alpha channel set to 0xFFF0 when decoded to RGBA.

But since we're dealing with lossy compression, I would somehow understand when not all values come out at exactly the same numerical value they went in. But seeing something higher than 12bits looks somehow wrong indeed.

I have a feeling that creating a new decoder will not give you a fully initialized decoder. Some month ago (before this was open-sourced) I was writing a plugin for a commercial application where we reuse a limited set of decoders for whatever clip the user might be playing. So sometimes a decoder that was previously used on an RGB encoded clip would then be used for an YUV clip and so on. This often gave YUV frames that were too light. Only solution we found was to destroy the decoder and create a new one when switching between RGB(A) and YUV clips.
Similar stuff with the encoder... try to first encode YUV (from YU64 format) and then try to use the same encoder (after a CFHD_PrepareToEncode(), of course) to make RGB from RG64 format. You'll get something that looks like RGB 4:2:2.

@dnewman-gpsw
Copy link
Collaborator

dnewman-gpsw commented Nov 24, 2017

CFHD_PrepareToEncode() must be call between any format change, so that is not surprising.

shekh, Once you enable active metadata with any change, it does go through more processing than just a vertical offset, but that error seems more pronounce than I would have expected. That will take some effort to track down.

@olafmatt
Copy link

olafmatt commented Aug 6, 2018

There seems to be one more thing wrong with alpha channel encoding. It seems the alpha channel gets a curve applied on encode that maps the 0-4095 range into 256-3823. Then it excluded 0 and "very high" values from being run through that curve. Presumably in order to make sure that even with rounding errors (i.e. lossy compression) fully transparent and fully opaque always come out of the decoder undamaged.

However, the condition for applying the curve is this (in ConvertBGRA64ToFrame_4444_16s() and similar functions in frame.c):

if(a > 0 && a < (255<<4))

I'd say this should be changed to:

if(a > 0 && a < 4095)

Otherwise all alpha values from 4080 on will come out of the decoder as 4095.

This curve applied to the alpha channel also explains why identical values for R, G ,B and A show different "rounding errors" for RGB and A when being encoded and decoded again.

Also in frame.c there are cases where decoding from YUV or RGB into an RGBA output format causes the alpha channel being filled with 0xffff (instead of taking the max. value for the respective bit depth).
In case there is an alpha channel present, it removes the curve, shifts up by 4 bits and then clips max. value to 0xffff. See ConvertLowpass16sRGBA64ToRGBA64() in frame.c. Since 4095 got encoded without the curve applied, such a value would now get the curve "removed" which means we end up with a value above 4095.
In both cases it should allow a max. alpha channel value of either 0xFFF0 or 0xFFC0, depending on bit depth.

Edit: There are also several cases in convert.c where the max. value gets limited to USHRT_MAX.

@dnewman-gpsw
Copy link
Collaborator

I agree ConvertBGRA64ToFrame_4444_16s() and ConvertRGBA64ToFrame16s() should be using
if(a > 0 && a < 4095)

Fixing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants