HDR Conversion Numerical Report

Before You Start

Thanks to everyone who has tried out HDR Conversion and provided lots of valuable feedback.

When working with various code libraries, the most painful things are “black boxes” and inconsistency, especially when dealing with HDR, an emerging and rather complex image format, where different libraries may produce different numerical results.

For this reason, using Google’s open-source libultrahdr version 1.4.0 as the baseline, I ran some tests on HDR Conversion version 0.1.5 and identified some areas for improvement along with a few thoughts.

The three samples used are:

A: A JPG taken with Phone V, containing a 1/4-resolution Gainmap, with a colour space of Display P3.
B: An HDR JPEG exported from Adobe Lightroom, containing a full-resolution Gainmap, with a colour space of sRGB.
C: A JPG taken with Phone H, containing a 1/4-resolution Gainmap, with a Baseline space of Display P3 and an Alternate space of BT.2020.

Metadata Reading

Gainmap metadata is a crucial part of this HDR format. HDR Conversion’s current behaviour is as follows:

When reading ISO 21496-1, it follows the data structure in the standard document and parses the binary data directly. In the binary, some floating-point metadata is stored as numerators and denominators, which HDR Conversion converts to floating-point numbers when reading.
When reading UltraHDR, it uses xml.etree.ElementTree to parse the XML portion.

ultrahdr_app -m 1 -P

The metadata reading results for the three samples are as follows (maximum error):

Sample	maxContentBoost	minContentBoost	hdrCapacityMax
A	2.91e-6	2.12e-7	2.91e-6
B	2.27e-6	4.69e-7	3.31e-6
C	3.29e-7	0	3.31e-6

The errors likely come from decimal truncation in libultrahdr’s probe output. It’s safe to say that HDR Conversion’s metadata reading is reliable.

Reading JPEG

ISO 21496-1 or UltraHDR contains at least two JPEG images: the Baseline and the Gainmap. HDR Conversion currently uses Pillow to read JPEG images.

In libultrahdr, the raw YCbCr planes are output first, then converted to RGB using BT.601 coefficients. In testing, some pixels showed a difference of one code value, possibly due to integer rounding errors. I tried reading the JPEGs with imagecodecs, and the results were identical to Pillow.

After going through the Gainmap application process, this one-code-value difference can result in a maximum absolute difference of about 0.034 in linear RGB. Errors introduced by JPEG decoding will not be optimised further and depend entirely on the decoder implementation.

Gainmap Application

After reading the two images and the metadata, HDR Conversion applies them according to the ISO 21496-1 standard. The main steps are:

Convert the Baseline image from the JPEG-decoded integer code values to linear RGB.
Resample the Gainmap to the Baseline’s resolution.
If the colour spaces don’t match, convert one of them according to the flag in the metadata.
Apply the Gainmap to the Baseline image to obtain the HDR linear image.

HDR Conversion’s current behaviour is as follows:

Linearisation: if an ICC profile is present, linearise according to the TRC in the ICC; otherwise, linearise using the sRGB EOTF.
Resampling: uses Lanczos4 as provided by opencv.
Colour space conversion: reads the ICC and builds an RGB-to-RGB conversion matrix for the conversion.
Gainmap application: applies the Gainmap according to the formula in the ISO 21496-1 standard.

The absolute deviations in linear RGB for the three samples are as follows, with libultrahdr outputting linear RGBA half-float raw:

Sample	max	mean	median	p99	diff > 0.1
A	1.7926	0.002780	0.000161	0.04654	0.3396%
B	0.03392	0.000933	0.000633	0.003876	0%
C	0.5564	0.004344	0.001120	0.04942	0.1461%

The sources of difference, ranked by impact, are as follows:

Gainmap resampling: sample B, which needs no resampling, shows significantly smaller differences than samples A/C, which do. In libultrahdr, Gainmap resampling uses Shepard IDW, whereas HDR Conversion uses Lanczos4. Lanczos4 can overshoot at local edges, leading to larger local differences.
sRGB EOTF: HDR Conversion uses a parameterised sRGB EOTF, while libultrahdr uses a 1D LUT, which may account for some differences.
ICC CMS: HDR Conversion currently includes a simple ICC parser that can handle any compliant ICC file and perform colour space conversion based on its actual contents. libultrahdr categorises colour spaces into three types — BT.709 / Display-P3 / BT.2100 — and uses built-in matrices for conversion.

To address the resampling issue, HDR Conversion has introduced a manually implemented Shepard IDW interpolation method in the develop branch, along with other optional resampling algorithms. Test results using sample A are as follows:

resize method	max	mean	median	p99	p99.9	diff > 0.1
Shepard IDW	1.0203	0.001052	0.000104	0.01756	0.05921	0.0293%
Lanczos4	1.7926	0.002780	0.000161	0.04654	0.21732	0.3396%

This is much closer to the libultrahdr results. If ISO 21496-1 specifies a resampling algorithm in the future, HDR Conversion will use it by default.

As for the differences caused by the EOTF and CMS, I believe HDR Conversion’s approach is better suited to research and development use, and more closely follows the ISO 21496-1 definition.

About CICP

In the ICC of Gainmap JPEGs, you can often find some CICP tags. Apart from the ones explicitly described in the standard, their exact purpose remains unclear.

If the Baseline and Alternate need to use different colour spaces, the ICC in the Gainmap must include a CICP tag to indicate the colour space.

The transfer function in the CICP tag doesn’t seem to have any effect, since Gainmap linearisation uses the Gainmap metadata rather than the ICC. Some images have no TRC field in their Gainmap ICC, with PQ or HLG specified in the CICP, but its purpose is unclear.

Before You Start#

Metadata Reading#

Reading JPEG#

Gainmap Application#

About CICP#

Before You Start

Metadata Reading

Reading JPEG

Gainmap Application

About CICP