CIE, XYZ, Yxy, RGB, and gamuts

Despite digital images are often given in RGB, RGB is not always a convenient space for image processing. Splitting RGB colors into their luminance and chrominance, and doing the opposite, are very common operations when it comes to image filtering.

The Yxy encoding is a very good solution due to its strong physical/perceptual background. One can go from RGB to XYZ (selecting a certain color-space transform matrix), and then go from XYZ to Yxy using the following formulas:

x = \frac{ X }{ X + Y + Z }

y = \frac{ Y }{ X + Y + Z }

The inverse transform to go from Yxy to XYZ is given by the following formulas:

X = Y \cdot \frac{ x }{ y }

Z = \frac{ X }{ x } - X - Y

The color-space transform matrix that turns RGB values into XYZ values (and its inverse) is a simple 3×3 affine transform matrix. The exact values of the matrix depend on the color-space you are working in. For example, the matrix for sRGB can be found in the Wikipedia sRGB page. Actually, one can build a custom RGB-to-XYZ matrix by defining the xy coordinates of the three primary colors (R, G, and B) and the position of the white-point. There is an excellent explanation of this on this page by Ryan Juckett.

XYZ colors, and the Yxy encoding have some very interesting practical properties. Below are some of them.

1- All human-visible colors have positive X, Y, and Z values.

This means that matrices to go from RGB to XYZ can only have positive coefficients, and a valid (positive) RGB color can only map to a positive XYZ triplet. The opposite is generally not true.

2- The Y value of an XYZ color represents the relative luminance of the color as percieved by the human eye.

So if one computes the Y (luminance) field of an RGB image, one gets a grayscale version of the original image.

3- Y (luminance) and xy (chrominance) are fully independent.

This means that if one encodes an RGB color in Yxy, then amplifies or diminishes the luminance Y’=k·Y, and then go back to RGB from Y’xy, the result is k·RGB.

This is a fundamental property in tonemapping algorithms, which typically do range compression on the luminance, and then just re-plug the original chrominance field to the compressed luminance field.

This property also means that one can do any alteration on the chrominance values (xy) and then go back to RGB while preserving the original luminance (Y). This comes handy when implementing image filters such as Tint (xy rotation around the whitepoint) or Saturation (xy amplification away from the whitepoint).

4- The range of Y is [0..INF), but the range of xy is constrained to [0..1]x[0..1].

Since the chrominance values (xy) are normalized they are independent of the luminance value, and hence they have a limited range. Here’s what happens when one selects a fixed luminance (Y=1, for example) and draws the RGB colors corresponding to all the chrominances in x=[0..1] and y=[0..1]:

sRGB triangle (gamut)

The darkened colors are invalid (not plausible) results where at least one of the RGB components is negative. So the only valid values that xy can get are actually constrained to this triangle, which is, in fact, much smaller than the unit square. As a matter of fact, the maximum distance between two chrominance values in Yxy (in sRGB as displayed above) is around 0.6, derived from the length of the longest edge of the gamut triangle.

Different color-space transforms have larger or smaller gamut triangles, as depicted in this CIE 1931 xy gamut comparison.

Note that the gamut triangle is the same regardless of the color luminance.

Glare patterns

Glare in photography is due to Fraunhofer diffraction as light from distant objects passes through the camera diaphragm.

There is a magical connection between Fraunhofer diffraction (physics) and the Fourier Transform (math). As a matter of fact, the intensity of the Fraunhofer diffraction pattern of a certain aperture is given by the squared modulus of the Fourier Transform of said aperture.

Assuming a clean and unobstacled camera, the aperture is the diaphragm shape. Here you have the diffraction patterns that correspond to some basic straight-blade (polygonal) diaphragms.

Glare patterns

Interestingly, the Fourier Transform produces one infinite decaying streak perpendicular to each polygon edge. When the number of edges is even, the streaks overlap in pairs. That is why an hexagonal diaphragm produces 6 streaks, and an heptagonal diaphragm produces 14.

The leftmost pattern happens to be the Airy disk. The Airy disk is a limit case where the number of polygon edges/streaks is infinite.

The examples above were generated at 256×256. The visual definition of the pattern naturally depends on the resolution of the buffers involved in the computation of the Fourier Transform. However, note that the FT has an infinite range. This means that for ideal polygonal shapes, the streaks are infinitely long.

In the practical case, buffers are far from infinite, and you hit one property of the Fourier Transform that is often nothing but an annoyance: the FT is cyclic. The image below depicts what happens when one pumps up the intensity of one of the glare patterns obtained above: the (infinite) streaks, warp-around the (finite) FT buffer.

Cyclic glare pattern

Bonus: Here’s some real-life glare I captured this evening at the European Athletics Championships.

Real-life glare