I am doing some work in the new ArionFX tonemapping system (v3). This is an example that abuses a bit of lens radial blur with spectral chromatic aberration:
EDIT: I added an in-depth explanation in the following post.
Some times one needs to detach the luminance from the chrominance in an image. For example, some tonemapping operators compress the luminance of HDR radiances down to a displayable range, and then add the chrominances back. Another situation when splitting information in luminance and chrominance is fundamental is image/video compression (more on this at the end of this post).
Assuming that one starts with colors in RGB, there are different ways to extract the luminance and chrominance values. Let’s start by saying that color spaces are three-dimensional, so no matter what basis you use to encode colors, you will always end up with (at least) three numbers. Actually, typical encodings for luminance/chrominance use one coordinate for the luminance, and two coordinates for the chrominance (e.g., YCrCb, YUV, Yxy, …).
I will use this Arion 3 render for this little dissertation. The image uses green predominantly, and there’s little or no blue, so it probably isn’t the best choice ever to talk about colors. But I find the image pretty anyway.
If we split the three R,G,B components of the image, this is what we get:
Each component looks like a separate grayscale image.
The RGB->YUV transform is a simple 3×3 matrix product (i.e., an affine three-dimensional transform). The exact coefficients of the 3×3 matrix vary from one standard to another. Take a look at Wikipedia for more details.
Interestingly, the Y (luminance) field is a grayscale version of the original image (as expected), but the U,V chrominance fields look like a mid-tone, dumbed-down version of the original.
Note that the 3×3 matrix used to convert from RGB to YUV has negative values which affect the chrominance output. Actually, the U,V values are typically centered around 0 and can be either positive or negative. The luminance component can only be positive, though. For the images shown on this post, U and V were offset to +0.5.
The Yxy system is a bit more complicated. First, one must go from RGB to CIE XYZ. Again, the exact coefficients depend on the color space, but we’re speaking of another affine transform. One that only has positive coefficients, this time. Wikipedia explains how to go from XYZ to Yxy in this page.
The chrominance values x,y are normalized. We’ll get to the implications of this in a minute.
Again, the chrominance fields look like mostly mid-gray images, with little contrast.
I mentioned normalization a few sentences ago. Let’s see what happens when we compute the YUV and Yxy fields when the intensity of the original image is divided by 2:
As both luminance fields got half as bright, the chrominance fields got less contrasted in YUV, but remained completely unaffected in Yxy. This is the effect of normalization. The chrominance fields in Yxy are independent of the luminance power. This is not true to YUV.
This is a very important property that makes Yxy suitable for cases of use where one is going to modify the luminance field, and then expects to be able to restore colors regardless of what the changes in luminance have been. A perfect example of this is Arion’s framebuffer tonemapping. Other examples worth mentioning are luminance-based image sharpening and denoising. Sharpening the luminance of an image leads to better results (fewer chromatic artifacts) than sharpening the RGB channels directly.
Below you can see what YUV and Yxy look like in RCSDK’s color chart, which covers the whole RGB range.
Bonus remark: Back in the late 90’s I spent a lot of time doing research on image and video compression using fractals, wavelets, and other exotic techniques. So I love the subject of data encoding.
Standards such as JPEG and MPEG make use of color spaces like YUV for two reasons:
1- Our eyes are far less sensitive to changes in chrominance than they are to changes in luminance. So it makes sense to spend fewer bits encoding the chrominance fields. It is even normal to downsample the chrominance fields to kill directly some information that your eyes and brain will (literally) overlook.
2- As seen above, it is evident that an encoding like YUV exhibits less entropy than the raw RGB channels, which look like 3 separate grayscale images. Once properly color-encoded, the same information retains most of the variance in the luminance field.
So, from at least two points of view (the human perceptual system and pure Information Theory) encodings that split luminance and chrominance are key.
If you code Computer Graphics stuff, or if you work in any field of science, then you are necessarily familiar with the Gaussian function (a.k.a. Normal distribution, Gaussian point-spread function, …).
The formulations below, where s=sigma is the standard deviation, assume that the mean m is 0:
The Gaussian function is commonly used as a convolution kernel in Digital Image Processing to blur an image. Computing a convolution is generally very slow, so choosing a convolution kernel that is as small as possible is always desirable.
The image below depicts a canonical Gaussian function where m=0 and s=1 in the range x=[-4..+4]. Changing the mean shifts the function to the left or right, and changing the standard deviation stretches or compresses the bell-shaped curve, but always leaving its surface (integral in [-INF..+INF]) unaffected and equal to 1.
The red, green, and blue regions are the intervals of radius sigma, 2·sigma, and 3·sigma around the mean. Interestingly, the red region comprises around 68% of the total integral of the Gaussian. The red+blue regions comprise around 95%, and the red+blue+green regions comprise more than 97% of the total.
(Wikipedia has a fantastic explanation of this in the Percentile page).
In other words, ignoring a Gaussian function beyond a radius of 3·sigma still leaves you with more than 97% of its total information.
In 2D, the Gaussian function can be described as the product of two perpendicular 1D Gaussians, and due to radial symmetry, the same principle applies:
This knowledge is very valuable when building a Gaussian-based blur convolution kernel.
As a summary:
Say that you intend to do a Gaussian blur of sigma=5 pixels. The optimal kernel dimensions would be [(15+1+15)x(15+1+15)] => [31×31].
The difference between using an infinite or a size-limited Gaussian kernel is negligible to the naked eye. The animated image below presents two Gaussian convolutions of the same sigma=5. One of them uses a kernel of radius 3·sigma=15, and the other one uses an infinite radius. As you can see, the difference can only be slightly noticed around areas of very high power. The base image used for this example intentionally features a very high dynamic range of approx. [0..625].
More on convolutions soon…
I feel fortunate to work on a field where not only I deal on a daily basis with plenty of algorithms worth visualizing, but my job itself is to develop algorithms -to- visualize. This is, the end goal of the algorithms I work with is to visualize (photo-real) stuff.
For example, here’s a visualization of the highly efficient fiber-field rendering routines (hair/fur) that I’ve been working on during the past months. This image was rendered by Erwann Loison in Arion 3. But, if you want to put it this way, an Arion render is nothing but a visualization of the light simulation algorithms that are running inside. 🙂
Now, here’s a much more spartan visualization of some of the routines involved in Arion’s hair/fur rendering. This is an image taken from the RCSDK Unit Testing system. It visualizes some of my custom ray-to-primitive intersection routines (e.g., infinite planes, spheres, cylinders, cones, and connectable fiber (hair) segments).