Downsampling and Gaussian blur

I talked about several strategies to optimize convolutions in some of my previous posts. I still got to talk about how to approximate a Gaussian blur using a multi-step Box blur in a future post. However, there is yet another good technique to optimize a Gaussian blur that may come handy in some cases.

This post is inspired by a need that I had some days ago: Say that you need to do a 3D Gaussian blur on a potentially humongous 3D data buffer. Working with downsampled data sounds ideal in terms of storage and performance. So that’s what I am going to talk about here:

What happens when downsampling is mixed with a Gaussian blur?

The idea:

Here’s the 0-centered and un-normalized 1D Gaussian function:

G( x , \sigma ) = e^\frac{ - x^2 }{ 2 \sigma ^2 }

The sigma parameter in the Gaussian function stretches the bell shape along the x axis. So it is quite straightforward to understand that if one downsamples the input dataset by a scale factor k, then applies a (smaller) Gaussian where sigma’=s/k, and finally upscales the result by the same scale factor k, the result will approximate a true Gaussian on the original dataset where sigma=s.

In cleaner terms: if one has an input dataset (e.g., an image) I and wants to have it blurred by a Gaussian where sigma=s:

1- I’<=I downsampled by a certain scale factor k.
2- I”<=I' blurred by a small Gaussian where s’=s/k.
3- I”’<=I'' upscaled by a scale factor k.

How good is this approximation?

The Sampling Theorem states that sampling a signal at (at least) twice its smallest wavelength is enough. Which means that downsampling cuts frequencies above the Nyquist limit (half the sampling rate). In other words: Downsampling means less data to process, but at the expense of introducing an error.

Fortunately, a Gaussian blur is a form of low-pass frequency filter. This means that blurring is quite tolerant to alterations in the high part of the frequency spectrum.

Visual evaluation:

In the examples below I am downsampling with a simple pixel average, and I am upscaling with a simple bilinear filter. The 2×2 grids below compare:

1- Top-left – The original image I.
2- Top-rightI downsampled and upscaled by k (note the blocky bilinear filter look).
3- Bottom-left – The resulting image I”’.
4- Bottom-rightI blurred by a true Gaussian where sigma=s.

In these examples, I chose k=sigma for simplicity. This means that the small Gaussian uses sigma’=1.

Gaussian blur where sigma=4

Gaussian blur where sigma=16

Gaussian blur where sigma=64

Conclusion:

As shown, the approximation (bottom-left vs. bottom-right) is pretty good.

The gain in speed depends on multiple implementation factors. However, as I explained above, this post was inspired by a need to cope with a cubic memory storage problem when doing Gaussian blurs on a 3D buffer. Working with a heavily downsampled buffer clearly helps in that sense. And it is needless to say that decreasing the amount of data to process by k^3 also brings a dramatic speed boost, making it possible to use tiny separable convolutions along the 3 (downsampled) axes.

Note that one might pick any downsampling scale factor k>=1. The higher the value of k, the higher the approximation error and the smaller and faster the convolution.

The choice k=sigma offers a good trade-off between approximation error and speed gain as shown above.

Glare patterns

Glare in photography is due to Fraunhofer diffraction as light from distant objects passes through the camera diaphragm.

There is a magical connection between Fraunhofer diffraction (physics) and the Fourier Transform (math). As a matter of fact, the intensity of the Fraunhofer diffraction pattern of a certain aperture is given by the squared modulus of the Fourier Transform of said aperture.

Assuming a clean and unobstacled camera, the aperture is the diaphragm shape. Here you have the diffraction patterns that correspond to some basic straight-blade (polygonal) diaphragms.

Glare patterns

Interestingly, the Fourier Transform produces one infinite decaying streak perpendicular to each polygon edge. When the number of edges is even, the streaks overlap in pairs. That is why an hexagonal diaphragm produces 6 streaks, and an heptagonal diaphragm produces 14.

The leftmost pattern happens to be the Airy disk. The Airy disk is a limit case where the number of polygon edges/streaks is infinite.

The examples above were generated at 256×256. The visual definition of the pattern naturally depends on the resolution of the buffers involved in the computation of the Fourier Transform. However, note that the FT has an infinite range. This means that for ideal polygonal shapes, the streaks are infinitely long.

In the practical case, buffers are far from infinite, and you hit one property of the Fourier Transform that is often nothing but an annoyance: the FT is cyclic. The image below depicts what happens when one pumps up the intensity of one of the glare patterns obtained above: the (infinite) streaks, warp-around the (finite) FT buffer.

Cyclic glare pattern

Bonus: Here’s some real-life glare I captured this evening at the European Athletics Championships.

Real-life glare

Circular & radial blur

Circular and radial blur can be implemented in different ways. The method I am going to describe here is reasonably efficient, provided that there is a hyper-fast 1D box-based blur routine at one’s disposal (more on that in a future post). The algorithm is quite straightforward to implement, and also has the beautiful property of being able to do both circular and radial blur at the same time.

I will work on grayscale images, although as usual the process can by extended to color images by performing the same operations on the three R, G, and B channels.

The key to circular/radial blur is to not work on the image space directly, but on a dual space, where cartesian co-ordinates are transformed to polar co-ordinates around the central pixel. Like this:

Cartesian-to-polar transform

Each column in the transformed image is one of the ‘spokes’ that go from the center to one of the pixels at the perimeter in the original image. The length of the largest spoke is half a diagonal, and the perimeter of a WxH image has 2·(W+H-2) pixels. So the transformed image is a buffer of dimensions ceil(sqrt(W^2+H^2)/2) and 2·(W+H-2).

We also need an inverse transform that restores the original image from its polar form.

Note that, for better results, the transform and also its inverse must do proper filtering. Since the spokes are diagonals that do not follow the arrangement of the pixels in the original image, the process of transforming and un-transforming is not exactly reciprocal. i.e., un-transforming the transformed image does not restore the original image identically. In simpler words: this process adds some little blur due to filtering. However, this is ok, because we’re aiming at circular/radial blur after all.

Below are the schematics of the type of filtering I am using in RCSDK. When I sample a pixel centered at (xm,ym) along a spoke, I integrate the original image, constrained to a 1×1 square area. This integration simply takes the (up to) 4 overlapped pixels, and weighs each of them by the corresponding surface overlapping factor:

Sample 1×1 px area

Note also that the inverse transform must ‘undo’ the filtering, by splatting contributions to the final image using the same surface overlapping factors.

…and here comes the interesting part.

1- If we 1D-blur the rows in the polar-space image, and then apply the inverse transform, we get a circular blur of the original image.

2- If we 1D-blur the columns in the polar-space image, and then apply the inverse transform, we get a radial blur of the original image.

A fast box-based 1D blur implementation can run in O(n), regardless of the radius of the blur. Let’s assume a square image of side S. The size of the transformed image is 2·(S+S-2)·sqrt(2)·S/2, which means a quadratic complexity, or linear with respect to the number of pixels. The algorithm is made of fairly simple arithmetic operations, and allows for multi-threading.

Here you can take a look at a couple of filtering examples taken from my RCSDK Unit Testing System:

Circular blur

Radial blur

Some bonus remarks:

1- If the amount of blur becomes progressively small as you approach the center, radial blur becomes lens blur.
2- If the amount of blur used in radial blur is different for each color component, you get chromatic aberration.
3- If you work with spectral colors, instead of RGB, chromatic aberration looks great even when the blur stretches colors along long streaks.

Lens blur & heavy chromatic aberration

Some more final remarks:

– In general, blur operations are clamped to the boundaries of the buffer they operate on. However, in the case of circular blur, one must warp-around from the first to the last spoke.
– It is not really necessary to create the transformed polar image, which is (much) larger than the original. One can feed the 1D blur with transformed pixels directly, and save some memory. Doing so doesn’t imply a performance penalty, because the algorithm runs through each spoke only once.