FUN WITH FILTERS AND FREQUENCES

Quang Nguyen, SID: 3036566521

1. Project Overview

This project aims to build intuitions behind image filtering and investigates different methods of leverage frequencies to alter, process, blend, and combine images. More specifically, as part of this project, I:

Leveraged Gaussian filter and its derivative to detect edges in an image.
Blurred and sharpened images with Gaussian filter.
Created a hybrid image from two images by low-pass filtering one image and high-pass filtering the other.
Blended and splined images with Gaussian and Laplacian stacks.

2. Fun with Filters

2.1. Finite Difference Operator

I defined the two finite difference operators D_x and D_y to be np.array([[1, -1]]) and np.array([[1, -1]]).T respectively. I then convolved these two kernels with the image using scipy.signal.convolve2d to produce the partial derivatives with respect to x and y, named gx and gy. To compute the gradient magnitude image, I did np.sqrt(gx ** 2 + gy ** 2), which treats the corresponding pair of pixel values of the two partial-derivative images as a gradient vector and then calculates its L2-norm to obtain the final pixel value. I then binarized this image with threshold 0.1 and 0.2 to obtain edge images.

Partial derivative with respect to x

Partial derivative with respect to y

Gradient magnitude image

Binarized gradient magnitude image with threshold = 0.1

Binarized gradient magnitude image with threshold = 0.2

2.2. Derivative of Gaussian (DoG) Filter

2.2.1. Finite Difference Operator of Blurred Image

I created a Gaussian kernel with kernel_size = 10 and sigma = kernel_size / 6 with cv2.getGaussianKernel. I then blurred the image by convolving it with this Gaussian kernel. The blurred image would then undergo the same process and operations as the previous part 2.1.

There are some differences in the images produced by this method comparing to the results from the previous part. The partial derivatives with respect to x and with respect to y are more smoothed out in this case, similar to the output of convolving the result from previous part directly with the Gaussian (this is because convolution is associative). The edges in the binarized gradient magnitude image (the edge image) are also thicker and fuller. Last but not least, given the same threshold of 0.1, some of the edges (for example, the ones in the camera) are absent in the blurred one.

Partial derivative with respect to x

Partial derivative with respect to y

Gradient magnitude image

Binarized gradient magnitude image with threshold = 0.05

Binarized gradient magnitude image with threshold = 0.1

2.2.2. Derivative of Gaussian

I first blurred the finte difference operators D_x and D_y using the same Gaussian kernel as the previous part 2.2.1, producing the partial derivatives of the Gaussian kernel with respect to x (named gdx) and y (named gdy) respectively. Then, I employ the same method as part 2.1 using gdx and gdy to produce the following images. Inspecting the results closely, I can verify that the images obtained in this part are similar, if not identical, to those in the other method in 2.2.1, aside from minor differences in some edges because of noise.

Partial derivative of image with respect to x

Partial derivative of image with respect to y

Gradient magnitude image

Binarized gradient magnitude image with threshold = 0.05

Binarized gradient magnitude image with threshold = 0.1

3. Fun with Frequencies

3.1. Image "Sharpening"

3.1.1. Method 1: Multiple operations

I first blurred the image with a Gaussian kernel, which acts as a low pass filter removing higher frequencies. I then subtracted the blurred image from the original image to obtain the high frequencies of the image. Finally, I will add back the high frequencies to the original image to acquire an image with sharpened, or emphasized edges. Syntactically, I computed the following: sharpened_img = img + alpha * (img - blurred_img) where blurred_img = convolve2d(img, gaussian_kernel)

3.1.2. Method 2: Single Convolution

Notice that

\[f + \alpha( f-f*g) = f * ((1+\alpha)e - \alpha g), \;\;\;\;\;\; \text{where \(e\) denotes the unit impulse, or an identity kernel.}\]

I created the identity kernel with numpy and use that to construct the modified kernel sharp_kernel = (1 + alpha) * unit_impulse - alpha * unit_impulse. Then, I conlved the original image with this modified kernel to obtain the sharpened image.

3.1.3. Results

Original Image of Taj Mahal

Blurred Image of Taj Mahal

Sharpened Image of Taj Mahal (\(\alpha = 1\))

Sharpened Image of Taj Mahal (\(\alpha = 2\))

Original Image of Sydney Opera House

Blurred Image of Sydney Opera House

Sharpened Image of Sydney Opera House (\(\alpha = 1\))

Sharpened Image of Sydney Opera House (\(\alpha = 2\))

3.1.4. Evaluation by Blurring and Resharpening

I first blurred the image with a Gaussian kernel of kernel_size = 20 and sigma = kernel_size / 6. I then resharpened the blurred image using the same Gaussian kernel and alpha = 2.

Original Image of Notre-Dame

Blurred Image of Notre-Dame

Resharpened Image of Notre-Dame

The resharpened image contains many of the high frequencies of the original image, as evident by the well-defined edges of the cathedral. That is, sharpening does well in recovering from the blurring to an extent. However, the effects of blurring still exist in the resharpened image as sharpening was unable to recover some of the lost information/frequencies (for example, the trees at the bottom of the catheral still look a little blurry or some columns on the left of the image are not shrewd-looking).

3.2. Hybrid Images

3.2.1. Algorithm

Given two images, I extracted the low frequencies from one image using a Gaussian filter with sigma = sigma1 and kernel_size = 6 * sigma1 and the high frequecies from the other image by subtracting a Gaussian filter (with sigma = sigma2 and kernel_size = 6 * sigma2) from the impulse filter. I then created the hybrid image by adding the low and the high frequencies together. After trying different combinations of grayscale and color, I notice that it works better to use color for both the high-frequency and low-frequency components.

3.2.2. Resulting Hybrid Images

Nutmeg

Derek

Nutmeg + Derek

For this case, I used sigma1 = 6.5 to extract the low frequencies from Derek and sigma2 = 15 to extract the high frequencies from Nutmeg.

Cristiano Ronaldo

Lionel Messi

Cristiano Messi

For this failed case, I used sigma1 = 1 to extract the low frequencies from Messi and sigma2 = 3 to extract the high frequencies from Ronaldo. This case is a failure to some extent primarily because the two resolutions of the two inputs are too different, making the resulting aligned and combined image seems less natural.

Confused Bean

Happy Bean

Ambiguous Bean

For this case, I used sigma1 = 1 to extract the low frequencies from Confused Bean and sigma2 = 2.5 to extract the high frequencies from Happy Bean.

3.2.3. Favorite Result with the log magnitude of Fourier transforms

Mark Wahlberg

Matt Damon

Mark Damon

For this case, I used sigma1 = 4 to extract the low frequencies from Mark and sigma2 = 6 to extract the high frequencies from Matt.

image of fourier of hybrid of matt and mark

We can see from the Fourier transforms that the hybrid image is indeed the sum of the low frequencies of Mark and the high frequencies of Matt.

3.3. Gaussian and Laplacian Stacks

To create the Gaussian stack, I initialized level 0 of the stack with the original image. For each successive level, I blurred the previous level using a Gaussian kernel, ultimately resulting in a stack of images with the same sizes. Within each iteration of the for loop to create the Gaussian stack, I subtracted the newly blurred Gaussian level from the previous Gaussian level to obtain an entry for the Laplacian stack. Finally, I appended the last image from the Gaussian stack to the Laplacian stack, resulting in two stacks with the same number of levels.

Here are the Laplacian and Gaussian stacks for the apple image respectively:

Laplacian stack of apple

Gaussian stack of apple

Here are the Laplacian and Gaussian stacks for the orange image respectively:

Laplacian stack of orange

Gaussian stack of orange

3.4. Multiresolution Blending

3.4.1. Algorithm

I created the Laplacian stacks for each of the two input images. I also constructed a Gaussian stack for the mask input image to smooth out the transition between the two images. Then, I generated the stack for the blended image by running blended_stack = (1 - mask_gaussian) * image_1_laplacian + mask_gaussian * image_2_laplacian. Finally, I collapsed the stack to get the final blended result: np.sum(blended_stack, axis = 0).

For all of the more irregular masks (i.e. aside from the linear mask in Oraple), I used Meta AI's Segment Anything to generate them by feeding in an input image and extracting the desired binary mask as a jpg file.

3.4.2. Oraple

laplacian stack of blended orange and apple

3.4.3. Penguin Visiting Pyramids

laplacian stack of blended penguin and pyramid

3.4.4. UFO on top of the Golden Gate Bridge

blended image of ufo and golden gate bridge

3.4.5. Snowing in Berkeley

Image of Berkeley

Image of snowing sky

Mask input image

Laplacian stack of Berkeley

Laplacian stack of the snowing sky

Gaussian stack of the mask input image

laplacian stack of blended snow and uc berkeley

Stack of the blended image

Final blended image

4. Conclusion

All of the displayed results on the websites, including the Gaussian/Laplacian stacks and multiresolution blending, are implemented with color, i.e. I applied the Gaussian filter to each channel and then stacked them together for the resulting image.

The most important thing I learned from this project is image processing/transformation/manipulation through the perspective of frequency. Prior to this project, I have worked with a variety of image processing techniques such as filtering, compressing, or segmentation, but all of those methods work directly with the raw pixel value of the image. In this project, I had the opportunity to extract low and high frequency out of an image and manipulate them appropriately to produce effects that are often provided by photo editing applications such as blending images or creating hybrid images.