← Back to Portfolio

[AUTO]STITCHING PHOTO MOSAICS

Quang Nguyen, SID: 3036566521

Project Overview

This project aims to build intuitions behind the creation of a large composite image by stitching together photographs. More specifically, in the first part of the project, I

In the second part, I automated the correspondence-determination process in Part 1 so that I can automatically stitch images together to create panoramas. That is, I:

PART A: IMAGE WARPING AND MOSAICING

1. Shoot The Pictures

Here are the series of pictures that I took to blend into mosaics:

my room 1
my room 2

Photos of my room

gateway construction 1
gateway construction 2

Photos of the construction of the new Gateway building

6th floor soda 1
6th floor soda 2

Photos of 6th floor of Soda Hall

common room 1
common room 2
common room 3

Photos of the common room of my apartment

anchor house 1
anchor house 2

Photos of the Helen Diller Anchor House before sunrise

bampfa 1
bampfa 2

Photos of the Berkeley Art Museum and Pacific Film Archive (BAMPFA) before sunrise

2. Recover Homographies

From lecture, we define an projective transformation using a homography matrix as follow: \[H\mathbf{p} = \mathbf{p'}\] \[\begin{bmatrix} a & b & c\\ d & e & f\\ g & h & 1 \end{bmatrix}\begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix}\]

Exapnding this matrix multiplication out, we get the following system of equations: \[ \begin{cases} ax + by + c = x' \\ dx + ey + f = y' \\ gx + hy + 1 = 1 \end{cases} \Rightarrow \begin{cases} ax + by + c - gxx' - hyx' = x' \\ dx + ey + f - gxy' - hyy' = y' \end{cases} \]

Rewriting the system of equations as matrix multiplication, we obtain \[\begin{bmatrix} x & y & 1 & 0 & 0 & 0 & -xx' & -yx'\\ 0 & 0 & 0 & x & y & 1 & -xy' & -yy' \end{bmatrix}\begin{bmatrix} a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{bmatrix} = \begin{bmatrix} x' \\ y' \end{bmatrix} \]

Since each point gives up two equations and we need at least 4 equations to solve for \(H\), we need at least 4 correspondence points and the matrix multiplication becomes \[\begin{bmatrix} x_1 & y_1 & 1 & 0 & 0 & 0 & -x_1x_1' & -y_1x_1' \\ 0 & 0 & 0 & x_1 & y_1 & 1 & -x_1y_1' & -y_1y_1' \\ x_2 & y_2 & 1 & 0 & 0 & 0 & -x_2x_2' & -y_2x_2' \\ 0 & 0 & 0 & x_2 & y_2 & 1 & -x_2y_2' & -y_2y_2' \\ x_3 & y_3 & 1 & 0 & 0 & 0 & -x_3x_3' & -y_3x_3' \\ 0 & 0 & 0 & x_3 & y_3 & 1 & -x_3y_3' & -y_3y_3' \\ x_4 & y_4 & 1 & 0 & 0 & 0 & -x_4x_4' & -y_4x_4' \\ 0 & 0 & 0 & x_4 & y_4 & 1 & -x_4y_4' & -y_4y_4' \\ \end{bmatrix}\begin{bmatrix} a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{bmatrix} = \begin{bmatrix} x_1' \\ y_1' \\ x_2' \\ y_2' \\ x_3' \\ y_3' \\ x_4' \\ y_4' \end{bmatrix} \]

Since I used more than 4 correspondence points (on averaged, I had about 14 correspondence points for each pair of images), the system of equation will be overconstrained. As a result, I used np.linalg.lstsq to solve for least squares solution. I then just reorganized the entries from the least squares vector to obtain an approximate \(H\).

3. Warp the Images

Given an image to be warped im and a homography matrix \(H\), I warp the image as follow:

4. Image Rectification

To rectify an image, I completed the following steps:

Here are some results of me applying rectification on different rectangular objects/surfaces:

rectified image of a monitor

Rectification of my monitor

rectified image of a laptop

Rectification of my laptop

rectified image of a book

Rectification of a book

5. Blend the images into a mosaic

To blend two images together, I completed the following procedure:

To blend three images together where the center image is chosen to be the reference image, I completed the following procedure:

Notice that the color a little off in some of the blended mosaics because the lighting conditions of the input images weren't consistent (also potentially because iPhone camera automatically adjusts the focus and exposure of the photos).

Here are the results of blending the images into a mosaic where for each group of images, the first row contains the input images and the second row displays the resulting warped and blended image:

my room 1
my room 2
warped and blended image of my room

My Room

the gateway 1
the gateway 2
warped and blended image of the gateway

The Construction of the Gateway

6th floor soda 1
6th floor soda 2
warped and blended image of 6th floor of soda

6th Floor of Soda Hall

common room 1
common room 2
common room 3
warped and blended image of common room

The Common Room of my Apartment

anchor house 1
anchor house 2
warped and blended image of the anchor house

Helen Diller Anchor House before Sunrise

bampfa 1
bampfa 2
warped and blended image of bampfa

Berkeley Art Museum and Pacific Film Archive before Sunrise

PART B: FEATURE MATCHING FOR AUTOSTITCHING

1. Harris Corner Detection

To identify all the corners/interest points of an image, I used the functions provided by course staff, which outputs a list of coordinates with higher corner strength than their immediate neighbors. Notice that without further filtering, there are too many possible candidates for Harris corners, which is too computationally inefficient to work with. Here are the Harris corners on two example images:

harris corners of images of soda hall

Images of Soda Hall with labeled Harris Corners

2. Adaptive Non-Maximal Suprression

I then impletemed Adaptive Non-Maximal Suppression on the list of Harris corners, which restricted the maximum number of interest points and ensured that the resulting points are spatially well distributed over the input image. Here are the remaining keypoints after ANMS:

harris corners of images of soda hall with anms

Images of Soda Hall with labeled Harris Corners after ANMS

3. Feature Descriptor Extraction

To extract features from each Harris corner after ANMS, I extracted axis-aligned 40x40 patches and downsampled them with anti_aliasing=True to obtain 8x8 feature patches and avoid aliasing. I then normalized the descriptors to ensure that their mean is 0 and their variance is 1. Here are some examples of the feature descriptors of an image of Soda Hall:

some feature descriptors of an image of soda hall

Examples of feature descriptors of an image of Soda Hall

4. Feature Matching

To match the features, I first flattened them from 8x8 patches to 64-dimensional vectors. I computed the Euclidean distance between each vector in the first descriptor and all vectors in the second descriptor to obtain a list of distances. I then sorted this list in ascending order to retrieve the distance of the 1-NN and 2-NN and used Lowe's technique to determine matches: If the ratio of 1-NN over 2-NN is below the threshold threshold, then the 1-NN is considered to be a match. Here are the matched features between two images of Soda Hall:

feature matches with threshold 0.5

Feature matches with threshold \(t=0.5\)

5. RANSAC

Even after using Lowe's technique to reduce the number of outliers, there remains inaccurate pair of correspondence points, mainly due to the fact that Euclidean distance, or least-squares, is still vulnerable to outliers. As a result, I implemented RANSAC as followed:

Here is the result of running RANSAC on the set of matched features from Lowe's technique:

feature matches with threshold 0.5 after ransac

Feature matches with threshold \(t=0.5\) after RANSAC

6. Results and Comparisons

Here are the automatically and manually stitched results side by side for comparison:

auto warped and blended image of my room

Automatically stitched image of my room

manual warped and blended image of my room

Manually stitched image of my room

auto warped and blended image of gateway

Automatically stitched image of the construction of the Gateway

manual warped and blended image of gateway

Manually stitched image of the construction of the Gateway

auto warped and blended image of soda hall

Automatically stitched image of the 6th floor of Soda Hall

manual warped and blended image of soda hall

Manually stitched image of the 6th floor of Soda Hall

auto warped and blended image of common room

Automatically stitched image of the 6th floor of my apartment's common room

manual warped and blended image of common room

Manually stitched image of the 6th floor of my apartment's common room

auto warped and blended image of the anchor house

Automatically stitched image of Helen Diller Anchor House before sunrise

manual warped and blended image of the anchor house

Manually stitched image of Helen Diller Anchor House before sunrise

auto warped and blended image of bampfa

Automatically stitched image of BAMPFA before sunrise

manual warped and blended image of bampfa

Manually stitched image of BAMPFA before sunrise

We notice that both automatically and manually mosaics yield pretty similar results, illustrating the robustness of the involved algorithms such as Harris corners, ANMS, or RANSAC.

7. What have I learned?

The coolest things I have learned from this project are rectification and automatic stitching through feature descriptors extraction and matching. After project 3 and course lectures, I was already intrigued by the extended application of linear algebra in warping one shape into another. While working on this project, I was particularly surprised and amazed by how much we could stretch the image with the homography matrix to accurately rectify a rectangular surface without encountering aliasing problems even when the surface is extremely skewed.

Similarly, from project 3 to project 4A, the typical procedure of determining the transformation matrix given two images includes manually defining correspondence points through a mouse-clicking interface, which involves a decent amount of tedious work, and recovering the transformation matrix using the numpy library. This procedure is effectively and sufficiently automated in project 4B, where I implemented interest point detection with Harris corners and ANMS, feature descriptor extraction with Gaussian blurring, and robust feature matching with Lowe's technique and RANSAC. Thanks to this much needed automation, I was able explore mosaicing images with fewer corners (I didn't include results for these images in this writeup).