In this project we continue to work with warping images. Now,
            rather than considering affine transformations of triangles, we are
            considering global homographies that model changing the viewing
            angle while keeping the camera in the same location.
            We first capture multiple images from the same location, with the
            camera pointing in different directions. These images are all
            capturing the same light field, so in regions of overlap they differ only by a perspective shift, which we model as a homography
            The task now is to align and join the images to create an image with a wider field of view.
            
            
             Homographies and Image Rectification 
             Before we move to image stitching, we first look at image
            rectification, which is the task of applying the homography that
            rectifices a selected rectangle. We compute this homography by
            solving the least squares solution to a certain linear system that
            describes how points transform under the homography. The key thing
            here is that we are using projective coordinates which allow us to
            describe how images change as the image sensor is rotated. These
            projective coordinates are very interesting and connected to some
            math that I would like to learn about. The equation we start with
            is
            
            [wx_i', wy_i', w] = H[x_i, y_i, 1]           (*)
            
            This is an example of computing with equivalence classes, as there
            is an entire 1d subspace of R^3 that we make correspond to a single
            point in R^2. From (*) we write:
            
            x_i' = (h_11 x_i + h_12 x_j + h_12) / (h_31 x_i + h_32 x_j + h_33)
            
            and
            
            x_j' = (h_21 x_i + h_22 x_j + h_22) / (h_31 x_i + h_32 x_j + h_33)
            
            Then organizing this into a system of equations, we have:
            
            [-x_i, -x_j,    -1,    0,    0,  0, x_i' x_i, x_j' x_j, x_i'] vec(H)^T = 0
            
            [   0,    0,     0, -x_i, -x_j, -1, x_j' x_i, x_j' x_j, x_j'] vec(H)^T = 0
            
            
            Then we stack these equations to get a system that we solve by least squares.
            
            In the case of image rectification, we can look at the case where
            we have four points (x_i, x_j), and compute the homography that
            takes these points to the vertices of the unit square.
            
            Below, we look at an example of rectifying an off center rectangle:
            
            
            
            
            
            Now, we can look at taking three images and aligning them
            homographically so that they agree on the location of the white
            rectangle at then end of the aisle in the image.
            
            
            
            After transforming images to agree on overlapping regions, we can
            begin stitching the images together, using the image blending
            technique we developed in project 2.
            
            This story works for a three panel panorama, where we can clearly
            align the two side images to the center image. But for the case
            when there are an arbitrary number of images, this method becomes
            more challenging. Rather than aligning all images to a center image
            and using a notion of global coordinates, we prefer to perform the
            stitching iteratively, adding one new image to the stitch at a
            time.
            At each new iteration, however, we consider the current stitched
            image as a function from the unit square to pixel space. Thus each
            new iteration introduces a new coordinate system, introducing
            challenges for any global parameterization.
            When using this iterative approach, we do not need to pay attention
            to where the source images get mapped to in the final stitch.
            Rather, we warp each image to the in progress stitched image, and
            expand the bounds on the sampling region to accomodate for the
            increasing size of the stitch. The new bounds are computed by
            looking at where the current homography takes the corners of the
            unit square. When we increase the interpolation domain to include
            the entire image of the unit square under the homography, we ensure
            that the resultant stitch includes the entire new image.
            This can lead to problems in the case where the computed homography
            has high condition number, leading to the image of the unit square
            becoming elongated. As the bounds accomadate this, the majority of
            the content of the stitched image gets relegated to a small region
            of the final image, as shown below.
            
              
             
            
            
            To accomadate for this, we prefer to compute homographies between
            images with large overlaps and small perspective changes, ensuring
            that the homographies are sufficiently gentle to the parts of the
            image that are lacking point correspondence information.
            
            Now we see the results of a couple successfully stitched images.
            These stitches were made with manually selected correspondences.
            
            
            
            
            I introduced various hyperparameters in the stitching code, such as
            epsilon tolerances to do with masks for blending, masks for
            updating pixels, all the hyperparameters associated with the
            blending (such as gaussian kernel size and standard deviation).
            The result is that we can achieve different blending effects
            depending on the different hyperparameters that we use, which
            result in different artifacts in the final image.
            
              
             
            
            
            Now we tackle the task of automatically finding correspondences.
            Previously we manually selected the points, but this was not a
            robust method as it is slow and sensitive to small user errors.
            We first compute the harris response to our image, by taking
            horizontal and vertical derivatives, combining them to form the
            second matrix, and then computing the response as the determinant
            of the second moment matrix divided by its trace.
            In the final implementation I used the provided code that calls the
            skimage harris response function, but for illustration purposes I
            show the construction process below.
            
              
             
            
            
            Once we have the harris response, we take the harris corners to be
            the points that are local maximums in their (3, 3) local
            neighborhoods. This gives a set of points that is too dense to work with.
            Next, we use Non-Maximal Suppression to select a subset of well
            spaced points with strong harris response. We do so by first
            sorting the points according to their response, and then
            iteratively adding points from the list to a final subset. When we
            encounter any points that are within a supperssion radius from any
            points already in the final subset, we skip them. This ensures that
            we take the most interesting points that are sufficiently spread
            apart.
            
            
            After selecting a subset of points, we treat patches around each
            point as a descriptor for it. We take (40, 40) patches, then blur
            and downsample to (8, 8). We perform bias / gain normalization so
            that we are considering descriptions that are invariant to affine
            lighting transformations.
            
            
            Then we view each patch as a vector, and compute pairwise inner
            products between each patch. This gives a "distance" between
            points, that we now use for defining correspondences.
            For each point, we look at the ratio of (the distance to its
            nearest neighbor in descriptor space) and (the distance to its
            second nearest neighbor). If this ratio is low, it means that the
            pair of points in consideration are significantly more similar than
            the next closest in similiarity, which means that we have found a
            likely accurate correspondence between the points.
            
              
             
            
            
            Now we have a decent handle on what points correspond to each
            other, however there will still be points that are misidentified as
            corresponding.
            To account for this, we adopt the approach for computing
            homgraphies. Now, the presence of bad outliers in the set of
            correspondences means that a least squares estimate will be pulled
            away from the correct homography. But we would ieally like to find
            the "correct" homography.
            To do so, we employ the RANSAC method. This is a powerful method
            that exemplifies the "robustness through noise" paradigm. The idea
            is to successively select 4 subsets of corrsponding points, compute
            the homography that maps from one side of the correspondence to the
            other, and then evaluate how well this homography explains the
            transformations undergone by the rest of the points. Specifically,
            we apply the proposed homography to all source points, and compute
            how many get mapped epsilon close to their corresponding point.
            This turns out to be an extremely effective approach, while also being beautifully simple!
            
            Now we have automatic correspondences implemented, we can stitch together some larger panoramas.