https://www.sicara.ai/blog/2019-07-16-image-registration-deep-learning
How the field has evolved from OpenCV to Neural Networks.
Written by Emna Kamoun & Jeremy Joslove
Image Registration is a fundamental step in Computer Vision. This article presents OpenCV feature-based methods before diving into Deep Learning.
Image registration is the process of transforming different images of one scene into the same coordinate system. These images can be taken at different times (multi-temporal registration), by different sensors (multi-modal registration), and/or from different viewpoints. The spatial relationships between these images can be rigid (translations and rotations), affine (shears for example), homographies, or complex large deformations models.
Source: CorrMap
Image registration has a wide variety of applications: it is essential as soon as the task at hand requires comparing multiple images of the same scene. It is very common in the field of medical imagery, as well as for satellite image analysis and optical flow.
In this article, we will focus on a few different ways to perform image registration between a reference image and a sensed image. We choose not to go into iterative / intensity-based methods because they are less commonly used.
Since the early 2000s, image registration has mostly used traditional feature-based approaches. These approaches are based on three steps: Keypoint Detection and Feature Description, Feature Matching, and Image Warping. In brief, we select points of interest in both images, associate each point of interest in the reference image to its equivalent in the sensed image and transform the sensed image so that both images are aligned.
Feature-based methods for an image couple associated by a homography transformation | Source: Unsupervised Deep Homography: A Fast and Robust Homography
Estimation Model
A keypoint is a point of interest. It defines what is important and distinctive in an image (corners, edges, etc). Each keypoint is represented by a descriptor: a feature vector containing the keypoints’ essential characteristics. A descriptor should be robust against image transformations (localization, scale, brightness, etc). Many algorithms perform keypoint detection and feature description:
These algorithms are all available and easily usable in OpenCV. In the example below, we used the OpenCV implementation of AKAZE. The code remains roughly the same for the other algorithms: only the name of the algorithm needs to be modified.
Image Keypoints
For more details on feature detection and description, you can check out this OpenCV tutorial.
Once keypoints are identified in both images that form a couple, we need to associate, or “match”, keypoints from both images that correspond in reality to the same point. One possible method is BFMatcher.knnMatch(). This matcher measures the distance between each pair of keypoint descriptors and returns for each keypoint its k best matches with the minimal distance.
We then apply a ratio filter to only keep the correct matches. In fact, to achieve a reliable matching, matched keypoints should be significantly closer than the nearest incorrect match.
Check out this documentation for other feature matching methods implemented in OpenCV.
After matching at least four pairs of keypoints, we can transform one image relatively to the other one. This is called image warping. Any two images of the same planar surface in space are related by a homography. Homographies are geometric transformations that have 8 free parameters and are represented by a 3x3 matrix. They represent any distortion made to an image as a whole (as opposed to local deformations). Therefore, to obtain the transformed sensed image, we compute the homography matrix and apply it to the sensed image.
To ensure optimal warping, we use the RANSAC algorithm to detect outliers and remove them before determining the final homography. It is directly built in OpenCV’s findHomography method. There exist alternatives to the RANSAC algorithm such as LMEDS: Least-Median robust method.
If you are interested in more details about these three steps, OpenCV has put together a series of useful tutorials.
Most research nowadays in image registration concerns the use of deep learning. In the past few years, deep learning has allowed for state-of-the-art performance in Computer Vision tasks such as image classification, object detection, and segmentation. There is no reason why this couldn’t be the case for Image Registration.
The first way deep learning was used for image registration was for feature extraction. Convolutional neural networks’ successive layers manage to capture increasingly complex image characteristics and learn task-specific features. Since 2014, researchers have applied these networks to the feature extraction step rather than SIFT or similar algorithms.
The code for this last paper can be found here. While we were able to test this registration method on our own images within 15 minutes, the algorithm is approximatively 70 times slower than the SIFT-like methods implemented earlier in this article.
Instead of limiting the use of deep learning to feature extraction, researchers tried to use a neural network to directly learn the geometric transformation to align two images.
Supervised Learning In 2016, DeTone et al. published Deep Image Homography Estimation that describes Regression HomographyNet, a VGG style model that learns the homography relating two images. This algorithm presents the advantage of learning the homography and the CNN model parameters simultaneously in an end-to-end fashion: no need for the previous two-stage process!
The network produces eight real-valued numbers as an output. It is trained in a supervised fashion thanks to a Euclidean loss between the output and the ground-truth homography.
Like any supervised approach, this homography estimation method requires labeled pairs of data. While it is easy to obtain the ground truth homographies for artificial image pairs, it is much more expensive to do so on real data.
With this in mind, Nguyen et al. presented an unsupervised approach to deep image homography estimation. They kept the same CNN but had to use a new loss function adapted to the unsupervised approach: they chose the photometric loss that does not require a ground-truth label. Instead, it computes the similarity between the reference image and the sensed transformed image.
Their approach introduces two new network structures: a Tensor Direct Linear Transform and a Spatial Transformation Layer. We will not go into the details of these components here, we can simply consider that these are used to obtain a transformed sensed image using the homography parameter outputs of the CNN model, that we then use to compute the photometric loss.
The authors claim that this unsupervised method obtains comparable or better accuracy and robustness to illumination variation than traditional feature-based methods, with faster inference speed. In addition, it has superior adaptability and performance compared to the supervised method.
Deep reinforcement learning is gaining traction as a registration method for medical applications. As opposed to a pre-defined optimization algorithm, in this approach, we use a trained agent to perform the registration
A significant proportion of current research in image registration concerns the field of medical imagery. Often times, the transformation between two medical images cannot simply be described by a homography matrix because of the local deformations of the subject (due to breathing, anatomical changes, etc.). More complex transformations models are necessary, such as diffeomorphisms that can be represented by displacement vector fields.
Researchers have tried to use neural networks to estimate these large deformation models that have many parameters.
We hope you enjoyed our article! Image registration is a vast field with numerous use cases. There is plenty of other fascinating research on this subject that we could not mention in this article, we tried to keep it to a few fundamental and accessible approaches. This survey on deep learning in Medical Image Registration could be a good place to look for more information.
If you want to learn more about OpenCV, check out our article Edge Detection in OpenCV 4.0, A 15 Minutes Tutorial.
Thanks to Rapha?l Meudec, Bastien Ponchon, and Pierre-Henri Cumenge.
Image Registration: From SIFT to Deep Learning
原文:https://www.cnblogs.com/jiangkejie/p/15221173.html