UWIT: Underwater Image Toolbox for Optical Image Processing and
Mosaicking in Matlab
This page adapted from a poster by Ryan
Eustice, Oscar Pizarro, Christopher Roman, Hanumant Singh
This work was supported in part by CenSSIS, the center for Subsurgace
Sensing and Imaging Systems, under the Engineering Research Centers
Program of the National Science Foundation.
For the complete pdf version, which includes equations, click
here (1.5mb).
Synopsis
This
poster shows results from our development of an extended MATLAB
image processing toolbox, which implements some useful optical
image processing and mosaicking algorithms found in the literature.
We surveyed and selected algorithms from the field which showed
promise in application to the underwater environment. We then
extended these algorithms to explicitly deal with the unique constraints
of underwater imagery in the building of our toolbox. As such,
the algorithms implemented include:
1.
Contrast limited adaptive histogram specification (CLAHS) to deal
with the inherent nonuniform lighting in underwater imagery
2.
Fourier based methods for scale, rotation, and translation recovery
which provide robustness against dissimilar image regions
3.
Local normalized correlation for image registration to handle
the unstructured environment of the seafloor
4.
Multiresolution pyramidal blending of images to form a composite
seamless mosaic without blurring or loss of detail near image
borders
Keeping
in theme with the global view of CenSSIS, "Diverse Problems,
Similar Solutions," many of the algorithms are useful to
the rest of the CenSSIS community. Take a look at the normalized
correlation section of the poster to see some recent applications
of our algorithm to medical imaging.
 |
|
|
Click
above to see the figures referenced in the article.
|
Contrast
Limited Adaptive Histogram Specification
The
propagation of light underwater suffers from rapid attenuation
and extreme scattering. These, in combination with the limited
camera-to-light separation available on most underwater imaging
platforms, places severe limitations on underwater imagery. To
deal with the lighting artifacts of nonuniform illumination and
low contrast underwater imagery, we utilize the classical techniques
associated with contrast limited adaptive histogram equalization
(CLAHE) (Zuiderveld 1994). With this technique the image is broken
up into sub-regions. The optimal gray scale distribution is calculated
for each of these sub-regions, based upon its histogram and a
previously determined transfer function, which is based upon the
desired histogram of the sub-region. Then, each pixel of the image
is adjusted based upon interpolation between the manipulated histograms
of the neighboring sub-regions. Our extensive work upon underwater
imagery has suggested that the model of a Raleigh distribution
is most suited for underwater imagery.
Fourier
Based Image Translation, Scale, and Rotation Recovery
Many
image processing problems involve the fundamental task of registration
of a pair of images. Methods range from: 1) correlation methods
which use pixel values directly; 2) fast Fourier transform methods
which use frequency domain information; and 3) feature based methods
which use low-level features such as edges and corners. This particular
algorithm is based upon Fourier domain methods for scale, rotation,
and translation recovery by making use of the phase shift property
of Fourier transforms (Reddy 1996).
Local
Normalized Correlation
Normalized
correlation is a practical measure of similarity (Brown, 1992).
Normalized correlation of two signals is invariant to local changes
in mean and contrast. When two signals are linearly related, their
normalized correlation is 1. When the two signals are not linearly
related, but do contain similar spatial variations, normalized
correlation will still yield a value close to unity (Irani, 1996).
The
lack of rich features in underwater imagery precludes indirect
feature based methods, and experimental evidence suggests that
direct correlation based methods yield good results. We employ
a dense local normalized correlation to determine correspondence
between images. The shape of the local normalized correlation
surfaces will be concave and have a prominent peak at the correct
displacement. We fit a quadratic surface near the surface peak
and analytically check for concavity (Mandelbaum, 1999) as a method
of outlier rejection.
Multiresolution
Pyramidal Based Blending
Due
to the rapid attenuation of light underwater, the only way to
get a large scale view of the seafloor is to build up a mosaic
from smaller local images, such as in Figure 7. The mosaic technique
is used to construct an image with a far larger field of view
and level of resolution than could be obtained with a single photograph.
Once
the mosaic is generated, a technical problem in image representation
is joining image borders so that the edge between them is not
visible. The two images to be joined may be considered as two
surfaces, where the image intensity I(x,y) is viewed
as the elevation above the (x,y) plane. The problem then
is how to gently distort the images near their common border so
that the seam is smooth?
We
implement a multiresolution pyramidal blending approach where
the two images are decomposed into different band-pass frequency
components, merged on those levels, and then reassembled into
a single seamless composite image (Burt, 1983). The idea is that
with this technique the transition zone between band-pass image
components can be appropriately chosen to match the scale of features
in that band-pass component.
First,
a Gaussian pyramid is constructed for each image where the base
level in the pyramid, G0, is the original image.
Each successive level is a low-pass filtered and down-sampled
by factor of two version of the previous level for an appropriately
chosen kernel w(m,n)). Next, the different band-pass components
are formed by generating the Laplacian pyramid. The Laplacian
pyramid is generated from the Gaussian pyramid by expanding the
image at the next higher level in the pyramid to the resolution
of the current level and then subtracting them. This results in
each level of the Laplacian pyramid containing a separate band-pass
component of the original image. The two Laplacian pyramids are
then merged at each level of the pyramid and the resulting new
seamless image is constructed from the different pyramid levels
via where N is the number of pyramid levels and the notation
Ll,l implies expansion of the level Ll,l times to
the resolution of G0.
References
Brown,
L. G. (1992). "A Survey of Image Registration Techniques."
ACM Computing Surveys 24(4): 325376.
Burt,
P. J. and E. H. Adelson (1983). "A Multiresolution Spline
with Application to Image Mosaics." ACM Transactions of
Graphics 2(4): 217236.
Eustice,
R., O. Pizarro, et al. (2002). UWIT: Underwater Imaging Toolbox
for Optical Image Processing and Mosaicking in MATLAB. Proceedings
of the Third Underwater Technology Symposium, 2002, Tokyo, Japan.
(to be presented)
Irani,
M. and P. Anandan (1996). Robust Multi-Sensor Image Alignment.
Sixth International Conference on Computer Vision, 1998.
Mandelbaum,
R., G. Salgian, et al. (1999). Correlation-Based Estimation
of Ego-Motion and Structure from Motion and Stereo. Proceedings
of the Seventh IEEE International Conference on Computer Vision,
1999, Kerkyra, Greece.
Reddy,
B. S. and B. N. Chatterji (1996). "An FFT-Based Technique
for Translation, Rotation, and Scale-Invariant Image Registration."
IEEE Transactions on Image Processing 5(8): 12661271.
Zuiderveld,
K. (1994). Contrast Limited Adaptive Histogram Equalization. Graphics
Gems IV. P. Heckbert. Boston, Academic Press. IV: 474485.
|