OUTLINE Registration and mosaicing of images started to be considered at the half of the Nineteenth century to overlap aerial photographs. When the first satellites started sending Earth images back, mosaicing became a need. Improvements in computer technology have given a boost to research in mosaicing. Mosaicing techniques have many application fields, ranging from computer graphics to medical imaging and, in general, whenever enhancing either image resolution or field of view is required.
Nowadays, image mosaicing is stirring up a lot of interests in the research community for both its scientific significance and potential spinoff in real world applications. Being able to perform automatic mosaicing coud trigger a wide range of higher level image processing tasks such as scene depth computation, resolution enhancement, motion detection and tracking using non stationary camera. When building real time mosaicing, most of the approaches need to use prior information or feedback camera signals (e.g. angle, focus or exposure information) to perform real time registration or correct geometric deformations, thus yielding hardware dependent methods. On the other side, high quality mosaics are commonly achieved using global registration approaches that exploit both spatial and temporal contiguity of the whole set of images to be mosaiced. Of course, these require off-line computations, since the whole set has to be known in advance.

METHOD Our research addresses real time methods that are fully image-based and self-calibrating, meanwhile achieving high quality results typical of global registrations. The solution we propose does not exploit any priori information regarding scene geometry, acquisition device properties or feedback signals, thus resulting in a fully image based, hence general purpose, solution.
We developed an innovative fully automated real time on-line mosaicing algorithm able to build high quality seam-free mosaics. It relies on pair-wise image (frame to frame) registration using feature based methods in order to offer real time performance [M4]. As a matter of fact, usually high quality registrations are achieved using cost function minimizations, at the expense of real time performance. Accordingly, we devised a two-stage spatial registration method in order to remove the cumulative drift errors introduced by the earlier pair-wise placement that is suboptimal. The second stage performs a sort of "back registration" (frame to mosaic) that refines the previous estimate (Fig. 1).

no_feedback

Fig. 1: Mosaic build with (right) and without (left) our back registration.

Phase correlation methods are employed as the initial guess to achieve a reliable estimation of the camera motion even in case of large inter-frame displacements that represent a critical issue when using feature based methods.

RESULTS The algorithm implemented performs in real time independently of the scenes being mosaiced, therefore we have focussed our attention on the quality of the outcome, that could depend on the scene being shot. To this purpose, several image sequences have been considered to assess the performance of our approach and Fig. 2 reports the outcome concerning three different environments.

our_lab
our_lab our_lab our_lab
outdoor_lab

Fig. 2: First row: a classic panorama, achieved by mosaicing a few tens frames. Second row: mosaic of our lab, indoor (left), the plan highlighting the camera's field of view (middle), a color mosaic with a narrower field of view (right). Third row: the outside of our lab (left) and the sequence of frames to generate the mosaic (right).

In the first row, a panorama probably represents the easiest of the test cases considered, since parallax effects are usually absent. However, although a long sequence of frame is considered, the mosaic is excellent with no drift errors. In the second row, the inside of our lab achieved in gray levels with pure pan (left) and in colours (right) with pan and tilt is acquired in quite a uniform light environment. However, reconstructing the set of highly changing depths (in the middle, the camera's field of view) represents a strong challenge that is won using back registration. Finally, the third row shows the most challenging environment, acquired using the PTZ of the video surveillance system we set up. The scene is outdoors, with a high dynamic range and highly structured. Actually, the mosaic is built using about 250 frames, overalapping more than 90% (the frames map is on the right). Although the entrance door jamb (on the left) and the gate (on the right) are placed at 1 and 30 meters, respectively, from the camera, our back registration approach together with our joint spatial and tonal registration [M1] prevents any error drift in the mosaic, that appears of an excellent quality.

Joint spatial and tonal registration

OUTLINE Real time mosaicing techniques often deal with spatial registration only, while tonal registration is been often left out of consideration, because of the absence of reliable algorithms compliant with the real time requirements. However, mostly in case that the mosaic must to be kept up-to-date, global scene illumination changes due to either environmental causes or camera integrated Automatic Gain Control (AGC), autoiris or autoshutter yield corresponding regions, framed at subsequent time, to look as being different. Accordingly, in order to be used effectively, general purpose image mosaicing requires to consider the overall alignment problem in a higher dimensional space rather than the geometric 2D space, so that the frames making up the mosaic undergo the same illumination.
Histogram modelling techniques provide sophisticated methods for modifying the dynamic range and contrast of an image by altering each individual pixel such that its intensity histogram assumes a desired shape. Histogram specification, or histogram matching, is a basic histogram modelling technique that transforms one histogram into another one by remapping the pixel values to control the relative frequency of their occurrence. Histogram specification methods can be classified according to computational complexity, image distortion and accuracy in reproducing the target histogram. Real-time histogram specification methods aim to find a continuous function that transforms a source image to match the distribution of a target image with the degree of accuracy compliant with the real time requirements. Nevertheless, just the exact specification exploiting multi-valued ordering functions offers the highest accuracy but the approaches using it incur in computationally expensive implementations.

boat boat_lin boat_gauss boat_log

Fig. 1: Example of images (at the farthest left, the original ones) histogram specified using our method with synthetic target histograms (linear, Gaussian and logarithmic, from left to right, respectively).

METHOD In this research we have developed an innovative exact specification method that is compliant with real time requirements. Practically speaking, our method offers the following features:

capability to achieve exact matching
simplicity yielding a fast algorithm
independence of the properties of both the source image and the target histogram

Usually, mapping between original and target histograms is carried out by means of functions, that can involve 1-dim features (gray levels) or N-dim features, where different metrics (e.g. neighborhood average) allow to discriminate among pixels having same values for the remaining (N-1)-dim features. Our approach abandons the concept of intensity mapping function and replaces it with the principle of one-to-many mapping relationship. This enables us to map pixels having same value for one feature in the source histogram to diverse gray level values. Besides, the random approach we use to assign the source gray levels to the target ones prevents the introduction of structured pattern noise, common to other approaches the use given metrics instead.
Jointly to spatial registration, this constitutes the core of our algorithm to achieve automatically a background mosaic without exploiting any prior information regarding the scene or the acquisition device, despite of the presence of moving objects in the scene [M1].

RESULTS We have carried out comparative experiments using the well known Baboon and Boat images. Fig. 2 reports the results of quasi-exact methods that are "more similar" to our, in terms of speed performance.

Classic
Heramian
Our algorithm
	Original	Linear	Gaussian	Logarithmic

Fig. 2: At the top left, the original source PDF. From top to bottom, results with the classic method, Heramian's and our's when target PDFs are linear, Gaussian and logarithmic (from left to right, respectively).

In order to assess quality and time performance of our method we propose the indicators we have implemented, that is computational speed, histogram distortion and image distortion. Although, here we just use 1-dim feature (i.e. gray levels) for the source histogram, our algorithm attains the same quality performance as the reference exact methods. Besides, it always achieves exact target histograms where other methods fail. Nevertheless, our algorithm runs two orders of magnitude faster than the reference exact methods and more than one order if compared with the other quasi-exact approaches. We can thus conclude that our method shows a very low image distortion (see Fig. 1), while achieving the exact specification (Fig. 2) faster than the other methods do. More details are reported in [M2].

In image registration, the effects of our methods are more clearly perceptible. The lack of tonal registration is visible in Fig. 3, left, indoors (top) and outdoors (bottom). Indoors, one can see how the joint registration (right) can improve mainly the geometric properties of the mosaic, adjusting its deformation. Outdoors, when the changing lighting conditions could be heavy, the lack of our tonal registration makes a mosaic unusable, where single frames keep their original illumination (left). On the contrary, our approach faces the heavy lighting differences, thus permitting to build a mosaic even outdoors (right), with very impressive results. A more extensive analysis is reported in [M1].

Fig. 3: Mosaic with spatial registration only (left) and with joint tonal and spatial registration (right). Indoors (top), the even illumination bounds the effects of the tonal registration on the deformation adjustments. Outdoors, the changing lighting condition are visible in the single frames composing the mosaic on the left. Here, our method permits to build an effective seamless mosaic (right).

Motion Detection with Pan-Tilt-Zoom Cameras using Background Mosaics

OUTLINE Scene segmentation between background and foreground (moving) regions represents the first stage of many applications such as visual surveillance, traffic monitoring and human activity understanding. The background subtraction technique (BST) is known as being the one yielding the highest quality of the detected moving masks when using one stationary camera. Using this approach basically means comparing the current frame with a reference scene (a previously computed background). Moving "blobs" (aggregate of pixels) are identified by thresholding these differences.
Many attempts to improve the overall system's performance of motion detection systems have been accomplished, by improving the background detection techniques or exploiting colour information. Exploiting pan/tilt/zoom (PTZ) cameras permits to widen the field of view of a surveyed area, but it needs less accurate methods (e.g. optical flow) to be employed to detect motion. Having a background mosaic at one disposal could permit a system to exploit even BST, but the exiting attempts to use BST rely on prior assumptions which limit the camera motion or the algorithm to work with a depth field of view only. In addition, the lack of effective real-time tonal, or photometric, registration has prevented the existing approaches to be used in real world applications.

METHOD Our group proposes an innovative solution to achieve a real time colour [M3] mosaic background apt to work with existing colour background subtraction algorithms to yield excellent foreground object masks [M4]. The two-stage real time algorithm we have developed for joint spatial and tonal registration [M1], in order to account even for photometric misalignments induced by varying light conditions and exposure, prevents error accumulation and permits to construct globally coherent mosaics, leading to near optimal results even without resorting to computing intensive global adjustment procedures.
Multiple objects are allowed to move in the scene while building the mosaic, since they are recognized as "motion outliers" (with respect to camera motion) using fast and robust statistic estimators (e.g. IRANSAC) and iterative clustering methods. Recovery of device intrinsics and lens distortion parameters (e.g. focal length and distortion coefficients) has been addressed as well, also arranging independent modules aiming at estimating them directly from the sequence.

RESULTS Some indoor and outdoor panning and tilting challenging case studies have been considered. In Fig. 1, three different scenes depict one person walking indoors (first row), outdoors (second row) and two people tracked outdoors (third row).

our_lab our_lab our_lab our_lab
our_lab our_lab our_lab our_lab
our_lab our_lab our_lab

Fig. 1: Walking people are detected using a background mosaic. First row: Motion detection in our lab, panning and tilting with small angles. Second row: a person is detected outdoors, with a high accurate mask, while panning manually. Third row: example of automatic people tracking, in our surveillance system, with wide pan and tilt angles.

People are detected through background subtraction, where the background is the mosaic built using our joint spatial and tonal approach. In the first row, the camera is moved according to narrow pan and tilt movements while in the second row the camera manually pans the scene. The moving masks thus detected have been compared with those achieved by a fixed camera employing a static background, and substantially they have shown same quality. The third row shows two people tracked in our surveillance system (in the right frame the person comes into the scene after exiting). As a conclusion, we can state that the quality of the mosaics attained permits a novel approach in motion detection, based on background difference even with PTZ cameras. Moreover, this approach is the first in literature used even for automatic tracking.

INDUSTRIAL APPLICATION

automatic 3D digital image metrology
quality control (e.g. 3D pose estimation, object alignment, offset displacement)
image stabilization
automatic tracking (video surveillance, quality control, etc.)
imaging of full objects (built incrementally by moving a camera)

REFERENCES

[M1] P. Azzari, A. Bevilacqua, Joint Spatial and Tonal Mosaic Alignment for Motion Detection with PTZ Camera, Lecture Notes in Computer Science (LNCS), Vol. 4142 (2006) 764-775
[M2] A. Bevilacqua, P. Azzari, A High Performance Exact Histogram Specification Algorithm, 14th International Conference on Image Analysis and Processing (ICIAP 2007), September 10-14, 2007, Modena, Italy, pp.623-628
[M3] A. Bevilacqua, P. Azzari, High-quality real time motion detection using PTZ cameras, IEEE International Conference on Advanced Video and Signal based Surveillance (AVSS 2006), November 22-24, 2006, Sidney, NSW, Australia, pp.23-29
[M4] A. Bevilacqua, P. Azzari, A Fast and Reliable Image Mosaicing Technique with Application to Wide Area Motion Detection, Lecture Notes in Computer Science (LNCS), Vol. 4633 (2007) 501-512

Image registration and mosaicing

Automatic real time image mosaicing

Joint spatial and tonal registration

Motion Detection with Pan-Tilt-Zoom Cameras using Background Mosaics