Edges are a robust feature for object detection. In this paper, we present an edge-based background modeling method for the detection of moving objects. The edges in the image frames were mapped using robust Canny edge detector. Two edge maps were created and combined to calculate the ultimate moving-edge map. By selecting all the edge pixels of the current frame above the defined threshold of the ultimate moving edges, a temporary background-edge map was created. If the frequencies of the temporary background edge pixels for several frames were above the threshold, then those edge pixels were treated as background edge pixels. We conducted a performance comparison with previous works. The existing edge-based moving-object detection algorithms pose some difficulty due to the changes in background motion, object shape, illumination variation, and noises. The result of the performance evaluation shows that the proposed algorithm can detect moving objects efficiently in real-world scenarios.
Moving-object detection methods detect moving objects by subtracting the background from the current image. The performance of moving-object detection depends on background modeling. A background model should overcome the illumination variation of the background and the noises. Such modeling methods can be classified into two types, namely pixel-based and edge-based methods, depending on the features used for the detection of a moving object. Pixelbased background modeling considers the changes in illumination and noise. These methods can produce moving edges in the background, which affect the moving-object detection. There has been a large amount of work addressing the issues of background model representation and adoptation of pixel-based methods [1-5]. Edge-based methods use edges that are less sensitive to intensity changes [6-10]. These methods work with fewer pixels than pixel-based methods.
Dailey et al. [6] presented an algorithm that uses the interframe differences of the three consecutive frames in order to obtain two difference images. Here, a Sobel edge detector was applied to the resulting images, and a threshold was applied on the resultant images to create binary images. Finally, the two binary images were intersected to obtain a moving-edge map. Kim and Hwang [7] presented an algorithm for the segmentation of moving objects with a robust double-edge map that is derived from the difference between two successive frames. After removing the edge points that belong to the previous frame, the remaining-edge map and the moving-edge map were combined to compute the final moving-edge map. Absolute background edges could be extracted from the first frame or by counting the number of edge occurrences for each pixel through the first several frames. This initialization of the background creates false-positive edges because it is impossible to obtain a background without moving objects in a real-world environment. Further, these two methods are sensitive to variations in the shape of the moving objects and to noise. These methods do not apply any background modeling; therefore, the detection of slow-moving objects is not possible. These limitations can be overcome by background modeling.
In this paper, we present an edge-based background modeling method based on the edge map of an interframe difference image that overcomes illumination variation, moving objects, and edge problems. We use a Canny edge detector to map the edges in the image frames. We create two edge maps: changing moving edge and stationary moving edge. These two edge maps are combined to calculate the ultimate moving-edge map. A temporary background-edge map is created by selecting all the edge pixels of the current frame above the threshold of the ultimate moving edges. The frequencies of the temporary background edge pixels for several frames are calculated. If the frequencies are above the threshold, then these edge pixels are treated as the background edge pixels and stored for a new background. Using this updated background, we can detect a stationary moving edge efficiently.
The rest of this paper is organized as follows: In Section II, we discuss the details of the proposed algorithm. Section III presents the results of the performance evaluation, followed by the overall conclusion in Section IV.
The first stage of the proposed background modeling is edge detection. We used a Canny edge detector [11] in this stage and executed five separate steps: smoothing, finding gradients, non-maximum suppression, double thresholding, and edge tracking by hysteresis. In the smoothing step, the image is blurred to remove noise. A gradient operation ∇ is applied on the Gaussian convoluted image G*F in the finding gradient step. Non-maximum suppression is applied to the gradient magnitude to thin the edge. Double thresholding with hysteresis is applied to detect and link edges. The Canny edge maps can be expressed as follows:
The edge extraction from the difference image in the successive frames results in a noise-robust difference edge map
Fig. 1 shows the block diagram of the proposed movingobject detection algorithm. We extract the moving edge
We define the edge map
For selecting the stationary moving edge, all the edge points are removed from the current frame, which belong to the previous moving-edge map. We can define the stationary moving edge as a set of all the edges that belong to the current edge frame within the distance
The ultimate moving-edge map for the current frame is expressed by combining two maps.
The temporary background-edge map Etb is given by selecting all edge pixels of the current frame above the distance Tback of MEn, i.e.,
For modeling the background, we counted the frequencies of the temporary background-edge map’s pixels for 200 frames. If the frequencies of the edge pixels exceed the threshold, then these edge pixels are considered the background edge pixels and stored as a new backgroundedge map.
This updated background is used for detecting a stationary moving-edge map. For the extraction of moving objects, we consider the component connection algorithm [6]. After the extraction of the moving object, morphological operations are applied to remove the noise regions in the post-processing.
We used the datasets from Performance Evaluation of Tracking and Surveillance (PETS) 2001 [12] dataset 3 (DS3) and dataset 4 (DS4), and PETS 2009 views 1, 5, and 6 [13]. PETS 2001 datasets are composed of five separate data sequences. All the datasets are multi-view (two cameras) and contain moving people and vehicles. DS3 has a more challenging sequence in terms of multiple targets and significant lighting variation. PETS 2009 datasets are multisensory sequences containing different crowd activities (walking around, standing, etc.). These datasets were captured from eight viewpoints. The PETS 2009 datasets lack ideal frames and have more challenging sequences for modeling backgrounds [14]. Using the ground truth [15], we built the ground-truth edges of these datasets. We compared our test results with the results of two other edge-based methods: Dailey et al. [6] and Kim and Hwang [7], as shown in Fig. 2.
Fig. 2 shows the column-wise comparison result of the proposed method and the methods developed by Dailey et al. [6] and Kim and Hwang [7]. The first column shows the image frames; the second column, the ground-truth images; the third column, the object-edge maps detected by the method developed by Dailey et al. [6]; the fourth column, the object-edge maps detected using the method developed by Kim and Hwang; and the last column, the object-edge maps detected by using the proposed method. The comparison result shows that the methods developed by Dailey et al. [6] and Kim and Hwang [7] produce more scatter edges than the proposed method. The proposed method suppresses the false-positive edges and absorbs the scatter edges from the object-edge map by background modeling.
We proposed an edge-based method to model the background for the detection of moving objects. We applied background modeling and updated the background after 200 frames. This method is more robust for the detection of slow-moving objects, shape changing, and noise suppression than the earlier methods. The proposed algorithm can be used in different applications, such as surveillance and content-based video coding.