Skin Region Detection Using a Mean Shift Algorithm Based on the Histogram Approximation
 Author: Byun KiWon, Nam KiGon, Ye SooYoung
 Organization: Byun KiWon; Nam KiGon; Ye SooYoung
 Publish: Transactions on Electrical and Electronic Materials Volume 13, Issue1, p10~15, 25 Feb 2012

ABSTRACT
In conventional, skin detection methods using for skin color definitions is based on prior knowledge. By experimentation, the threshold value for dividing the background from the skin region is determined subjectively. A drawback of such techniques is that their performance is dependent on a threshold value which is estimated from repeated experiments. To overcome this, the present paper introduces a skin region detection method. This method uses a histogram approximation based on the mean shift algorithm. This proposed method applies the mean shift procedure to a histogram of a skin map of the input image. It is generated by comparing with the standard skin colors in the C_{b}C_{r} color space. It divides the background from the skin region by selecting the maximum value according to the brightness level. As the histogram has the form of a discontinuous function. It is accumulated according to the brightness values of the pixels. It is then, approximated by a Gaussian mixture model (GMM) using the Bezier curve technique. Thus, the proposed method detects the skin region using the mean shift procedure to determine a maximum value. Rather than using a manually selected threshold value, as in existing techniques this becomes the dividing point. Experiments confirm that the new procedure effectively detects the skin region.

KEYWORD
Skin region detection , Mean shift , Histogram approximation

1. INTRODUCTION
Skin detection plays an important role in various areas of humancomputer interaction, such as face detection, face tracking,contentbased image data search systems, and gesture analysis.Recently, skin detection methods based on skin color data have attracted considerable attention because of their computational efficiency in terms of rotation, size, and partial obstruction of the relevant region. Skin color is used to complement the geometric data in designing the accurate face detection systems [14]. Skin detection via skin color is a significant preprocessing tool in face detection and recognition.
Research on skin detection is generally conducted using the visible spectrum images. However, skin detection in visible spectrum images is limited by ambient illumination, camera characteristics, ethnicity, and personal characteristics. Procedures using nonvisual spectrum images, such as infrared images, have been considered as a means of resolving such issues, but these procedures require prohibitively expensive hardware devices or extremely limited environments [59].
As a means of classification, skin detection distinguishes two categories: the skin region and the nonskin region. For skin detection based on color data, an efficient classification procedure requires (1) the selection of an appropriate color model, (2) the selection of a suitable model distribution for skin and nonskin pixels, and (3) the consideration of the actual distribution being modeled. Selection of an appropriate color model determines the efficiency with which a given skin color distribution can be modeled. The skin color distribution is usually modeled by a histogram or a Gaussian distribution. Various techniques for
obtaining the actual distribution have been researched, from the simple use of a skin color lookup table to complex pattern recognition. In existing research on skin color detection, RGB images are transformed to a color space, where they can be divided by intensity and color to exclude effects of external illumination. TSL, NCC, HSV, and YC_{b}C_{r} are the color spaces that are usually considered. Techniques of transforming the color spaces include linear and nonlinear transformations in RGB. Linear transformations include YIQ, YYUV, and YC_{b}C_{r}. Nonlinear transformations include NCC, HSV, and HSL [1013]. Recent research efforts have generally used techniques that transform higherdimensional color spaces into lowerdimensional color spaces to save computational time [14,15]. When skin detection is conducted using predefined skin color data, the skin similarity threshold value that divides the background region from the skin region. This is determined from repeated experimentation [16]. Such methods are limited in that the threshold values vary according to the experimental environment and skin color data. Also, these threshold values are not standardized objectively, and are partly based on the subjective user concepts. To overcome the weaknesses in the existing procedures, this study introduces a technique of skin color detection using histogram data based on the mean shift algorithm in a lowerdimensional color space. Unlike the existing procedures, this technique does not use experimentally determined threshold values. Instead, it uses the mean shift algorithm to find local maxima, which is used as segmentation points to detect the skin region.
Methods of skin color detection can be classified based on physical characteristics and on statistical characteristics. The latter can be subdivided into parametric approaches [1719] and nonparametric approaches [2022]. Generally, methods based on statistical characteristics detect the skin region in a lowerdimensional color space to reduce effects of illumination. The skin color distributions used in parametric statistical approaches tend to follow a Gaussian mixture model (rather than a simple Gaussian model), and thus methods of skin detection based on these approaches usually employ a Gaussian mixture model [18,23]. Selection of model dimensions is one of the important problems which arise in such procedures. The selection is usually accomplished inductively, based on a priori environmental data, and the parameters of the skin color distribution are determined variously on the basis of ethnicity and illumination. Recent studies have made frequent use of the ExpectationMaximization (EM) algorithm as an applied technique for estimating the parameters of the Gaussian mixture model [24]. The EM algorithm uses a probability density function, determining the parameters of a skin color distribution by inductive estimation,based on ambient illumination and ethnicity. Nonparametric approaches use a model that expresses the general shape of the skin color distribution more easily than parametric approaches
[21,22]. Such techniques usually employ a histogram to represent the characteristics of the skin color distribution in color space. One major advantage of using a histogram is that the probability density function can be calculated quantitatively from a quantization level figure, even when the skin color distribution is complex.
Unlike the statistical approach, skin detection based on physical characteristics uses a physical model of inherent skin color. Such physical models are often employed in research methods to detect skin regions because they neglect backgroundbased changes in illumination, and use permanent skin color characteristics.
2. METHOD OF SKIN REGION DETECTIOM
Figure 1 shows the entire block of the suggested skin region detection.
The proposed technique includes color transformation, skin map histogram generation, histogram approximation, and skin region detection via a mean shift. To accomplish the color transformation, input images are transformed to the YC_{b}C_{r} color space using a color transformation formula in RGB color space. The skin map histogram is generated by expressing the skin region distribution in terms of brightness values. This is calculated from a standard skin color table and similarities established in advance by using the skin color characteristics of the C_{b}C_{r} color space. Histogram approximation is carried out by regarding the histogram as a discontinuous function, which is approximated by a continuous Gaussian function using the Bezier curve theorem.Finally, the mean shift algorithm is applied to the histogram to find out the Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of the pixels in the relevant regions are made uniform with these local maxima, and the regions having the maximum brightness value are detected as skin regions via region growing.
2.1 Skin color analysis
The skin region occupies certain parts of a color space, and this characteristic enables skin color to be divided from other background colors. The skin region distribution varies according to color space, and thus the choice of color space affects detection performance [16]. The YC_{b}C_{r} color space includes Y, which indicates the brightness value, and C_{b} and C_{r}, which represent color differences. Except for Y, the pixels of
C_{b}C_{r} color space contain only color data, and are less affected by illumination. Thus, the skin color region inC_{b}C_{r} color space is effective in various illumination environments, as brightness has less effect on color values than in other color spaces. The first step in skin color region detection is to define the skin color region. This definition is accomplished by using the existing skin color region images made from images of various people's faces in varying ambient illumination. The effects of illumination should be considered in detecting skin color regions. In this study, lowerdimensionalC_{b}C_{r} color space is used to minimize the effects of illumination [14,15]. Figure 2 shows the distribution regions of the standard skin color table inC_{b}C_{r} color space.In the images used in this study, the
C_{b} skin color values were distributed primarily between 102 and 118, while theC_{r} color values were between 137 and 152. A standard skin color table was constructed on the basis of 100 Korean male and female adults illuminated by fluorescent lights in ordinary buildings. According to the results of research on face detection, skin color distribution is similar in form to a Gaussian distribution.Thus, skin color distribution can be expressed as a 2D Gaussian function
G (μ _{CbCr} , ∑_{CbCr} ).Here,
C _{b} andC _{r} denote pixel color values,denote Gaussian mean color values, and denotes a 3D Gaussian covariance matrix.
2.2 Skinmap generation
The skin map used in this study is calculated from the skin region similarity between a standard skin color table and input images, and then normalized to brightness values between 0 and 255. The skin map is generated by applying the Mahalanobis distance to the Gaussian mean and covariance of predefined skin color images.
Equation (6) is the formula for calculating the Mahalanobis distance. ∑
C_{b}C_{r} denotes the 2D Gaussian covariance inverse matrix, and N is the total number of pixels in the input image. The values obtained from Equation (6) indicate the degree of similarity to the skin region, but it requires normalization between 0 and 255 to express the intensity of the image. Equation (7) provides a formula for normalizing the values obtained from Equation (6). It produces brightness values that are close to 255 when the similarity to the skin region is high.2.3 Skin region detection using histogram approximation
The proposed technique uses skin map histogram approximation to efficiently detect skin regions in environments with varying or complex illumination. The procedure is carried out in three steps. First, the skin map histogram is regarded as a discontinuous function to be approximated by a continuous Gaussian function, using the Bezier curve theorem. In the second step, the mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions, and the brightness values of pixels in the relevant regions are approximated at the local maxima. In the third step, uniform brightness value of each region is investigated, and the region with the highest brightness value is detected via region growing.
2.3.1 Histogram approximation using bezier curve
Generally, a skin map histogram has the form of a discontinuous function determined by the accumulated brightness values of the pixels. In the proposed method, this histogram is approximated by a continuous Bezier curve, using the brightness value of each level of the histogram as a Bezier control point. Equation (8) is used to obtain a histogram from a skin map.
Here, N indicates the size of the skin map.
Equations (8) and (9) are Bernstein function equations for the Bezier curve [18].
Here,
P_{i} is a control point for generating the Bezier curve, andu denotes a variable for controlling distance (smaller values ofu corresponding to shorter distances on the curve). The Bezier curve is approximated by a Gaussian curve with Bezier control points given byh (level, value), which denotes the frequency of any given brightness level of the histogram. The number of dimensions in the Bezier curve formula is determined by the number of control points, and thus the Bernstein function formula has 256 dimensions. Computational errors cause higherdimensional Bernstein functions to generate unstable Bezier curves,and thus this study uses a onedimensional De Castelli algorithm repetitively instead of the Bernstein functions [27]. Equation (11) is the Bezier curve formula using the De Castelli algorithm.The control point
P_{S}(x) indicates the frequency at each brightness level of the histogram, and t is a distance control variable calculated via Equation (12) (smaller values of t corresponding to shorter curves).2.3.2 Establishment of threshold value using mean shift algorithm
In the mean shift algorithm, the mode of the probability density function is found by hill climbing. The probability density function indicates the brightness distribution of pixels in the intensity image. The algorithm is a procedure for converging on a local maximum point within the kernel via repetitive calculation of mean locations and mean brightness values of pixels having a similar brightness distribution in a neighborhood of the given pixel. In other words, the pixel value at the current location is transformed to the brightness value at the local maximum, and thus the brightness values in the spatial region are made uniform. Thus, the optimal segmentation threshold value is obtained by using the mean shift algorithm. It is a point which has a valleypoint in the boundary line between the uniform regions,or in the Gaussian histogram approximation. Equation (13) expresses the mean shift algorithm.
Equation (14) describes the transformation of the current pixel brightness value to the local maximum brightness value via the mean shift algorithm.
Here, x denotes the current pixel brightness value and k denotes the weight variable. Thus,
P_{M}(X') transforms the current pixel brightness value to the local maximum brightness value in a given region via Equation (13). Equation (15) gives the optimal threshold value for segmenting the background region and the skin region via Equations (13) and (14).The maximum value is used because the skin region is the brightest region in the skin map.
Equation (16) is the equation for region growing.
Here,
I_{R} indicates the skin region. The proposed technique is realized as follows.step 1. The RGB input image is transformed to be YC_{b}C_{r} image.
step 2. From the analysis of standard skin color, skin color similarity is calculated using the following formula and a skin map is generated.
Here,
μ _{CbCr} denotes the mean, ∑C_{b}C_{r} denotes the covariance, andis expressed as a
C_{b}C_{r} component ofI _{CbCr} .step 3. After
I_{S} is quantized, a skinmap histogramH_{S} is obtained, and is approximated by a Gaussian function using the Bezier curve of De Castelli's algorithm [27].step 4. The mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of pixels in the relevant region are made uniform with the local maximum.
step 5. After the brightness values of the segmented regions are investigated, the regions having the maximum brightness value are detected as skin regions via region growing.
3. RESULTS
In the experiment, RGB color images 320 × 240 in size were captured with an ordinary digital camera. Figure 3 shows the process of skin region detection via the proposed method.
Figure 3(a) shows the input image, and Fig. 3(b) shows the normalization of skin similarity to the brightness values of 0 to 255 by applying Equations (6) and (7) to the input image. Figure 3(b) is brighter than the background because the face of the input image was accurately identified as the skin region by Equations (6) and (7). Figure 3(c) shows the brightness data of Figure 3(b) converted to a histogram, and Figure 3(d) shows the continuous approximation of the histogram via De Castelli's algorithm. Figure 3(e) illustrates the process of skin region detection via the proposed method. Figure 4 compares the proposed method to the existing method [16] using the skin color model. The same skin color model was used in this study for the objectivity of verification.
The same skin color model was used in this study for the objectivity of verification. As Figure 4 indicates, the proposed method accurately detected the skin regions via the mean shift procedure without a usersupplied threshold values. Also, the figure shows that compared to the existing method, the proposed method can accurately segment the skin and lip regions.
Figure 5 shows the results of skin region detection using the existing method [25,26] which establishes an appropriate threshold value based on ambient illumination, and the proposed method, in which segmentation points are determined via the mean shift algorithm.
Images used in skin detection experiments are generally captured in an internal environment under fluorescent light, where the skin color contamination by illumination is insignificant. In this study, strong illumination was deliberately projected onto a certain part of at the left side of the human face to investigate the performance of the proposed method. The same skin color model was used with both the existing and the proposed method for the sake of performance objectivity.
Figure 5(a) shows input images in which illumination was projected onto the faces from a certain direction. Figure 5(b) shows skin detection results using the existing method, and Figure 5(c) shows the results using the proposed method. In this experiment, the threshold values for the existing method were selected from the optimum skin detection values determined by experiment. As Figure 5 indicates, the proposed method detected skin regions more efficiently than the existing method, even though the skin color was changed by the illumination in certain directions. The existing technique detected the region by calculating skin similarity and establishing a threshold value at each pixel. On the other hand the proposed method applies the mean shift algorithm to the skin map histogram to find Gaussian local maxima in certain regions having similar brightness distributions, and assigns uniform brightness values to pixels in the relevant regions.
To evaluate the performance on the proposed method, we suggested Precision, Recall and Accuracy, the equation is as followed.
Recall is defined as the ratio between the number of skin pixels correctly classified by the proposed method and the total number of actual skin pixels. Accuracy means that skin region by proposed method was how many matched with real skin region. In the equation (18) and (19),
N(S_{M}) andN(S_{A}) mean the number of real skin region and detected skin region.N(S_{M}∩S_{A}) means the number of matched pixels between the real skin region and the detected skin region.N(S_{U}) means that the number of pixels that are not detected in the real skin region andN(S_{O}) means that detected skin region in the not skin region.Table 1 shows the result of the performance of the proposed method, X axis indicated the input images and y axis indicated the percentage of performance. The results of recall was 95.8%, accuracy was 97.8%. The reason for the lower value of recall than accuracy is due to the contamination of skin color on the illumination change. In terms of the result of performance the proposed method appeared as a strong method to detect the skin region.
4. CONCLUSION
This study introduces a method of skin detection by applying the mean shift algorithm to histogram data. In the existing methods using standard skin color models, skin similarity threshold values for segmenting the background region and the skin region are determined by repeated experimentation. A weakness of these techniques is that the threshold values vary according to illumination and environment. Also, established threshold values cannot be standardized objectively, and include subjective factors, determined by individual users.
In the proposed method, a skin map histogram of an input image is created by using standard skin color characteristics of the C_{b}C_{r} color space. The accumulated data at each brightness level are analyzed via the mean shift algorithm, and the skin region is detected by finding the regional segmentation points. Even when the skin color is contaminated by illumination, this procedure can accurately segment the skin region and the background region. The proposed method may be useful in detecting facial regions as a pretreatment for face recognition in various types of illumination.

[Fig. 1.] Block diagram of proposed skin region detection method.

[Fig. 2.] Analysis of skin color in the CbCr space.

[Fig. 3.] Skin detection process using the proposed method: (a) input image, (b) skinmap of (a), (c) histogram of (b), (d) histogram of (c) smoothed by De Castelli's algorithm, (e) skin region detection of (d) by mean shift algorithm. Analysis of skin color in the CbCr space.

[Fig. 4.] Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.

[Fig. 5.] Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.

[Table 1.] Performance evaluation of skin color detection.