Skin Region Detection Using a Mean Shift Algorithm Based on the Histogram Approximation

  • cc icon
  • ABSTRACT

    In conventional, skin detection methods using for skin color definitions is based on prior knowledge. By experimentation, the threshold value for dividing the background from the skin region is determined subjectively. A drawback of such techniques is that their performance is dependent on a threshold value which is estimated from repeated experiments. To overcome this, the present paper introduces a skin region detection method. This method uses a histogram approximation based on the mean shift algorithm. This proposed method applies the mean shift procedure to a histogram of a skin map of the input image. It is generated by comparing with the standard skin colors in the CbCr color space. It divides the background from the skin region by selecting the maximum value according to the brightness level. As the histogram has the form of a discontinuous function. It is accumulated according to the brightness values of the pixels. It is then, approximated by a Gaussian mixture model (GMM) using the Bezier curve technique. Thus, the proposed method detects the skin region using the mean shift procedure to determine a maximum value. Rather than using a manually selected threshold value, as in existing techniques this becomes the dividing point. Experiments confirm that the new procedure effectively detects the skin region.


  • KEYWORD

    Skin region detection , Mean shift , Histogram approximation

  • 1. INTRODUCTION

    Skin detection plays an important role in various areas of human-computer interaction, such as face detection, face tracking,content-based image data search systems, and gesture analysis.Recently, skin detection methods based on skin color data have attracted considerable attention because of their computational efficiency in terms of rotation, size, and partial obstruction of the relevant region. Skin color is used to complement the geometric data in designing the accurate face detection systems [1-4]. Skin detection via skin color is a significant preprocessing tool in face detection and recognition.

    Research on skin detection is generally conducted using the visible spectrum images. However, skin detection in visible spectrum images is limited by ambient illumination, camera characteristics, ethnicity, and personal characteristics. Procedures using non-visual spectrum images, such as infrared images, have been considered as a means of resolving such issues, but these procedures require prohibitively expensive hardware devices or extremely limited environments [5-9].

    As a means of classification, skin detection distinguishes two categories: the skin region and the non-skin region. For skin detection based on color data, an efficient classification procedure requires (1) the selection of an appropriate color model, (2) the selection of a suitable model distribution for skin and non-skin pixels, and (3) the consideration of the actual distribution being modeled. Selection of an appropriate color model determines the efficiency with which a given skin color distribution can be modeled. The skin color distribution is usually modeled by a histogram or a Gaussian distribution. Various techniques for

    obtaining the actual distribution have been researched, from the simple use of a skin color lookup table to complex pattern recognition. In existing research on skin color detection, RGB images are transformed to a color space, where they can be divided by intensity and color to exclude effects of external illumination. TSL, NCC, HSV, and YCbCr are the color spaces that are usually considered. Techniques of transforming the color spaces include linear and nonlinear transformations in RGB. Linear transformations include YIQ, YYUV, and YCbCr. Nonlinear transformations include NCC, HSV, and HSL [10-13]. Recent research efforts have generally used techniques that transform higher-dimensional color spaces into lower-dimensional color spaces to save computational time [14,15]. When skin detection is conducted using predefined skin color data, the skin similarity threshold value that divides the background region from the skin region. This is determined from repeated experimentation [16]. Such methods are limited in that the threshold values vary according to the experimental environment and skin color data. Also, these threshold values are not standardized objectively, and are partly based on the subjective user concepts. To overcome the weaknesses in the existing procedures, this study introduces a technique of skin color detection using histogram data based on the mean shift algorithm in a lower-dimensional color space. Unlike the existing procedures, this technique does not use experimentally determined threshold values. Instead, it uses the mean shift algorithm to find local maxima, which is used as segmentation points to detect the skin region.

    Methods of skin color detection can be classified based on physical characteristics and on statistical characteristics. The latter can be subdivided into parametric approaches [17-19] and nonparametric approaches [20-22]. Generally, methods based on statistical characteristics detect the skin region in a lowerdimensional color space to reduce effects of illumination. The skin color distributions used in parametric statistical approaches tend to follow a Gaussian mixture model (rather than a simple Gaussian model), and thus methods of skin detection based on these approaches usually employ a Gaussian mixture model [18,23]. Selection of model dimensions is one of the important problems which arise in such procedures. The selection is usually accomplished inductively, based on a priori environmental data, and the parameters of the skin color distribution are determined variously on the basis of ethnicity and illumination. Recent studies have made frequent use of the Expectation-Maximization (EM) algorithm as an applied technique for estimating the parameters of the Gaussian mixture model [24]. The EM algorithm uses a probability density function, determining the parameters of a skin color distribution by inductive estimation,based on ambient illumination and ethnicity. Nonparametric approaches use a model that expresses the general shape of the skin color distribution more easily than parametric approaches

    [21,22]. Such techniques usually employ a histogram to represent the characteristics of the skin color distribution in color space. One major advantage of using a histogram is that the probability density function can be calculated quantitatively from a quantization level figure, even when the skin color distribution is complex.

    Unlike the statistical approach, skin detection based on physical characteristics uses a physical model of inherent skin color. Such physical models are often employed in research methods to detect skin regions because they neglect background-based changes in illumination, and use permanent skin color characteristics.

    2. METHOD OF SKIN REGION DETECTIOM

    Figure 1 shows the entire block of the suggested skin region detection.

    The proposed technique includes color transformation, skin map histogram generation, histogram approximation, and skin region detection via a mean shift. To accomplish the color transformation, input images are transformed to the YCbCr color space using a color transformation formula in RGB color space. The skin map histogram is generated by expressing the skin region distribution in terms of brightness values. This is calculated from a standard skin color table and similarities established in advance by using the skin color characteristics of the CbCr color space. Histogram approximation is carried out by regarding the histogram as a discontinuous function, which is approximated by a continuous Gaussian function using the Bezier curve theorem.Finally, the mean shift algorithm is applied to the histogram to find out the Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of the pixels in the relevant regions are made uniform with these local maxima, and the regions having the maximum brightness value are detected as skin regions via region growing.

       2.1 Skin color analysis

    The skin region occupies certain parts of a color space, and this characteristic enables skin color to be divided from other background colors. The skin region distribution varies according to color space, and thus the choice of color space affects detection performance [16]. The YCbCr color space includes Y, which indicates the brightness value, and Cb and Cr, which represent color differences. Except for Y, the pixels of CbCr color space contain only color data, and are less affected by illumination. Thus, the skin color region in CbCr color space is effective in various illumination environments, as brightness has less effect on color values than in other color spaces. The first step in skin color region detection is to define the skin color region. This definition is accomplished by using the existing skin color region images made from images of various people's faces in varying ambient illumination. The effects of illumination should be considered in detecting skin color regions. In this study, lower-dimensional CbCr color space is used to minimize the effects of illumination [14,15]. Figure 2 shows the distribution regions of the standard skin color table in CbCr color space.

    In the images used in this study, the Cb skin color values were distributed primarily between 102 and 118, while the Cr color values were between 137 and 152. A standard skin color table was constructed on the basis of 100 Korean male and female adults illuminated by fluorescent lights in ordinary buildings. According to the results of research on face detection, skin color distribution is similar in form to a Gaussian distribution.

    Thus, skin color distribution can be expressed as a 2D Gaussian function G(μ CbCr, ∑ CbCr).

    image
    image
    image
    image
    image

    Here, Cb and Cr denote pixel color values,

    image

    denote Gaussian mean color values, and denotes a 3D Gaussian covariance matrix.

       2.2 Skin-map generation

    The skin map used in this study is calculated from the skin region similarity between a standard skin color table and input images, and then normalized to brightness values between 0 and 255. The skin map is generated by applying the Mahalanobis distance to the Gaussian mean and covariance of predefined skin color images.

    image

    Equation (6) is the formula for calculating the Mahalanobis distance. ∑ CbCr denotes the 2D Gaussian covariance inverse matrix, and N is the total number of pixels in the input image. The values obtained from Equation (6) indicate the degree of similarity to the skin region, but it requires normalization between 0 and 255 to express the intensity of the image. Equation (7) provides a formula for normalizing the values obtained from Equation (6). It produces brightness values that are close to 255 when the similarity to the skin region is high.

    image

       2.3 Skin region detection using histogram approximation

    The proposed technique uses skin map histogram approximation to efficiently detect skin regions in environments with varying or complex illumination. The procedure is carried out in three steps. First, the skin map histogram is regarded as a discontinuous function to be approximated by a continuous Gaussian function, using the Bezier curve theorem. In the second step, the mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions, and the brightness values of pixels in the relevant regions are approximated at the local maxima. In the third step, uniform brightness value of each region is investigated, and the region with the highest brightness value is detected via region growing.

    2.3.1 Histogram approximation using bezier curve

    Generally, a skin map histogram has the form of a discontinuous function determined by the accumulated brightness values of the pixels. In the proposed method, this histogram is approximated by a continuous Bezier curve, using the brightness value of each level of the histogram as a Bezier control point. Equation (8) is used to obtain a histogram from a skin map.

    image

    Here, N indicates the size of the skin map.

    Equations (8) and (9) are Bernstein function equations for the Bezier curve [18].

    image
    image

    Here, Pi is a control point for generating the Bezier curve, and u denotes a variable for controlling distance (smaller values of u corresponding to shorter distances on the curve). The Bezier curve is approximated by a Gaussian curve with Bezier control points given by h(level, value), which denotes the frequency of any given brightness level of the histogram. The number of dimensions in the Bezier curve formula is determined by the number of control points, and thus the Bernstein function formula has 256 dimensions. Computational errors cause higher-dimensional Bernstein functions to generate unstable Bezier curves,and thus this study uses a one-dimensional De Castelli algorithm repetitively instead of the Bernstein functions [27]. Equation (11) is the Bezier curve formula using the De Castelli algorithm.

    image

    The control point PS(x) indicates the frequency at each brightness level of the histogram, and t is a distance control variable calculated via Equation (12) (smaller values of t corresponding to shorter curves).

    image

    2.3.2 Establishment of threshold value using mean shift algorithm

    In the mean shift algorithm, the mode of the probability density function is found by hill climbing. The probability density function indicates the brightness distribution of pixels in the intensity image. The algorithm is a procedure for converging on a local maximum point within the kernel via repetitive calculation of mean locations and mean brightness values of pixels having a similar brightness distribution in a neighborhood of the given pixel. In other words, the pixel value at the current location is transformed to the brightness value at the local maximum, and thus the brightness values in the spatial region are made uniform. Thus, the optimal segmentation threshold value is obtained by using the mean shift algorithm. It is a point which has a valley-point in the boundary line between the uniform regions,or in the Gaussian histogram approximation. Equation (13) expresses the mean shift algorithm.

    image

    Equation (14) describes the transformation of the current pixel brightness value to the local maximum brightness value via the mean shift algorithm.

    image

    Here, x denotes the current pixel brightness value and k denotes the weight variable. Thus, PM(X') transforms the current pixel brightness value to the local maximum brightness value in a given region via Equation (13). Equation (15) gives the optimal threshold value for segmenting the background region and the skin region via Equations (13) and (14).

    image

    The maximum value is used because the skin region is the brightest region in the skin map.

    Equation (16) is the equation for region growing.

    image

    Here, IR indicates the skin region. The proposed technique is realized as follows.

    step 1. The RGB input image is transformed to be YCbCr image.

    step 2. From the analysis of standard skin color, skin color similarity is calculated using the following formula and a skin map is generated.

    image

    Here, μ CbCr denotes the mean, ∑ CbCr denotes the covariance, and

    image

    is expressed as a CbCr component of ICbCr.

    step 3. After IS is quantized, a skin-map histogram HS is obtained, and is approximated by a Gaussian function using the Bezier curve of De Castelli's algorithm [27].

    step 4. The mean shift algorithm is used to find Gaussian local maxima in certain regions having similar brightness distributions. The brightness values of pixels in the relevant region are made uniform with the local maximum.

    step 5. After the brightness values of the segmented regions are investigated, the regions having the maximum brightness value are detected as skin regions via region growing.

    3. RESULTS

    In the experiment, RGB color images 320 × 240 in size were captured with an ordinary digital camera. Figure 3 shows the process of skin region detection via the proposed method.

    Figure 3(a) shows the input image, and Fig. 3(b) shows the normalization of skin similarity to the brightness values of 0 to 255 by applying Equations (6) and (7) to the input image. Figure 3(b) is brighter than the background because the face of the input image was accurately identified as the skin region by Equations (6) and (7). Figure 3(c) shows the brightness data of Figure 3(b) converted to a histogram, and Figure 3(d) shows the continuous approximation of the histogram via De Castelli's algorithm. Figure 3(e) illustrates the process of skin region detection via the proposed method. Figure 4 compares the proposed method to the existing method [16] using the skin color model. The same skin color model was used in this study for the objectivity of verification.

    The same skin color model was used in this study for the objectivity of verification. As Figure 4 indicates, the proposed method accurately detected the skin regions via the mean shift procedure without a user-supplied threshold values. Also, the figure shows that compared to the existing method, the proposed method can accurately segment the skin and lip regions.

    Figure 5 shows the results of skin region detection using the existing method [25,26] which establishes an appropriate threshold value based on ambient illumination, and the proposed method, in which segmentation points are determined via the mean shift algorithm.

    Images used in skin detection experiments are generally captured in an internal environment under fluorescent light, where the skin color contamination by illumination is insignificant. In this study, strong illumination was deliberately projected onto a certain part of at the left side of the human face to investigate the performance of the proposed method. The same skin color model was used with both the existing and the proposed method for the sake of performance objectivity.

    Figure 5(a) shows input images in which illumination was projected onto the faces from a certain direction. Figure 5(b) shows skin detection results using the existing method, and Figure 5(c) shows the results using the proposed method. In this experiment, the threshold values for the existing method were selected from the optimum skin detection values determined by experiment. As Figure 5 indicates, the proposed method detected skin regions more efficiently than the existing method, even though the skin color was changed by the illumination in certain directions. The existing technique detected the region by calculating skin similarity and establishing a threshold value at each pixel. On the other hand the proposed method applies the mean shift algorithm to the skin map histogram to find Gaussian local maxima in certain regions having similar brightness distributions, and assigns uniform brightness values to pixels in the relevant regions.

    To evaluate the performance on the proposed method, we suggested Precision, Recall and Accuracy, the equation is as followed.

    image
    image

    Recall is defined as the ratio between the number of skin pixels correctly classified by the proposed method and the total number of actual skin pixels. Accuracy means that skin region by proposed method was how many matched with real skin region. In the equation (18) and (19), N(SM) and N(SA) mean the number of real skin region and detected skin region. N(SM∩SA) means the number of matched pixels between the real skin region and the detected skin region. N(SU) means that the number of pixels that are not detected in the real skin region and N(SO) means that detected skin region in the not skin region.

    Table 1 shows the result of the performance of the proposed method, X axis indicated the input images and y axis indicated the percentage of performance. The results of recall was 95.8%, accuracy was 97.8%. The reason for the lower value of recall than accuracy is due to the contamination of skin color on the illumination change. In terms of the result of performance the proposed method appeared as a strong method to detect the skin region.

    4. CONCLUSION

    This study introduces a method of skin detection by applying the mean shift algorithm to histogram data. In the existing methods using standard skin color models, skin similarity threshold values for segmenting the background region and the skin region are determined by repeated experimentation. A weakness of these techniques is that the threshold values vary according to illumination and environment. Also, established threshold values cannot be standardized objectively, and include subjective factors, determined by individual users.

    In the proposed method, a skin map histogram of an input image is created by using standard skin color characteristics of the CbCr color space. The accumulated data at each brightness level are analyzed via the mean shift algorithm, and the skin region is detected by finding the regional segmentation points. Even when the skin color is contaminated by illumination, this procedure can accurately segment the skin region and the background region. The proposed method may be useful in detecting facial regions as a pretreatment for face recognition in various types of illumination.

  • 1. Jiang Z, Wu Z, Yao M 2008 Skin Detection on Images with Color Deviation. [IEEE Trans Congress on Services, Part Ⅱ] P.171-174 google doi
  • 2. Kherchaoui S, Houacine A 2010 Face Detection Based on A Model of the Skin Color with Constranins and Template Matching. [Int'l Conf. Machine and Web Intell.] P.469-472 google doi
  • 3. Zhengming L, Tong Z, Jin Z 2010 Skin Detection in Color Images. [Int'l Conf. ICCET.] P.156-159 google doi
  • 4. Uongqiu T, Faling Y, Guohua C, Shizhong J 2010 Skin Color Detection by Illumination Estimation and Normalization in Shadow Regions. [IEEE. Conf. ICIA.] P.1082-1085 google doi
  • 5. Socolinsky D.A, Selinger A, Neuheisel J.D 2003 Face Recognition with Visible and Thermal Infrared Imagery. [Computer Vision Image Understanding] Vol.91 P.72-114 google doi
  • 6. Kong S.G, Heo J, Abidi B.R, Paik J, Abidi M.A 2005 Recent Advances in Visual and Infrared Face Recognition: A Review. [Computer Vision Image Understanding] Vol.97 P.103-135 google doi
  • 7. Nunez A. S, Mendenhall M. J 2008 Detection of Human Skin in Near Infrared Hyperspectral Imagery. [IEEE. Int'l IGARSS.] Vol.2 P.621-624 google doi
  • 8. Liensberger C, Stottinger J, Kampel M 2009 Color-Based and Context-Aware Skin Detection for Online Video Annotation. [IEEE. Trans. Intl'l MMSP] P.1-6 google doi
  • 9. Pan Z, Healey G, Prasad M, Tromberg B 2003 Face Recognition in Hyperspectral Images. [IEEE Trans. PatternAnal. Mach. Intell] Vol.25 P.1552-1559 google doi
  • 10. Hjelm E, Low B.K 2001 Face Detection: A Survey. [Computer Vision and Image Understanding] Vol.83 P.236-274 google doi
  • 11. Niazi M, Jafar S 2010 Hybrid Face Detection with HSV Color method and HAAR Classifier. [Int'l Conf. Software Technology and Engineering] P.325-329 google doi
  • 12. Popov A, Dimitrova D 2008 A New Approach for Finding Face Features in Color Images. [IEEE. Int'l. Intelligent Systems] P.33-37 google doi
  • 13. Adachi Y, Imai A, Ozaki M, Ishii N 2000 Extraction of face region by using characteristics of color space and detection of face direction through an eigenspace. [Int'l Conf. Knowledge-Based Intelligent Engineering Systems and Allied Technologies] P.393-396 google doi
  • 14. Xinyu W, Huosheng X, Heng W, Heng L 2008 Robust Real-Time Face Detection with Skin Color Detection and The Modified Census Transform. [Int'l Conf. ICIA.] P.590-595 google doi
  • 15. Sebastian P, Vooi V 2007 Tracking using Normalized Cross Correlation and Color Space. [Intl'l Conf. Intelligent and Advanced Systems] P.770-774 google doi
  • 16. Hsu R. L, Abdel-Mottaleb M, Jain A. K 2002 Face Detectionin Color Images. [IEEE Trans. on PAMI] Vol.24 P.696-706 google doi
  • 17. Darrell T, Gordon G. G, Harville M, Woodfill J 1998 Integrated Person Tracking Using Stereo Color and Pattern Detection. [Proc. IEEE Conf. CVPR] P.601-607 google doi
  • 18. Zhu X, Yang J, Waibel A 2000 Segmenting Hands of Arbitrary Color. [in Proc. Int'l Conf. Automatic Face and Gesture Recognition] P.446-453 google doi
  • 19. Yang M. H, Ahuja N 1999 Gaussian Mixture Model for Human Skin Color and Its Application in Image and Video Databases. [in Proc. SPIE Conf. Storage and Retrieval for Image and Video Databases] P.458-466 google doi
  • 20. Saxe D, Foulds R 1996 Toward Robust Skin Identification in Video Image. [in Porc. Int'l Conf. Automatic Face and Gesture Recognition] P.379-384 google doi
  • 21. Schwerdt K, Crowley J. L 2000 Robust Face Tracking Using Color. [in Proc. Int'l Conf. Automatic Face and Gesture Recognition] P.90-95 google doi
  • 22. Soraino M, Martinkauppi B, Huovinen S, Laaksonen M 2000 Skin Detection in Video under Changing Illumination Conditions. [in Proc. Int'l Conf. Pattern Recognition] Vol.1 P.839-842 google doi
  • 23. Pal A 2008 Multicues Face Detection in Complex Background for Frontal Faces. [Int'l. Machine Vision and Image Processing Conf.] P.57-62 google doi
  • 24. Diplaros A, Gevers T, Vlassis N 2004 Skin Detection using The EM Algorithm with Spatial Constraints. [IEEE. Int'l. Conf. Systems, Man and Cybernetics] Vol.4 P.3071-3075 google doi
  • 25. Ukil Y, Minsung K, Kar-Ann T, Kwanghoon S 2010 An Illumination Invariant Skin-Color Model for Face Detection. [IEEE. Int'l Conf. Biometrics: Theory Applications and Systems] P.1-6 google doi
  • 26. D. Hyun-Chul, Y. Ju-Yeon, C. Sung-Il 2007 Skin Color Detection through Estimation and Conversion of Illuminant Color under Various Illuminations. [IEEE. Trans. Consumer Electronics] P.1103-1108 google doi
  • 27. Ding R, Zhang Y 2003 The Extension of The Dual De Casteljau Algorithm. [Int'l Conf. on PDCAT] P.688-692 google doi
  • [Fig. 1.] Block diagram of proposed skin region detection method.
    Block diagram of proposed skin region detection method.
  • [Fig. 2.] Analysis of skin color in the CbCr space.
    Analysis of skin color in the CbCr space.
  • [Fig. 3.] Skin detection process using the proposed method: (a) input image, (b) skin-map of (a), (c) histogram of (b), (d) histogram of (c) smoothed by De Castelli's algorithm, (e) skin region detection of (d) by mean shift algorithm. Analysis of skin color in the CbCr space.
    Skin detection process using the proposed method: (a) input image, (b) skin-map of (a), (c) histogram of (b), (d) histogram of (c) smoothed by De Castelli's algorithm, (e) skin region detection of (d) by mean shift algorithm. Analysis of skin color in the CbCr space.
  • [Fig. 4.] Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
    Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
  • [Fig. 5.] Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
    Comparison of the existing and proposed methods: (a) input images, (b) results obtained by the existing method, (c) results obtained by the proposed method.
  • [Table 1.] Performance evaluation of skin color detection.
    Performance evaluation of skin color detection.