Identification and Determination of Oil Pollutants Based on 3D Fluorescence Spectrum Combined with Selfweighted Alternating Trilinear Decomposition Algorithm
 Author: Cheng Pengfei, Wang Yutian, Chen Zhikun, Yang Zhe
 Publish: Current Optics and Photonics Volume 20, Issue1, p204~211, 25 Feb 2016

ABSTRACT
Oil pollution seriously endangers the biological environment and human health. Due to the diversity of oils and the complexity of oil composition, it is of great significance to identify the oil contaminants. The 3D fluorescence spectrum combined with a second order correction algorithm was adopted to measure an oil mixture with overlapped fluorescence spectra. The selfweighted alternating trilinear decomposition (SWATLD) is a kind of second order correction, which has developed rapidly in recent years. Micellar solutions of #0 diesel, #93 gasoline and ordinary kerosene in different concentrations were made up. The 3D fluorescence spectra of the mixed oil solutions were measured by a FLS920 fluorescence spectrometer. The SWATLD algorithm was applied to decompose the spectrum data. The predict concentration and recovery rate obtained by the experiment show that the SWATLD algorithm has advantages of insensitivity to component number and high resolution for mixed oils.

KEYWORD
SWATLD , 3D fluorescence spectrum , Oil mixture , Component number

I. INTRODUCTION
Oil pollution seriously threatens the safety of human and ecological environment. Measuring oil pollutants has become an urgent problem [1]. Fluorescence detection has the advantages of good selectivity and high precision [25]. D. Patra applied threedimensional fluorescence spectrometry to identify diesel with trace amounts of kerosene and gasoline, obtaining good result [6]. R. J. Kavanagh used synchronous fluorescence spectra to measure a variety of polycyclic aromatic hydrocarbons in water, and to successfully distinguish the type of polycyclic aromatic hydrocarbons [7]. Polycyclic aromatic hydrocarbons usually have stronger fluorescence properties [813], and the fluorescence spectra of oil pollutants are formed by the superposition of various aromatic hydrocarbon fluorescence spectra. It is relatively difficult to identify the type and determine the content of each component of oil mixture using chemical methods [14, 15]. The parallel factor (PARAFAC) analysis method is a kind of iterative threedimensional matrix decomposition algorithm, which adopts the alternating least squares principle. It can quantitatively measure the specific components in the existence of interferents [1618]. But only estimating the component number correctly can we get the accurate result [19, 20]. The selfweighted alternating trilinear decomposition (SWATLD) algorithm, which has been developed in recent years, is insensitive to component numbers, requires fewer iterations and has high stability [2022]. Hu and Yin applied PARAFACalternative least squares (PARAFACALSs) and SWATLD to measure Vancomycin and Cephalexin in human plasma, the experimental result showed that SWATLD algorithm had good effect on the determination of complex analysis of drugs in plasma [23]. Zhang and Wang determined Sulfonamides in synthetic samples and pharmaceutical tablets using SWATLD, showing that SWATLD algorithm can be easily performed and applied to solve secondorder calibration problems quickly and accurately [24].
II. THEORIES
2.1. PARAFAC Model
The PARAFAC model means a threedimensional data matrix can be decomposed into three twodimensional loading matrices
,A andB . The decomposition algorithm is shown in Fig. 1.C In Fig. 1, is a threedimensional data matrix, with the sizes of
I ×J ×K . is a threedimensional core matrix, with the sizes ofF ×F ×F . In the cubical matrix , the super diagonal elements are 1, other elements are 0.The PARAFAC model formula is expressed as:
The PARAFAC model has specific and actual chemical significance in threedimensional fluorescence analysis. Scan the
I samples in the range ofJ emission wavelengths andK excitation wavelengths, theI excitationemission matrices will be produced. Then theI excitationemission matrices can constitute a threedimensional matrix sequentially, which corresponds to the formula (1). Here,x_{ijk} is the element of , standing for the fluorescence intensity of thei th sample in thej th emission wavelength and thek th excitation wavelength; is the element ofa _{if} , standing for the concentration of theA f th factor in thei th sample; is the element ofb _{jf} , standing for the fluorescence intensity of theB f th factor in thej th emission wavelength, so the matrix can be used to estimate the emission spectrum;B is the element ofc _{kf} , standing for the fluorescence intensity of theC f th factor in thek th excitation wavelength, so the matrix can be used to estimate the excitation spectrum;C F is the column number of the loading matrix, standing for the number of the components, namely, it is the number of principal components in the samples;e_{ijk} is the element of , standing for the residual value that is caused by noise or other unexplained variations.The PARAFAC analysis algorithm takes advantage of the Alternating Least Square (ALS) method to decompose the threedimensional matrix , aiming to minimize the sum of squared residuals [25]:
Where,
σ is the sum of squared residuals;F is the component number.2.2. SWATLD Algorithm
The PARAFAC algorithm has been widely used in the analysis of a threedimensional fluorescence spectrum, while it is sensitive to component number, and the component number should be estimated accurately. Selecting the correct component number might get the good result. Therefore, a lot of improved PARAFAC algorithms are developed, among which, the SWATLD algorithm has the advantages of insensitivity to component number, fewer iterations, fast operation, good stability, etc.
The SWATLD algorithm is the improvement of the PARAFAC algorithm, and it optimizes the three objective functions which are close in internal relation but not completely equivalent. The functions are as follows:
Where, . represents the Frobenius matrix norm, ./ denotes array division.
After iterating the formula (3)~(5), we can obtain the following formula:
First, initialize
andA , then calculateB ,C andB according to formula (6), (7) and (8), repeat iteration until convergence, the convergence criteria is :A The process of the SWATLD algorithm is shown in Fig. 2.
III. EXPERIMENT AND DISCUSSION
3.1. Experiment Setting and Sample Preparation
The Edinburgh Instruments FLS920 fullfunction type fluorescence spectrometer is used as the fluorescence detector. The width between excitation and emission slit is 1.11 mm, spectral resolution 2 nm, integration time 0.1s, the excitation wavelength 250~400 nm, fluorescence emission wavelength 270~500 nm, each step length 5 nm. The starting emission wavelength lags the starting excitation wavelength 20 nm to avoid the interference of Rayleigh scattering. Measure the threedimensional fluorescence spectrum of solutions twice and deduct the background spectrum of solvent to eliminate the effect of Raman scattering.
#97 gasoline and #0 diesel, and ordinary kerosene are the pollutants under test, sodium dodecyl sulfate (SDS) micellar solution is as the analysis of pure sample. The solubility of oil in the water is very low, which reduces the accuracy of the determination. As the solvent of oily substances, the SDS micellar solution can increase the fluorescence intensity of oil in the water, which is widely used in fluorometric analysis [26, 27].
Firstly, prepare the SDS solution. Weigh the different quantities of SDS, and make 20 sets of SDS solution in deionized water, then add 0.05 g kerosene to each. Measure the fluorescence intensity of 20 sets of kerosene SDS solution, which is shown in Table 1. The results show that when the concentration of SDS is 0.1 mol/L, fluorescence intensity of the kerosene SDS solution is the largest, so we choose 0.1mol/L SDS solution as solvent.
Prepare the 18 samples with different concentrations, and mark them. #1 ~ #10 are the calibration samples, and #11 ~ #18 are the prediction samples. The concentration in each sample is shown in Table 2.
3.2. Threedimensional Fluorescence Spectra
Figures 3(a), (b), (c) show the threedimensional fluorescence spectra of diesel, gasoline and kerosene solutions respectively. Figure 5 shows the threedimensional fluorescence spectra of two or three kinds of oil mixture solution. Although the fluorescence intensity of the oils are not identical, two or three kinds of mixed oil seriously overlap spectrally. It is difficult to separate the spectra and predict the concentration.
3.3. SWATLD to Mixed Solution of Diesel and Gasoline
Scan the calibration samples #1 ~ #5 and prediction samples #11 ~ #14 to get the 3dimensional data array , with the size of 9×37×21. Use the SWATLD algorithm to decompose . First, use the core consistent diagnosis method to estimate the component number. The core consistency and the change of component number are shown in Fig. 5. When the component number is 1 or 2, the coreconsistency is 100%. The component number increases, the coreconsistency value reduces gradually, which deviates from the trilinear model, so choosing 2 as the component number is optimal. The excitation and emission spectra of the two omponents are shown in Fig. 6. The “OOOO” represents the actual spectrum of gasoline, and “ΔΔΔΔ” represents the actual spectrum of diesel. Component 1 and component 2 are obtained by the SWATLD algorithm. We can conclude that component 1 is gasoline , component 2 is diesel . The diesel and gasoline resolution spectra have high similarity and good repeatability to the actual spectra. The experimental result qualitatively shows that SWATLD has high resolution on the oil mixture.
Table 3 lists the actual concentration of diesel and gasoline in the samples #11 ~ #14, the prediction concentration obtained by SWATLD and the recovery rate. Figure 7 shows the least square fitting lines between the actual concentration of the samples #11 ~ #14 and the prediction concentration obtained by SWATLD. The correlation coefficients are 0.9845(diesel), and 0.9926(gasoline). It quantitatively shows that SWATLD has high resolution on the oil mixture.
3.4. SWATLD for Mixed Solution of Diesel, Gasoline and Kerosene
Scan the calibration samples #6 ~ #10 and prediction samples #15 ~ #18 to get the 3dimensional data array , with the size of 9×37×21. Use the SWATLD algorithm to decompose . First, use the core consistent diagnosis method to estimate the number of components, as shown in Fig. 8. There is less difference in the coreconsistency of 3 components and 4 components. As the SWATLD algorithm is not sensitive to component number, there is no need to estimate the component number accurately. We just need to make the component number greater than or equal to the actual component number to obtain the ideal results. Attempt to make the component number 3, and research the resolution of SWATLD on mixed oil. The excitation and emission spectra are shown in Fig. 9. The “diesel resolved”, “kerosene resolved” and “gasoline resolved” respectively represent the spectra of diesel, kerosene and gasoline obtained by the SWATLD algorithm. The “diesel actual”, “kerosene actual” and “gasoline actual” respectively represent the spectrum of actual diesel, kerosene and gasoline. We can conclude that the SWATLD algorithm has high resolution on 3 components oil mixture.
Table 4 lists the actual concentration of diesel, gasoline and kerosene in the samples #15 ~ #18, the prediction concentration obtained by SWATLD and the recovery rate. Figure 10 shows the least square fitting lines between the actual concentration and the prediction concentration obtained by SWATLD. The correlation coefficients are 0.9034(diesel), 0.9252(gasoline), 0.9643(kerosene). It is shown that SWATLD has high resolution on three or more kinds mixed oil.
IV. CONCLUSIONS
Aimed at the problem of oil pollutants that are not easy to distinguish, a second order correction method was put forward. As the SWATLD algorithm is insensitive to factor numbers, there is no need to estimate the component number accurately. We just need to make the component number greater than or equal to the actual component numbers to obtain the ideal results. First, the diesel, gasoline and kerosene mixed solution was made up as calibration samples and prediction samples. Second, the FLS920 fluorescence spectrometer was used to measure the threedimensional fluorescence spectra of the samples. The mixed solution spectra overlapped seriously, and it was difficult to distinguish each specific oil. And then the SWATLD algorithm was adopted to decompose the mixed oil. The decomposed spectra had high similarity with the measured spectra of each kind of oil. In addition, the concentration recovery rate was high. The experimental results show that the SWATLD algorithm has good effect on the separation of oil mixtures with overlapped spectra.
NOMENCLATURE
threedimensional data matrix residual matrix A, B, C twodimensional loading matrix xijk the ijkth element of aif, bif, ckf the ifth, jfth and kfth elements of loading matrix A, B, C eijk the ijkth element of the threedimensional residual matrix σ the sum of squared residuals F the component number

[FIG. 1.] The decomposition diagram of PARAFAC model.

[]

[]

[]

[]

[]

[]

[]

[]

[]

[FIG. 2.] Process flow chart of SWATLD.

[TABLE 1.] The concentration of SDS micellar solution and the fluorescence intensity of kerosene SDS micellar solution

[TABLE 2.] The oil concentration in each sample (mg/L)

[FIG. 3.] 3D fluoresence spectra of three standard samples. (a) 3D fluoresence spectrum of diesel standard sample. Concentration: 10 mg/L; Width between excitation and emission slit: 1.11 mm; Excitation and emission step length: 5 nm; (b) 3D fluoresence spectrum of gasoline standard sample. Concentration: 50 mg/L; Width between excitation and emission slit: 1.11 mm; Excitation and emission step length: 5 nm.; (c) 3D fluoresence spectrum of kerosene standard sample. Concentration: 10 mg/L; Width between excitation and emission slit: 1.11 mm; Excitation and emission step length: 5 nm.

[FIG. 4.] 3D fluoresence spectra of mixed samples. (a) 3D fluoresence spectrum of a mixed solution of diesel and gasoline. Concentration of diesel and gasoline is 10 mg/L and 10 mg/L, respectively. Width between excitation and emission slit: 1.11 mm; Excitation and emission step length: 5 nm.; (b) 3D fluoresence spectrum of a mixed solution of diesel, gasoline. and kerosene. Concentration of diesel, gasoline and kerosene is 10 mg/L, 10 mg/L and 10 mg/L respectively. Width between excitation and emission slit: 1.11 mm; Excitation and emission step length: 5 nm.

[FIG. 5.] Core consistency value of .

[FIG. 6.] The spectra of actual solution and SWATLD analyzed solution. (a) Fluorescence excitation spectra. “OOOO” : the actual spectrum of gasoline; “ΔΔΔΔ” : the actual spectrum of diesel; Factor 1 and factor 2: the components obtained by SWATLD algorithm. (b) Fluorescence emission spectra. “OOOO” : the actual spectrum of gasoline; “ΔΔΔΔ” : the actual spectrum of diesel; Factor 1 and factor 2: the components obtained by SWATLD algorithm.

[TABLE 3.] The predicted concentration and recovery of diesel and gasoline obtained by SWATLD

[FIG. 7.] The fitting lines between actual concentration and predicted concentration of diesel and gasoline. (a) R2 =0.9835, (b) R2 =0.9622.

[FIG. 8.] Core consistency value of .

[FIG. 9.] The spectra of actual solution and SWATLD analyzed solution. (a) Fluorescence excitation spectra. The full lines represent the spectra of 3 components obtained by SWATLD algorithm; the imaginary lines represent the spectra of diesel, kerosene and gasoline respectively. (b) Fluorescence emission spectra. The full lines represent the spectra of 3 components obtained by SWATLD algorithm; the imaginary lines represent the spectra of diesel, kerosene and gasoline respectively.

[TABLE 4.] The predicted concentration and recovery of diesel, gasoline and kerosene obtained by SWATLD

[FIG. 10.] The fitting lines between actual concentration and predicted concentration for diesel, gasoline and kerosene. (a) Diesel：R2=0.9536, (b) Gasoline：R2=0.9107, (c) Kerosene : R2=0.9646.