3D Integral Imaging Display using Axially Recorded Multiple Images
- Author: Cho Myungjin, Shin Donghak
- Organization: Cho Myungjin; Shin Donghak
- Publish: Journal of the Optical Society of Korea Volume 17, Issue5, p410~414, 25 Oct 2013
In this paper, we propose a 3D display method combining a pickup process using axially recorded multiple images and an integral imaging display process. First, we extract the color and depth information of 3D objects for displaying 3D images from axially recorded multiple 2D images. Next, using the extracted depth map and color images, elemental images are computationally synthesized based on a ray mapping model between 3D space and an elemental image plane. Finally, we display 3D images optically by an integral imaging system with a lenslet array. To show the usefulness of the proposed system, we carry out optical experiments for 3D objects and present the experimental results.
3D display , Depth extraction , Multiple images , Elemental images
Integral imaging (II) has been actively researched as one of the next-generation 3D display techniques [1-9]. This is due to full parallax, continuous viewing points and full-color 3D images. II has simple structure of both pickup and display schemes compared to other glass-free 3D displays. It uses a lenslet array to capture and display 3D information. That is, a 3D object is recorded through a lenslet array. Here the recorded images are called elemental images. And, the 3D images are formed by integrating the rays from 2D elemental images by use of a lenslet array.
Recently, several 3D imaging methods have been studied to extract high-resolution 3D information in a 3D scene. Among them, an axially distributed image sensing (ADS) method [10-14] has been proposed. In this method, a single camera is translated along its optical axis and then longitudinal perspective information is obtained for a 3D scene. Compared with the other 3D imaging techniques, it can provide simple structure such as single movement and high resolution elemental images. Therefore, it can be used to extract a high-resolution depth map in the practical environment.
In this paper, we propose a 3D display method combining the pickup process using axially recorded multiple images and the integral imaging display process. First, we extract the color and depth information of 3D objects for displaying 3D images from axially recorded multiple 2D images. Next, using the extracted depth map and color images, elemental images are computationally synthesized based on a ray mapping model between 3D space and the elemental image plane. Finally, we display 3D images optically by an integral imaging system with a lenslet array. We perform the preliminary experiment and present the experimental results.
In this section, we present a 3D integral imaging display system using axially recorded multiple 2D images as shown in Fig. 1. It is consisted of four processes: the ADS pickup process, depth extraction process, computationally elemental image synthesis process, and integral imaging display process.
Figure 2 illustrates the ADS pickup stage. Multiple 2D images with slightly different perspectives are recorded by translating the single camera along its optical axis . The focal length of the imaging lens is
g. When the camera moves along the zaxis (its optical axis), multiple 2D images with different perspectives are captured (e.g. first camera position is z1 and last camera position is zNwhere z1< zN). Then, Nimages can be recorded by shifting the camera N-1 times. The separation distance between two adjacent cameras is ∆ zin Fig. 2. Thus, the i-th camera position can be calculated by zi= z1+( i-1)∆ zfrom the object where z1is the distance between the origin ( z=0) and the first image sensor. Therefore, each recorded 2D image has a different magnification ratio. That is, the 2D image with the smallest magnification ratio can be obtained when i=1 because the camera position is farthest from the 3D object. This magnification ratio difference between recorded 2D images is useful to estimate or extract depth information of the 3D object.
In this section, we explain the point-based depth extraction method using the profilometry for axially recorded multiple images. The basic principle of our depth extraction method is to find the depth information of 3D objects using statistical variance of intensity of integrated rays generated from all the recorded multiple images. That is, rays from a 3D object point are recorded into the different images according to the camera position as shown in Fig. 3. The corresponding pixel of the 3D object point in each camera is recorded with the same intensity level. When we reconstruct the integrated rays at the original position of the 3D object point as shown in Fig. 4(a), all intensities of rays become the same. On the other hand, rays have different intensities when the estimation position is not equal to the original position of the 3D object point as shown in Fig. 4(b). Based on this principle, we can estimate the depth information using the statistical variance of intensity distribution of integrated rays. However, it does not consider spatial variation of pixel intensity in the local area around the 3D object point. Thus, it may increase the noise which is caused by point-wise depth fluctuation in the extracted depth map.
Let us estimate the reconstruction point with (
x, y, Z). Then, we can find the corresponding pixel for each camera and obtain its intensity value. The position of each corresponding pixel is different according to the distance between the reconstruction point and the image sensor. Let the local coordinates and the intensity of the corresponding pixel on the i-th image sensor be denoted by ( ξi, ηi) and Ii( ξi, ηi, zi), respectively. Here, ziis the position of the i-th image sensor from the origin ( z=0). As shown in Fig. 3, we can see that ξ i=- gx/( Z-zi-g) and hi=- gy/( Z-zi-g). And, ( ξ i, η i) is varied according to Z. To find whether the reconstruction point is 3D object point or not, the statistical variance of the intensity values for all cameras is used. First, we average the intensities of all corresponding pixels at the reconstruction point ( x, y, Z). It is calculated by the following:
Then, we define the variance metric D as
From Eq. (2), it is seen that the variance metric
Dis varied according to the Zdistance. Finding the local minimum of variance metric D, the 3D object point can be estimated. This can be formulated as
Finally, when the local minimum variance metric is found, the intensity value of the 3D object point can be obtained using the mean value as described in the following equation:
To obtain the final depth map and the color image of the 3D object, the depth estimation process is repeated for all
x, yand Z. Here, the depth search range of Zmay be determined by the system performance due to the heavy computation load in the large search range.
Now, we present the generation process of elemental images for the 3D display from the calculated depth map and color image. The generation process of elemental images based on the geometry of pixel mapping between 3D object points and the elemental images plane through the lenslet array (in fact, virtual pinhole array) is shown in Fig. 5. The rays from the 3D object point pass through the center of each lenslet (virtual pinhole) and then they are recorded into the corresponding pixels in the elemental image plane. The pixel coordinates for the recorded pixels in each elemental image are given by
( x, y )is the rescaled version of ( x, y )in Eq. (3) by scale change of the 3D images in the display space, fis the focal length of the lenslet and Pk,hand Pk,vare the horizontal and vertical position of the k-th lenslet, respectively. For all pixels, the final elemental images can be calculated by iteration of the pixel mapping process.
In a typical II system, depth-priority integral imaging (DPII) and resolution-priority integral imaging (RPII) can be classified by the gap between the lenslet array and the display panel,
g[4,5]. In DPII, gis the same as the focal length of lenslet ( f). Since displayed or reconstructed voxel size is equal to the lenslet size in DPII, the lateral resolution is low. However, the large depth can be provided because the depth of focus of the lenslet in DPII is large and 3D image can be displayed in real and virtual image fields simultaneously. On the other hand, gis not equal to the focal length of the lenslet in RPII. Since the voxel size is small enough in RPII, high lateral resolution can be provided. However, a 3D image has shallow depth because the depth of focus of the lenslet in RPII is small. In this paper, we use the DPII system for display to provide the large depth of the 3D object.
Figure 6 illustrates the DPII system setup to display 3D images using computationally synthesized elemental images from axially recorded multiple images. The elemental images are focused at the focal plane of the lenslet array by passing through the projector to implement the DPII system. Then, 3D images are displayed in free space by displaying the elemental images through the lenslet array. We capture 3D images at different viewpoints to prove that the DPII system can provide the full parallax of 3D objects.
To demonstrate the proposed scheme, we performed preliminary experiments. Figure 7 shows the entire experimental system setup including the ADS process and the DPII display process. First, the 3D object is composed of two toys: ‘car’ and ‘sign board’ as shown in Fig. 7. They have different shapes and colors. They are located at 300 mm and 530 mm from the image sensor, respectively. In the ADS pickup process, the objects should be located out of the optical axis due to the low perspective collection of 3D objects near the optical axis. Therefore, two toys are located at approximately 80 mm from the optical axis. Their size is approximately 40 mm×30 mm. We record multiple 2D images by moving the single image sensor along its optical axis. We use a Nikon camera (D70) whose pixels are 2184(H)×1426(V). The imaging lens with focal length
f=50 mm is used in this experiments. The camera is translated with ∆ z=5 mm increments for a total of K=41 elemental images and a total displacement distance is 200 mm. Two recorded 2D images (1st image and 41th image) are shown in Fig. 7(b) and (c).
For the synthesis of elemental images, we extract the depth map and color image using the depth extraction process (i.e. profilometry) as described in Chapter 2.2. The depth search range of
Zwas from 200 mm to 600 mm with the step of 10 mm. The extracted color image and depth map are shown in Fig. 8. Their resolution is 640(H)×480(V).
Then, the extracted depth map and color image are used to synthesize the elemental images for the DPII display as explained in Fig. 5. 3D object points are located within 20 mm(H) to 20 mm(V). To fit the depth range between real 3D objects and displayed 3D images, we used the rescale process . That is, we relocated the ‘car’ object at 30mm and the ‘sign’ object at -20mm in the display space. In addition, the depth inversion was applied to avoid the pseudoscopic image problem in the integral imaging display. The computationally modeled lenslet array has 45(H)×45(V) lenslets and each elemental image is mapped with 20(H)×20(V) pixels through a single lenslet. Thus, we synthesize an elemental image array of 900(H)×900(V) pixels as shown in Fig. 8(c).
Finally, we carried out the optical experiments to display 3D images using the synthesized elemental image array in the DPII as shown in Fig. 8(c). In the experimental setup as shown in Fig. 6, the elemental images are displayed through a projector. The lenslet array has 45(H)×45(V) lenslets whose focal length and diameter are 3 mm and 1.08 mm, respectively. The size of reconstructed images is approximately 5 mm. The ‘car’ image is observed at
z= 30 mm (real image) and the ‘sign board’ image is observed at z= -20 mm (virtual image). Figure 9 shows experimental results of displayed 3D images. The measured viewing angle is approximately 16°. It is shown that we can observe the 3D images correctly through both real and virtual fields in the DPII system.
In conclusion, a novel integral imaging display method has been proposed using axially recorded multiple images. The proposed method extracts 3D information based on the statistical variance of rays from a 3D object point. With the extracted 3D information of objects, the elemental images are computationally synthesized based on a ray mapping model between 3D space and an elemental image plane. We can display 3D images optically in the depthpriority integral imaging system with lenslet array. The experimental results show that the proposed method can display 3D images using the elemental images synthesized from axially recorded multiple 2D images.
[FIG. 1.] Procedure of the proposed 3D display method.
[FIG. 2.] Pickup process to obtain axially recorded multiple images.
[FIG. 3.] Ray model for recording of 3D object point.
[FIG. 4.] (a) Ray relation at the original position of 3D object point (b) Ray relation at the different position.
[FIG. 5.] Ray mapping for elemental image generation.
[FIG. 6.] Optical integral imaging display with large depth.
[FIG. 7.] (a) Experimental setup (b) 1st recorded elemental image, (c) 41th recorded elemental image.
[FIG. 8.] (a) Extracted color image (b) Extracted depth map. (c) Generated elemental images.
[FIG. 9] Experimental result (a) Left view (-8 deg). (b) Right view (8 deg).