In multiview and integral imaging, the image in the image plane can be treated as a form of a two-dimensional representation of three-dimensional objects to be displayed. This image is called an integral image [1]; it consists of multiple elemental images [1] (image cells) and, with a proper optics, it can be observed from different directions [2]. Each cell corresponds to a light source or its optical equivalent (lens, pinhole, barrier stripe, etc.) In its turn, the content of a cell can be treated as a set of directional (view) images [3,4]. Correspondingly, basing on the concept of view images, an integral image can be decomposed into a set of view images, in addition, a set of view images can be combined into a single integral image [5]. This is the known technique to expand the integral image into view images; this is schematically illustrated in Fig. 1, where the cells are shown by segments of bold lines and the directions indicated by lines with arrowheads. Note that throughout this paper, the projected coordinates are used; however, Figs. 1 and 2 are the only illustrations regular (non-projected) coordinates are in use.
Due to the similarity [6-8], and in spite of possible geometric distortions [9], in this paper, we intentionally do not distinguish between the integral imaging and the multiview imaging, two methods of autostereoscopic imaging. Particular features of each method do not seem to be essential for the current content. The similarity allows us to describe an integral image in terms of the multiview imaging and vice versa. Thus, in this paper we prefer considering the common features of the integral imaging and the multiview imaging. In particular, the image in the image plane is referred to as the integral image in both cases.
On the other hand, the useful and known concept of view images [10], [11] is not the only possible idea to describe the integral image. Generally, alternative principles to represent the same phenomenon are not impossible and may also exist. In this paper, an alternative concept for the
integral image will be proposed and illustrated in examples.
Considering the integral image itself, one can recognize similar patterns in various integral images. For instance, some peculiar patterns consisting of regularly repeated patches can often be recognized in various integral images, see, e.g., the illustrations in papers of other authors: Fig. 12(b) in [12], Figs. 2 and 3 in [13], Fig. 2 in [14], Fig. 9(f) in [15], and Fig. 7(a) in [16]. Among them, most patterns seem to represent a point of an object; however, Fig. 12(b) in [12] shows a line. These and many other integral and multiview images may inspire someone that the integral image may consist of some “elemental particles”.
Basing on that idea, a pattern-based representation can be also built in parallel to the existing concepts of view images in multiview displays and elemental images in integral imaging. In particular, we propose the reference functions which represent basic three-dimensional objects (points) as they are mapped onto the two-dimensional image plane. Such functions (their physical meaning is the brightness or transparency distribution across the image plane) look like a set of repeated patches distributed across several cells.
The paper is organized as follows. The Sections II and III introduce one- and two-dimensional reference functions for a square (rectangular) layout of the image cells. Two general applications of the reference functions, the image analysis and the synthesis, together with the numerical experiments are presented in Sections IV and V. The related effect of the discrete (pixelated) integral image is considered in Sec. VI. The next Sec. VII contains discussions covering, in particular, the reference functions for the hexagonal and random layouts. The conclusions and acknowledgements complete the paper.
II. ONE-DIMENSIONAL REFERENCE FUNCTIONS
In this section, the one-dimensional construction blocks (the units, bricks, or even “elemental particles”) of the integral and multiview imaging are introduced. This description is based on the linear geometry model [17]. The model describes the geometry of a three-dimensional autostereoscopic display based on two planes (the image plane and the light source plane) and two regions (the image region and the observer region). The former region is located near the image plane (a screen), the latter at certain distance in front of it. In the Euclidean coordinates, the quasi-horizontal cross-sections of regions have a deltoid shape, and their geometrical structure is not uniform.
To obtain an extendable form, the projective transformations are used. It turns out that it is possible to build a transformation which yields a square-shaped region with a periodic structure. In this case, the target plane is perpendicular to the quasi-horizontal source plane with the centers of projection located at the normals to the centers of the light source array and the sweet spot (where the best visual image can be observed), as shown in Fig. 2. Note that the plane
The projective form is especially convenient for the collinear points on the parallel lines; in particular, it ensures the uniform locations of discrete distance planes (depth planes), as shown in Fig. 3, where
Mathematically, these “particles” are described through the reference functions (in two dimensions, the patterns, see Sec. III) of the integral image. The proposed reference functions (in this paper, the step functions with two levels) are based on the rectangular unit impulse function. Essential in this description is that the number of the unit impulses in a pattern distinctly depends on the distance. For sake of simplicity, only the shapes are considered, but not amplitudes nor colors. Therefore throughout this paper, all the reference functions are of unity height, and all images are also step functions with two levels, i.e. black-and-white images without intermediate grades.
Similarly to the view images, the representation by the reference functions also works in two directions, composition and decomposition. These operations can be expressed through the convolution of the integral image with the reference functions. The convenience of this approach is in providing both synthesis and analysis of the integral image in a quite similar way.
A three-dimensional object is “split” in slices by discrete distance planes defined in the model. In the projected form [17], all distance planes are equidistant.
The integral image can be considered as a result of a transformation which maps three-dimensional objects from the image region onto the two-dimensional surface of the image plane. Once mapping is explicitly found, the reference functions can be built.
A spatial point can be mapped onto the image plane in various ways, e.g., basing on the visibility of the light sources. It can give a key to build the impulse mapping functions [18,19] consisting of rectangular impulses of the identical amplitude, see Fig. 4. The shape of the functions depends on the index
Basing on the geometry Fig. 3, the visibility of the light sources can be estimated as follows. A displayed object is located in the image region. A given location in a threedimensional space within the projected image region between the distance planes
Generally, the points of an object themselves are infinitely small, they have zero effective area in the cross-section of the image region; and the amount of required locations is
virtually infinite. In order to reduce this amount, a discrete representation based on the line segments can be used instead of the continuous representation. In this case, the effective area becomes larger but remains limited (finite), and the amount of required locations is reduced. This way, the functions with different numbers of separated points for different distance planes can be built.
Under the above assumptions, the
For example, the second function
The width of all impulses is identical, as well as the gap between them. Fig. 4 shows examples of the one-dimensional reference functions for |
The period of impulses of the
where
is the period of projected light sources,
The width of the impulse involves an adjustment parameter
where 0 <
The center of
where
Basing on Eqs. (1) - (4), the
where
where
is the distance between the projected nodal planes,
The total area of all impulses of any function is the same and equal to
From (5) and Fig. 4, it can be seen that the positive and negative functions only differ in locations within cells, while the number of cells and the width of the impulses are the same in both cases
The influence of the parameter
only needed; this particular case is considered in Sections IV and V. In such a way, the size of the mapping unit is controlled; the spatial objects are split in smaller or larger parts and thus can be described with different accuracy for different
The one-dimensional reference functions (5) can be used for transparency
III. TWO-DIMENSIONAL REFERENCE FUNCTIONS
In displays, the one-dimensional functions describe the stereoscopic images with the one-dimensional parallax (horizontal parallax only). For the two-dimensional (full parallax) imaging, two-dimensional reference functions are needed. A straightforward extension of (5) into two dimensions is the multiplication of the one-dimensional functions, as graphically shown in Fig. 6.
This extension is based on the 90°-rotation symmetry of the optical element, for example, the crossed lenticular plates of identical periods arranged orthogonally. Another example is a lens array with lenses arranged in a square matrix. Fig. 6 shows examples of the patterns 3 × 3 and 7 × 7 cells, which will be used for synthesis and analysis of the integral images in Sections IV and V.
For the two-dimensional functions (7), the width, period and displacement of impulses in
In this paper, we intentionally use the simple patterns like Fig. 6 for explanation of the principle. In Fig. 6 and related illustrations below, the white color means a zero value (zero brightness or opaque area, depending on the type of a particular screen), while the black means one unit (maximum brightness, completely transparent area). For instance, the functions in Figs. 6 (b), 6 (c) have zero values at the perimeter and one unit in the center.
What can be done using these elemental particles? It appears that both synthesis and analysis of integral images are possible using the same functions. It means that the composition and decomposition of the integral image can be conveniently expressed through the same operation, the convolution of the integral image with the proposed reference functions. Preliminary results were reported in [18-20]. The particular significance of the proposed approach is in providing both synthesis and analysis of the integral image in a quite similar way.
The two-dimensional patterns (7) (see also Fig. 6) allow synthesizing images. Since assumed in this paper is the model of brightness based on the step functions with two levels (zero and one), the occlusion cannot actually be supported, and the brightness of the resulting image can be written as the point-by-point logical summation (logical OR) of the reference two-level patterns,
where the integer numbers
By using the patterns (7) as construction blocks, an integral image of any three-dimensional object can be built basing on, e.g., its wireframe model. The merged patterns (8) form the resulting integral image.
In simulation examples, the spatial objects are constructed from two nonintersecting skew lines in different planes. For the first object, the first line connects points in the first plane (4, 8, 1) and (8, 4, 1), the second line (4, 4, 3) and (8, 8, 3) in the third plane; for the second object the same
To confirm that the testing image represents an intended three-dimensional object, the synthesized image can be printed on the paper and displayed with using a pinhole array, or a lens array, or two crossed lenticular plates. In our experiments, two lenticular plates 25 lenticules per inch with the focal distance 3.63 mm were used. Such layout is typical for autostereoscopic displays.
In the displayed testing image, the two skew lines can be clearly localized visually by distance; although, they look crossed in each photograph of Fig. 8. The visible regular grid in Fig. 8 is formed by lenticules comprising the square cells.
The reference functions make it also possible to analyze the integral image by distance; in other words, to find the distance from the image plane to this or that point of an object. Similarly to the patterns of certain three-dimensional objects, the reference functions can be treated as the patterns of the most basic objects, the points lying at the predefined distances. Together with the known methods, e.g., [21,22], the reference functions can be applied to extract the distance to three-dimensional objects or their parts basing on the integral image. This can be made, for instance, by convolution of the integral image with the reference functions. The convolution is a mathematical operation on two functions defined by an integral [23]; in the one-dimensional case it looks like follows,
In two-dimensional convolution analysis, the function
reference function (7). Generally, the convolution shows the similarity between the image and the reference pattern. The convolution analysis of the synthesized testing images Fig. 7 can be made with using two-dimensional reference functions across a set of fixed discrete locations, i.e., with the lateral step equal to the cell size
The resulted surface of convolution can be represented in various ways, in cross-sections, in three-dimensional projections, or in shades of gray. In the latter case, the values of functions are encoded by gray levels similarly to Fig. 6 with additional levels which represent intermediate values. Note that the gray levels are only used in this paper to represent the results of the convolution analysis in the current section and to draw the example of partial pixels in Sec. VI; the reference functions and the images are always the step functions with two levels. An example of the image analysis within a distance plane is shown in Fig. 9.
Furthermore, the image can be analyzed across several planes with using several patterns corresponding to the different distance planes. An example is shown in Fig. 10 for five planes (and five patterns, three of which are shown in the illustration). In this case, except for features of each plane, the local maxima of the convolution across several planes indicate the distances of the three-dimensional points of interest.
In the numerical simulation, the testing integral images of skew lines (like Fig. 7) were analyzed by convolution
as described above. In this analysis, the reference functions (7) with
The cross-sections of the convolution surface along diagonals ±45° (i.e., along the original lines) are shown in Fig. 12 for the planes 1 through 7. The area, where the calculated convolution was not identically zero, appeared to be 9 × 9 cells for the first testing image of this simulation and 11×11 cells for the second.
As shown in Fig. 12, the convolution with the proper reference function gives a relatively flat reply (its non-flatness is about 2 - 7%). This could be a distinctive feature in recognizing the line segments within a plane.
The convolution with a neighboring reference function (i.e. the first pattern for the third plane, the fifth pattern for the third and seventh planes, etc.) may sometimes exceed a certain critical value (for example, the level 60% of the maximum convolution was used in this numerical experiment). Generally, this may lead to inexact recognition. However, the indication of an improper distance plane caused by this reason has happened in the numerical experiments not so often; namely in 2 cells for the first image and in 4 for the second, i.e., only in 2.5% and 3.3% of the whole tested area, resp. In this example, the cell size was 53 pixels, which value may produce some inexactness due to the pixelated structure, see Sec. VI.
The convolution of the integral image with the reference functions suggests the points which lie in particular planes by maximum convolution; this effectively separates the distance planes. The original planes of the two lines seem
to be restored correctly in both testing images Fig. 7. Similarly looking spatial distributions (see Figs. 4, 6-9) can be found, e.g., in [13, 16]. In common with the patterns of the digital holography [24, 25], the proposed reference functions depend on the distance only; the discrete displacement within a plane does not affect their shape. Especially important for this analysis is the projection, because the period of cells is identical for all planes in the projected form but varies from plane to plane in the regular coordinates.
In order to confirm the above statements additionally, a supplementary computer simulation was performed. In it, two integral images of the diagonals of a cube 8x8x8 were synthesized for |
For the case of spatial diagonals, the results of the convolution analysis (restored cross-sections) are shown in Fig. 15 by planes.
Figure 15 shows the cross-sections of the cube, and two crossing diagonals can be recognized in these successive cross-sections. Alternatively, several cross-sections superposed with some artificial displacements are shown together in Fig. 16 the front and rear faces of the cube are highlighted, as well as the spatial diagonals.
In Fig. 16, the spatial structure of the crossed spatial diagonals of the cube is clearly recognizable.
In this example of the convolution analysis, we intentionally used the exact integer patterns (see Sec. VI), and no errors or displaced locations were found; all points of face/spatial
diagonals were restored correctly in both images, refer to Figs. 15, 16. The related effect of non-integer patterns may produce some errors as it happens in the previous example of this section. The effect of the integer and non-integer patterns is partially covered in the next Section.
Nevertheless, the convolution results can be estimated in terms of the signal to noise ratio (SNR). In particular, the estimated average SNR of the restored face diagonals is 5.3 (varying from 4.42 to 7.16 in particular planes); for the restored image of the spatial diagonals, the SNR is 5.8 (between 5.18 and 6.43). These values show that in this simulation, the desired signal is notable above the noise level and well-recognizable therefore.
For the practical analysis, it is essential that the total area of all impulses of any reference function is the same (see Sec. II). Therefore, the maximum value of the convolution (meaning the absolute match) is the same for all planes. In recognition of points, lines, etc., this looks convenient, because the same criterion can be applied to any plane. To consider a location as a recognized point in the second (supplementary) simulation of this section, the level 80% of the maximum convolution was used for all planes.
In the previous sections, the description of the reference functions was given with no relation to the pixels of a screen, where the images are displayed. At the present time, a typical case is a relatively small number of pixels along one dimension of the cell [26], and therefore the discreteness of the digital screen may become significant.
To represent a connected three-dimensional object occupying a volume between -
parts (impulses) distributed across
To be displayed in a digital screen, all parts must be expressed in the screen units, the pixels. How to implement the
For instance, the sequence A003418 [27] has the value LCM(1, 2, …, 10) = 2520 for
It means that strictly speaking, it is impossible to build all the pulses (2) in a digital screen exactly. Therefore, an approximate solution should be found, and care should be taken about non-exact partitions. An example non-exact 1/3 of cell and a possible way to approximate partial pixels by gray levels are given in Fig. 18.
The inexactness may lead to visual distortions (a less sharp three-dimensional visual image). The partitions should be evaluated from this perspective, expecting that some numbers could produce less distortion.
Among all
number and gives the total number of divisors including 1 and itself, refer to [27,28]. Thus, the number of exact partitions is given by
The behavior of the divisor function
To estimate the number of exact partitions for various cell sizes in approximately equal manner, the following relative divisor function can be used
This function is derived from the Dirichlet’s formula for the average order of the divisor function [28]; it gives an asymptotic estimate of the share of exact partitions among all partitions for a given
In Fig. 20, the relatively high local maxima are indicated by bold dots; for them,ive divisor function (11) is higher than 2. The corresponding list of abscissas of the relatively high maxima of
The preferable values (12) provide the highest share of exact partitions; this leads to more accurate reference functions, and therefore, to less visual distortions. From (12) and Fig. 20, one may observe that the listed values are 4-fold and 6-fold numbers, at least in the interval from 3 to 240. Consequently, the 12-fold numbers are automatically included in the list. However, not all 4- and 6- fold numbers are in the list (12), because for some of them, the value of
where
Strictly speaking, the definition of the reference functions (5) does not provide a mechanism to distinguish between the +1st and -1st planes. As strange as it seems, this is correct, because the ±1st functions are actually defined in a separate way than others. The similar situation is with the 0th function which is assigned to the identical zero by definition. This inexactness may result in some ambiguity of the image analysis near
The two-level brightness model is used here for sake of simplicity. In practice, the number of levels can be increased. Then, in image generation, the patterns from different planes should overlap each other (replace point by point), starting from the farthest distance plane. This will guarantee the occlusion conditions. The logical summation in (8) is the formal representation of the absence of overlapping in a two-level model.
The pixilated structure is a serious limitation of discontinuous rectangular pulses [13]. For instance, not more than 20% of all partitions (for image cells less than 60 pixels) can be exact; the other 80% cannot be expressed in integer numbers and are principally inexact. Thus, most partitions are essentially non-integer, even using the preferable numbers (13). It means that the problem of discreteness is only partially solved here and needs a further investigation.
The formula (13) clearly states that the 12-fold image cells are preferable. Generally speaking, the formulas similar to (13) can be written for 6- and 4-fold numbers too, but they would be valid only conditionally. From the perspective of fewer visual distortions, an autostereoscopic display device with 12 or 24 pixels (or sub-pixels) per cell width is definitely better than another hypothetical three-dimensional device with all other parameters identical but, say, 11, 13, or 23 pixels per cell. At the same time, it should not be forgotten that sometimes there also exist some local maxima for some of 6- and 4-fold numbers which satisfy the criterion (the relative divisor function higher than 2).
When designing the two-dimensional functions, we implicitly relied upon the 90°-rotation symmetry which, of course, is not the general case. This means that other layouts of optical elements in three-dimensional displays may require different functions, although we expect them to be similar to the reference functions (5). The basic properties (such as the area) of functions are kept for any layout of microlenses or pinholes. The number of split impulses only depends on the distance, but not on the layout of the cells (light sources). It is a permanent distinctive feature of the geometry of the reference functions.
A particular geometry of the pinhole array affects the shape of the two-dimensional reference functions. As it can be seen from (3) and (4), the distances from the center of the pattern to the center of
According to (14), the layouts of the cells and of the impulses (namely, the centers of cells and pulses) are geometrically similar to each other. It means that one of them can be transformed into another by the uniform scaling (resizing) with the coefficient (14). This suggests the concept of the relative distance. The relative distance in the image plane is the ratio of lengths of two collinear vectors from the origin to the center of a cell and to the center of the impulse; this ratio is the same for all spatial points from the same distance, i.e. for the same
Furthermore, the relative distance gives a key to build the reference functions for any arbitrary (irregular) layout of cells, as shown in Fig. 22, where the circular cells of equal area are distributed randomly across the plane. In this example, roughly speaking, the radial displacement of impulses within the cells is equal approximately to the cell size for cells crossing the larger reference circle, whereas to one half of that for the cells crossing the smaller circle (half radius).
For future work, we suppose to investigate more complicated grayscale testing images and the influence of the adjustment parameter
In order to describe an integral image by elemental building blocks, we suggest the one- and two-dimensional reference functions. The proposed functions provide the synthesis/analysis of an integral image by distance with the controllable accuracy, as an alternative of the known technique of composition/decomposition by view images (directions). The results are confirmed in simulation. It is experimentally confirmed that the synthesized image can be displayed in an autostereoscopic display. In simulation and in experiments, the step functions with two levels are used for the reference functions and for the testing geometric objects. Beyond the general interest (structural elements of multiview and integral images), the proposed reference functions can be used in practical applications like depth extraction [14], three-dimensional shape extraction [15], transformations of the integral images and so on. The effect of discreteness due to the finite size of pixels is analyzed, and the preferable sizes of cells are determined. Layouts other than rectangular are also discussed.
The proposed analysis can probably substitute the search of corresponding points in rectified images aimed to the same goal, to reconstruct a three-dimensional structure from images. Also, a direct measurement of depth in a three-dimensional scene photographed through a lens array (instead of a regular camera lens) could be useful.