Multiple Description Coding Using Directional Discrete Cosine Transform
 Author: Lama Ramesh Kumar, Kwon GooRak
 Organization: Lama Ramesh Kumar; Kwon GooRak
 Publish: Journal of information and communication convergence engineering Volume 11, Issue4, p293~297, 31 Dec 2013

ABSTRACT
Delivery of high quality video over a wide area network with large number of users poses great challenges for the video communication system. To ensure video quality, multiple descriptions have recently attracted various attention as a way of encoding and visual information delivery over wireless network. We propose a new efficient multiple description coding (MDC) technique. Quincunx lattice subsampling is used for generating multiple descriptions of an image. In this paper, we propose the application of a directional discrete cosine transform (DCT) to a subsampled quincunx lattice to create an MDC representation. On the decoder side, the image is decoded from the received side information. If all the descriptions arrive successfully, the image is reconstructed by combining the descriptions. However, if only one side description is received, decoding is executed using an interpolation process. The experimental results show that such the directional DCT can achieve a better coding gain as well as energy packing efficiency than the conventional DCT with realignment.

KEYWORD
Directional discrete cosine transform , Image coding , Multiple description coding

I. INTRODUCTION
Due to network congestion and delay sensibility, video transmission over a lossy network is always a great challenge. Multiple description coding (MDC) [1] is an attractive approach to solving this problem as shown in Fig. 1. It can efficiently combat packet loss without any retrainsmission, thus satisfying the demand of realtime services and relieving network congestion.
MDC encodes the source message into several bit streams (descriptions) carrying different information, which can then be transmitted over multiple channels [2]. In MDC’s simplest form, two parallel channels are assumed to connect the source with the destination. If only one channel works, the descriptions can individually be decoded to sufficiently guarantee a minimum fidelity in the reconstruction at the receiver [3]. However, when both channels work, the descriptions from the channels can be combined to yield a relatively high fidelity reconstruction.
Numerous MDC techniques have been proposed in recent years, such as the multiple description scalar quantization (MDSQ) proposed in [2]. In MDSQ, two descriptions are created by two coarse quantizers, each ensuring an aceptable distortion when only one of them is received.
These two coarse quantizers can be combined to produce a finer quantizer if two descriptions are received. Further, various types of coding techniques such as subband coding and wavelet coding have also implemented MDC [47].
In this paper, we revisit the MDC scheme based on the pixel domain subsampling. In particular, we focus on the quincunx subsampling lattice. Instead of applying a horizontal or vertical realignment so as to form regular
square blocks, we retain the quincunx lattice and apply the directional discrete cosine transform (DDCT). Both theoretical analysis and simulation test will be discussed to confirm that an improved coding efficiency can be achieved in our DDCT, as compared to the traditional DCT with horizontal or vertical realignment.
In Section II, we briefly introduce the traditional pixeldomain sublattice on MDC. The proposed directionally sampled discrete cosine transform (DSDCT) for the quincunx sub sampling lattice is presented in Section III. Further, we explain how to handle some boundary blocks that remain after the DSDCT. In Section IV, we describe the experimental setup and present some simulation results. Finally, some conclusions are presented in Section V.
II. SUBSAMPLING ON MDC
In this section, we will discuss the sublattice technique used in the proposed method. Given the source image I, which is typically a subset of Z^{2}, in the proposed method, signal samples are partitioned into two subsets as follows:
There are two different methods to partition the image into two parts. Fig. 2(a) shows orthogonal subsampling, and Fig. 2(b) illustrates quincunx subsampling. In the proposed method, descriptions generated by scheme two are used. One of the major advantages of this scheme is the increase in correlation between samples. Under this scheme, two descriptions are generated according to a chessbox pattern, and the Euclidian distance between two neighboring samples is constantly equal to √2. After the splitting process, each description is transformed to the transform domain.
III. DIRECTIONAL COSINE TRANSFORM
The DCT and the discrete wavelet transform used in image compression are implemented by separable onedimensional (1D) transforms in the rows and columns of images.
The conventional
N ×N 2D DCT is implemented separately by twoN point 1D transforms. LetB(i, j)_{N × N} andC_{N × N} be the image block and the transform matrix. Then, the corresponding block of transformed coefficientsB(u,v) can be expressed as follows:where
Naturally, the conventional 2D DCT seems to be the best choice for image blocks in which vertical and/or horizontal edges dominate. However, it may cause some defects when it is applied to an image block in which other directional edges dominate. The major shortcoming of the separable transform is that it cannot represent the anisotropic edges in the image sparsely. In order to obtain the better representation of edges in all directions, the given image block is transformed on the basis of the directional DCT in Fig. 3.
In the proposed method, there are in total five directional modes. Among these modes, one is the vertical prediction (mode 0), and the remaining are labeled diagonal downright (mode 1), diagonal downleft (mode 2), verticalright (mode 3), and horizontaldown (mode 4), as shown in Fig. 4(a), (b), (c), (d), and (e), respectively.
On the encoder side, the input image is first analyzed blockbyblock to decide the transform directions. The 1D DCT transform is performed in each block of the selected direction along the vertical direction. Next, the horizontal DCT is applied in the second step.
On the decoder side, when only one side description is received, the main task of the decoder is to interpolate the missing subimage from the received subimage. Although the proposed method involves the use of the directional data, all pixels in the partitioned blocks share a common direction
and the lost description is estimated from the four connected neighbors by using the conventional bilinear interpolation method. Since there are numerous interpolation algorithms for preserving the original textue contents of an image, we can enhance the quality of the reconstructed image by selecting any of the appropriate interpolation schemes.
Similarly, when two descriptions are simultaneously available at the decoder, a straightforward method is to decode the two descriptions simultaneously and then merge the two subimages. Since each side description is compressed using quantization, any decoded pixel value from one description is only an approximation of the original.
IV. EXPERIMENTAL RESULTS
Several experiments were conducted, and their results are presented in this section in order to evaluate the performance of the proposed image MDC scheme. The implementation of the new MDC scheme is integrated into JPEG coding, with the directional transform replacing the original 2D separable rectilinear discrete cosine transform.
Naturally, the JPEG MDC scheme is used as a benchmark for performance comparisons in terms of the peak signaltonoise ratio (PSNR), where the input image is first split into two descriptions by quincunx lattice subsampling and then coded individually by JPEG coding.
To make a fair comparison, we also adopt the proposed textureoriented interpolation and data fusion algorithms for the central decoding of JPEG MDC.
Two JPEG test images (Lena and Barbara) having a resolution of 512 × 512 are used in our experiments. They are split into two descriptions, each of which is a quincunx lattice. Each description is compressed using JPEG coding and the proposed scheme.
We evaluate the PSNR performance of a side decoder when only one description is received. As proposed, the fullresolution images are reconstructed from the received side description by the texture orientation interpolation method. For the sake of comparison, we also compute the PSNR results of the widespread linear interpolation method when applied to the received quincunx image, as shown in Fig. 5. Since the two descripttions are balanced in our experiments, it suffices to list the PSNR values for description 1 of the proposed MDC scheme. The PSNR values shown in this figure are calculated over all samples including both the decoded and the interpolated images. The rates are also calculated over all samples in terms of a fullresolution image. One can observe that at low rates (e.g., 0.125 bpp), the two interpolation methods perform roughly the same, with only a small advantage to the texture orientation method. This is due to the lack of highfrequency components in the received side description at a low rate.
V. CONCLUSIONS
In this paper, we proposed a new MDC scheme using the directional DCT transform. The input image was directly split into two descriptions in the pixel domain using quincunx lattice subsampling. Using DDCT, we represented the image pixels oriented in different directions perfectly. The experimental results confirmed that the proposed directional MDC scheme could outperform the JPEG MDC scheme by up to 0.9 in the cases of both side decoding and central decoding.
Now, let us turn to channel modeling. Due to the highly frequencyselective nature of underwater channels, multicarrier modulation (e.g., orthogonal frequencydivision multiplexing) is an attractive choice for reduction in receiver complexity. For analytical convenience, coding is assumed to be performed over a subchannel in a slot experiencing relatively flat fading (through channel coding across all the subchannels, full frequency diversity can be utilized, resulting in a better outage performance, which remains for further work). In this work, we focus on a subcarrier under the assumption that the same relay technique is applied to every subcarrier.
As stated earlier, suppose that the processing delay, taking place due to a variety of operations (e.g., receiving and reading a packet), at the relay is negligible as compared to the propagation delay in water (the propagation speed of an acoustic signal in water is around 1,500 m/s
[13] , which is five orders of magnitude lower than that of a radiowave). This is because the processing delay is at most on the order of a few milliseconds, while the propagation delay can be of several seconds according to the distance between nodes. Such an assumption was similarly made in[14] only when the AF relay was used in the underwater system even if the AF protocol could not utilize the full spatial diversity, which will be specified in Section IIIA. In this model, the symbol generated atR is immediately forwarded toD , instead of waiting until the next time slot. That is, no idle time is assumed atR . Then, when the relative propagation delay between the direct and the relay paths is only a multiple of the basic symbol duration (far less than the length of each slot) under our network topology, the signal sent fromS and the signal forwarded byR can be regarded as two paths in the frequency domain at a certain time by allowing a sufficiently long guard interval between the symbols. That is, synchronous cooperative communications can be possible owing to the use of multicarrier modulation (refer to[15] for the detailed description). Thus, unlike in the case of a wireless radio[5 ,16] , no additional time slot is required for cooperative transmission.When the two instantaneous fullduplex relay schemes are used at a certain subcarrier (symbol), the output signals at the relay
R and the destinationD are given byand
where
y_{R} andy_{D} denote the signals received atR andD , respectively,x_{S} andx_{R} represent the transmitted symbols fromS andR , respectively, andz_{R} andz_{D} refer to the independent and the identically distributed (i.i.d.) additive white Gaussian noises with varianceN _{0}. Here,h_{RS} ,h_{RD} , andh_{DS} denote the i.i.d. channel coefficients of theSR, RD , andSD links, respectively, where all of them followCN (0,1), i.e., Rayleigh fading (Note that Rician fading provides a good match for underwater acoustic channels[17] . However, since the high SNR outage behaviors of Rayleigh and Rician channels are shown to be identical[18] , we simply consider Rayleigh fading in this work). Moreover, we assume the quasistatic channel model, in which the channel coefficients are constant over time during one block transmission and change to a new independent value for the next block. The CSI is assumed to be available at the receivers, but not at the transmittersFor the AF transmission, the transmitted symbol at
R is given bywhere
g represents the amplification factor and is given by [5]For DF transmission, the relay processes
y_{R} by decoding an estimate of the symbol transmitted fromS . The relay codebook is assumed to be independent of the source codebook. The relayR transmits the encoded symbol if it decodes the received signal successfully, i.e., the effective SNR h_{RS} ^{2}/N _{0} atR exceeds a predetermined threshold. Otherwise,x_{R} is set to 0, i.e., no transmission atR .III. DMT ANALYSIS
In this section, the DMT curves for threenode underwater acoustic systems using the AF and DF protocols are analyzed after briefly reviewing DMT
[10] .A. Overview of DMT

[Fig. 1] Multiple description (MD) coding with two descriptions and three decoders.

[Fig. 2.] Two different pixeldomain subsampling lattices for multiple description coding: (a) orthogonal subsampling and (b) quincunx subsampling.

[Fig. 3.] Exemplified elementary matrix operation: (a) no directional and (b) directional. The circles denote pixels, and the squares represent halfpixels.

[Fig. 4.] Five direction modes: (a) vertical prediction, (b) diagonal downright, (c) diagonal downleft, (d) verticalright, and (e) horizontaldown. The circles denote pixels, and the dashed lines represent direction lines.

[Fig. 5.] Experimental results. The horizontal axis is bit per pixel and the vertical axis is peak signaltonoise ratio (PSNR). (a) PSNR of the received interpolated image with side decoder1 of Lena image. (b) PSNR of the interpolated received image with side decoder1 with Barbara image. (c) PSNR of the received interpolated image with side decoder1 with Boat image. MDC: multiple description coding.