HAL Id: hal-00340194
https://hal-supelec.archives-ouvertes.fr/hal-00340194
Submitted on 20 Nov 2008
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entific research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Onboard Hyperspectral images compression with
exogenous quasi optimal coding transforms
Isidore Paul Akam Bita, Michel Barret, Florio Dalla Vedova, Jean-Louis
Gutzwiller
To cite this version:
Isidore Paul Akam Bita, Michel Barret, Florio Dalla Vedova, Jean-Louis Gutzwiller. Onboard Hy-
perspectral images compression with exogenous quasi optimal coding transforms. On-Board Payload
Data Compression Workshop (ESA OBPDC 2008), Jun 2008, Noordwijk, Netherlands. �hal-00340194�
Onboard Hyperspectral Images Compression with Exogenous Quasi
Optimal Coding Transforms
Isidore Paul Akam Bita(1), Michel Barret(2), Florio Dalla Vedova(1), Jean-Louis Gutzwiller(2)
(1) LUXSPACE Sarl, Chateau de Betzdorf, L-6815, Betzdorf, Luxembourg Tel. +352 267.890.8046/267.890.4024 – Fax. +352 267.890.4029
[email protected], [email protected]
(2) SUPELEC Campus de Metz , IMS Team, 2 rue Edouard Belin, 57070 Metz, France Tel. +33 387.764.731/387.764.709 – Fax.+33 3.87.76.47.22
[email protected], [email protected]
This work is financially supported by ESA through the Innovation Triangle Initiative (ITI)
INTRODUCTION
These last years, research activities on multi-component image compression have been expanded, due to the development of multi-spectral and hyper-spectral image sensors which supply larger and larger amount of data.
The end-users of such images are also more and more numerous and miscellaneous. The future earth observation systems will use multi-, super- or hyper- spectral image sensors with higher resolutions leading to bigger amount of transmitted data. However the channel bandwidth for transmission is limited and therefore it is of great interest to conceive onboard compression systems of such images compatible with the diversity of end-users’ needs.
The JPEG2000 standard is well known and well spread today. Moreover the Karhunen-Loève Transform (KLT) used in the Part 2 of JPEG2000 is considered as the best existing lossy compression techniques for hyper- spectral images at medium and high bit rates [1],[2]. The KLT consists in a Principal Component Analysis (PCA), well known of statisticians, where all the components are kept. However, the rather great computational complexity of the KLT hinders its adoption in practice — specially on satellite platforms — and recent works propose different solutions in order to pass round this problem. One approach consists in reducing the complexity of the covariance matrix computation. This is done by randomly sampling the entire image in order to obtain a small sample of the pixels’ population on which the covariance matrix is computed [2], [3]. Another approach consists in computing a kind of KLT average on a set of images (the learning basis) issued from only one sensor and using it on other images obtained with the same sensor. This sub-optimal transform is called exogenous KLT in [4] and the computational complexity of the second approach seems compatible with satellite platforms.
Both approaches are fruitful: the rate-distortion performance sacrifice compared with the true KLT is very slight, whereas the computational burden is significantly reduced. In the second approach, the exogenous KLT matrix is known by the decoder, hence there is no need to transmit it.
It is often taught in coding lessons that the KLT is optimal in transform coding. Nevertheless it is well known that the optimality of the KLT is proven only for gaussian data [5], [6] and that it can be sub-optimal for non gaussian data [7]. Now, under the high resolution quantization hypothesis, nearly everything is known about the performance of a transform coding with entropy constrained scalar quantization and mean square distortion. It is then straightforward to find a criterion that, when minimized, gives the optimal linear transform under the above mentioned conditions. Nevertheless, the optimal transform computation is generally considered as a difficult task [8] and the Gaussian assumption is then used in order to simplify the calculus. In [9] the problem of computing the optimal transform for still images is resolved under high-rate entropy constraint scalar quantization hypothesis. More, in [10] these optimal transforms are applied to multi- and hyper-spectral images for reducing spectral redundancy when the quasi-orthonormal 2D discrete wavelet decomposition (DWT) — known as Daubechies (9,7) — is applied to each component for reducing spatial redundancy. Further, for multi- component images and under the same hypothesis as above, the authors clarified in [11] the criterion that, when minimized, gives the optimal linear transform for reducing spectral redundancy when it is associated with the Daubechies (9,7) 2D DWT, according to a compression scheme compatible with the JPEG2000 Part 2 standard.
They introduced also two algorithms that compute the optimal transform, one with the orthogonality constraint
and the other with no constraint, but invertibility. In the following, we shall call these transforms OCT for Optimal Coding Transform.
In this paper we adapt the notion of exogenous KLT to the OCT, introducing exogenous quasi optimal coding transform (EQOCT). The rate-distortion performances of the EQOCT are compared with the ones of the KLT, the exogenous KLT and the OCT on Hyperion images. First, in our tests we used the Verification Model 9.1 (VM9) [12] of the JPEG2000 group which is an efficient implementation of the JPEG2000 Part 2 standard.
We shall see that with this codec the EQOCT outperforms the exogenous KLT and even the KLT. However, the entropy coder EBCOT [13] of JPEG2000 associated with its Post Compression Rate-Distorsion (PCRD) optimizer for the optimal bit-rate allocations between all the blocks of all the components are too complex for being used by onboard compression systems. Consequently, the Consultative Committee for Space Data Systems (CCSDS) recommends the use of the Bit Plane Encoding (BPE) standard described in [14]. Hence, in our tests we have then used an implementation of CCSDS 122.0-B-1 Recommended Standard [15] of Hongqiang Wang, called myBPE, which is available at [16]. Although the CCSDS recommended standard can control the tradeoff between compressed data volume and reconstructed image quality, we have not yet succeeded in using mybpe in quality-mode. This is why we applied an allocation between components which is not optimal at low and medium bit-rates and which tends to be asymptotically optimal at high bit-rates.
The first section gives a brief theoretical overview of the OCT (for more details see [17] or [11]). Then, we explain the exogenous transform concept and we give some details on its computation. The final section presents some results obtained on hyper-spectral images.
THE OPTIMAL CODING TRANSFORM
In order to be self content, we recall in this section previous results presented in [11] and [17]. We first give the compression scheme studied. Then, based on the Bennett approximation for high rate quantization, we deduce the criterion minimized by an optimal coding transform (OCT) of this compression scheme. The section ends with a brief overview of the algorithm used to obtain the linear transform that minimizes the criterion.
Some Definitions and Notations
We denote by X a multicomponent image with N spectral components X1, . . . , XN. Each component Xi is a 2D image with NL rows and NC columns. In order to simplify the notations and the mathematical expressions, we assume that each component is written as a row vector by scanning all its pixels row by row. Then X is a N × L matrix, with L = NLNC. In the following, depending on the context, we shall interpreted Xi as a 2D image or as a row vector of length L.
In the following compression scheme, the 2D DWT we use are all with fixed coefficients, actually the Daubechies (9,7) DWT in our tests, but on the other hand the linear transforms apply in order to reduce spectral redundancy is adapted to the data.
The Separable Scheme
The description of that scheme can be summarized as follow:
• Coding. The same DWT is applied to each component Xi. We denote by W the invertible L × L matrix associated with that DWT. The result of the 2-D DWT applied to the entire image X is XWT. Then, a linear transform A is applied in order to reduce the spectral redundancy between the components. Hence, the transformed coefficients are the elements of the matrix Y = AXWT. The aim of the DWT (resp.
transform A) is to reduce the spatial (resp. spectral) redundancy in each (resp. between) component(s).
Then, the transformed coefficients are quantized and entropy coded.
• Decoding. Let Yqdenote the matrix with the same dimension as Y containing the dequantized transformed coefficients. The mathematical inverse transforms are applied to Yqin order to reconstruct an approximation X = Ab −1YqW−T of the original image X.
Remark 1 It is of interest to remark that the order of the transformations (i.e., applying first the DWT then A, or first A then the DWT) has no effect on the result, since Y = A(XWT) = (AX)WT. This is why we called that scheme separable.
Remark 2 The separable scheme is compatible with the JPEG2000 Part 2 standard and with the CCSDS recom- mended standard.
As it was already mentioned, the 2D DWT has fixed coefficients and we look for a linear spectral transform that adapts to the encoded image in order to reduce the spectral redundancy as much as possible for coding, under the assumptions of a scalar quantizer per subband and per spectral band. In the next subsection we clarify a criterion which gives, when minimized, an optimal linear transform for the separable scheme under high rate quantization assumptions.
The Criterion Minimized by an OCT
Theorem 1 For the separable scheme with a fixed 2D DWT, assuming high resolution quantization, the optimal spectral transform A is a N × N matrix that minimizes the criterion:
C(A) =
N
X
j=1 M
X
m=1
πmH(Yj(m)) +1
2log2det diag(A−TA−1), (1)
where H(Yj(m)) is the differential entropy of the m-th subband of the j-th spectral band, M the number of subbands and πm the ratio of wavelets coefficients localized in the m-th subband (PM
m=1πm= 1).
A proof can be found in [11] or [17], based on the high resolution quantization theory [18] and the following assumption: the components of the quantization noise are zero mean and uncorrelated.
Remark 3 Since PM
m=1πm= 1, the previous criterion C(A) can be expressed as
C(A) =
M
X
m=1
πm
"N X
i=1
H(Yi(m)) − log2| det A|
# +1
2log2
det diag
A−TA−1 det(A−TA−1)
=
M
X
m=1
πmCICA(m)(A) + CO(A), (2)
where, for 1 ≤ m ≤ M , CICA(m)(A) = PN
i=1H(Yi(m)) − log2| det A| is the criterion of ICA applied to the N components of the transformed coefficients that belong to the subband m.
The relation (2) shows that the criterion C(A) of the separable scheme takes into consideration the fact that one quantizer per subband and per component is allocated. It is also important to notice that the criterion C(A) involves the transformed coefficients Y. Therefore, even for the separable scheme (where the order of application of the 2D DWT and the spectral transform does not matter), the search for the optimal spectral transform must be done after the 2D DWT.
Minimization of the Criterion
A quasi-Newton method has been introduced in [11] to minimize the criterion of (1) with a variant that minimizes the criterion under the constraint of an orthogonal transform. These algorithms are based on [20] and [19] and are quite similar to the ones presented in [9]. In the following, we shall consider only the orthogonal optimal transform, which will be called OCT.
THE EXOGENOUS OPTIMAL TRANSFORM
In this section, we explain how the Exogenous Quasi-Optimal Coding Transform (EQOCT) can be computed from a set of several hyper- (or multi-) spectral images acquired with the same camera and, consequently, having the same number of spectral bands. First, the set of images is partitioned into two disconnected subsets (i.e., their intersection is the empty set), one is called the learning basis and the other the test basis. Then, all the images of the learning basis are concatenated to give a very large image still having Nc columns. Finally the EQOCT is returned by the algorithm mentioned at the end of the previous section, when it is applied to this very large image. We can notice that the computation of the EQOCT by this way is very complex (the complexity of the algorithm computing the OCT is evaluated in [11] or [17]) and could be significantly reduced in sub-sampling at random the pixels of the large image as it is done in [3] or [2].
Remark 4 It is clear that the exogenous transform computed in this way is not optimal for each image of the test basis. However if the pixels of the learning basis images constitute a statistically significant sample of pixels that can be acquired with the camera, one can expect that the EQOCT performances in coding could be close to the ones obtained by the true OCT for each image of the test basis.
SIMULATIONS AND RESULTS
We have freely downloaded ten Hyperion hyperspectral images (Level 1 images radiometrically corrected) from the NASA website [21]. Each image has 242 spectral bands (from the visible to the infra-red spectrum) with only 198 calibrated channels. Each image has 256 columns, however the number of rows is variable, depending on the image. In our simulations, we chose to use 45 over the 50 calibrated channels in the visible and near infrared (VNIR) spectrum, since all these spectral bands come from only one spectrometer. The image are originally encoded as signed integer with two bytes, i.e. with 16 bits per pixel and per band (pppb).
We have partitioned the set of ten Hyperion images into two disconnected subsets having each five images (one subset is the learning basis and the other the test basis). Among the 105 = 252 different possibilities for choosing the learning basis, we tried four possibilities, corresponding to the bases n◦ 1 to 4. Then for each learning basis, we have computed the Exogenous KLT (EKLT) and the EQOCT. Finally we evaluate the bit-rate versus Signal to Noise Ration (SNR) of the KLT, the OCT, the EKLT and the EQOCT for each image of the test basis. The Table. 1 summarizes the different Hyperion used for the study with the number of rows per component for each image and the Table. 2 gives the images used in order to constitute the learning basis from which the EKLT and the EQOCT are calculated.
The evaluation of the bit-rate versus SNR was made with the Verification Model VM9, which is an efficient implementation of the JPEG2000 Part 2 standard, and with mybpe which has been introduced in the introduction of this paper.
Performances obtained with the VM9
The VM9 realizes both the quantization and the allocation between blocs with its Post Compression Rate- Distortion optimizer, leading to optimal allocation for all the rate. The matrices of the spectral transform and its inverse can be given to the VM9, which computes the transformed image Y = AXWT.
We introduce the Generalized Coding Gain (GCG) of a spectral transform A with respect to another one A0, which is the function of the bit-rate defined by the SNR (at the bit-rate) obtained with A for reducing spectral redundancy minus the SNR (at the same bit-rate) obtained with A0. It is expressed in dB. For a given learning basis, we have computed the GCG of the EQOCT with respect to the KLT, with respect to the EKLT and with respect to the OCT. In Fig. 1 we show these three GCG averaged on all the images of the test bases for the learning bases n◦1 and n◦2. In Tab. 3, we show the average GCG with respect to the KLT obtained by averaging the GCG on all the images of the test basis, for four different learning bases. We show also the best and the worst GCG obtained with respect to the KLT. It can be noticed that the distortion obtained with the VM9 at bit-rates higher than about 3 bpppb is degraded by a quantization on 16 bpppb of the image after the spectral quantization and before the 2D DWT (this can be observed on the curves of Fig. 2).
Performances obtained with mybpe
The input data accepted by the program mybpe must be 2D images, therefore mybpe do not apply any spectral transform and the bit-rate allocation between spectral bands must be done before sending the data to mybpe. As mentioned in the introduction, since we have not yet been able to use mypbe in quality mode, we chose in our tests the following rate allocation between spectral bands, which is optimal only at high bit-rates but neither at low nor medium rate. Let us introduce the variance σ[AX]2
i of the i-th component of the image AX obtained after applying the spectral transform and before applying the 2D DWT. The rate Ri (1 ≤ i ≤ 45) of the i-th spectral band is taken as Ri = log2(σyi) + Const, where Const is a real constant that does not depend on the spectral band. The bit-rate versus SNR is then obtained when Const varies. One must also take care to avoid very high or very low rates for some components. Indeed, having rate greater than the rate of the lossless compression for one components is not an interesting case. In the same way, one must also avoid to have rate values less than zero for some components since that leads to a nonsense in compression. Using this knowledge, for the purpose of the rate vs SNR evaluation for different rate, we define a minimum and a maximum threshold for each component.
The scalar quantization step is chosen so that all the coefficients in each component after applying the spectral transform to the original image remain in the range of values of the original image. The Fig. 2 shows the rate vs SNR of four different images for different spectral transforms with the VM9 and mybpe.
Discussion
It is not surprising to see that the OCT perform better than the KLT with the VM9. The happy surprise come from the fact that the EQOCT perform very well, they are close to the OCT for all the rates (the sacrifice is generally less than 1 dB) and they outperform the EKLT and even the KLT in average. The advantage is more important than what we can see since the VM9 includes the coefficient of the inverse transform in the bitstream, which is not useful for exogenous transforms known by the decoder. It is also relevant to see (as it is well known [1], [2]) the importance of applying a spectral transform , comparing the performances of the OCT to the ones of the Identity transform (there is no spectral transform in this case).
The performances we have obtained until now with my mybpe are not favorable to the EQOCT and very far for the ones obtained with the VM9, specially at low and medium bit-rates. We think that this is due mainly to the allocation between spectral bands which can be improved with a computational complexity compatible with satellite platforms (see e.g. [22]). In the future, we shall investigate that point.
CONCLUSION
In this paper we have proposed an approach using a fixed spectral transform for multi-component image com- pression. The exogenous transform is computed from a learning basis of image. The results obtained with the implementation of the JPEG2000 Part 2 standard are encouraging since the performances are close to the ones obtained with the OCT for each image and outperform in average the KLT. The first results obtained with an implementation of the CCSDS 122.0-B-1 recommended standard are less interesting, however we think that they can be significantly improved with a better allocation between spectral bands. The tests have been made on calibrated images. Next, we plan 1) to test raw Level-0 images in order to be as closed as possible to real appli- cations, 2) to improve the bit allocation between spectral bands with the BPE, 3) to extend this lossy method to lossless compression in order to propose a full on-board compression system and 4) to evaluate other distortion measures, as the ones presented in [23].
References
[1] Q. Du and J. E. Fowler, “Hyperspectral image compression using JPEG2000 and principal component anal- ysis”, IEEE Geoscience and Remote Sensing Letters, vol. 4, pp. 201–205, Apr. 2007.
[2] B. Penna, T. Tillo, E. Magli and G. Olmo, “Transform coding techniques for lossy hyperspectral data com- pression”, IEEE Trans. Geoscience and Remote Sensing, vol. 45, no. 5, May 2007.
[3] Q. Du and J. E. Fowler, “Low-compexity principal component analysis for hyperspectral image compression”, International Journal of High Performance Computing Applications, to appear.
[4] C. Thiebaut, D. Lebedeff, C. Latry and Y. Bobichon, “On-board compression algorithm for satellite multi- spectral images”, Proceedings of the Data Compression Conference, Snowbird, Mar. 28-30, 2006.
[5] J.-Y. Huang and P. M. Schultheiss, “Block quantization of correlated Gaussian random variables,” IEEE Trans. Communication, vol. COM-11, pp. 289–296, Sep. 1963.
[6] V. K. Goyal, J. Zhuang and M. Vetterli, “Transform coding with backward adaptive updates,” IEEE Trans.
Information Theory, vol. 46, pp. 1623–1633, Jul. 2000.
[7] M. Effros, H. Feng and K. Zeger, “Suboptimality of the Karhunen-Loève transform for transform coding,”
IEEE Trans. on Information Theory, vol. 50, no. 8, pp. 1605–1619, Aug. 2004.
[8] S. Mallat and F. Falzon, “Analysis of Low Bit Rate Image Transform Coding,” IEEE Trans. Signal Processing, vol. 46, no. 4, pp. 1027–1042, Apr. 1998.
[9] M. Narozny, M. Barret and D.-T. Pham, “ICA based algorithms for computing optimal 1-D linear block transforms in variable high-rate source coding”, Signal Processing, vol. 88, no. 2, pp. 268–283, Feb. 2008.
[10] I. P. Akam Bita, M. Barret and D.-T. Pham, “Compression of Multicomponent Satellite Images Using Inde- pendent Component Analysis”, Proceedings of the Sixth International Conference on Independent Component Analysis and Blind Signal Separation (ICA 2006), Charleston, Usa, vol. LNCS 3889, pp. 335–342, Mar. 2006.
[11] I. P. Akam Bita, M. Barret and D.-T. Pham, “Transformations linéaires optimales à hauts débits pour la compression d’images multi-composantes selon la norme JPEG2000”, Proceedings of the 21st GRETSI colloquium, pp. 489–492, Troyes (France), Sep. 11-14, 2007.
[12] JPEG2000 Verification Model 9.1 (Technical description), ISO/IEC JTC 1/SC 29/WG 1 WG1 N2165, Jun.
2001.
[13] D. S. Taubman and M. W. Marcellin, JPEG2000 Image compression fundamentals, standards and practice, Kluwer Academic Publishers, London, 2002.
[14] Image data compression, Report Concerning Space Data System Standards, CCSDS 120.1-G-1, Green Book, June 2007.
[15] Image Data Compression, Recommendation for Space Data System Standards, CCSDS 122.0-B-1, Blue book, Nov. 2005.
[16] http://hyperspectral.unl.edu/
[17] I.P. Akam Bita, Sur l’application de l’analyse en composantes indépendantes à la compression des images multi composante, thèse de l’Université J. Fourier, Grenoble 2007.
[18] A. Gersho and R. M. Gray, Vector quantization and signal compression, Kluwer Academic Publisher, 1992.
[19] D.-T. Pham, “Fast Algorithms for Mutual Information Based Independent Component Analysis”, IEEE Transaction on Signal Processing, Vol. 52, No. 10, pp. 2690–2700, 2004.
[20] D.-T. Pham, “Entropy of Random Variable Slightly Contaminated with Another”, IEEE Signal Processing Letters, Vol. 12, No. 7, pp. 536–539, 2005.
[21] http://eo1.usgs.gov/samples.php
[22] C. Thiebaut, E. Christophe, D. Lebedeff and C. Latry, CNES Studies of On-Board Compression for Multi- spectral and Hyperspectral Images, SPIE, Satellite Data Compression, Communications and Archiving III, 2007.
[23] E. Christophe, D. Léger and C. Mailhes, “Quality criteria benchmark for hyperspectral imagery”, IEEE Trans.
on Geoscience and Remote Sensing, vol. 43, no. 9, pp. 2103–2114, Sep. 2005.
Table 1: The number of rows per band for the 10 Hyperion Hyperspectral images.
Images Iranbam Tucson Maryland Cuprite Dongting Maine Oklahoma Bay Caledonie Paloalto
Rows 7149 5585 3352 3184 3129 3129 3129 3128 3127 2905
Table 2: The images of learning bases used to calculate the exogenous transforms.
Basis n◦1 Maine Oklahoma Bay Caledonie Paloalto Basis n◦2 Iranbam Tucson Maryland Cuprite Dongting Basis n◦3 Maine Cuprite Bay Tucson Paloalto Basis n◦4 Maine Oklahoma Maryland Cuprite Dongting
Figure 1: Generalized Coding Gain (GCG) of the EQOCT with respect to the KLT, with respect to the EKLT and with respect to the OCT, averaged on all the test basis images, for learning bases n◦1 (left) and n◦2 (right).
Table 3: The best, worse and averaged Generalized Coding Gains of the EQOCT with respect to the KLT obtained with the VM9.
Rate (bpppb) 0.25 0.5 0.75 1 1.5 2 2.5 3 4
best case 0.48 0.52 0.46 0.43 0.4 0.39 0.38 0.38 0.34 worst case -1.27 -0.77 -0.59 -0.49 -0.44 -0.46 -0.48 -0.49 -0.48 average (basis n◦1) 0.05 0.2 0.2 0.19 0.18 0.17 0.17 0.17 0.15 average (basis n◦2) 0.04 0.18 0.19 0.19 0.19 0.18 0.18 0.17 0.15 average (basis n◦3) 0.29 0.33 0.28 0.26 0.25 0.25 0.25 0.25 0.23 average (basis n◦4) 0.27 0.38 0.36 0.34 0.33 0.32 0.31 0.32 0.29
0 0.5 1 1.5 2 2.5 3 3.5 4
20 25 30 35 40 45 50 55 60
rate (bpppb)
SNR (dB)
VM9-Id VM9-KLT VM9-OTC VM9-ET CCSDS-Id CCSDS-ET
0 0.5 1 1.5 2 2.5 3 3.5 4
10 15 20 25 30 35 40 45 50 55 60
rate (bpppb)
SNR (dB)
VM9-Id VM9-KLT VM9-OTC VM9-ET CCSDS-Id CCSDS-ET
0 0.5 1 1.5 2 2.5 3 3.5 4
20 25 30 35 40 45 50 55 60
rate (bpppb)
SNR (dB)
VM9-Id VM9-KLT VM9-OTC VM9-ET CCSDS-Id CCSDS-ET
0 0.5 1 1.5 2 2.5 3 3.5 4
25 30 35 40 45 50 55 60
rate (bpppb)
SNR (dB)
VM9-Id VM9-KLT VM9-OTC VM9-ET CCSDS-Id CCSDS-ET
Figure 2: Rate vs SNR for different images from left to right and from up to down: Paloalto, Maine, Maryland and Tucson