pdf.io >> Marketing >> CHAPTER 11.pdf

CHAPTER 11

 FileName: ch11.pdf
[readonline]


 FileSize: 549 KB download
 Shared by: math_gmu_edu 130 month ago
 Category: Marketing
 Report us: delete it


Abstract: most heavily used image compression methods rely on a variant of the discrete Fourier. transform called the discrete cosine transform. This transform interpolates data using basis. functions that are all cosine functions, and involves only real computations. ...

CHAPTER 11
Compression
HE information age exists because of the ease with which information travels. The fact that sound
T and images can be sent around the world so easily is due to ingenious ways of compressing the
data which represents them. In this chapter we describe methods of data compression, and as
applications of the methods, we describe some modern ways to shrink sound and images ﬁles.
11.1 The Discrete Cosine Transform
most heavily used image compression methods rely on a variant of the discrete Fourier
T HE
transform called the discrete cosine transform. This transform interpolates data using basis
functions that are all cosine functions, and involves only real computations.
The discrete cosine transform (DCT) is often used to compress small blocks of an image, as
small as 8 × 8 pixels. The compression is lossy, meaning that some information from the block is
ignored. The key feature of the DCT is that it helps organize the information so that the part that is
ignored is the part that the human eye is least sensitive to. More precisely, the DCT will show us
how to interpolate the data with a set of basis functions that are in descending order of importance
as far as the human visual system is concerned. Then we can drop the less important interpolation
terms, like a newspaper editor cuts a long story on deadline.
We will apply what we have learned about the DCT to compress full images. Using the added
tools of quantization and entropybased coding, each 8 × 8 block of an image can be reduced to a bit
stream that is stored with bit streams from the other blocks of the image. The complete bit stream
is decoded when the image needs to be uncompressed and displayed, by reversing the encoding
process. We will describe this approach, called Baseline JPEG, the default method for storing JPEG
images.
The Modiﬁed Discrete Cosine Transform (MDCT) is the current standard for compression of
sound ﬁles. We will introduce MDCT and investigate its application for coding and decoding, which
provides the core technology of ﬁle formats such as MP3 and AAC. Finally, in analogy with the Fast
Fourier Transform of Chapter 10, we will show how to develop fast versions of the DCT and MDCT.
11.1.1 Onedimensional DCT
Let n be a positive integer. The onedimensional discrete cosine transform of order n is deﬁned by
the n × n matrix C whose entries are
√
2 1/ 2 if i = 0, j = 0, . . . , n − 1 ,
(C)ij = 1
i(j+ 2 )π (11.1)
n cos if i > 0, j = 1, . . . , n − 1
n
398 11.1 T HE D ISCRETE C OSINE T RANSFORM
or 1 1 1
√
2
√
2
··· √
2
cos 2nπ
cos 3π ··· cos (2n−1)π
2n 2n
2
cos 2π cos 6π cos 2(2n−1)π
C=
2n 2n ··· 2n
(11.2)
n . . .
.
. .
. .
.
cos (n−1)π
2n cos (n−1)3π
2n ··· cos (n−1)(2n−1)π
2n
When working with twodimensional images, the convention is to begin with 0 instead of 1. The
notation will be easier if we extend this convention to matrix numbering, as you see we have done
in (11.1). For Chapter 11 only, we will use subscripts for n × n matrices that go from 0 to n − 1.
We will treat only the case where n is even in the following discussion.
Deﬁnition 11.1 The discrete cosine transform (DCT) of x = (x0 , . . . , xn−1 )T is the ndimensional
vector y = (y0 , . . . , yn−1 ), where
y = Cx. (11.3)
Note that C is a real orthogonal matrix, meaning that its transpose is its inverse:
cos (n−1)π
1 π
√ cos 2n ··· 2n
12 3π (n−1)3π
2 2
√ cos 2n ··· cos 2n
C −1 = C T = . (11.4)
. . .
n . . .
. . .
1 (2n−1)π (n−1)(2n−1)π
√
2
cos 2n · · · cos 2n
The rows of an orthogonal matrix are pairwise orthogonal unit vectors. The orthogonality of C
follows from the fact that the columns of C T are the unit eigenvectors of the real symmetric n × n
matrix
1 −1
−1 2 −1
−1 2 −1
(11.5)
.. .. ..
. . .
−1 2 −1
−1 1
Exercise 11.1.7 asks the reader to verify this fact.
This puts us right back in the context of Assumption 10.1 of Chapter 10. The Orthogonal
Function Interpolation Theorem 10.8 applied to the discrete cosine matrix C implies:
Theorem 11.2 DCT Interpolation Theorem. Let tj = j for j = 0, . . . , n − 1 and let x =
(x0 , . . . , xn−1 ) be a vector of n real numbers. Deﬁne y = (y0 , . . . , yn−1 ) = Cx, where C is the discrete
cosine transform. Then the real function
√ n−1
1 2 k(2t + 1)π
Pn (t) = √ y0 + √ yk cos
n n 2n
k=1
satisﬁes Pn (j) = xj for j = 0, . . . , n − 1.
Chapter 11 C OMPRESSION 399
Thus the n × n matrix C transforms n data points into n interpolation coefﬁcients. Like the discrete
Fourier transform, the discrete cosine transform gives coefﬁcients for a trigonometric interpolation
function. Unlike the DFT, the DCT uses cosine terms only, and is deﬁned entirely in terms of real
arithmetic.
1 x0
x1 x3
1 2 3
−1 x2
Figure 11.1: DCT interpolation and least squares approximation. The data points are (j, xj )
where x = (1, 0, −1, 0). The DCT interpolating function P4 (t) of (11.7) is shown as a solid curve
along with the least squares DCT approximation function P3 (t) of (11.8) as a dotted curve.
Example 11.1 Use the DCT to interpolate the points (0, 1), (1, 0), (2, −1), (3, 0).
The order4 DCT multiplied by the data x = (1, 0, −1, 0) is
1 1 1 1
√
2
√
2
√
2 2
√ 1 0.0000
1 cos π cos 3π cos 5π cos 7π 0 0.9239
−1 ≈ 1.0000
8 8 8 8 (11.6)
2 cos 2π
8 cos 6π
8 cos 10π
8 cos 14π
8
cos 3π
8 cos 9π
8 cos 15π
8 cos 21π
8
0 −0.3827
According to Theorem 11.2, the function
1 1 (2t + 1)π 2(2t + 1)π 3(2t + 1)π
P4 (t) = · 0 + √ 0.9239 cos + cos − 0.3827 cos (11.7)
2 2 8 8 8
interpolates the four data points. The function P4 (t) is plotted in Figure 11.1.
Just as the DCT Interpolation Theorem 11.2is an immediate consequence of Theorem 10.8, the
least squares result Theorem 10.9 of Chapter 10 shows how to ﬁnd a DCT least squares approxi
mation of the data, using only part of the basis functions. Because of the orthogonality of the basis
functions, this can be accomplished by simply dropping the higher frequency terms.
Theorem 11.3 DCT Least Squares Approximation Theorem. Let tj = j for j = 0, . . . , n − 1 and
let x = (x0 , . . . , xn−1 ) be a vector of n real numbers. Deﬁne y = (y0 , . . . , yn−1 ) = Cx, where C
is the discrete cosine transform. Let 0 ≤ m ≤ n be another integer. Then the choice of coefﬁcients
y0 , . . . , yn−1 in
√ m−1
1 2 k(2t + 1)π
Pm (t) = √ y0 + √ yk cos
n n 2n
k=1
n−1
minimizes the squared approximation error j=0 (Pm (j) − xj )2 of the n data points.
400 11.1 T HE D ISCRETE C OSINE T RANSFORM
0 S POTLIGHT ON: Orthogonality
The idea behind least squares approximations is that ﬁnding the shortest distance from a point to a plane
(or subspace in general) means constructing the perpendicular from the point to the plane. This con
struction is carried out by the normal equations, as we saw in Chapter 2. In Chapters 10 and 11, this
concept is applied again and again to approximate as closely as possible with a relatively small set of ba
sis functions, resulting in compression. The message of the later chapters is to carefully choose the basis
functions to be orthogonal, as reﬂected in the rows of the discrete transform matrix. Then the normal
equations become computationally very simple.
Referring to Example 11.1, if we require the best least squares approximation to the same four
data points, but use the three basis functions
(2t + 1)π 2(2t + 1)π
1, cos , cos
8 8
only, the solution is
1 1 (2t + 1)π 2(2t + 1)π
P3 (t) = · 0 + √ 0.9239 cos + cos . (11.8)
2 2 8 8
Figure 11.1 compares the least squares solution P3 with the interpolating function P4 .
Exercises 11.1
11.1.1. Write out the 2 × 2 DCT matrix and the general twoterm DCT interpolation function P2 (t). Then ﬁnd the
1DDCT transform and the interpolating function for the data points. (a) [3, 3] (b) [2, −2] [Ans: The DCT
1 1 1 1 (2t + 1)π √
matrix is C = √ , and P2 (t) = √ y0 + y1 cos (a) y = [3 2, 0], P2 (t) = 3 (b) y =
√ 2 √1 −1 2 4
[0, 2 2], P2 (t) = 2 2 cos(2t + 1)π/4]
11.1.2. For n = 2, describe the m = 1 least squares DCT approximation in terms of the input data x = (x0 , x1 ).
11.1.3. Carry out the trigonometry to establish equations (11.10) and (11.11).
11.1.4. Find the 1DDCT of the following data vectors x, and ﬁnd the corresponding interpolating function Pn (t) for the
data points (i, xi ), i = 0, . . . , n − 1. You may state your answers in terms of the a, b, c deﬁned in (11.11).
(a) [1, 0, −1, 0]
(b) [1, 1, 1, 1]
(c) [2, 0, −2, 0]
(d) [1, 2, 3, 4]
[Ans.: (a) y = [0, a(b + c), 1, a(c − b)], P4 (t) = a[a(b + c) cos(2t + 1)π/8 + cos 2(2t + 1)π/8 + a(c −
b) cos 3(2t + 1)π/8] (b) y = [2, 0, 0, 0], P4 (t) = 1 (c) y = [0, 2a(b + c), 2, 2a(c − b)], P4 (t) = 2a[a(b +
c) cos(2t + 1)π/8 + cos 2(2t + 1)π/8 + a(c − b) cos 3(2t + 1)π/8] (d) y = [10a, −3b − c, 0, b − 3c], P4 (t) =
5a + a[−(3b + c) cos(2t + 1)π/8 + (b − 3c) cos 3(2t + 1)π/8]]
11.1.5. Find the DCT least squares approximation with m = 2 terms for the data in Exercise 11.1.4.
Chapter 11 C OMPRESSION 401
11.1.6. Find the 1DDCT of the data vectors (a) [1, 0, −1, 0, 1, 0, −1, 0] (b) [0.3, 0.6, 0.8, 0.5, 0.6, 0.4, 0.2, 0.3]
11.1.7. Show that the columns of C T are the unit eigenvectors of the matrix T in (11.5).
11.1.8.
Computer Problems 11.1
11.1.1.
11.2 Twodimensional DCT and image compression
discrete cosine transform lies at the core of modern compression techniques for images and
T HE
audio. To discuss the former, we need to develop a twodimensional version of the DCT. Later
we show how to transform and quantize rectangular grids of pixel values that make up an image.
11.2.1 The twodimensional discrete cosine transform
The twodimensional discrete cosine transform is simply the onedimensional DCT applied in two
dimensions, one after the other. It can be used to interpolate or approximate data given on a two
dimensional grid, in a straightforward analogy to the onedimensional case. In the context of image
processing, the twodimensional grid represents a block of pixel values, say grayscale intensities or
color intensities.
s
x x x x
30 31 32 33
3
x20 x21 x22 x23
2
x x x x
10 11 12 13
1
x x x x
00 01 02 03
0
0 1 2 3 t
Figure 11.2: Twodimensional grid of data points. The 2DDCT can be used to interpolate
function values on a square grid, such as pixel values of an image.
In Chapter 11 only, we will list the vertical coordinate ﬁrst, and the horizontal coordinate sec
ond, when referring to a twodimensional point, as shown in Figure 11.2. Although radical, our
goal is to be completely consistent with the usual matrix convention, where the i index of entry xij
changes along the vertical direction, and j along the horizontal. A major application of this section
is to pixel ﬁles representing images, which are most naturally viewed as matrices of numbers.
Figure 11.2 shows a grid of (s, t) points in the twodimensional plane with assigned values
xij at each rectangular grid point (si , tj ). For concreteness, we will often use the integer grid
si = {0, 1, . . . , n − 1} (remember, along the vertical axis) and tj = {0, 1, . . . , n − 1} along the
horizontal axis. The purpose of the 2DDCT is to construct an interpolating function F (s, t) that ﬁts
the n2 points (si , tj , xij ) for i, j = 0, . . . , n − 1. The 2DDCT accomplishes this in an optimal way
402 11.2 T WO  DIMENSIONAL DCT AND IMAGE COMPRESSION
from the point of view of least squares, meaning that the ﬁt degrades gracefully as basis functions
are dropped from the interpolating function.
The 2DDCT is the onedimensional DCT applied successively to both horizontal and vertical
directions. Consider the matrix X consisting of the values xij as in Figure 11.2. To apply the 1D
DCT in the horizontal sdirection, we ﬁrst need to transpose X, then multiply by C. The resulting
columns are the 1DDCT’s of the rows of X. Each column of CX T corresponds to a ﬁxed ti . To do
a 1DDCT in the tdirection means moving across the rows, so again transposing and multiplying
by C yields
C(CX T )T = CXC T . (11.9)
Deﬁnition 11.4 The twodimensional discrete cosine transform (2DDCT) of the n × n matrix X is
the matrix Y = CXC T , where C is deﬁned in (11.1).
If instead we perform the DCT vertically ﬁrst, then horizontally, the result is the transpose of the
deﬁnition. See Exercise 11.2.3.
Example 11.2 Find the 2D discrete cosine transform of the data in Figure 11.3(a).
It is helpful to notice, using elementary trigonometry, that the 4 × 4 DCT matrix can be viewed as
1 1 1 1
√
2
√
2
√
2
√
2
a a a a
1 cos π cos 3π cos 5π cos 7π = a b
c −c −b
C=√ 8 8 8 8 (11.10)
2 cos 2π8 cos 6π8 cos 10π
8 cos 14π
8
a −a −a a
cos 3π8 cos 9π8 cos 15π
8 cos 21π
8
c −b b −c
where
√ √
π 1 π 2+ 2 3π 2− 2
a = cos = √ , b = cos = , c = cos = . (11.11)
4 2 8 2 8 2
From the deﬁnition, the 2DDCT is the matrix
a a a a 1 1 1 1 a b a c
b c −c −b 1 0 0 1 a c −a −b
Y = CXC T = a
a
a −a −a a 1 0 0 1 a −c −a b
c −b b −c 1 1 1 1 a −b a −c
3 0 1 0
0 0 0 0
=
1
(11.12)
0 −1 0
0 0 0 0
The inverse of the 2DDCT is easy to express in terms of the DCT matrix C. Since Y = CXC T ,
the X is recovered as X = C T Y C.
Deﬁnition 11.5 The inverse twodimensional discrete cosine transform of the n × n matrix Y is the
matrix X = C T Y C.
Chapter 11 C OMPRESSION 403
1 1 1 1 1.25 0.75 0.75 1.25
3 3
1 0 0 1 0.75 0.25 0.25 0.75
2 2
1 0 0 1 0.75 0.25 0.25 0.75
1 1
1 1 1 1 1.25 0.75 0.75 1.25
0 0
0 1 2 3 0 1 2 3
(a) (b)
Figure 11.3: Twodimensional data for Example 11.2. (a) The 16 data points (i, j, xij ). (b)
Values of the least squares approximation (11.15) at the grid points.
There is a close connection between inverting a transform like the 2DDCT and interpolation.
After all, the goal of interpolation is to recover the original data points from functions that are
constructed with the interpolating coefﬁcients that came out of the transform. Recall that C is an
orthogonal matrix, so that C −1 = C T . The deﬁnition of 2DDCT, Y = CXC T , can be rewritten as
a fact about interpolation, X = C T Y C, since in this equation the xij are being expressed in terms
of products of cosines.
To write a useful expression for the interpolating function, note that the deﬁnition of C in (11.1)
can be viewed as
2 i(2j + 1)π
(C)ij = ai cos (11.13)
n 2n
for i, j = 0, . . . , n − 1, where
√
1/ 2 if i = 0,
ai ≡
1 if i > 0.
According to the rules of matrix multiplication, the equation X = C T Y C translates to
n−1 n−1
T
xij = Cik ykl Clj
k=0 l=0
n−1 n−1
= Cki ykl Clj
k=0 l=0
n−1 n−1
2 k(2i + 1)π l(2j + 1)π
= ykl ak al cos cos . (11.14)
n 2n 2n
k=0 l=0
This is a statement about interpolation:
404 11.2 T WO  DIMENSIONAL DCT AND IMAGE COMPRESSION
Theorem 11.6 2DDCT Interpolation Theorem. Let sj = j, ti = i for j, i = 0, . . . , n − 1 and let
X = (xij ) be a matrix of n2 real numbers. Let Y = (ykl ) be the twodimensional discrete cosine
√
transform of X. Deﬁne a0 = 1/ 2 and ak = 1 for k > 0. Then the real function
n−1 n−1
2 k(2s + 1)π l(2t + 1)π
Pn (s, t) = ykl ak al cos cos
n 2n 2n
k=0 l=0
satisﬁes Pn (i, j) = xij for i, j = 0, . . . , n − 1.
Returning to Example 11.2, the only nonzero interpolation coefﬁcients are y00 = 3, y02 = y20 = 1,
and y22 = −1. Writing out the interpolation function in the Theorem 11.6 yields
2 1 1 2(2t + 1)π 1 2(2s + 1)π 2(2s + 1)π 2(2t + 1)π
P4 (s, t) = y00 + √ y02 cos + √ y20 cos + y22 cos cos
4 2 2 8 2 8 8 8
1 1 1 2(2t + 1)π 1 2(2s + 1)π 2(2s + 1)π 2(2t + 1)π
= (3) + √ (1) cos + √ (1) cos + (−1) cos cos
2 2 2 8 2 8 8 8
3 1 (2t + 1)π 1 (2s + 1)π 1 (2s + 1)π (2t + 1)π
= + √ cos + √ cos − cos cos .
4 2 2 4 2 2 4 2 4 4
Checking the interpolation, we get for example
3 1 1 1
P4 (0, 0) = + + − =1
4 4 4 4
and
3 1 1 1
P4 (1, 2) = − − − = 0.
4 4 4 4
The constant term y00 /n of the interpolation function is called the ”DC” component of the expansion
(for ”direct current”, not ”discrete cosine”). It is the simple average of the data; the nonconstant
terms contain the ﬂuctuations of the data about this average value. In this example, the average of
the 12 ones and 4 zeros is y00 /4 = 3/4.
Least squares approximations with the 2DDCT are done in the same way as with the 1DDCT.
The ”higherfrequency” basis functions, those whose coefﬁcients have larger indices, are dropped
from the interpolating function. Consider the example we have been working on. The best least
squares ﬁt to the basis functions
i(2s + 1)π j(2t + 1)π
cos cos
8 8
for i + j ≤ 2 is given by dropping the i = j = 2 term, leaving
3 1 (2t + 1)π 1 (2s + 1)π
P (s, t) = + √ cos + √ cos . (11.15)
4 2 2 4 2 2 4
This least squares approximation is shown in Figure 11.3(b).
The Matlab command for the onedimensional DCT of a vector x is
>> y=dct(x)
Chapter 11 C OMPRESSION 405
The input to dct can be a vector or a matrix. In the latter case, the DCT of each column is re
turned. Surprisingly, Matlab does not compute the DCT by matrix multiplication by the appropriate
DCT matrix C, but by way of the Fast Fourier Transform, to maximize efﬁciency. We will investi
gate this approach in the last section of the chapter. Of course, Matlab can be forced to display the
n × n DCT matrix with the command
>> C=dct(eye(n))
To have Matlab carry out the 2DDCT, we fall back to our deﬁnition of it in (11.9). Therefore
for a matrix X, the command
>> Y=dct(dct(X’)’)
computes the 2DDCT with two applications of the 1DDCT.
11.2.2 Image compression
The concept of orthogonality, as represented in the discrete cosine transform, is crucial to perform
ing image compression. Images consist of pixels, each represented by a number (or a vector, for
color images). The convenient way that methods like the DCT can carry out least squares approx
imation makes it easy to reduce the number of bits needed to represent the pixel values, while
degrading the picture only slightly, and perhaps imperceptibly to human viewers.
Figure 11.4: Grayscale image. Each pixel is represented by an integer between 0 and 255. The
pixels form a 64 by 64 grid.
Figure 11.4 shows a grayscale rendering of a 64 × 64 array of pixels. The grayness of each pixel
is represented by one byte, a collection of 8 bits representing 0 to 255, with 0 being black and 255
being white. We can think of the information shown in the ﬁgure as a 64 × 64 array of integers.
Represented in this way, the picture requires 8(64)2 = 215 = 32K bits of information.
Matlab imports grayscale or RGB values of images using standard image formats. For example,
given a grayscale ﬁle picture.jpg, the commands
>> picture = imread(’picture.jpg’);
>> picture = double(picture);
406 11.2 T WO  DIMENSIONAL DCT AND IMAGE COMPRESSION
puts the matrix of grayscale values into the variable picture. If the image is RGB color, the array
variable will have a third dimension to index the three colors. We will restrict attention to grayscale
for most of the section. Extensions to color are easy, and we indicate the details at the end.
To begin, we simplify the problem to a single 8 × 8 block of pixels, as shown in Figure 11.5(a).
This block was taken from an area of Figure 11.4 just to the left of the nose. Figure 11.5(b) shows the
onebyte integers that represent the grayscales of the 64 pixels. In Figure 11.5(c) we have subtracted
256/2 = 128 from the pixel numbers to make them approximately centered around zero. This step
is not essential, but better use of the 2DDCT will result because of this centering.
8 8 8
204 206 213 209 199 193 185 200 76 78 85 81 71 65 57 72
7 7 7
210 215 206 197 204 193 165 171 82 87 78 69 76 65 37 43
6 6 6
228 195 212 214 124 104 161 197 100 67 84 86 −4 −24 33 69
5 5 5
212 199 200 95 46 168 202 189 84 71 72 −33 −82 40 74 61
4 4 4
203 199 103 41 80 180 194 181 75 71 −25 −87 −48 52 66 53
3 3 3
179 88 46 72 164 120 68 72 51 −40 −82 −56 36 −8 −60 −56
2 2 2
127 102 97 169 217 190 174 115 −1 −26 −31 41 89 62 46 −13
1 1 1
79 133 158 189 192 201 196 190 −49 5 30 61 64 73 68 62
0 0 0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
(a) (b) (c)
Figure 11.5: Example of 8 × 8 block. (a) Grayscale view (b) Grayscale pixel values (c) Grayscale
pixel values minus 128.
To compress the 8 × 8 pixel block shown, we will transform the matrix of grayscale pixel values
−49 5 30 61 64 73 68 62
−1 −26 −31 41 89 62 46 −13
51 −40 −82 −56 36 −8 −60 −56
 Related pdf books
 Math 776Spring 2011Homework 8
 MAPLE demo on partial fractions and rational functions Example
 MATH 113 Analytic Geometry and Calculus I Fall 2009, Section 1
 CASE STUDIES IN OPTIMIZATION: CATENARY PROBLEM
 Text: Advanced Engineering Mathematics, Tenth Edition
 Math 352 Exam 1 Answers Lim  Mathematical Sciences  George ...
 MATH 554 B01 FINANCIAL MATHEMATICS SUMMER 2013
 On Characterizations of MeirKeeler Contractive Maps
 THE EFFECTIVE STABILITY OF ADAPTIVE TIMESTEPPING ODE SOLVERS
 MATH 351 Problem Set #1 Due Tuesday, September 4 inclass
 Coloring Copoints of a Planar Point Set  Mathematical ...
 Z with either
 MATH 315 15 JUNE 2011 EXAM 1 Answer all of the following
 Syllabus for Analysis Preliminary Exam  George Mason University
 MATH 315EXAM 1SOLUTIONS
 MATH 114 Analytic Geometry and Calculus II TR720910 pm Science
 Hysteresis and Economics Taking the economic past into account
Download the ebook