| Games and SDL SDL Installation SDL for Embedded SDL API SDL Events | SDL Graphics SDL Threads Thread Example SDL Animation SDL Sound | Raw Video Player Video Formats Video Compression | Game Trees About The Author |
Video Compression
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
We discussed before that quantization is the process of limiting the value of a sample one of a predetermined number of allowed values. In a video coder, the step of DCT that we discussed in the previous section does not cause a loss in information besides the numerical rounding errors; the main purpose of applying DCT to the video data of a frame is to cluster the data so that they can be more easily processed in subsequent steps. The video data can be recovered by applying IDCT to the DCT coefficients.
To achieve high compression, we do not want to retain the full range of DCT coefficients as we did in the previous section where we have used 16-bit ( i.e. data type short ) to save a DCT coefficient. We can quantize the DCT coefficients and represent a coefficient with significantly less number of bits. Quantization can be done using a scalar quantizer or vector quantizer; the latter in general yields better results but consumes a lot more computing power. A scalar quantizer operates on a sample value and generates a quantized output value. So we can have several or many different input values map to the same output value. A vector quantizer operates on a vector consisting of a group of sample values to produce an output vector consisting of a group of quantized values.
Scalar Quantization
A simple example of scalar quantization is a uniform quantizer where an input value X is divided by a quantization parameter ( or step size ) q and rounded to the nearest integer Fq.
The output levels Y are spaced uniformly with step size q as shown in the following example.
Example:
| Y | ||||
|---|---|---|---|---|
| X | q = 1 | q = 2 | q = 3 | q = 4 |
| -4 | -4 | -4 | -3 | -5 |
| -3 | -3 | -4 | -3 | -5 |
| -2 | -2 | -2 | -3 | 0 |
| -1 | -1 | -2 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 |
| 1 | 1 | 2 | 0 | 0 |
| 2 | 2 | 2 | 3 | 0 |
| 3 | 3 | 4 | 3 | 5 |
| 4 | 4 | 4 | 3 | 5 |
| 5 | 5 | 6 | 6 | 5 |
| 6 | 6 | 6 | 6 | 5 |
| 7 | 7 | 8 | 6 | 5 |
| 8 | 8 | 8 | 9 | 10 |
| 9 | 9 | 10 | 9 | 10 |
| 10 | 10 | 10 | 9 | 10 |
The following figure shows two examples of scalar quantizer. The linear scalar quantizer shown on the left shows linear mapping between input and output values. The nonlinear quantizer on the right shows a dead zone where small input values are mapped to zero.
Linear scalar quantizer |
Nonlinear scalar quantizer |
More precisely, one can define an N-point scalar quantizer Q as a mapping Q : R → C where R is the real line and
Every quantizer can be viewed as making up of two successive operations ( mappings ), an encoder, E, ( or forward quantizer FQ ), and a decoder, D ( or inverse quantizer IQ ). The encoder E is a mapping
Vector Quantization
Vector quantization ( VQ ) is a generalization of scalar quantization to the quantization of a vector, an ordered set of real numbers. Speech or image samples can be grouped together to form a vector. Thus vector quantization can be regarded as a form of pattern recognition where an input pattern is "approximated" by one of a predetermined set of patterns stored in a codebook. The application of VQ to image compression can be summarized as follows:
The following diagram summarizes the concept of vector quantization.
|
→ | Encoder: Find best match |
→ | Transmit index of chosen vector |
→ | Decoder: Look up |
|||
| ↑ | ↓ | ||||||||
| Codebook: Vector 1 Vector 2 ... Vector N |
Codebook: Vector 1 Vector 2 ... Vector N |
→ |
|
MPEG-4 Quantization
Video compression standard MPEG-4 allows two methods to quantize DCT coefficients. A parameter called quantizer_scale ( quantization_parameter ) is used to control how much information is discarded during the quantization process. The parameter can take values from 1 to 31 in the case of 8-bit textures and 1 to 2quant_precision - 1 in the case of no 8-bit textures. Each frame may use a different value of quantizer_scale. The two methods are referred to as Method 2 ( basic method ) and Method 1 ( more flexible but more complex ). Method 2, which is the default method, specifies the quantization of the DC component, F[0][0] using a fixed quantizer step:
| Forward Quantization: | Fq[0][0] = | F[0][0] / dc_scalar |
| Inverse Quantization: | F'[0][0] = | Fq[0][0] * dc_scalar |
| quantizer_parameter (Qp) | 1-4 | 5-8 | 9-24 | 25-31 |
| dc_scalar (luminance) | 8 | 2Qp | Qp+8 | 2Qp - 16 |
| dc_scalar (chrominance) | 8 | (Qp+13)/2 | (Qp+13)/2 | Qp - 6 |
All other coefficients are rescaled as follows:
| |F| | = | Qp * ( 2 * |FQ| + 1 ) | ( if Qp is odd and FQ ≠ 0 ) | |
| |F| | = | Qp * ( 2 * |FQ| + 1 ) - 1 | ( if Qp is even and FQ ≠ 0 ) | |
| |F| | = | 0 | ( if FQ = 0 ) |
In Method 1, which is also referred to as alternate quantizer,, MPEG-4 uses a weighting factor to exploit properties of the human visual system ( HVS ). Since human eyes are less sensitive to some frequencies, we can quantize these frequencies with a coarser step-size, which results in a more compactly coded bit-stream and minimizes the image distortion. MPEG-4 recommends different weight matrices for the quantization of intra and inter blocks. The forward and inverse quantization can be described as follows.
| k = | { | 0 sign ( FQ(u,v) ) |
for intra coded blocks for inter coded blocks |
Inverse Quantization:
| F'(u,v) = | { | 0 ( 2 * FQ(u,v) + k ) * W(u,v) * Qp ) / 16 |
if FQ(u,v) = 0 if FQ(u,v) &ne 0 |
8 17 18 19 21 23 25 27 17 18 19 21 23 25 27 28 20 21 22 23 24 26 28 30 21 22 23 24 26 28 30 32 22 23 24 26 28 30 32 35 23 24 26 28 30 32 35 38 25 26 28 30 32 35 38 41 27 28 30 32 35 38 41 45Default weighting matrix for intra coded MBs |
16 17 18 19 20 21 22 23 17 18 19 20 21 22 23 24 18 19 20 21 22 23 24 25 19 20 21 22 23 24 26 27 20 21 22 23 25 26 27 28 21 22 23 24 26 27 28 30 22 23 24 26 27 28 30 31 23 24 25 27 28 30 31 33Default weighting matrix for inter coded MBs |
For simplicity, in our sample code, we implement a uniform quantizer. The code is simple and straight forward and the following shows an example of implementation; the array coef[] holds an 8x8 sample block of DCT coefficients; Qstep is the quantization parameter we discussed above; the code shows both forward quantization ( FQ ) and inverse quantization ( IQ ).
const short Qstep = 12;
//forward quantization
void quantize_block ( short coef[] )
{
for ( short k = 0; k i < 64; ++k ) {
coef[k] = ( short ) round ( (double)coef[k] / Qstep );
}
}
//inverse quantization
void inverse_quantize_block ( short coef[] )
{
for ( short k = 0; k < 64; ++k ) {
coef[k] = coef[k] * Qstep;
}
}
|
After DCT transform and forward quantization, a sample block may have only a few nonzero coefficients and all others are zeros. It is desirable to to group zero coefficients together so that they can be represented effectively. The optimum reordering path ( scan order ) depends on the distribution of nonzero DCT coefficients. A commonly used scan order is a zigzag path starting from the DC coefficient at the top left corner of an 8x8 sample block as shown in the following figure; after such a reordering, nonzero coefficients tend to cluster together at the beginning of the reordered array, followed by long sequences of zeros.
|
0, 1, 8, 16, 9, 2, 3, 10, 17, 24, 32, 25, 18, 11, 4, 5, 12, 19, 26, 33, 40, 48, 41, 34, 27, 20, 13, 6, 7, 14, 21, 28, 35, 42, 49, 56, 57, 50, 43, 36, 29, 22, 15, 23, 30, 37, 44, 51, 58, 59, 52, 45, 38, 31, 39, 46, 53, 60, 61, 54, 47, 55, 62, 63 | ||||||
The alternate scan where the left-hand side coefficients are scan before those on the right side may be more effective the zigzag scan for a field block where the coefficient distribution is often skewed. The following figure shows the alternate scan.
|
Increasing Vertical
Frequency ↓ |
| Increasing Horizontal Frequency → | |
| Alternate scan order ( field block, MPEG2 ) | |
Again, for simplicity, in our examples here, we shall use the zizag scan. The following piece of code shows the implementations of such a reordering and the reverse of it.
unsigned char zigzag[64] = {
0, 1, 8, 16, 9, 2, 3, 10,
17, 24, 32, 25, 18, 11, 4, 5,
12, 19, 26, 33, 40, 48, 41, 34,
27, 20, 13, 6, 7, 14, 21, 28,
35, 42, 49, 56, 57, 50, 43, 36,
29, 22, 15, 23, 30, 37, 44, 51,
58, 59, 52, 45, 38, 31, 39, 46,
53, 60, 61, 54, 47, 55, 62, 63
};
void reorder ( short Y[], short Yr[] )
{
for ( short k = 0; k < 64; ++k ) {
Yr[k] = Y[zigzag[k]];
}
}
void reverse_reorder ( short Yr[], short Y[] )
{
for ( short k = 0; k < 64; ++k ) {
Y[zigzag[k]] = Yr[k];
}
}
|