Linux Game Programming for PC & Embedded Systems using SDL
Presented by
Fore June
Author of Windows Fan, Linux Fan

Games and SDL
SDL Installation
SDL for Embedded
SDL API
SDL Events
 SDL Graphics
SDL Threads
Thread Example
SDL Animation
SDL Sound
 Raw Video Player
Video Formats
Video Compression
 Game Trees
About The Author
@Copyright by Fore June, 2006

Video Compression

1 2 3 4 5 6 7 8
  1. Quantization

    We discussed before that quantization is the process of limiting the value of a sample one of a predetermined number of allowed values. In a video coder, the step of DCT that we discussed in the previous section does not cause a loss in information besides the numerical rounding errors; the main purpose of applying DCT to the video data of a frame is to cluster the data so that they can be more easily processed in subsequent steps. The video data can be recovered by applying IDCT to the DCT coefficients.

    To achieve high compression, we do not want to retain the full range of DCT coefficients as we did in the previous section where we have used 16-bit ( i.e. data type short ) to save a DCT coefficient. We can quantize the DCT coefficients and represent a coefficient with significantly less number of bits. Quantization can be done using a scalar quantizer or vector quantizer; the latter in general yields better results but consumes a lot more computing power. A scalar quantizer operates on a sample value and generates a quantized output value. So we can have several or many different input values map to the same output value. A vector quantizer operates on a vector consisting of a group of sample values to produce an output vector consisting of a group of quantized values.

    Scalar Quantization

    A simple example of scalar quantization is a uniform quantizer where an input value X is divided by a quantization parameter ( or step size ) q and rounded to the nearest integer Fq.

    The quantized output level is given by

    The output levels Y are spaced uniformly with step size q as shown in the following example.

    Example:

    The following figure shows two examples of scalar quantizer. The linear scalar quantizer shown on the left shows linear mapping between input and output values. The nonlinear quantizer on the right shows a dead zone where small input values are mapped to zero.


    Linear scalar quantizer
     
    Nonlinear scalar quantizer

    More precisely, one can define an N-point scalar quantizer Q as a mapping Q : R → C where R is the real line and

    C ≡ { y1, y2, ...., yN } ⊂ R is the output set or codebook with size |C| = N. The output values, yi, are referred to as output levels, output points, or reproduction values. Quite often, we choose the indexing of output values so that y1 < y2 < ... < yN The resolution or code rate, r, of a scalar quantizer is defined as r = log2 N which measures the number of bits required to uniquely specify the quantized value.

    Every quantizer can be viewed as making up of two successive operations ( mappings ), an encoder, E, ( or forward quantizer FQ ), and a decoder, D ( or inverse quantizer IQ ). The encoder E is a mapping

    E : R → I where I = { 1, 2, 3, ..., N }, and the decoder is the mapping D : I → C . Therefore, if Q(x) = yi, then E(x) = i and D(i) = yi. Consequently, Q(x) = D(E(x)). Note that the decoder can be implemented by a table-lookup process, where the table or codebook contains the output set which can be stored with very high precision without affecting the transmission rate R. The decoder is also referred to as an inverse quantization and the encoder is sometimes referred to as forward quantization.

    Vector Quantization

    Vector quantization ( VQ ) is a generalization of scalar quantization to the quantization of a vector, an ordered set of real numbers. Speech or image samples can be grouped together to form a vector. Thus vector quantization can be regarded as a form of pattern recognition where an input pattern is "approximated" by one of a predetermined set of patterns stored in a codebook. The application of VQ to image compression can be summarized as follows:

    The following diagram summarizes the concept of vector quantization.

    Input
    block
      →   Encoder:
    Find best
    match
      →   Transmit index
    of chosen vector
      →   Decoder:
    Look up
             
        Codebook:
    Vector 1
    Vector 2
    ...
    Vector N
     
          Codebook:
    Vector 1
    Vector 2
    ...
    Vector N
     
      →  
    Output
    block

    MPEG-4 Quantization

    Video compression standard MPEG-4 allows two methods to quantize DCT coefficients. A parameter called quantizer_scale ( quantization_parameter ) is used to control how much information is discarded during the quantization process. The parameter can take values from 1 to 31 in the case of 8-bit textures and 1 to 2quant_precision - 1 in the case of no 8-bit textures. Each frame may use a different value of quantizer_scale. The two methods are referred to as Method 2 ( basic method ) and Method 1 ( more flexible but more complex ). Method 2, which is the default method, specifies the quantization of the DC component, F[0][0] using a fixed quantizer step:

    Forward Quantization: Fq[0][0] = F[0][0] / dc_scalar
    Inverse Quantization: F'[0][0] = Fq[0][0] * dc_scalar
    where dc_scalar which has a value of 8 in the short header mode and depends on the quantizer_parameter determined from the following table.

    quantizer_parameter (Qp)     1-4 5-8 9-24 25-31
    dc_scalar (luminance) 8   2Qp     Qp+8   2Qp - 16
    dc_scalar (chrominance) 8   (Qp+13)/2     (Qp+13)/2   Qp - 6

    All other coefficients are rescaled as follows:

    |F| = Qp * ( 2 * |FQ| + 1 )   ( if Qp is odd and FQ ≠ 0 )
    |F| = Qp * ( 2 * |FQ| + 1 ) - 1   ( if Qp is even and FQ ≠ 0 )
    |F| = 0   ( if FQ = 0 )

    where FQ is the forward-quantized coefficient and F is the rescaled ( inverse-quantized ) coefficient.

    In Method 1, which is also referred to as alternate quantizer,, MPEG-4 uses a weighting factor to exploit properties of the human visual system ( HVS ). Since human eyes are less sensitive to some frequencies, we can quantize these frequencies with a coarser step-size, which results in a more compactly coded bit-stream and minimizes the image distortion. MPEG-4 recommends different weight matrices for the quantization of intra and inter blocks. The forward and inverse quantization can be described as follows.

    For simplicity, in our sample code, we implement a uniform quantizer. The code is simple and straight forward and the following shows an example of implementation; the array coef[] holds an 8x8 sample block of DCT coefficients; Qstep is the quantization parameter we discussed above; the code shows both forward quantization ( FQ ) and inverse quantization ( IQ ).

    const short Qstep = 12;
    //forward quantization
    void quantize_block ( short coef[] )
    {
      for ( short k = 0; k i < 64; ++k ) {
        coef[k] = ( short ) round ( (double)coef[k] / Qstep );
      }
    }
    
    //inverse quantization
    void inverse_quantize_block ( short coef[] )
    {
      for ( short k = 0; k < 64; ++k ) {
        coef[k] = coef[k] * Qstep;
      }
    }
    	

  2. Reordering

    After DCT transform and forward quantization, a sample block may have only a few nonzero coefficients and all others are zeros. It is desirable to to group zero coefficients together so that they can be represented effectively. The optimum reordering path ( scan order ) depends on the distribution of nonzero DCT coefficients. A commonly used scan order is a zigzag path starting from the DC coefficient at the top left corner of an 8x8 sample block as shown in the following figure; after such a reordering, nonzero coefficients tend to cluster together at the beginning of the reordered array, followed by long sequences of zeros.

    Increasing
    Vertical
    Frequency
    Increasing Horizontal Frequency
    Zigzag scan order ( frame block, MPEG-1, MPEG2 )
       0,  1,  8, 16,  9,  2,  3, 10,
       17, 24, 32, 25, 18, 11,  4,  5,
       12, 19, 26, 33, 40, 48, 41, 34,
       27, 20, 13,  6,  7, 14, 21, 28,
       35, 42, 49, 56, 57, 50, 43, 36,
       29, 22, 15, 23, 30, 37, 44, 51,
       58, 59, 52, 45, 38, 31, 39, 46,
       53, 60, 61, 54, 47, 55, 62, 63
      

    The alternate scan where the left-hand side coefficients are scan before those on the right side may be more effective the zigzag scan for a field block where the coefficient distribution is often skewed. The following figure shows the alternate scan.

    Increasing Vertical
    Frequency
    Increasing Horizontal Frequency
    Alternate scan order ( field block, MPEG2 )

    Again, for simplicity, in our examples here, we shall use the zizag scan. The following piece of code shows the implementations of such a reordering and the reverse of it.

    unsigned char zigzag[64] = {
       0,  1,  8, 16,  9,  2,  3, 10,
       17, 24, 32, 25, 18, 11,  4,  5,
       12, 19, 26, 33, 40, 48, 41, 34,
       27, 20, 13,  6,  7, 14, 21, 28,
       35, 42, 49, 56, 57, 50, 43, 36,
       29, 22, 15, 23, 30, 37, 44, 51,
       58, 59, 52, 45, 38, 31, 39, 46,
       53, 60, 61, 54, 47, 55, 62, 63
    };
    
    void reorder ( short Y[], short Yr[] )
    {
      for ( short k = 0; k < 64; ++k ) {
        Yr[k] = Y[zigzag[k]];
      }
    }
    
    void reverse_reorder ( short Yr[], short Y[] )
    {
      for ( short k = 0; k < 64; ++k ) {
        Y[zigzag[k]] = Yr[k];
      }
    }
    	

    <<Prev   Next >>