Linux Game Programming for PC & Embedded Systems using SDL
Presented by
Fore June
Author of Windows Fan, Linux Fan

Games and SDL
SDL Installation
SDL for Embedded
SDL API
SDL Events
 SDL Graphics
SDL Threads
Thread Example
SDL Animation
SDL Sound
 Raw Video Player
Video Formats
Video Compression
 Game Trees
About The Author
@Copyright by Fore June, 2006

Video Compression

1 2 3 4 5 6 7 8
  1. Image Compression

    In previous sections, we discussed that by transforming the representation of an image from RGB to 4:2:0 YCbCr format, which separates the intensity from color components, we can easily compress an image by a factor of two without much downgrading of the image quality. The underlying principles in this stage of compression is that human eyes are more sensitive to brightness than to color and in practice, we do not need to retain as much information of the color components as we need in presenting an image. However, even after this change in format, we still represent the image in the spatial domain, where a sample value depends on the positon of the two-dimensional space; that is, it is a function of the coordinates (x, y) of a two-dimensional plane. Our eyes do not have any crucial discrimination of a point at any special position in space and thus it is difficult for us to further pick the irrelevant information and throw it away if we need to. On the other hand, if we could represent the image in the frequency domain, we know that our eyes are not very sensitive to high frequency components and if we have to get rid of any information, we would like to get rid of those components first. It turns out that we can indeed transform an image from the spatial domain to the frequency domain with two well-studied and researched methods called Discrete Cosine Transform ( DCT ) and Discrete Wavelet Transform (DWT). Researchers have found that in most cases DWT works best for static image compression while DCT works best for video compression. One characteristics of DCT is that it has good 'enery' compact capability, meaning that it clusters information into a small region. After DCT, the next step in compressing an image is to quantize the DCT coefficients and reorder the quantized coefficients to make them easier to compress. We summarize the compression steps that we have been discussing in the following figure.

  2. Discrete Cosine Transform ( DCT )

    The Discrete Cosine Transform ( DCT ) is a derivation of Fourier analysis, which is often used to decompose continuous, periodic signals into frequency components. The Discrete Fourier Transform ( DFT ), makes use of sine and cosine basis functions to deal with sampled signals. DCT only has cosine basis, which is an even function, meaning that cos ( -x ) = cos ( x ). It turns out that DCT is effective in handling images. In image processing, DCT operates on X, a block of N x N image samples to create an N x N block of coefficients Y. A coefficient Y(u, v) of Y at coordinates ( u, v ) can be calculated from sample values X(i, j) of X by the following formula.

    DCT:

    Y(u, v) = auav N-1

    i=0
    N-1

    j=0
    X(i, j) cos cos ( 10.1 )
    where a0 = √(1/N),     and
    ak = √(2/N)     for k > 0

    The samples values X(i, j) can be recovered from Y(u, v) with Inverse DCT ( IDCT ) according to the following formula.

    IDCT ( Inverse DCT ) :

    X(i, j) = N-1

    u=0
    N-1

    v=0
    auavY(u, v) cos cos ( 10.2 )

    The DCT and IDCT can be also expressed in matrix form as follows.

    DCT:

    IDCT:

    where X is a matrix of samples, Y is a matrix of coefficients, and AT is the transpose of A = an N x N transform matrix whose elements are:

    For example, when N = 4, the transform matrix is given by

    Evaluating the cosines gives

    Note that which can be factorized to: where ⊗ means element-by-element multiplication and d = c/b = √2 - 1 ≈ 0.414

    As we discussed in previous sections, our sample blocks are 8x8. So in the following discussions of DCT, we consider N = 8 most of the time.

  3. Implementation of DCT and IDCT

    A direct implementation of DCT and IDCT is very simple and easy. All we need to do is to have two for-loops to do summations. However, such an implementation is inefficient and impractical in image or video compression, which require numerous operations of DCT and IDCT. Nevertheless, to get a feeling of how DCT works, we present dct_direct.cpp below, which uses a direct implementation of DCT and IDCT with floating point computation. In the program, the functions dct_direct() and idct_direct() do the actual work of DCT, and IDCT; they use floating point ( double ) in calculations; the functions dct() and idct() simply cast short values to double and call dct_direct() or idct_direct() to do the transformations. This program can be used for checking when we later implement DCT and IDCT using other more efficient methods.

    /*
      dct_direct.cpp
      A straight forward implementation of DCT and IDCT for the purpose of learning and testing.
      Floating-point arithmetic is used.  Such an implementation should not be used
      in practical applications.
      
      Compile: g++ -o dct_direct dct_direct.cpp -lm
      Execute: ./dct_direct
    */
    #include <stdio.h>
    #include <stdlib.h>
    #include <math.h>
    
    using namespace std;
    
    #define PI 3.141592653589
    
    //input: X, N; output: Y
    short dct_direct( short N, double *X, double *Y )
    {
      double a[32], sum, coef;
      short i, j, u, v;
    
      if ( N > 32 || N <= 0 ) {
        printf ("\ninappropriate N\n");
        return -1;
      }
      a[0] = sqrt ( 1.0 / N );
      for ( i = 1; i < N; ++i ) {
        a[i] = sqrt ( 2.0 / N );
      }
      for ( u = 0; u < N; ++u ) {
        for ( v = 0; v < N; ++v ) {
          sum = 0.0;
          for ( i = 0; i < N; ++i ) {
            for ( j = 0; j < N; ++j ) {
    	  coef =  cos ((2*i+1)*u*PI / (2*N)) * cos ((2*j+1)*v*PI/(2*N));
    	  sum += *(X+i*N+j) * coef; 		//X[i][i] * coef
    	} //for j
            *(Y+u*N+v) = a[u] * a[v] * sum;
          } //for i
        } //for u
      } //for v
      
      return 1;
    }
    
    
    //input: N, Y; output X
    short idct_direct( short N, double *Y, double *X )
    {
      double a[32], sum, coef;
      short i, j, u, v;
    
      if ( N > 32 || N <= 0 ) {
        printf ("\ninappropriate N\n");
        return -1;
      }
      a[0] = sqrt ( 1.0 / N );
      for ( i = 1; i < N; ++i ) {
        a[i] = sqrt ( 2.0 / N );
      }
      for ( i = 0; i < N; ++i ) {
        for ( j = 0; j < N; ++j ) {
          sum = 0.0;
          for ( u = 0; u < N; ++u ) {
            for ( v = 0; v < N; ++v ) {
    	  coef =  cos ((2*j+1)*v*PI/(2*N)) * cos ((2*i+1)*u*PI / (2*N));
    	  sum += a[u] * a[v] *( *(Y+u*N+v) ) * coef; 	//a[u] * a[v] * Y[u][v] * coef
    	} //for j
            *(X+i*N+j) =  sum;
          } //for i
        } //for u
      } //for v
      
      return 1;
    }
    
    /*
      change values from short to double and vice versa.
    */
    short dct ( short N, short *X, short *Y )
    {
      double  tempx[1024], tempy[1024];
      int  total, i;
     
      if ( N > 32 || N <= 0 ) {
        printf ("\ninappropriate N\n");
        return -1;
      }
      total = N * N;
      for ( i = 0; i < total; ++i ) {
        tempx[i] = (double) *(X+i);
      }
      dct_direct ( N, tempx, tempy );		//DCT operation
      for ( i = 0; i < total; ++i ) {
        *(Y+i) = (short ) ( floor (tempy[i]+0.5) );
      }
      
      return 1; 
    }
    
    /*
      change values from short to doulbe, and vice versa.
    */
    short idct ( short N, short *Y, short *X )
    {
      double  tempx[1024], tempy[1024];
      int  total, i;
    
      if ( N > 32 || N <= 0 ) {
        printf ("\ninappropriate N\n");
        return -1;
      }
      total = N * N;
      for ( i = 0; i < total; ++i ) {
        tempy[i] = (double) *(Y+i);
      }
      idct_direct ( N, tempy, tempx );		//IDCT operation
      for ( i = 0; i < total; ++i ) {
        *(X+i) = (short ) ( floor (tempx[i]+0.5) );	//rounding
      }
    
      return 1;
    }
    
    void print_elements ( short N,  short *X )
    {
      short i, j;
    
      for ( i = 0; i < N; ++i ){
        printf("\n");
        for ( j = 0; j < N; ++j ) {
          printf ("%4d, ", *(X+N*i+j) );
        }
      }
    }
    
    int main()
    {
      short X[8][8], Y[8][8];
      int i, j, x, y, N;
      char temp[8][8];
      N = 8;
    
      //try some values for testing
      for ( i = 0; i < N; ++i ) {
        for ( j = 0; j < N; ++j ) {
            X[i][j] = i + j;
        }
      }
      
      printf("\nOriginal sample values");
      print_elements ( N, &X[0][0] );
      printf("\n--------------------\n");
      
     
      dct ( N, &X[0][0], &Y[0][0] );		//performing DCT
      printf("\nCoefficients of DCT:"); 
      print_elements ( N, &Y[0][0] );
      printf("\n--------------------\n");
      
    
      idct ( N, &Y[0][0], &X[0][0] );		//performing IDCT
      printf("\nValues recovered by IDCT:");
      print_elements ( N, &X[0][0] );
      printf("\n");
    }
    

    The following is the output from the program.

    Original sample values
       0,    1,    2,    3,    4,    5,    6,    7,
       1,    2,    3,    4,    5,    6,    7,    8,
       2,    3,    4,    5,    6,    7,    8,    9,
       3,    4,    5,    6,    7,    8,    9,   10,
       4,    5,    6,    7,    8,    9,   10,   11,
       5,    6,    7,    8,    9,   10,   11,   12,
       6,    7,    8,    9,   10,   11,   12,   13,
       7,    8,    9,   10,   11,   12,   13,   14,
    --------------------
    
    Coefficients of DCT:
      56,  -18,    0,   -2,    0,   -1,    0,    0,
     -18,    0,    0,    0,    0,    0,    0,    0,
       0,    0,    0,    0,    0,    0,    0,    0,
      -2,    0,    0,    0,    0,    0,    0,    0,
       0,    0,    0,    0,    0,    0,    0,    0,
      -1,    0,    0,    0,    0,    0,    0,    0,
       0,    0,    0,    0,    0,    0,    0,    0,
       0,    0,    0,    0,    0,    0,    0,    0,
    --------------------
    
    Values recovered by IDCT:
       0,    1,    2,    3,    4,    5,    6,    7,
       1,    2,    3,    4,    5,    6,    7,    8,
       2,    3,    4,    5,    6,    7,    8,    9,
       3,    4,    5,    6,    7,    8,    9,   10,
       4,    5,    6,    7,    8,    9,   10,   11,
       5,    6,    7,    8,    9,   10,   11,   12,
       6,    7,    8,    9,   10,   11,   12,   13,
       7,    8,    9,   10,   11,   12,   13,   14,
    

    You can see from the output that there are only a few nonzero coefficient values after DCT and they are clustered at the upper left corner. So after DCT, it becomes clear to us that the sample block X actually does not contain as much information as it appears and it is a lot easier to carry out data compression in the transformed domain.

    The value of the DCT coefficient at position (0, 0), Y(0, 0) is in general referred to as the DC value and others are referred to as AC values. This is because Y(0, 0) is essentially the average of all the sample values and you can easily see this if you write down the formula for calculating its value ( note that a0 * a0 = 1 / N, and cos ( 0 ) = 1. ):

    Y(0, 0) =( 1/N ) N-1

    i=0
    N-1

    j=0
    X(i, j)

  4. Fast DCT

    As DCT is so important in signal processing, a lot of research has been done to speed up its calculations. One main idea behind fast DCT is to break down the summation into stages and in each stage, intermediates sums of two quantities are formed, which will be used in later stages to obtain the final sum. If we consider only small values of N in the form of N = 2n, it is not difficult to understand how this is done. For instance, consider N = 8, which is what we need in our video compression. We can rewrite ( 10.1 ) as follows.

    Y(u, v) = auav 7

    i=0
    7

    j=0
    X(i, j) cos cos ( 12.1 )
    with a0 = 1/(2√2),     and
    ak = 1/2     for k > 0
    Actually, we can express the two-dimensional ( 2-D ) DCT as two one-dimensional ( 1-D ) DCT as ( 12.1 ) can be expressed as
    Y(u, v) = auav 7

    i=0
    yiv cos ( (2i+1)uπ/16 ) ( 12.2 )

    where

    yiv = 7

    j=0
    X(i, j) cos ( (2j+1)vπ/16 ) ( 12.3 )
    For convenience of writing, we suppress writing the index i; we let xj = X(i, j) and yk = yik. So yk is essentially the one-dimensional DCT of row elements of of X(i, j). If we let θ = π/16, we can list all the coefficients of yk in a table as follows.

      x0 x1 x2 x3 x4 x5 x6 x7
      y0 =   1 1 1 1 1 1 1 1
      y1 =     cos (1θ)     cos (3θ)     cos (5θ)     cos (7θ)     cos (9θ)     cos (11θ)     cos (13θ)     cos (15θ)  
      y2 =     cos (2θ)     cos (6θ)     cos (10θ)     cos (14θ)     cos (18θ)     cos (22θ)     cos (26θ)     cos (30θ)  
      y3 =     cos (3θ)     cos (9θ)     cos (15θ)     cos (21θ)     cos (27θ)     cos (33θ)     cos (39θ)     cos (45θ)  
      y4 =     cos (4θ)     cos (12θ)     cos (20θ)     cos (28θ)     cos (36θ)     cos (44θ)     cos (52θ)     cos (60θ)  
      y5 =     cos (5θ)     cos (15θ)     cos (25θ)     cos (35θ)     cos (45θ)     cos (55θ)     cos (65θ)     cos (75θ)  
      y6 =     cos (6θ)     cos (18θ)     cos (30θ)     cos (42θ)     cos (54θ)     cos (66θ)     cos (78θ)     cos (90θ)  
      y7 =     cos (7θ)     cos (21θ)     cos (35θ)     cos (49θ)     cos (63θ)     cos (77θ)     cos (91θ)     cos (105θ)  

    Table 1

    Since θ = π/16, we have 16θ = π; by making use of some basic cosine properties such as cos ( π - α ) = -cos ( α ) and cos ( 2π - α ) = cos ( α ), we can always reduce cos ( nθ ) when n ≥ 8 to a form of ±cos ( kθ ) with k < 8. For example,

    cos ( 9θ ) = cos ( 16θ - 7θ ) = cos ( π - 7θ ) = -cos ( 7θ )
    cos ( 35θ ) = cos ( 32θ + 3θ ) = cos ( 2π + 3θ ) = cos ( 3θ )

    Table 1 can then be simplified to Table 2 shown below.

      x0 x1 x2 x3 x4 x5 x6 x7
      y0 =   1 1 1 1 1 1 1 1
      y1 =     cos (1θ)     cos (3θ)     cos (5θ)     cos (7θ)     -cos (7θ)     -cos (5θ)     -cos (3θ)     -cos (1θ)  
      y2 =     cos (2θ)     cos (6θ)     -cos (6θ)     -cos (2θ)     -cos (2θ)     -cos (6θ)     cos (6θ)     cos (2θ)  
      y3 =     cos (3θ)     -cos (7θ)     -cos (1θ)     -cos (5θ)     cos (5θ)     cos (1θ)     cos (7θ)     -cos (3θ)  
      y4 =     cos (4θ)     -cos (4θ)     -cos (4θ)     cos (4θ)     cos (4θ)     -cos (4θ)     -cos (4θ)     cos (4θ)  
      y5 =     cos (5θ)     -cos (1θ)     cos (7θ)     cos (3θ)     -cos (3θ)     -cos (7θ)     cos (1θ)     -cos (5θ)  
      y6 =     cos (6θ)     -cos (2θ)     cos (2θ)     -cos (6θ)     -cos (6θ)     cos (2θ)     -cos (2θ)     cos (6θ)  
      y7 =     cos (7θ)     -cos (5θ)     cos (3θ)     -cos (1θ)     cos (1θ)     -cos (3θ)     cos (5θ)     -cos (7θ)  

    Table 2

    We observe in Table 2 that there are some symmetries in the columns; besides the first row, whenever cos ( kθ ) appears in an i-th column, it also appears in another j-th column with i + j = 7. This implies that we can always group the i-th and j-th column together in our summing operations to compute ( xi ± xj ) cos ( kθ ). For example, we can rewrite y0 and y2 as follows, where we have abbreviated cos ( kθ ) as ck.

    y0 = ( x0 + x7 ) + ( x1 + x6 ) + ( x2 + x5 ) + ( x3 + x4 )
    y2 = ( x0 + x7 )c2 + ( x1 + x6 )c6 - ( x2 + x5 )c6 - ( x3 + x4 )c2
    So once we have computed ( xi + xj ) with i + j = 7, the value can be used in calculating both y0 and y2. Moreover, we can continue this process recursively until we obtain the final values; suppose we let x'i = ( xi + xj ), and x'j = ( xi - xj ), we can rewrite y0 and y2 as follows.
    y0 = ( x'0 + x'3 ) + ( x'1 + x'2 ) = ( x''0 + x''1 )
    y2 = ( x'0 - x'3 )c2 + ( x'1 - x'2 )c6 = x''3c2 + x''2c6

    It is obvious that y4 and y6 can be decomposed in the same way as y4 only involves c4 and y6 only involves c2 and c6. The other yk's ( y1, y3,y5,and y7 ) appear to be more difficult to decompose as four ci's are involved ( c1, c3,c5, and c7 ). However, it turns out that the coefficients can be expressed in terms of each other by making use of the cosine properties. For example,

  5. cos ( α - β ) = cos ( α ) cos ( β ) + sin ( α ) sin ( β )
  6. 8θ = 8π/16 = π/2
  7. cos ( 4θ ) = cos ( π/4 ) = sin ( π/4 ) = 1/√2
  8. cos ( 1θ ) = cos ( 8θ - 7θ ) = cos ( π/2 - 7θ ) = sin ( 7θ )
  9. c3 = cos ( 3θ ) = cos ( 7θ - 4θ ) = cos ( 7θ ) cos ( 4θ ) + sin ( 7θ ) sin ( 4θ ) = ( c7 + c1 ) / √2
  10. c5 = cos ( 5θ ) = cos ( 1θ + 4θ ) = cos ( 1θ ) cos ( 4θ ) - sin ( 1θ ) sin ( 4θ ) = ( c1 - c7 ) / √2
  11. Making use of these identities, we can similarly apply the decomposition steps to all the yi's. For example, we can express y1 as

    y1 = ( x0 - x7 )c1 + ( x1 - x6 )c3 + ( x2 - x5 )c5 + ( x3 - x4 )c7
      = x'7c1 + x'6c3 + x'5c5 + x'4c7
      = x'7c1 + x'6( c1 + c7 ) / √2 + x'5 ( c1 - c7 ) / √2 + x'4c7
      = [ x'7 + ( x'6 + x'5 ) / √2 ] c1 + [x'4 + ( x'6 - x'5 ) / √2 ] c7
      = [ x'7 + x''6 ] c1 + [x'4 + x''5 ] c7
      = x'''7 c1 + x'''4 c7
    We also observe that in the process we often have calculations in the form,
    c = ( a + b )
    d = ( a - b )
      ( 1 )
    Such a computation can be represented by a diagram shown below, which is referred to as a butterfly computation because of its appearance; actually, it is a simple flow graph.

    Figure 1

    If we include the constants ak in our calculation of yk and let

    Ck = ckak = ck / 2 = cos ( θ ) / 2     k ≥ 1, and
    C0 = 1 / √2
    we arrive at the following flow graph, where "-" represents "-1". ( Note that a0 = 1/( 2√2 ) = C4. )

    Figure 2
    We can obtain yk by tracing the paths of getting to it. For example, according to our notation, y2 is given by
    y2 = x''2C6 + x''3C2
      = [ x'1 - x'2 ]C6 + [ x'0 - x'3 ]C2
      = [ ( x1 + x6 ) - ( x2 + x5 ]C6 + [ ( x0 + x7 ) - ( x3 + x4 ) ]C2

    The following piece of codes shows how y[k]'s are computed from x[k]'s.

       for (j = 0; j < 4; j++) {                 //first stage transform, see flow-graph
            j1 = 7 - j;
            x1[j] = x[j] + x[j1];
            x1[j1] = x[j] - x[j1];
       }
       x[0] = x1[0] + x1[3];                     //second stage transform
       x[1] = x1[1] + x1[2];
       x[2] = x1[1] - x1[2];
       x[3] = x1[0] - x1[3];
       x[4] = x1[4];
       x[5] = ( x1[6] - x1[5] ) * C0; 
       x[6] = ( x1[6] + x1[5]) * C0
       x[7] = x1[7];
    
       y[0] = ( x[0] + x[1] ) * C4;             //upper-half of third (final) stage, see flow-graph
       y[4] = ( x[0] - x[1] ) * C4;
       y[2] = x[2] * C6 + x[3] * C2;
       y[6] = x[3] * C6 - x[2] * C2;
    
       x1[4] = x[4] + x[5];                      //lower-half of third stage
       x1[5] = x[4] - x[5];
       x1[6] = x[7] - x[6];
       x1[7] = x[7] + x[6];
    
       y[1] = x1[4] * C7 + x1[7] * C1;        //lower-half of fourth ( final ) stage
       y[7] = x1[7] * C7 - x1[4] * C1;
       y[5] = x1[5] * C3 + x1[6] * C5;
       y[3] = x1[6] * C3 - x1[5] * C5;
    

    This code works a lot faster than the direct implementation of DCT. Moreover, we can make further improvement if we use integer-arithmetic in the calculations.

  12. Integer Arithmetic

    In practice, we need to use integer arithmetic to implement a Fast DCT as that will speed up the code significantly. To see how this works, without loss of generality, let us consider 16-bit integers and consider a simple example with a number x = 2.75. We cannot express this number as an integer directly, but we can imagine that there is a binary point at the right side of bit 0 of a 16-bit integer. In this imaginary format, x can be represented as a bit vector with imaginary digits ( i.e. 2.75 = 21 + 2-1 + 2-2 ) depicted in the following diagram.

    bit position: 9 8 7 6 5 4 3 2 1 0  
    x =     0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 .1 1

    Of course, a 16-bit integer does not have the part shown in pink. To retain the information of the part in pink, we can shift the whole thing left by 8 bits, which is the same as multiplying 2.75 by 256 ( = 28 ). After this shift, the binary point will be at position between bit 8 and bit 7 and the number becomes an integer shown below.

    bit position: 9 8   7 6 5 4 3 2 1 0
    x' =     0 0 0 0 0 0 1 0 . 1 1 0 0 0 0 0 0

    The value of x becomes

    x' = 29 + 27 + 26 = 704

    which is the same as 2.75 x 256. If we now divide x' by 256, or right-shift it by 8 bits, we obtain the integer 2; that is, we have truncated the pink digits, the fraction part of 2.75. In many cases, we would prefer to obtain a value of 3 after the shift operations. Such an operation is the same as a round operation. We can achieve this by first adding 0.5 to 2.75 before the truncation. If we express 0.5 in our imaginary format and left-shift it by 8 bits, we obtain the value 128 ( = 27 ). So adding 0.5 to x corresponds to the action of adding 27 to x'. This is illustrated in the following figure.

    bit position: 9 8   7 6 5 4 3 2 1 0
    x' =     0 0 0 0 0 0 1 0 . 1 1 0 0 0 0 0 0
    + 0.5 x 27 =     0 0 0 0 0 0 0 0 . 1 0 0 0 0 0 0 0
    -----------------------------------------------------
    x'' =     0 0 0 0 0 0 1 1 . 0 1 0 0 0 0 0 0

    Obviously, when we right-shift x'' by 8 bits, we obtain our desired value 3. So in general, if we multiply real positive numbers by 2n to convert them to integers and we want to achieve the rounding effect, we can add the value 2n-1 to the results before shifting them right n bits. The above discussion covers positive numbers. How are about negative numbers? Should we still add 0.5 to the a number or should we subtract 0.5 from it before truncation? The following example tells us what we should do.

    Most modern-day computers use two's complement to represent negative numbers. Binary numbers that can have negative values are referred to as signed numbers, otherwise they are referred to as unsigned numbers. In two's complement representation, if we right-shift an unsigned number one bit, a 0 is always shifted into the leftmost bit position. However, if we right-shift a negative signed-number, a 1 is shifted in. For example, consider the following piece code.

    int main()
    {
      char sc = 0x82;		//8-bit signed number
      unsigned uc = 0x82;		//8-bit unsigned number
    
      sc >>= 1;     //right-shift one bit
      uc >>= 1;     //right-shift one bit
    
      printf("\nsigned shift:   0x%x", sc );
      printf("\nunsigned shift: 0x%x", uc );
    }
    
    The outputs will be something like,
    signed shift:   0xc1
    unsigned shift: 0x41
    
    as a 1 has been shifted into sc and a 0 into uc. Because of this property, integer-division of a negative signed-number by 2n is different from right-shifting it by n bits. For example,

    -1 / 2 = 0     but   -1 >> 1 = -1
    -3 / 2 = -1     but   -3 >> 1 = -2

    Now consider the negative signed-number x = -2.75. Its 2s complement representation can be obtained by complementing all bits of the corresponding positive number ( 2.75 ) and adding 1 to the right most bit. Thus it has the following binary form.

    bit position: 9 8 7 6 5 4 3 2 1 0  
    x =     1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 .0 1
    When we round -2.75, we would like to obtain a value of -3 rather -2 as the former is closer to its real value. Suppose we perform the same operations that we did to positive numbers discussed above, shifting it left 8 bits, adding 0.5x27 and then right shifting 8 bits. The following diagram illustrates the situation.
    bit position: 9 8   7 6 5 4 3 2 1 0
    x' =     1 1 1 1 1 1 0 1 . 0 1 0 0 0 0 0 0
    + 0.5 x 27 =     0 0 0 0 0 0 0 0 . 1 0 0 0 0 0 0 0
    -----------------------------------------------------
    x'' =     1 1 1 1 1 1 0 1 . 1 1 0 0 0 0 0 0
    When we right-shift x'' 8 bits, we have
    bit position: 9 8 7 6 5 4 3 2 1 0
    x'' >> 8 =     1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1
    which has a value of -3 and is what we want. So we can conclude that to obtain rounding effect, we always add 0.5 to the number before truncating ( right-shifting ) regardless if the number is positive or negative.

    When we add two numbers using integer arithmetic, we must align the binary points of the two numbers before addition; if the first number has been left-shifted by n bits, the second one must also be shifted by the same amount. In our implementation of Fast DCT, we use 32-bit integers; all the coefficients Ci is pre-multiplied by 2048 ( = 211 ). After the calculations, we right-shift the integers to obtain the final results. The code of the implementation of Fast DCT is listed below.

    #include <string.h>
    #include <stdio.h>
    #include <math.h>
    
    #define PI 3.141592653589
    const short shift = 10;			//10 bits precision for fixed-point arithmetic
    const short shift1 = 2 * shift;		//at the final stage, values have been shifted twice
    const int fac = 1 << shift ;		//multiply all constants by 2^10 ( = 1024 )
    const int delta = 1 << (shift-1) ;	//for rounding adjustment ~ 0.5x2^10
    const int delta1 = 1 << (shift1-1);	//for final rounding adjustment ~ 0.5x2^20
    const double a = PI / 16.0;		//angle theta
    
    //DCT constants; use integer-arithmetic.
    const int c0 = (int) ( 1 / sqrt ( 2 ) * fac );
    const int c1 = (int) ( cos ( a ) / 2 * fac );
    const int c2 = (int) ( cos ( 2*a ) / 2 * fac );
    const int c3 = (int) ( cos ( 3*a ) / 2 * fac );
    const int c4 = (int) ( cos ( 4*a ) / 2 * fac );
    const int c5 = (int) ( cos ( 5*a ) / 2 * fac );
    const int c6 = (int) ( cos ( 6*a ) / 2 * fac );
    const int c7 = (int) ( cos ( 7*a ) / 2 * fac );
    
    /*
      DCT function.
      Input: X, array of 8x8, containing data with values in [0, 255].
      Ouput: Y, array of 8x8 DCT coefficients.
    */
    void dct(int *X, int *Y)
    {
       int   i, j, j1, k;
       int x[8], x1[8], m[8][8];
          
    
       /*
         Row transform 
         i-th row, k-th element 
       */
       for (i = 0, k = 0; i < 8; i++, k += 8) {
          for (j = 0; j < 8; j++) 
            x[j] = X[k+j];				//data for one row
          
          for (j = 0; j < 4; j++) {			//first stage transform, see flow-graph
            j1 = 7 - j;
            x1[j] = x[j] + x[j1];
            x1[j1] = x[j] - x[j1];
          }
          x[0] = x1[0] + x1[3];			//second stage transform
          x[1] = x1[1] + x1[2];
          x[2] = x1[1] - x1[2];
          x[3] = x1[0] - x1[3];
          x[4] = x1[4];
          x[5] = ((x1[6] - x1[5]) * c0 + delta ) >> shift;	//after multiplication, add delta for rounding,
          x[6] = ((x1[6] + x1[5]) * c0 + delta ) >> shift;	//  shift-right to undo 'x fac' to line up
          x[7] = x1[7];					//  binary points of all x[i]
    
          m[i][0] = (x[0] + x[1]) * c4;		//upper-half of third (final) stage, see flow-graph
          m[i][4] = (x[0] - x[1]) * c4;
          m[i][2] = x[2] * c6 + x[3] * c2;
          m[i][6] = x[3] * c6 - x[2] * c2;
    
          x1[4] = x[4] + x[5];			//lower-half of third stage
          x1[5] = x[4] - x[5];
          x1[6] = x[7] - x[6];
          x1[7] = x[7] + x[6];
    	
          m[i][1] = x1[4] * c7 + x1[7] * c1;	//lower-half of fourth ( final ) stage
          m[i][7] = x1[7] * c7 - x1[4] * c1;
          m[i][5] = x1[5] * c3 + x1[6] * c5;
          m[i][3] = x1[6] * c3 - x1[5] * c5;
       } //for i
    
       /*
    	At this point, coefficients of each row ( m[i][j] ) has been multiplied by 2^10.
    	We can undo the multiplication by << 10 here before doing the column transform.
    	However, as we are using int variables, which are 32-bit to do multiplications, we
    	can tolerate another multiplication of 2^10. So we delay our undoing until the end
    	of the vertical transform and we undo all left-shift operations by shifting
    	the results right 20 bits ( i.e. << 2 * 10 ).
       */
       // Column transform 
       for (i = 0; i < 8; i++) {			//eight columns
    
          //consider one column
          for (j = 0; j < 4; j++) {		//first-stage operation
            j1 = 7 - j;
            x1[j] = m[j][i] + m[j1][i];
            x1[j1] = m[j][i] - m[j1][i];
          }
    						//second-stage operation
          x[0] = x1[0] + x1[3];
          x[1] = x1[1] + x1[2];
          x[2] = x1[1] - x1[2];
          x[3] = x1[0] - x1[3];
          x[4] = x1[4];
          //undo one shift for x[5], x[6] to avoid overflow
          x1[5] = (x1[5] + delta) >> shift;
          x1[6] = (x1[6] + delta) >> shift;
    
          x[5] = (x1[6] - x1[5]) * c0;	
          x[6] = (x1[6] + x1[5]) * c0;
          x[7] = x1[7];
    
          m[0][i] = (x[0] + x[1]) * c4;		//upper-half of third (final) stage, see flow-graph
          m[4][i] = (x[0] - x[1]) * c4;
          m[2][i] = ( x[2] * c6 + x[3] * c2 ) ;
          m[6][i] = ( x[3] * c6 - x[2] * c2 );
          
          x1[4] = x[4] + x[5];			//lower-half of third stage
          x1[7] = x[7] + x[6];
          x1[5] = x[4] - x[5];
          x1[6] = x[7] - x[6];
    
          m[1][i] = x1[4] * c7 + x1[7] * c1;	//lower-half of fourth stage
          m[5][i] = x1[5] * c3 + x1[6] * c5;
          m[3][i] = x1[6] * c3 - x1[5] * c5;
          m[7][i] = x1[7] * c7 - x1[4] * c1;
       } // for i
    
      //we have left-shift (multiplying constants) twice
      for ( i = 0; i < 8; ++i ) {
        for ( j = 0; j < 8; ++j ) {
          *Y++ =  ( m[i][j] + delta1 ) >> shift1; 	//round the value by adding delta
        }
      }
    }
    

  13. Inverse DCT ( IDCT ) Implementation

    Once we have written the code for DCT, the implementation of IDCT becomes easy. All we need to do is to reverse the steps in the DCT program. In the flow graph shown in Figure 2, when we traverse from left to right, we obtain the DCT; if we traverse from right to left, we obtain the IDCT. To understand why this is so, lets examine a butterfly computation, where we obtain (c, d) from (a, b). If we reverse the direction of the butterfly flow-graph, going from right to left, we recover (a, b) from (c, d) by

    a = ( c + d ) / 2
    b = ( c - d ) / 2
      ( 2 )
    Figure 3 shows the reversed butterfly except that we have suppressed writing the constant 1/2 in the diagram.

    Figure 3
    Basically, (1) and (2) have the same form. If we had multiplied the right side of (1) by 1/√2, then the forms for (1) and (2) would become identical; there's no difference in going forward ( from left to right ) or going backward ( from right to left ) in the flow graph. Another case shown in Figure 2 is of the form
    c = aCi - bCj
    d = aCj + bCi
      ( 3 )
    where Ck = cos (kθ), and θ = π / 16. ( If the minus sign occurs in the second equation of (3), we can simply interchange the roles of c, and d to make it look the same as (3). ) It turns out that we always have i + j = 8. Therefore,
       
    and we can express (3) in the following matrix form,
     
    c
    d
    = cos(iθ) - sin(iθ)
    sin(iθ) + cos(iθ)
    a
    b
      ( 4 )

    You may now recognize that the operation of (4) is a rotation of iθ on a plane about the origin, which rotates the point (a, b) to the point (c, d). Obviously, the inverse of such a transformation is a rotation of -iθ which will bring the point (c, d) back to (a, b). That is,

    a
    b
    = cos(iθ) + sin(iθ)
    -sin(iθ) + cos(iθ)
    c
    d
      ( 5 )
     
    as sin(-iθ) = -sin(iθ) and cos(-i&theta) = cos(iθ). We can then easily obtain the inverse of (3) by simply changing the Ci's associated with negative signs; we replace any -Ck by Ck and the corresponding Ck by -Ck. So the inverse of ( 3 ) is
     
    a = cCi + dCj
    b = -cCj + dCi
      ( 6 )

      By doing this substitution for all -Ck's in Figure 2, we can easily obtain the flow graph for IDCT. In developing the IDCT program, we start with the DCT code, starting from the bottom of the DCT function, and working our way up with the replacing strategy; the result is the IDCT code. We list the complete program, dct_video.cpp that implements both DCT and IDCT below; they are straight forward implementations of what we have discussed. The functions are ready for use for video compression.

    /*
      dct_video.cpp
      An implementation of 8x8 DCT and IDCT for video compression.
      32-bit integer-arithmetic is assumed.
      For DCT operation:
    	dct ( int *X, int *Y );
    	X is the input pointing to an 8x8 array of samples; values must be within [0, 255]
    	Y is the output pointing to an 8x8 array of DCT coefficients.
      For IDCT operation:
    	idct ( int *Y, int *X );
    	Y is the input pointing to an 8x8 array of DCT coefficients.
    	X is the output pointing to an 8x8 array of sample values.
      For more detailed explanations, see
    	http://www.webkinesia.com/games/vcompress3.php
    
      compile: g++ -c dct_video.cpp
      To use them, include the the following header statements in your application.
    	void dct(int *X, int *Y);
    	void idct(int *Y, int *X);
    	
    */
    
    #include <string.h>
    #include <stdio.h>
    #include <math.h>
    
    #define PI 3.141592653589
    const short shift = 10;			//10 bits precision for fixed-point arithmetic
    const short shift1 = 2 * shift;		//at the final stage, values have been shifted twice
    const int fac = 1 << shift ;		//multiply all constants by 2^10 ( = 1024 )
    const int delta = 1 << (shift-1) ;	//for rounding adjustment ~ 0.5x2^10
    const int delta1 = 1 << (shift1-1);	//for final rounding adjustment ~ 0.5x2^20
    const double a = PI / 16.0;		//angle theta
    
    //DCT constants; use integer-arithmetic.
    const int c0 = (int) ( 1 / sqrt ( 2 ) * fac );
    const int c1 = (int) ( cos ( a ) / 2 * fac );
    const int c2 = (int) ( cos ( 2*a ) / 2 * fac );
    const int c3 = (int) ( cos ( 3*a ) / 2 * fac );
    const int c4 = (int) ( cos ( 4*a ) / 2 * fac );
    const int c5 = (int) ( cos ( 5*a ) / 2 * fac );
    const int c6 = (int) ( cos ( 6*a ) / 2 * fac );
    const int c7 = (int) ( cos ( 7*a ) / 2 * fac );
    
    /*
      DCT function.
      Input: X, array of 8x8, containing data with values in [0, 255].
      Ouput: Y, array of 8x8 DCT coefficients.
    */
    void dct(int *X, int *Y)
    {
       int   i, j, j1, k;
       int x[8], x1[8], m[8][8];
          
       /*
         Row transform 
         i-th row, k-th element 
       */
       for (i = 0, k = 0; i < 8; i++, k += 8) {
          for (j = 0; j < 8; j++) 
            x[j] = X[k+j];				//data for one row
          
          for (j = 0; j < 4; j++) {			//first stage transform, see flow-graph
            j1 = 7 - j;
            x1[j] = x[j] + x[j1];
            x1[j1] = x[j] - x[j1];
          }
          x[0] = x1[0] + x1[3];			//second stage transform
          x[1] = x1[1] + x1[2];
          x[2] = x1[1] - x1[2];
          x[3] = x1[0] - x1[3];
          x[4] = x1[4];
          x[5] = ((x1[6] - x1[5]) * c0 + delta ) >> shift;	//after multiplication, add delta for rounding,
          x[6] = ((x1[6] + x1[5]) * c0 + delta ) >> shift;	//  shift-right to undo 'x fac' to line up
          x[7] = x1[7];					//  binary points of all x[i]
    
          m[i][0] = (x[0] + x[1]) * c4;		//upper-half of third (final) stage, see flow-graph
          m[i][4] = (x[0] - x[1]) * c4;
          m[i][2] = x[2] * c6 + x[3] * c2;
          m[i][6] = x[3] * c6 - x[2] * c2;
    
          x1[4] = x[4] + x[5];			//lower-half of third stage
          x1[5] = x[4] - x[5];
          x1[6] = x[7] - x[6];
          x1[7] = x[7] + x[6];
    	
          m[i][1] = x1[4] * c7 + x1[7] * c1;	//lower-half of fourth ( final ) stage
          m[i][7] = x1[7] * c7 - x1[4] * c1;
          m[i][5] = x1[5] * c3 + x1[6] * c5;
          m[i][3] = x1[6] * c3 - x1[5] * c5;
       } //for i
    
       /*
    	At this point, coefficients of each row ( m[i][j] ) has been multiplied by 2^10.
    	We can undo the multiplication by << 10 here before doing the vertical transform.
    	However, as we are using int variables, which are 32-bit to do multiplications, we
    	can tolerate another multiplication of 2^10. So we delay our undoing until the end
    	of the vertical transform and we undo all left-shift operations by shifting
    	the results right 20 bits ( i.e. << 2 * 10 ).
       */
       // Column transform 
       for (i = 0; i < 8; i++) {			//eight columns
    
          //consider one column
          for (j = 0; j < 4; j++) {			//first-stage operation
            j1 = 7 - j;
            x1[j] = m[j][i] + m[j1][i];
            x1[j1] = m[j][i] - m[j1][i];
          }
    						//second-stage operation
          x[0] = x1[0] + x1[3];
          x[1] = x1[1] + x1[2];
          x[2] = x1[1] - x1[2];
          x[3] = x1[0] - x1[3];
          x[4] = x1[4];
          //undo one shift for x[5], x[6] to avoid overflow
          x1[5] = (x1[5] + delta) >> shift;
          x1[6] = (x1[6] + delta) >> shift;
    
          x[5] = (x1[6] - x1[5]) * c0;	
          x[6] = (x1[6] + x1[5]) * c0;
          x[7] = x1[7];
    
          m[0][i] = (x[0] + x[1]) * c4;		//upper-half of third (final) stage, see flow-graph
          m[4][i] = (x[0] - x[1]) * c4;
          m[2][i] = ( x[2] * c6 + x[3] * c2 ) ;
          m[6][i] = ( x[3] * c6 - x[2] * c2 );
          
          x1[4] = x[4] + x[5];			//lower-half of third stage
          x1[7] = x[7] + x[6];
          x1[5] = x[4] - x[5];
          x1[6] = x[7] - x[6];
    
          m[1][i] = x1[4] * c7 + x1[7] * c1;	//lower-half of fourth stage
          m[5][i] = x1[5] * c3 + x1[6] * c5;
          m[3][i] = x1[6] * c3 - x1[5] * c5;
          m[7][i] = x1[7] * c7 - x1[4] * c1;
       } // for i
    
      //we have left-shift (multiplying constants) twice
      for ( i = 0; i < 8; ++i ) {
        for ( j = 0; j < 8; ++j ) {
          *Y++ =  ( m[i][j] + delta1 ) >> shift1; 	//round the value by adding delta
        }
      }
    }
    
    /*
      Implementation of idct() is to reverse the operations of dct().
      We first do vertical transform and then horizontal; this is easier
      for debugging as the operations are just the reverse of those in dct();
      of course, it works just as well if you do the horizontal transform first.
      So in this implementation, the first stage of idct() is the final stage of dct()
      and the final stage of idct() is the first stage of dct().
    */
    
    void idct(int *Y, int *X)
    {
       int   j1, i, j, k;
       int x[8], x1[8], m[8][8], y[8];
       k = i = 0;
    
      //column transform
      for ( i = 0; i < 8; ++i ) {
        for (j = 0; j < 8; j++) 
            y[j] = Y[i+8*j];				
        //print_x ( y );
        //getchar();
    
        x1[4] = y[1] * c7 - y[7] * c1;		//lower-half final stage of dct
        x1[7] = y[1] * c1 + y[7] * c7;
        x1[6] = y[3] * c3 + y[5] * c5;
        x1[5] = -y[3] * c5 + y[5] * c3;
          
        x[4] = ( x1[4] + x1[5] );			//lower-half of third stage of dct
        x[5] = ( x1[4] - x1[5] );
        x[6] = ( x1[7] - x1[6] );
        x[7] = ( x1[7] + x1[6] );
        
        x1[0] = ( y[0] + y[4] ) * c4;		//upper-half of third ( final ) stage of dct
        x1[1] = ( y[0] - y[4] ) * c4;
        x1[2] = y[2] * c6 - y[6] * c2;
        x1[3] = y[2] * c2 + y[6] * c6;
        
        
        x[0] = ( x1[0] + x1[3] );			//second stage of dct 
        x[1] = ( x1[1] + x1[2] );
        x[2] = ( x1[1] - x1[2] );
        x[3] = ( x1[0] - x1[3] );
    
    
        //x[4], x[7] no change
        x1[5] = ((x[6] - x[5]) * c0 + delta ) >> shift;  	//after multiplication, add delta for rounding,
        x1[6] = ((x[6] + x[5]) * c0 + delta ) >> shift;  	//  shift-right to undo 'x fac' to line up x[]s
        x[5] = x1[5];
        x[6] = x1[6];  
        
        for (j = 0; j < 4; j++) {				//first stage transform of dct
            j1 = 7 - j;
            m[j][i] = (x[j] + x[j1] );
            m[j1][i] = (x[j] - x[j1] );
        }
      } //for i
    
       //row transform
      for ( i = 0; i < 8; i++ ) {
        for (j = 0; j < 8; j++) 
            y[j] = m[i][j] ;				//data for one row
        
        x1[4] = y[1] * c7 - y[7] * c1;
        x1[7] = y[1] * c1 + y[7] * c7;
        x1[6] = y[3] * c3 + y[5] * c5;
        x1[5] = -y[3] * c5 + y[5] * c3;
    
        x[4] = ( x1[4] + x1[5] );				//lower-half of third stage
        x[5] = ( x1[4] - x1[5] );
        x[6] = ( x1[7] - x1[6] );
        x[7] = ( x1[7] + x1[6] );
        
        x1[0] = ( y[0] + y[4] ) * c4;
        x1[1] = ( y[0] - y[4] ) * c4;
        x1[2] = y[2] * c6 - y[6] * c2;
        x1[3] = y[2] * c2 + y[6] * c6;
        
        //undo one shift for x[5], x[6] to avoid overflow
        x1[5] = (x[5] + delta) >> shift;
        x1[6] = (x[6] + delta) >> shift;
        x[5] = (x1[6] - x1[5]) * c0;  		
        x[6] = (x1[6] + x1[5]) * c0;  		
          //x[4], x[7] no change
        x[0] = ( x1[0] + x1[3] );			//second stage transform
        x[1] = ( x1[1] + x1[2] );
        x[2] = ( x1[1] - x1[2] );
        x[3] = ( x1[0] - x1[3] );
    
        for (j = 0; j < 4; j++) {			//first stage transform, see flow-graph
            j1 = 7 - j;
            m[i][j] = (x[j] + x[j1]);
    	m[i][j1] = (x[j] - x[j1]);
          }
        }
        //we have left-shift (multiplying constants) twice
        for ( i = 0; i < 8; ++i ) {
          for ( j = 0; j < 8; ++j ) {
            *X++ =  ( m[i][j] + delta1 ) >> shift1;    //round the value by adding delta
          }
        }
    }
    

    We supply test_dct.cpp listed below for you to test the DCT and IDCT routines in dct_video.cpp listed above. Note that the sample values must be within the range [0, 255].

    /*
      test_dct.cpp
      For testing dct() and idct() routines in dct_video.cpp.
      Compile: g++ -o test_dct dct_video.o test_dct.cpp -lm
      Execute: ./test_dct
    */
    
    #include <stdio.h>
    #include <string.h>
    
    using namespace std;
    
    void dct(int *X, int *Y);
    void idct(int *Y, int *X);
    
    void print_block( int *X )
    {
      int k = 0;
      for ( int i = 0; i < 8; ++i ){
        printf("\n");
        for ( int j = 0; j < 8; ++j ) {
    	printf("%d, ", X[k++] );
        }
      }
    }
    	
    int main()
    {
      int X[64], Y[64], i, j, k;
      
      //some sample data
      k = 0;
      for ( i = 0; i < 8; ++i ) 
        for ( j = 0; j < 8; ++j ) 
            X[k++] =  ( 3 * i *j  + j + 1 );
      
      printf("\nOriginal Data:");
      printf("\n ------------------");
      print_block ( X );
    
      dct(  X,  Y );
    
      printf("\n\nData after dct:");
      print_block ( Y );
    
      bzero ( X, 256 );		//clear X
      
      idct ( Y, X );                                                                             
      
      printf("\n\nData recovered by idct:");
      printf("\n ------------------");
      print_block ( X );
      printf("\n");
      return 0;
    }
    

    The following is the output when you execute test_dct.

    Original Data:
     ------------------
    1, 2, 3, 4, 5, 6, 7, 8, 
    1, 5, 9, 13, 17, 21, 25, 29, 
    1, 8, 15, 22, 29, 36, 43, 50, 
    1, 11, 21, 31, 41, 51, 61, 71, 
    1, 14, 27, 40, 53, 66, 79, 92, 
    1, 17, 33, 49, 65, 81, 97, 113, 
    1, 20, 39, 58, 77, 96, 115, 134, 
    1, 23, 45, 67, 89, 111, 133, 155, 
    
    Data after dct:
    330, -210, 0, -22, 0, -6, 0, -2, 
    -191, 124, 0, 13, 0, 4, 0, 1, 
    0, 0, 0, 0, 0, 0, 0, 0, 
    -20, 13, 0, 1, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 
    -6, 4, 0, 1, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 
    -1, 1, 0, 1, 0, 0, 0, 0, 
    
    Data recovered by idct:
     ------------------
    1, 2, 3, 4, 5, 6, 7, 8, 
    1, 5, 9, 13, 17, 21, 25, 29, 
    1, 8, 15, 22, 29, 36, 43, 50, 
    1, 11, 21, 31, 41, 51, 61, 71, 
    1, 14, 27, 40, 53, 66, 79, 92, 
    1, 17, 33, 49, 65, 81, 97, 113, 
    1, 20, 39, 58, 77, 96, 115, 134, 
    1, 23, 45, 67, 89, 111, 133, 155, 
    

    <<Prev   Next >>