Linux Game Programming for PC & Embedded Systems using SDL
Presented by
Fore June
Author of Windows Fan, Linux Fan

Games and SDL
SDL Installation
SDL for Embedded
SDL API
SDL Events
 SDL Graphics
SDL Threads
Thread Example
SDL Animation
SDL Sound
 Raw Video Player
Video Formats
Video Compression
 Game Trees
About The Author

An Introduction to Video Compression in C/C++ now available at Amazon

@Copyright by Fore June, 2006

Video Compression

1 2 3 4 5 6 7 8
  1. A Simple Video Codec for Intra Frames

    In the previous sections, we have discussed the coding and decoding of a video frame; we can refer to such a frame as an intra-frame ( I-frame ) as we have not considered any correlations between any two frames. Including this chapter, the codes presented are mainly for explaining and illustrating concepts. We have not not discussed and included Motion Estimation ( ME ) and Motion Compensation ( MC ) in the compression process. The steps, in particular the quantization and Huffman encoding are far from optimized. However, the implementation has used integer arithmetic to speed up the computing and utilized the C/C++ STL to simplify the coding and make it more robust.

    In this section, we shall summarize what we have discussed and put together a simple codec ( coder-decoder ) that can compress an uncompressed AVI file and play it back. We assume that the data of the avi file are saved using the 24-bit RGB colour model.

    First, we define our own header of our compressed file. The header contains the basic information of the compressed file as listed in the following table.

        Bytes     Information
    0 - 9   contains "FORJUNEV" as I.D. of file  
    10 - 11   frame rate ( frames per second )  
    12 - 15   number of frames  
    16 - 19   width of an image frame  
    20 - 23   height of an image frame  
    24   bits per pixel  
    25   quantization method  
    26   extension,   0 for uncompressed data

    The default extension of such a compressed file is .fjv. This header can be implemented by defining a struct as follows.

    typedef struct {
      char id[10];          //I.D. of file, should be "FORJUNEV"
      short fps;            //frame per second
      int   nframes;        //number  of frames
      short width;          //width of video frame
      short height;         //height of video frame
      char  bpp;            //bits per pixel
      char  qmethod;        //quantization method
      char  ext;            //extension
    } VHEADER;
      

    In summary, at this point the encoding process includes the following steps.

    1. Read a 24-bit RGB image frame from an uncompressed avi file.
    2. Decompose the RGB frame into 16x16 macroblocks.
    3. Transform and down-sample each 16x16 RGB macroblock to six 8x8 YCbCr sample blocks using YCbCr 4:2:0 format.
    4. Apply Discrete Cosine Transform ( DCT ) to each 8x8 sample block to obtain an 8x8 block of integer DCT coefficients.
    5. Forward-quantize the DCT block.
    6. Reorder each quantized 8x8 DCT block in a zigzag manner.
    7. Run-level encode each quantized reordered DCT block to obtain 3D ( run, level, last ) tuples.
    8. Use pre-calculated Huffman codewords along with sign bits to encode the 3D tuples.
    9. Save the output of the Huffman coder in a file.

    On the other hand, the decoding process includes the following steps.

    1. Construct a Huffman tree from pre-calculated Huffman codewords.
    2. Read a bit stream from the encoded file and traverse the Huffman tree to recover 3D run-level tuples to obtain 8x8 DCT blocks.
    3. Reverse-reorder and inverse-quantize each DCT block.
    4. Apply Inverse DCT ( IDCT ) to resulted DCT blocks to obtain 8x8 YCbCr sample blocks.
    5. Use six 8x8 YCbCr sample blocks to obtain a 16x16 RGB macroblocks.
    6. Combine the RGB macroblocks to form an image frame.

    We have discussed all the above steps and their implementations in previous sections. We just need to make very minor modifications to accomplish the task. Our video codec mainly consists of a core of six files, namely, vcodec.cpp, encode.cpp, decode.cpp, dct_video.cpp, rgb_ybr.cpp, and runhuf.cpp. Most of the functions in these files have been covered previously and will be listed again at the end of this section. We also need the utility program fbitios.cpp to input or output bit streams from or to a file. Moreover, we need the avilib.o library from MPEG4IP to process AVI files; this library, along with its header and a sample AVI file are provided at the end of this section. We discuss each of the core files briefly below.

    vcodec.cpp

    This file consists of the entry point main() function and four other functions, player(), encoder(), decoder(), and up_down_flip().

    The main() asks for an AVI file as input to encode. If a switch "-d", or "-s" is provided, it tries to decode the input file which should be saved in the ".fjv" mentioned above. If the .fjv file is uncompressed ( i.e. value of "ext" of its header is 0 ), the program simply plays the video, otherwise it decodes the data and plays the video. If the switch is "-s" and the .fjv is compressed, it decodes and save the uncompressed data in an output file in the .fjv format. If no argument is provided, it simply presents a simple menu showing its usage as shown below.

      Usage: vcodec [-d|-s] infile [outfile]
        Default is encoding, encoded data saved.
        -d : Decoding, decoded data not saved
        -s : Decoding, decoded data saved
      Examples:
              vcodec sample.avi       ;output in sample.fjv
              vcodec -d sample.fjv    ;output not saved
              vcodec -s sample.fjv    ;output in sample_d.fjv
      
    As discussed before, our implementation uses the producer-consumer paradigm to separate the video display from the encoding or decoding process by creating a consumer thread to play the video and a producer thread to supply and process the data. The function player() is always the consumer thread. It simply lets the screen display pointer pointing to a memory location of the data buffer buf where head pointer is pointing at; it then advances head. If head catches up with tail where data are inserted, player() sleeps for 10 ms and check the head and tail pointers again. ( Actually, it may be more efficient if one uses two semaphores to synchronizes the producer and consumer. If the consumer goes to sleep, the producer is responsible for waking up the consumer and vice versa. )

    The function up_down_flip() flips the image in the up-down direction so that the SDL display orientation will be consistent with that of AVI.

    The function encoder() is a producer thread and is the entry point for the encoding process. It first builds the Huffman table htable which will be passed to the function encode_one_frame() ( defined in file encode.cpp ) to encode an image frame. This function is also a producer and is supposed to put data in a location of the buffer buf. Actually, it obtains the image data from the input file via the MPEG4IP avi function AVI_read_frame(). If the buffer buf is full, it also goes to sleep for 10 ms.

    The function decoder() is also a producer thread and is the entry point for the decoding process. It may call the function decode_ybrFrame() ( defined in file decode.cpp ) to decode a frame from the input bit stream and put the decoded data in a slot of the buffer buf[]. It may also save the decoded data in an output file.

    encode.cpp

    This file consists of the functions save_dct_block(), get_dctcoefs(), and encode_one_frame().

    As its name implies, encode_one_frame() encodes one image frame; it is called by the thread encoder() ( in "vcodec.cpp" ). It accepts input parameters image, width, height and htable, where image holds the data of one image frame, width and height are width and height of the image, and htable is the Huffman table to be used to encode the 3D runs. The function takes one output parameters outputs, which points to a bitFileIO object associated with a file; the function sends the encoded bit stream to the file via outputs. To accomplish the tasks, encode_one_frame() in turn calls functions, macroblock2ycbcr(), get_dctcoefs(), quantize_block(), reorder(), and run_block().

    The function get_dctcoefs () generates DCT coefficients from a YCbCr macroblock by calling the DCT transformation function dct(). It takes a pointer to a struct YCbCr_MACRO as an input parameter; a YCbCr_MACRO struct consists of four 8x8 Y sample blocks, one 8x8 Cb block and one 8x8 Cr block; it applies DCT to each of the block and saves the results in the array dctcoefs[][] by calling the function save_dct_block(); the saved dct coefficients will be passed back to the calling function as outputs.

    The function save_dct_block(), called by get_dctcoefs(), simply saves an 8x8 DCT block in an array, which is an output argument.

    decode.cpp

    This file contains functions that are responsible to decode an input bit stream and convert the data back to the RGB color space. It has functions decode_ybrFrame(), get_ybrblocks(), and get_dct_block().

    The main function is decode_ybrFrame(), which gets a YCbCr block using the function get_ybrblocks(), and converts it to an RGB 16x16 macroblock using the function ycbcr2macroblock().

    The function get_ybrblocks() get a dct block by calling the function get_dct_block(), perform inverse-DCT using idct() to obtain YCbCr components and return the blocks via the output argument ycbcr_macro.

    The function get_dct_block() obtains a DCT block by performing Huffman decoding, run-level decoding, reverse-reordering, and inverse quantization calling functions huff_decode(), run_decode(), reverse_reorder() and inverse_quantize_block() respectively. If successful, it return 1, otherwise it returns -1.

    runhuf.cpp

    This file is composed of functions quantize_block(), inverse_quantize_block(), reorder(), reverse_reorder(), run_block(), run_decode(), build_htable(), escape_encode(), huff_encode(), build_huff_tree(), huff_decode() and some print functions for debugging use.

    The first four functions quantize_block(), inverse_quantize_block(), reorder(), and reverse_reorder() are self-explained. The function run_block() takes an 8x8 sample block as input and generates 3D run-level codewords for it. Function run_decode() is the run_block(). That is, it takes a 3D run-level array and converts it to an 8x8 sample block.

    The function build_htable() collects all pre-calculated run-level and Huffman codewords along with the sign bit and put them in the set htable, which is an output argument. Both Huffman encoding and decoding needs the table htable. The function huff_encode() uses htable to encode 3D run-level tuples; when a 3D run-level tuple is not in the set htable, escape_encode() is called to output the run-level tuple values with fixed lengths directly.

    In the decoding process, after creating htable, it must also build the Huffman tree; this is done by build_huff_tree(), which takes htable as input, and uses it to build the Huffman tree which will be saved in a Dtables struct and pass it as output to the calling function.

    The function huff_decode() uses the Huffman tree to reconstruct the 3D run-levels from the input bit stream.

    Note that here the Huffman codeword bit pattern has to be read from right to left. For example, if the Huffman codeword is 0x60 and code length is 7, it represents a code of '0', '0', '0', '0', '0', '1', '1', '0'. The sign-bit is appended to the right of the Huffman codeword. For example if the sign-bit is '1', the combined Huffman-sign codeword for 0x60 is 0xC1.

    dct_video.cpp

    This file contains the two DCT functions, dct() and idct(); dct() transforms an 8x8 sample block to DCT coefficients and idct() reverses the process, applying inverse DCT to an 8x8 block of DCT coefficients to recover the sample block.

    rgb_ybr.cpp

    This file contains functions for converting RGB to YCbCr and vice versa. The functions have been discussed in detail in Section 8

    We list all the files below. A sample uncompressed avi file is also provided for you to carry out testing. You may copy the source files using the copy-and-paste technique.

    codecio.h
    runhuf.h
    encode.h
    decode.h
    dct_video.h
    common.h
    fbitios.h
    avilib.h
    vcodec.cpp
    runhuf.cpp
    encode.cpp
    decode.cpp
    dct_video.cpp
    fbitios.cpp
    Makefile
    sample_video.avi ( not zipped )
    avilib.o ( not zipped )

    <<Prev   Next >>