| Games and SDL SDL Installation SDL for Embedded SDL API SDL Events | SDL Graphics SDL Threads Thread Example SDL Animation SDL Sound | Raw Video Player Video Formats Video Compression | Game Trees About The Author |
Video Compression
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
In the previous sections, we have discussed the coding and decoding of a video frame; we can refer to such a frame as an intra-frame ( I-frame ) as we have not considered any correlations between any two frames. Including this chapter, the codes presented are mainly for explaining and illustrating concepts. We have not not discussed and included Motion Estimation ( ME ) and Motion Compensation ( MC ) in the compression process. The steps, in particular the quantization and Huffman encoding are far from optimized. However, the implementation has used integer arithmetic to speed up the computing and utilized the C/C++ STL to simplify the coding and make it more robust.
In this section, we shall summarize what we have discussed and put together a simple codec ( coder-decoder ) that can compress an uncompressed AVI file and play it back. We assume that the data of the avi file are saved using the 24-bit RGB colour model.
First, we define our own header of our compressed file. The header contains the basic information of the compressed file as listed in the following table.
| Bytes | Information |
|---|---|
| 0 - 9 | contains "FORJUNEV" as I.D. of file |
| 10 - 11 | frame rate ( frames per second ) |
| 12 - 15 | number of frames |
| 16 - 19 | width of an image frame |
| 20 - 23 | height of an image frame |
| 24 | bits per pixel |
| 25 | quantization method |
| 26 | extension, 0 for uncompressed data |
The default extension of such a compressed file is .fjv. This header can be implemented by defining a struct as follows.
typedef struct {
char id[10]; //I.D. of file, should be "FORJUNEV"
short fps; //frame per second
int nframes; //number of frames
short width; //width of video frame
short height; //height of video frame
char bpp; //bits per pixel
char qmethod; //quantization method
char ext; //extension
} VHEADER;
|
In summary, at this point the encoding process includes the following steps.
On the other hand, the decoding process includes the following steps.
We have discussed all the above steps and their implementations in previous sections. We just need to make very minor modifications to accomplish the task. Our video codec mainly consists of a core of six files, namely, vcodec.cpp, encode.cpp, decode.cpp, dct_video.cpp, rgb_ybr.cpp, and runhuf.cpp. Most of the functions in these files have been covered previously and will be listed again at the end of this section. We also need the utility program fbitios.cpp to input or output bit streams from or to a file. Moreover, we need the avilib.o library from MPEG4IP to process AVI files; this library, along with its header and a sample AVI file are provided at the end of this section. We discuss each of the core files briefly below.
vcodec.cpp
The main() asks for an AVI file as input to encode. If a switch "-d", or "-s" is provided, it tries to decode the input file which should be saved in the ".fjv" mentioned above. If the .fjv file is uncompressed ( i.e. value of "ext" of its header is 0 ), the program simply plays the video, otherwise it decodes the data and plays the video. If the switch is "-s" and the .fjv is compressed, it decodes and save the uncompressed data in an output file in the .fjv format. If no argument is provided, it simply presents a simple menu showing its usage as shown below.
Usage: vcodec [-d|-s] infile [outfile]
Default is encoding, encoded data saved.
-d : Decoding, decoded data not saved
-s : Decoding, decoded data saved
Examples:
vcodec sample.avi ;output in sample.fjv
vcodec -d sample.fjv ;output not saved
vcodec -s sample.fjv ;output in sample_d.fjv
As discussed before, our implementation uses the producer-consumer paradigm to
separate the video display from the encoding or decoding process by creating a consumer
thread to play the video and a producer thread to supply and process the data.
The function player() is always the consumer thread. It simply lets the screen
display pointer pointing to a memory location of the data
buffer buf where head pointer is pointing at; it then advances
head. If head catches up with tail where data are inserted,
player() sleeps for 10 ms and check the head and tail pointers again.
( Actually, it may be more efficient if one uses two semaphores to synchronizes the
producer and consumer. If the consumer goes to sleep, the producer is responsible
for waking up the consumer and vice versa. )
The function up_down_flip() flips the image in the up-down direction so that the SDL display orientation will be consistent with that of AVI.
The function encoder() is a producer thread and is the entry point for the encoding process. It first builds the Huffman table htable which will be passed to the function encode_one_frame() ( defined in file encode.cpp ) to encode an image frame. This function is also a producer and is supposed to put data in a location of the buffer buf. Actually, it obtains the image data from the input file via the MPEG4IP avi function AVI_read_frame(). If the buffer buf is full, it also goes to sleep for 10 ms.
The function decoder() is also a producer thread and is the entry point for the decoding process. It may call the function decode_ybrFrame() ( defined in file decode.cpp ) to decode a frame from the input bit stream and put the decoded data in a slot of the buffer buf[]. It may also save the decoded data in an output file.
encode.cpp
As its name implies, encode_one_frame() encodes one image frame; it is called by the thread encoder() ( in "vcodec.cpp" ). It accepts input parameters image, width, height and htable, where image holds the data of one image frame, width and height are width and height of the image, and htable is the Huffman table to be used to encode the 3D runs. The function takes one output parameters outputs, which points to a bitFileIO object associated with a file; the function sends the encoded bit stream to the file via outputs. To accomplish the tasks, encode_one_frame() in turn calls functions, macroblock2ycbcr(), get_dctcoefs(), quantize_block(), reorder(), and run_block().
The function get_dctcoefs () generates DCT coefficients from a YCbCr macroblock by calling the DCT transformation function dct(). It takes a pointer to a struct YCbCr_MACRO as an input parameter; a YCbCr_MACRO struct consists of four 8x8 Y sample blocks, one 8x8 Cb block and one 8x8 Cr block; it applies DCT to each of the block and saves the results in the array dctcoefs[][] by calling the function save_dct_block(); the saved dct coefficients will be passed back to the calling function as outputs.
The function save_dct_block(), called by get_dctcoefs(), simply saves an 8x8 DCT block in an array, which is an output argument.
decode.cpp
The main function is decode_ybrFrame(), which gets a YCbCr block using the function get_ybrblocks(), and converts it to an RGB 16x16 macroblock using the function ycbcr2macroblock().
The function get_ybrblocks() get a dct block by calling the function get_dct_block(), perform inverse-DCT using idct() to obtain YCbCr components and return the blocks via the output argument ycbcr_macro.
The function get_dct_block() obtains a DCT block by performing Huffman decoding, run-level decoding, reverse-reordering, and inverse quantization calling functions huff_decode(), run_decode(), reverse_reorder() and inverse_quantize_block() respectively. If successful, it return 1, otherwise it returns -1.
runhuf.cpp
This file is composed of functions quantize_block(), inverse_quantize_block(), reorder(), reverse_reorder(), run_block(), run_decode(), build_htable(), escape_encode(), huff_encode(), build_huff_tree(), huff_decode() and some print functions for debugging use.
The first four functions quantize_block(), inverse_quantize_block(), reorder(), and reverse_reorder() are self-explained. The function run_block() takes an 8x8 sample block as input and generates 3D run-level codewords for it. Function run_decode() is the run_block(). That is, it takes a 3D run-level array and converts it to an 8x8 sample block.
The function build_htable() collects all pre-calculated run-level and Huffman codewords along with the sign bit and put them in the set htable, which is an output argument. Both Huffman encoding and decoding needs the table htable. The function huff_encode() uses htable to encode 3D run-level tuples; when a 3D run-level tuple is not in the set htable, escape_encode() is called to output the run-level tuple values with fixed lengths directly.
In the decoding process, after creating htable, it must also build the Huffman tree; this is done by build_huff_tree(), which takes htable as input, and uses it to build the Huffman tree which will be saved in a Dtables struct and pass it as output to the calling function.
The function huff_decode() uses the Huffman tree to reconstruct the 3D run-levels from the input bit stream.
Note that here the Huffman codeword bit pattern has to be read from right to left. For example, if the Huffman codeword is 0x60 and code length is 7, it represents a code of '0', '0', '0', '0', '0', '1', '1', '0'. The sign-bit is appended to the right of the Huffman codeword. For example if the sign-bit is '1', the combined Huffman-sign codeword for 0x60 is 0xC1.
dct_video.cpp
rgb_ybr.cpp
We list all the files below. A sample uncompressed avi file is also provided for you to carry out testing. You may copy the source files using the copy-and-paste technique.