When security cameras are deployed, they are positioned to provide coverage over the scenes of most interest. Within the field of view of the camera, some areas are of greater importance than others and some areas are really of no interest at all. This is more and truer as the resolution of cameras increase from the traditional D1
resolution to the 1.3MP and 2MP that is standard today, to 5MP and more. These cameras now either cover more area with a single camera or provide more detail within the traditional field of view. The increasing resolution, even with improved encoding efficiencies, requires more and more bandwidth and storage.
New technology, called Region of Interest (ROI) Encoding has been developed to resolve this issue by providing the highest image quality on the areas or a scene or objects of most interest while reducing the quality level in uninteresting areas to provide the highest quality/lowest bandwidth results.
Encoding Quantization and Quality Control
Currently, most of the lossy compression standards, such as JPEG, MPEG-2, MPEG-4,H.264/AVC and the recent unveiled HEVC/H.265, adopt the spatial-frequency transformation encoding. The image is transformed and quantified into data and transferred to the decoding device without data loss, and the decoding device processes inverse quantization and transformation to transform the data back into an image. The differences in the image information between the encoding end and the decoding end lead to image distortion. These differences are mainly caused by the quantization, because in the process of compression, the quantization is the crucial factor for the encoding quality and the bitrate volume.
As a rough understanding, the quantization is more like a division method. For example: the quantization coefficient is 8, and as a convention the coefficient is not to be compressed. Taking the original data for encoding as 31 and dividing by 8, we get 4.
So the data transferred is 4. Then when going to the decoding end, you need to multiply the coefficient 8 by 4, with the resulting data equal to 32. The distortion is then 1. (32-31=1) Therefore, we can conclude that the higher the value of the coefficient, the larger the distortion comes.
In the figure below, we take a block as the basic unit for compression, for instance, a block with 8*8 spots. We use the spatial-frequency transformation to compress the data, and then process the quantization and inverse quantization. See the below figure:
After compression, many small data turn out to be 0; thus the compression decreases the amount information transferred. When restoring the image data at the decoding end, there are a lot of differences from that of the encoding end. To be clear, the above figure shows only the simplified process of the quantization, in actual cases,
different compression standards use different methods of quantization, and the process is much more complicated.
H.264/AVC and ROI Encoding
Most of the current compression standards divided the image with 16*16 macro blocks, and the macro block is the basic unit of compression. H.264 /AVC standard prescribes that the coefficient for quantization can be different for each macro block, and this prescription reserves the development space for the technology for ROI. Theoretically, the H.264/AVC is able to support the ROI as any shape and of any number at 16*16 macro block. And the available difference scope of the configuration for the quality of ROI and non-ROI provided by H.264/AVC compression also meets the requirement of the market.
The Effects and Influence of ROI Encoding
We use an actual surveillance scene as an example to compare the difference of the ROI and non-ROI encoding effects. In the below figure, the ROI is marked with red rectangle frame, and the blue frame marks the differences of the ROI image and the non-ROI image. The green frame marks the comparison of the non-ROI image before and after the ROI encoding, showing the influence of the ROI encoding on the non-ROI area.
Normally, one method of configuring ROI encoding is to manually set it via user interface, which is not very convenient depending on manpower and may be difficult for the user to configure. Another advanced method is to trigger the ROI encoding by the intelligent analysis on the object of the scene, for instance, an abnormal event can trigger certain area being seen more clearly than usual by the ROI encoding. However,in spite of the flexibility of the latter method, it proposes higher demands on the performance and capacity of the compressing device.
Take the ROI triggered by face recognition as an example, the below figure shows the ROI encoding effects based on face recognition. The manually configured ROI is still working and by the intelligent analysis the region of the human face is set as ROI.
There are many surveillance situations where there is ROI and background information in the same scene. In this white paper, two solutions are presented, the manually enabled ROI and ROI triggered by intelligent analysis on the image. And those two solutions are realized by the H.264/AVC encoding, to allocate the encoding quality and
bitrate, save the bandwidth and rather reduce the cost.