«When the security industry began the transition from using VHS tapes to hard disks for video surveillance storage, the question of how to compress ...»
Understanding Compression Technologies for
HD and Megapixel Surveillance
Understanding Compression Technologies for
HD and Megapixel Surveillance
When the security industry began the transition from using VHS
tapes to hard disks for video surveillance storage, the question of how
to compress and store video became a top consideration for video
surveillance system designers. As the industry moves from analog
cameras and digital video recorders (DVR) to IP cameras and network
video recorders (NVR), how to compress and store video comes into question again. When analog cameras are connected to a DVR, video compression is performed inside the recorder unit at a central location.
While IP camera video compression is performed inside the camera then transmitted to the NVR in the compressed format. The centralized compression of DVRs typically meant that all cameras in the surveillance system had to use the same compression technology. IP cameras, on the other hand, have allowed for the design of hybrid systems that can use multiple compression technologies on the same system. As a result, it is critical for end-users, integrators, and system designers to have a clear understanding of the compression technologies available.
Knowing when each should be used will create the best results in a system design.
There are now a wide variety of compression technologies available on the market, but no clear standard has emerged. At the same time, implementations of a particular technology may vary from one vendor to another. Often, installers think only of file and disk size and how that determines the number of days video is stored, neglecting the fact that video compression can also impact a video surveillance system design. For example, video compression technology impacts the choice of hardware for client workstations, what transmission systems can be used, and the speed, success, and efficiency of investigations.
Frame-by-Frame and Temporal Compression Technologies There are two broad groups of compression technologies currently used in video surveillance: frame-by-frame encoding and temporal encoding. Each technology group incorporates different formats and in turn has its own tradeoffs. Understanding these differences will allow the system designer to choose the right compression technology to best meet the project’s requirements and performance objectives.
Understanding Compression Technologies for HD and Megapixel Surveillance Frame-by-Frame Compression Frame-by-frame, or intra-frame, compression technologies compress video by applying a compression algorithm to each frame captured by a camera. The end result is a series of individually compressed images.
Video that is compressed using a frame-by-frame compression technology presents a number of benefits over the more complicated temporal compression technologies discussed later. First, the resulting video through frame-by-frame compression is a series of individually compressed frames that do not require information from other frames – they can be compressed and transmitted out of a camera more quickly to reduce latency. Second, because each frame acts as an independently accessible frame and is not built up from multiple frames, recorded video can be accessed more quickly. This rapid access improves investigation efficiency and can improve the forensic viability of the recorded video. In the most demanding high security situations, Figure 1 - Intra Frame Encoding = providing all recorded video as a series of independent video frames Frame by Frame ensures that the video cannot be challenged due to invalid frames generated by the compression process.
The two main frame-by-frame compression technologies currently used in video surveillance are discussed in more detail in the following sections: JPEG and JPEG2000.
JPEG JPEG compression is most widely used for static image compression in digital cameras and on the internet. JPEG compression is named after the Joint Photographic Experts Group and was initially introduced in
1992. Based on a compression technique known as a ‘discrete cosine transform,’ JPEG compression relies on blocks of pixels, typically 8x8 in size, to compress the information in an image and reduce its file size. This block-based transformation typically introduces blocking algorithms like those shown in Figure 2. These block artifacts can sometimes obscure image details when JPEG images are heavily compressed.
Figure 2 - Example image showing JPEG compression artifacts Understanding Compression Technologies for HD and Megapixel Surveillance JPEG2000 Since its introduction in 2000, JPEG2000 has gone through many revisions and updates. JPEG2000 has become a widely used standard in many different industries. For example, JPEG2000 is used in digital cinema, diagnostic medical images, document archiving, and in the capture and transmission of images from satellites and other military applications.
JPEG2000 is designed to preserve as much detail and evidence as possible within the image while greatly reducing file sizes. As a wavelet-based compression technology, JPEG2000 allows for additional compression with fewer artifacts in the image. The JPEG2000 compression process generates images that are 30 percent smaller in file size and bandwidth than a conventional JPEG image of the same visual quality, and adds additional features for effective streaming and transmission.
Two additional features of JPEG2000 compression are its ability to capture a wide dynamic range and its ability to scale to higher resolutions. Dynamic range is an important topic in surveillance because many cameras are challenged to record bright and dark areas that vary dramatically throughout the day and by season. The ability to capture dynamic range is expressed in bits. Most compression technologies capture 8-bits of dynamic range, which means it can describe 256 different intensities of light within the image. The sensors used in surveillance cameras are often capable of capturing more than 256 intensities of light and more information than even the human eye can see. JPEG2000 was designed to preserve the extra information that the sensors generate and maintain it in the compressed video.
The second key feature of JPEG2000 is its ability to scale to higher resolutions, unlike technology borrowed from the consumer market.
For example, JPEG2000 can scale resolutions higher than MPEG-4, which is typically limited to VGA (640 x 480 pixels) or lower resolutions.
JPEG2000 is designed to scale up to extremely high resolution images and make use of its progressive compression to efficiently allow the transmission and display of those images. Information on the JPEG2000 advantage and how Avigilon has combined it with High Definition Stream Management (HDSM) for even greater results is discussed in the ‘Streaming and Network Effects of Compression’ section.
Understanding Compression Technologies for HD and Megapixel Surveillance Temporal Compression Temporal compression technologies rely both on compressing data within a single frame and on analyzing changes between frames. The result is a stream of video that is compressed over multiple frames rather than a series of individual frames. Typically, a temporal compression technology will attempt to store only incremental changes between frames and store whole frames only on periodic intervals. Though this technique can result in bandwidth efficiencies, it can also lead to the loss of information because the whole frame is not retained. The technologies used for temporal encoding are also often referred to as inter-frame or ‘time-based’ encoding because they rely on information spread out over time. The two main temporal compression technologies currently used in video surveillance are discussed in more detail in the following sections: MPEG-4 and H.264.
Figure 3 - Inter Frame = TemporalMPEG-4
MPEG-4 compression is an umbrella term used for many different technologies defined by the Moving Picture Experts Group. Most surveillance systems implement a variant of MPEG-4 Part 2, which was introduced in 1999. However, there are many different MPEG-4 compression technologies available and few are alike.
MPEG-4 compression incorporates the same basic technology as JPEG compression for reducing the file size of a digital image, but encode different types of frames in a video as a group of pictures (GOP) rather than as independent images.
A GOP is typically composed of three different frame types: I, P, and B frames. Intra-Frames (I-Frames) are complete encoded images similar to the images generated using JPEG or JPEG2000 compression.
Predicted-Frames (P-Frames) are coded with reference to the previous image, which can be either another P-Frame or the previous I-Frame.
Bidirectional-Frames (B-Frames) are sandwiched between I-Frames and P-Frames, and contain information on the changes calculated between the previous and subsequent frames.
Typically, MPEG-4 compression is limited to VGA resolutions and isn’t commonly available for higher resolution surveillance cameras.
Similar to JPEG, most implementations of MPEG-4 compression in surveillance are limited to 8-bits of dynamic range. This results in a loss of information if the camera is capable of capturing a wider dynamic range.
Understanding Compression Technologies for HD and Megapixel Surveillance
H.264 is the newest compression technology used in the security industry. H.264 compression is actually a variant of the MPEG-4 standard, commonly referred to as MPEG-4 Part 10 Advanced Video Coding (AVC). It uses the same basic concepts of I, P, and B Frames to encode video, but relies on more advanced coding technologies. One example is motion compensation using motion vectors to compress video to a smaller size. H.264 compression allows frames to be inserted between I-Frames in a GOP to describe the relative movement of information from a reference frame, further reducing the information required to represent video.
Another feature of H.264 that extends beyond standard MPEG-4 is the availability of de-blocking filters. De-blocking filters can smooth artifacts created by large amounts of compression. This allows systems to be configured with a higher level of compression while maintaining more detail in the images. H.264 compression is ready for use with higher resolution surveillance cameras, especially as one and two megapixel H.264 IP surveillance cameras become more widely available on the market.
Stream Size, Frame Rate, Lighting and Activity with Temporal Compression
Temporal compression technologies rely on scene changes as part of its compression methodology, and can introduce variability in the size of the compressed data stream that is generated. This variability depends on the compression being used – if it is configured to use a constant bit rate (CBR) or a variable bit rate (VBR). When configuring a system for a constant bit rate, the amount of compression applied increases as more activities occur. This can add compression artifacts to the image and degrade image quality. When variable bit rate compression is used, the size of the compressed stream is allowed to vary to maintain consistent image quality.
Variability in the size of the compressed stream presents important challenges in system design. Networks and servers should be designed for the worst case bandwidth demands. This ensures that on higher activity, a network is not overwhelmed. Storage must also be scaled for the worst case to ensure that the required retention times can be met under all conditions. Alternatively, frame-by-frame compression technologies offer a predictable (constant) compressed data stream size and therefore allows for simpler system designs.
Understanding Compression Technologies for HD and Megapixel Surveillance Frame rate will also have a dramatic impact on the level of activity perceived in video by the compression technology. For example, a camera running at 30 frames per second may use a single I-Frame every two seconds and rely on changes in the scene to describe the other 58 frames in between. At this rate, the amount of change between individual frames could be very small, and substantial savings in bandwidth could be achieved by only storing scene changes for those frames. However, as the frame rate is decreased, the amount of change between frames can increase substantially. When running below 10 frames per second, there may be so much incremental change between frames that a temporal compression has little or no benefit over a frame-by-frame compression technology.
Scene lighting will also impact the ability of temporal compression algorithms to efficiently compress video. Often in low light scenes, noise within the image will be interpreted as a scene change by the compression algorithm, and cause bandwidth to increase. However, when implementing a compression technology, a camera manufacturer can optimize their motion detection algorithm to prevent the algorithm from interpreting noise in low light images as changes in the scene.
Streaming and Network Effects of Compression By increasing camera resolution, HD and megapixel IP cameras come with their own unique challenges for storage, bandwidth, and efficient video surveillance management. These issues can be addressed by the choice of compression technology and camera resolution. Here, we will compare JPEG2000 and H.264, the most current of the frameby-frame and temporal frame compression technologies, and review their respective strengths and weaknesses related to streaming within a network.
JPEG2000 and High Definition Stream Management
When used with high definition and multi-megapixel surveillance video, JPEG2000 can effectively and progressively compress the video and enable advance functionality for retransmitting and managing the compressed video. Avigilon has designed High Definition Stream Management (HDSM) within the Avigilon Control Center Network Video Management Software (NVMS) to deliver these key features.
Understanding Compression Technologies for HD and Megapixel Surveillance Figure 4 – Streaming Situational Awareness with JPEG2000 Figure 5 – Streaming High Resolution Details with JPEG2000 HDSM gives JPEG2000 compressed video a three dimensional cubelike quality, so the video can be accessed as portions within the cube.