For more information about MPEG4 video compression and coding techniques
MPEG4Video Compression and Coding Technology
MPEGfull nameisMoving Pictures Experts Group, itis"dynamic image expertgroup", the expert group was establishedon1988, they devoted themselves to the standardization of compression and coding of moving images and their accompanying sounds. Originally, they planned to startMPEG1,MPEG2,MPEG3andMPEG4to meet different bandwidth and digital image quality requirements.
Currently,MPEG1technology is widely usedonVCD,andMPEG2standards are used for radio and televisionandDVD, etc.MPEG3was originallyisHDTVdeveloped coding and compression standards, butonMPEG2,MPEG3can only die in swaddling clothes. And the Lord we are going to talk about todayangle--MPEG4on1999officially became an international standard at the beginning of the year. It is a scheme suitable for low transmission rate applications.andMPEG1andMPEG2,MPEG4pays more attention to the interactivity and flexibility of multimedia systems. Let's enter colorful togetherMPEG4world.
MPEG1,MPEG2technology was originally formulated, their positioning standards were high-level media representation and structure, but with the rapid development of computer software and network technology,MPEG1.MPEG2technology are shown: low interactivity and flexibility, and the compressed multimedia files are too large to realize real-time transmission on the network.andMPEG4technology is to encode the content in moving images. The specific encoding objects are audio and video in the images. The terminology isis"AVpairelephant", while continuousAVobjects can be combined and formed againAVThe scene. Therefore,MPEG4standard is aroundAVthe encoding, storage, transmission and combination of objects to efficiently encode, organize, store and transmitAVobjectisMPEG4the basic content of the standard.
in video coding,MPEG4supports encoding natural and synthetic visual objects. (Composite Visual Object Package2D,3Danimation, human facial expression animation, etc.). On audio coding,MPEG4can perform audio coding on natural sound objects such as voice and music and synthesized sound objects with reverberation and spatial orientation with the support of a set of coding tools.
onMPEG4only processes elements with differences between image frames and abandons the same elements, thus greatly reducing the volume of composite multimedia files. shouldMPEG4technology's audio-visual files is high compression rate and clear imaging. Generally speaking, one-hour images can be compressedis350Mor so, while a high-definition dataDVDelectricityshadow,can be compressed into two or even onesheets650M CDCD to store. To the vast"flatpeople"computer users,this means,You do not need to purchaseDVD-ROMcan enjoy the nearlikeDVDquality high-quality images. And miningMPEG4coding technology, the requirements for machine hardware configuration are very low.,300MHZCPU,64Mmemory and one8Mvideo card can play smoothly. In terms of playing software, its requirements are also very loose, you only need to install one500KMPEG4After encoding the driver,WINDOWS's own media player can play smoothly (we will talk about it in detail below).
Video CodingandMPEGstandard performance
information obtained by human beings70%comes from vision, and video information plays an important role in multimedia information. At the same time, video data has the greatest redundancy, and the quality of compressed video is the key factor that determines the quality of multimedia service. Therefore, digital video technology is the core technology of multimedia application, and the research on video coding has become a hot topic in the field of information technology..
video coding mainly include data compression ratio and pressureshrinkage/decompression speed and fast implementation algorithm. With pressureshrinkage/Whether the decompressed data is exactly the same as the original data before compression, data compression can be divided into two types: distortion-free compression (reversible compression) and distortion compression (irreversible compression).
traditional compression coding is based on Xiannong's information theory. It uses classical set theory as a tool and uses a probability statistical model to describe the source. Its compression idea is based on data statistics, so it can only remove data redundancy and belongs to the category of low-level compression coding..
With the rapid development of video coding related disciplines and emerging disciplines, a new generation of data compression technology has been born and matured day by day. Its coding idea has changed from pixel-based and pixel block to internal-based(content-based). It breaks through the shackles of the framework of Xiannong's information theory, fully considers the visual characteristics of the human eye and the characteristics of the source, and realizes data compression by removing content redundancy. It can be divided into object-based(object-based) and semantic-based(semantics-based) two types, the former belongs to the middle-level compression coding, the latter belongs to the high-level compression coding.
At the same time, the formulation of relevant standards for video coding is also improving day by day. Video coding standards mainlyITU-TandISO/IECdevelopment.ITU-Tpublished video standardshaveH.261,H.262,H.263,H.263+,H.263 ++,ISO/IECannouncementMPEGseries standardshaveMPEG-1,MPEG-2,MPEG-4andMPEG-7, and the plan is publicclothMPEG-21.
MPEGMoving Picture Expert Group(Moving Picture Expert Group), which is an international organization specialized in developing multimedia video and audio compression coding standards.MPEGseries of standards have become the most influential multimedia technology standards in the world. ItsMPEG-1andMPEG-2is the first generation of data compression coding technologies based on Xiannong information theory, such as predictive coding, transform coding, entropy coding and motion compensation.;MPEG-4(ISO/IEC 14496) is an international standard based on the second-generation compression coding technology. It uses audio-visual media objects as the basic unit and uses content-based compression coding to realize the integration of digital video and audio, graphics synthesis applications and interactive multimedia.MPEGseries standardspairVCD,DVDand other audio-visual consumer electronics and digital television and high-definition television(DTV&&HDTV), multimedia communication and other information industries have had a huge and far-reaching impact.
inMPEG-4formulation,MPEG-1,MPEG-2,H.261,H.263all use the first generation compression coding technology to design encoders focusing on the statistical characteristics of image signals, which belongs to the category of waveform coding. The first generation compression coding scheme divides the video sequence into a series of frames according to time, and each frame image is divided into macro blocks for motion compensation and coding. This coding scheme has the following defects:
·the image is fixedly divided into blocks of the same size, a serious block effect, I .e. mosaic effect, will occur at a high compression ratio.;
·cannot access, edit, and play back image content, etc*;
·underutilized human visual system(HVS,Human Visual System).
MPEG-4represents a module-based/objects, which makes full use of the visual characteristics of human eyes, captures the essence of image information transmission, starts from the idea of contours and textures, and supports interactive functions based on visual content, which adapts to the application of multimedia information from playback to content-based access and retrievaland*.
AVobject(AVO,Audio VisualObject)isMPEG-4important concepts proposed to support content-based coding. Object refers to being able to access in a sceneand*vertical entities, the division of objects can be based on their unique texture, motion, shape, model and high-level semantics.inMPEG-4is no longer tooMPEG-1,MPEG-2, but audio-visual scenes(AVscenarios), these are different.AVThe scene is different.AVObject composition.AVobject is the representation unit of auditory, visual, or audio-visual content, and its basic unit is the originalAVobject, which can be natural or synthetic sounds and images. OriginalAVobjects have efficient encoding, efficient storage and transmission, and interoperability.Mutual*, it can be further combinedAVobject. BecausethisMPEG-4standard ispairAVefficient encoding, organization, storage, and transmission of objects.AVobject enables multimedia communication to have high interaction and efficient coding capability.,AVobject encodingisMPEG-4's Core Coding Technology.
MPEG-4can not only provide high compression rate, but also achieve better multimedia content interactivity and all-round accessibility. It uses an open coding system and can add new coding algorithm modules at any time. At the same time, it can also be based on different application requirements. Configure the decoder on site to support a variety of multimedia applications.
MPEG-4adopts a new generation of video coding technology. For the first time in the history of video coding, it expands the coding object from image frame to any shape video object with practical significance, thus realizing the transformation from traditional pixel-based coding to modern coding based on object and content, thus leading the development trend of a new generation of intelligent image coding.
MPEG-4also puts forward some new innovative key technologies, and has made fruitful improvements and improvements on the basis of the first generation video coding technology. The following highlights some of these key technologies.
A.Video Object Extraction Technology
MPEG-4to realize content-based interaction isfrequency/image is divided into different objects or the moving objects are separated from the background, and then corresponding coding methods are adopted for different objects to realize efficient compression. Therefore, video object extraction is video object segmentation,isMPEG-4the key technology of video coding is also the research hotspot and difficulty of new generation video coding.
video object segmentation involves the analysis and understanding of video content, which is closely related to disciplines such as artificial intelligence, image understanding, pattern recognition and neural networks. At present, the development of artificial intelligence is not perfect, and computers do not have the ability to observe, recognize and understand images. At the same time, research on computer vision also shows that to achieve correct image segmentation, video content needs to be understood at a higher level. Therefore, the besttubeMPEG-4framework has been formulated, but there is still no universal and effective method to fundamentally solve the problem of video object segmentation. Video object segmentation is considered to be a challenging problem, and semantic-based segmentation is even more difficult..
At present, the general steps for video object segmentation are as follows:frequency/image data is simplified to facilitate segmentation, which can be done by low-pass filtering, median filtering, and morphological filtering.frequency/image data for feature extraction, which can be features such as color, texture, motion, frame difference, displacement frame difference, and even semantics; then determine the segmentation decision based on a certain uniformity standard, and classify the video data according to the extracted features; Finally, relevant post-processing is performed to filter out noise and accurately extract boundaries..
Based on Mathematical Morphology Theory in Video Segmentation(watershed) algorithm is widely used. It is also called waterline algorithm. Its basic process is continuous corrosion of binary images, which consists of four stages: image simplification, mark extraction, decision-making and post-processing. Watershed algorithm has the advantages of simple operation and excellent performance, which can better extract the contour of moving objects and accurately obtain the edge of moving objects. However, the segmentation requires gradient information, is more sensitive to noise, and does not use inter-frame information, which usually results in over-segmentation of the image..
B. VOPVideo encoding technology
Video object plane(VOP,Video Object Plane) is a video object.(VO) Sampling at a certain time,VOPisMPEG-4Core concepts of video coding.MPEG-4In the encoding processVODifferent coding strategies are adopted, that is, the formerviewVOcompression coding retains details and smoothness as much as possible.viewVOuses a high compression ratio encoding strategy, and even does not transmit and splices other backgrounds on the decoding end. This object-based video coding not only overcomes the block effect produced by high compression rate coding in the first generation video coding, but also enables users to interact with the scene, thus not only improving the compression ratio, but also realizing content-based interaction, providing a broad development space for video coding.
MPEG-4supports encoding and decoding of images and videos of any shape. For arbitrary shape video objects. For very low bit rate real-time applications, such as videophone, conference television,MPEG-4VLBV(Very Low Bit-rate Video, very low bit rate video) core encoding.
Traditional Rectangular DiagraminMPEG-4is regardedisVO, which reflects traditional coding and content-based codinginMPEG-4.VOconcept is more in line with the human brain's processing of visual information, and makes the processing of video signals progress from digitization to intelligence, thus improving the interactivity and flexibility of video signals, making wider video applications and more content interaction possible. BecausethisVOPvideo coding technology is praised as the preliminary exploration of video signal processing technology from digitalization to intelligence..
C.video coding scalability technology
With the huge growth of Internet services, the rate fluctuates greatly.IP(Internet Protocol) There are more and more requirements and applications for video transmission on networks and heterogeneous networks with different transmission characteristics. In this context, the importance of video hierarchical coding has become increasingly prominent, its application is very wide, and has high theoretical research and practical application value, so it has attracted great attention.
Scalability of Video Coding(scalability) refers to the adjustability of the code rate, that is, the video data can be compressed only once, but can be decoded at multiple frame rates, spatial resolution or video quality, which can support various application requirements of multiple types of users.
MPEG-4through the video object layer(VOL,Video Object Layer) data structure to implement hierarchical encoding.MPEG-4provides two basic grading tools, namely time domain grading(Temporal Scalability) and airspace classification(Spatial Scalability), in addition, it also supports mixed classification of time domain and airspace. Each hierarchical code has at least twolayerVOL, the lower layer is called the basic layer and the upper layer is called the enhancement layer. The basic layer provides the basic information of the video sequence, and the enhancement layer provides the higher resolution and detail of the video sequence..
in the subsequent video streaming application framework,MPEG-4proposedFGS(Fine Granularity Scalable, fine scalability) video coding algorithmandPFGS(Progressive Fine Granularity Scalable, progressive fine scalability) video coding algorithm.
FGScoding is simple to implement, can provide flexible adaptation and scalability in coding rate, display resolution, content, decoding complexity, etc., and has strong bandwidth adaptation and error resilience performance. However, there are still two shortcomings: the coding efficiency is lower than that of non-scalable coding and the video quality at the receiving end is not optimal.
PFGSis a changeFGSon the coding efficiency is to use a certain enhancement layer image reconstructed from the previous frame as a reference for motion compensation when the enhancement layer image is encoded, so as to make motion compensation more effective and improve the coding efficiency.
D.Motion Estimation and Motion Compensation Techniques
MPEG-4miningI-VOP,P-VOP,B- VOPthree frame formats to characterize different types of motion compensation. It adoptsH.263(half pixel searching) Technology and Overlapping Motion Compensation(overlapped motion compensation)technology, while introducing repeated filling(repetitive padding) technology and modified block (polygon) matching(modified block(polygon)matching) technology to support any shapeVOParea.
In addition, in order to improve the accuracy of motion estimation algorithm,MPEG-4adoptedMVFAST(Motion Vector Field Adaptive Search Technique) and improvementPMVFAST(Predictive MVFAST) method is used for motion estimation. For global motion estimation, feature-based fast robustFFRGMET(Feature-based Fast and Robust Global Motion Estimation Technique) method.
inMPEG-4In video coding, motion estimation is time-consuming and has a great impact on the real-time performance of the coding. Therefore, special emphasis is placed on fast algorithms here. Motion estimation methods mainly include pixel recursion method and block matching method, the former is very complex, less applied in practice, the latter isinH.263andMPEG. In the block matching method, the focus is on block matching criteria and search methods. There are currently three commonly used matching criteria:
(1) absolute error and(SAD, Sum of Absolute Difference) guidelines;
(2) mean square error(MSE, Mean Square Error) guidelines;
(3) Normalized cross-correlation function(NCCF, Normalized Cross Correlation Function) guidelines.
in the above three criteria,SADcriterion has the advantages of no multiplication operation and simple and convenient implementation, but it should be clear that the selection of the matching criterion has little effect on the matching result..
, after selecting the matching criteria, the search for the optimal matching point should be carried out. The simplest and most reliable method is the full search method.(FS, Full Search), but the amount of computation is too large to be implemented in real time. Therefore, the fast search method came into being, mainly cross search method, two-dimensional logarithmic method and diamond search method, including diamond search method.isMPEG-4Check the model(VM, Verification Model), the following details.
diamond search(DS, Diamond Search) method is named after the shape of the search template. It is simple, robust and efficient. It is one of the fast search algorithms with the best performance. The basic idea is to use the shape and size of the search template to have an important impact on the speed and accuracy of the motion estimation algorithm. When searching for the optimal matching point, selecting a small search template may fall into local optimization, while selecting a large search template may not find the best advantage. BecausethisDSalgorithm selects two search templates with shapes and sizes..
·large diamond search template(LDSP, Large Diamond Search Pattern), package9candidate locations;
·small diamond search template(SDSP, Small Diamond Search Pattern), package5candidate locations.
DSalgorithm is as follows: at the beginning, the large diamond search template is reused until the best matching block falls in the center of the large diamond. ByonLDSPstep size is large, so that the search range is wide, and the rough positioning can be realized, so that the search will not fall into the local minimum. After the rough positioning is completed, it can be considered as the best advantage.inLDSPenclosure8in the diamond area around the point. Then use the small diamond search template to achieve the accurate positioning of the best matching block, so as not to produce large fluctuations, thereby improving the accuracy of motion estimation.
thisSpritevideo coding technology alsoinMPEG-4is widely used as one of its core technologies..Sprite, also known as mosaic or background panorama, refers to an image formed by splicing all the parts of a video object in a video sequence. LeeSpritecan directly reconstruct the video object or perform predictive compensation coding on it.
Spritevideo coding can be regarded as a more advanced motion estimation and compensation technology, which can overcome the shortcomings of traditional motion estimation and compensation technology based on fixed blocks.,MPEG-4uses the traditional block coding technologyandSpritecoding technology.
with excellent performance,MPEG4technology has been widely used in multimedia transmission, multimedia storage and other fields. Let's take a look at those currently indomainMPEG4technology has been given a big chance to reach out.
1, wonderful video world
wonderful video worldisMPEG4technology is also the most familiar form to friends. At present, it mainly appears in two forms, oneisDIVX-MPEG4DVD (has appeared on the domestic market,andDversion is mostly), the other is the InternetMPEG4movies.
(1), let's first saysaidDIVX-MPEG4DVD,DIVXvideo coding technology actuallyisMPEG4compression technology, which consists of microsoftMPEG4V3isMPEG4compression algorithm and separate video and audio at the same time. The core part of it isDivXpairDVDaudio and video compression to produceMpeg4Video formatpieces(alsoisAVI).
tip: The author is often asked by friends:"I sawMPEG4movie clip clearlyisavI(extension) format file, andandWindows's media player is also associated with it, but it just cannot be broadcast.release". Actually,MPEG4is not sure what extension must be used, it is just an encoding method. MakeavI, as an extension, is a habitual use.
is broadcast on the computer.releaseMPEG4audio-visual files: the first is to useDivxPlayerand other special playback software to play;MPEG4(Divx)After the plug-in,Windowsits own media player to play.
(2). With the continuous development of network technology, video streaming applications on the Internet have also become a hot topic in recent years. At present, several video format packages are popular on the Internet.Quicktime,RealPlayand MicrosoftMediaPlayer, etc.MPEG4technology, it appeared again on the Internet.MPEG4format movie, but before watching, the system will prompt you to download the latestMPEG4decoding software.
tip: People may often read it online at ordinary times.Fformat are actually a compressed format developed by Microsoft that can watch video programs directly on the Internet. Also usedisMPEG4compression algorithm, but because it exists in the video stream format of watching movies on the Internet, its image quality is relatively poor.
2, multimedia communication at low bit rate,
Currently,MPEG4technology has been widely used in multimedia communication fields such as video phone, video e-mail, mobile communication, electronic news, etc. Due to the low transmission rate requirements for these applications, generallyin4.8~64kbit/s.is176 × 144or so. BecausethisMPEG4technology can make full use of network bandwidth, compress and transmit data through frame reconstruction technology, and obtain the best image quality with the least amount of data.
3, real-time multimedia monitoring.
The field of multimedia monitoring has always beenisMPEG1technology plays an important role, but in recent years, they have alsois"Chengtou Transformation Kingflag". ByonMPEG4compression technology was originally an audio and video processing technology suitable for information exchange under low bandwidth. Its characteristic is that it can dynamically detect changes in various regions of the image. Object-based adjustment compression methods can be obtainedratioMPEG1a larger compression ratio to lower the compressed code stream. Therefore, the besttubeMPEG4technology was not specially developed for the field of video surveillance compression at the beginning, but its high-definition video compression, in real-time multimedia monitoring, is incapable of storage, transmission rate and definition.ratioMPEG1has greater advantages.
4, multimedia system based on content storage and retrieval.
onMPEG4is far superior in compression methodonMPEG1technology, moreisMJPEGtechnology.tests by experts show that in the same clarity pairshouldMPEG1(500Kbits/sec),MPEG4ratioMPEG1savings2/3hard disk space also saves nearly normal capacity in normal active scenarios. Therefore, both in terms of content storage and the retrieval speed of multimedia files,MPEG4technology is the only choice for multimedia system applications.
5, hardware applications
Currently,MPEG4technology has also begun to be gradually applied in hardware products. Especially in video surveillance and playback, this high-definition, high-compression technology has been loved by many hardware manufacturers, and the marketMPEG4technology. The following author will list some representative products, aimed at making readerssolutionMPEG4technology is widely used today.
(1), Camera: Sharp Corporation of Japan has launched digital camera applications on the Internet.machineVN-EZ1. This webcam benefitsMPEG4format to compress an image file.isF(advanced streaming format), users only need to use Microsoft CorporationMediaPlayerthe playback program, it can be played directly on the computer.
(2), Player: Philips launched a branch in August this yearDivXDVDmachineDVD737. It can supportDivX 3.11,4.xx,5.xx, etcMPEG4standards, and the new standard can be supported by upgrading the firmware.
(3), Digital Camera: Kyocera Corporation of Japanin11.machineFinecam L30, this one is300megapixels,3Optical Zoom Design,L30adoptedMPEG4format dynamic video recording can make the dynamic video recording effect better than traditional digital cameras.
(4), mobile phones: in the field of mobile phones,MPEG4technology has been widely used, and major mobile phone manufacturers have also launched a camera.MPEG4dynamic video, such as SimonST55, Sony AililetterP900/P908,LGcolorscreenG8000, etc.
(5)),MPEG4Digital Hard Disk: At the security exhibition held in Shenzhen this year, manufacturers developing digital video surveillance products have launched their latest productsMPEG4DVRcompression technology has also become the highlight of the exhibition.
such as Beijing Huaqing Zibo Technology Launched"EeyeMPEG4digital video"is a high-definition digital monitoring and alarm system based on network environment. Built-in multi-screen processor, integrating on-site monitoring, monitoring, multi-channel simultaneous digital video recording and playback and other functions.
In fact, there are still many bases in the market.onMPEG4technology are not listed here, but I believe that with the continuous development of video compression technology,MPEG4technology products will appear more and more in our life and work.