CN104159060B - Preprocessor method and equipment - Google Patents
Preprocessor method and equipment Download PDFInfo
- Publication number
- CN104159060B CN104159060B CN201410438251.8A CN201410438251A CN104159060B CN 104159060 B CN104159060 B CN 104159060B CN 201410438251 A CN201410438251 A CN 201410438251A CN 104159060 B CN104159060 B CN 104159060B
- Authority
- CN
- China
- Prior art keywords
- frame
- video
- digital
- information
- progressive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0112—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level one of the standards corresponding to a cinematograph film standard
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/19—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/523—Motion estimation or motion compensation with sub-pixel accuracy
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/87—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
- H04N5/144—Movement detection
- H04N5/145—Movement estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
- H04N5/147—Scene change detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0117—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal
- H04N7/012—Conversion between an interlaced and a progressive signal
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Graphics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Television Systems (AREA)
- Studio Devices (AREA)
- Microscoopes, Condenser (AREA)
Abstract
The present invention relates to preprocessor method and equipment, and more particularly, the processing operation for being related to before data compression process or being performed together with data compression process.A kind of method for handling multi-medium data includes receiving interlaced frames of video, obtains the metadata for the interlaced frames of video, the interlaced frames of video is converted into progressive using at least a portion of the metadata;And provide at least a portion of the progressive and the metadata to encoder to encode the progressive.Methods described may also include the spatial information and bi-directional motion information produced for the interlaced frames of video, and be based on the interlaced frames of video using the spatial information and the bi-directional motion information and produce progressive.
Description
The relevant information of divisional application
The application is the divisional application of the former Chinese invention patent application of entitled " preprocessor method and equipment ".
The Application No. 200780010753.9 of original application;The applying date of original application is on March 13rd, 2007.
According to the CLAIM OF PRIORITYs of 35U.S.C. § 119
Present application for patent advocates No. 60/789,048 Provisional Application, April 4 in 2006 filed in 3 days April in 2006
No. 60/789,377 Provisional Application filed in No. 60/789,266 Provisional Application filed in day and 4 days April in 2006
Priority, all application cases transfer the present assignee and be hereby incorporated by reference herein.
Technical field
The present invention generally relates to multimedia-data procession, and more particularly, is related to before data compression process
Or the processing operation performed together with data compression process.
Background technology
Nothing
The content of the invention
Each of invention as described herein device and method are respectively provided with any list in some aspects, the aspect
Individual aspect does not want individually attribute to be responsible for it.In the case of not limiting the scope of the invention, now by brief discussion, it is more prominent
Go out feature.After considering this discussion, and in particular, after the part of entitled " embodiment " is read, it should be understood that this
How the feature of invention provides the improvement to multimedia-data procession device and method.
In one aspect, a kind of method for handling multi-medium data is comprising interlaced frames of video is received, by the terleaved video
Frame is converted into progressive (progressive video), produces the metadata associated with the progressive, and will be described
At least a portion of progressive and the metadata provides to encoder to encode the progressive.Methods described can
Further comprise carrying out encoded progressive video using metadata.In certain aspects, interlaced frames of video (interlaced video)
Include ntsc video.Transformed video frame may include interlaced frames of video described in release of an interleave.
In certain aspects, metadata may include bandwidth information, bi-directional motion information, bandwidth ratio (bandwidth
Ratio), complexity value (for example, time complexity value or spatial complexity value or both), monochrome information (luminance
), and spatial information may include brightness and/or chrominance information information.Methods described, which may also include generation, is used for described hand over
The spatial information and bi-directional motion information of wrong frame of video, and using the spatial information and the bi-directional motion information based on described
Interlaced frames of video and produce progressive.In certain aspects, change the interlaced frames of video and include anti-telecine process 3/2
Pulldown video frame, and/or progressive is sized.Methods described can further include segmentation progressive to determine image
Group (group of picture) information, wherein the segmentation may include to detect (shot to the story board of progressive
detection).In certain aspects, methods described also includes spending noise filter to filter progressive.
In another aspect, a kind of equipment for being used to handle multi-medium data may include to be configured to receive interlaced frames of video
Receiver, be configured to the interlaced frames of video being converted into the deinterlacer of progressive, and be configured to produce and institute
State the associated metadata of progressive and provide the progressive and the metadata to encoder for coding institute
State the dispenser of progressive.In certain aspects, the equipment can further comprise being configured to from communication module receive by
Row video and the encoder for carrying out encoded progressive video using the metadata provided.The deinterlacer can be configured to perform sky
Between time release of an interleave and/or anti-telecine process (inverse telecining).The dispenser can be configured to perform
Story board detects and produces compression information based on story board detection.In certain aspects, the dispenser can be configured with
Produce bandwidth information.The equipment, which may also include, to be configured to the adjusted size of resampler of progressive frame.The metadata can
Including bandwidth information, bi-directional motion information, bandwidth ratio, monochrome information, the spatial complexity value related to content, and/or with it is interior
Hold related time complexity value.In certain aspects, the deinterlacer, which is configured to produce, is used for the interlaced frames of video
Spatial information and bi-directional motion information and be based on the terleaved video using the spatial information and the bi-directional motion information
Frame and produce progressive.
On the other hand a kind of equipment for handling multi-medium data is included, the equipment includes being used to receive terleaved video
The device of frame, the device for the interlaced frames of video to be converted into progressive, for produce it is related to the progressive
The device of the metadata of connection, and for by least a portion of the progressive and the metadata provide to encoder with
In the device for encoding the progressive.In certain aspects, the conversion equipment includes anti-teleciner and/or sky
Between time deinterlacer.In certain aspects, the generation device is configured to perform story board detection and based on described point of mirror
Head detection produces compression information.In certain aspects, the generation device is configured to produce bandwidth information.In some respects
In, the generation device includes being used for resampling with to the adjusted size of device of progressive frame.
On the other hand a kind of machine-readable medium is included, the machine-readable medium, which is included, to be used to handle multi-medium data
Instruction, the instruction causes machine upon execution:Interlaced frames of video is received, the interlaced frames of video is converted into progressive,
The metadata associated with the progressive is produced, and at least a portion of the progressive and the metadata is provided
To encoder for the coding progressive.
On the other hand include a kind of processor, the processor comprising one configuration, it is described configuration to receive terleaved video,
Terleaved video is converted into progressive, the generation metadata associated with the progressive and by the progressive and institute
At least a portion for stating metadata provides to encoder to encode the progressive.Conversion to terleaved video may include
Perform spatio-temporal deinterlacing.In certain aspects, the conversion to terleaved video includes the anti-telecine process of execution.At some
In aspect, the generation of metadata includes based on detector lens changing and producing compression information.In certain aspects, the production of metadata
It is raw to include determining the compression information of progressive.In certain aspects, the configuration is included to be produced to video resampling
The configuration of the progressive frame of adjusted size.In certain aspects, the metadata may include bandwidth information, bi-directional motion information,
Complexity information (for example, time or space complexity information based on content) and/or compression information.
Brief description of the drawings
Fig. 1 is for the block diagram for the communication system for delivering streaming multimedia data;
Fig. 2 is the block diagram for the digital transmission facility for including preprocessor;
Fig. 3 A are the block diagram of the illustrative aspect of preprocessor;
Fig. 3 B are to illustrate to be used to handle the flow chart of the process of multi-medium data;
Fig. 3 C are to illustrate to be used to handle the block diagram of the device of multi-medium data;
Fig. 4 is the block diagram for the operation for illustrating exemplary preprocessor;
Fig. 5 be anti-telecine process during phase bit decisions figure;
Fig. 6 is the flow chart for the process for illustrating anti-telecine process video;
Fig. 7 is the explanation to showing the grid of phase transition;
Fig. 8 is to the guidance for the respective frame for recognizing to create multiple measurements;
Fig. 9 is the flow chart for illustrating how to create Fig. 8 measurement;
Figure 10 is flow chart of the displaying to the processing of phase estimated by the arrival of measurement;
Figure 11 is to illustrate to be used to produce the DFD of the system of decision variable;
Figure 12 is used for the block diagram for the variable for assessing branch information for description;
How Figure 13 A, 13B and 13C calculate the flow chart of lower envelope for displaying;
Figure 14 is the flow chart of the operation of displaying consistency detector;
Figure 15 calculates the flow chart of the process of the skew of decision variable for displaying, and the skew is to compensate in phase bit decisions
Inconsistency;
Figure 16 is in the operation for having estimated the anti-telecine process after pulldown phase.
Figure 17 is the block diagram of deinterlacer device;
Figure 18 is the block diagram of another deinterlacer device;
Figure 19 is the schema of the subsample pattern of interlaced image;
Figure 20 is the block diagram for the deinterlacer device for estimating to produce release of an interleave frame using Wmed filtered motions;
Figure 21 illustrates the one side of the aperture of the static zones for determining multi-medium data;
Figure 22 is to illustrate to be used to determine the figure of the one side of the aperture of the slow motor area of multi-medium data;
Figure 23 is the figure for the one side for illustrating estimation;
Figure 24 explanations are it is determined that used two motion vector figures during motion compensation;
Figure 25 is the flow chart for the method for illustrating release of an interleave multi-medium data;
Figure 26 is the flow chart for illustrating to produce the method for release of an interleave frame using spatial temporal information;
Figure 27 is the flow chart for illustrating to perform the method for motion compensation for release of an interleave;
The block diagram of preprocessor according to Figure 28 in terms of some, the preprocessor is comprising being configured for story board inspection
Survey and the processor of other pretreatment operations;
Figure 29 illustrates the relation between codec complexity C and distributed position B;
Figure 30 is to illustrate to operate group of pictures and can be used in certain aspects based on the story board inspection in frame of video
Survey and the flow chart of the process of encoded video;
Figure 31 is to illustrate to be used for the flow chart for the process that story board is detected;
Figure 32 is to illustrate to be used to determine the flow chart of the process of the different classifications of the camera lens in video;
Figure 33 is to illustrate to be used to be assigned to frame compression scheme based on story board testing result the flow of the process of frame of video
Figure;
Figure 34 is to illustrate to be used to determine the flow chart of the process of unexpected scene changes;
Figure 35 is to illustrate to be used to determine the flow chart of the process of slowly varying scene;
Figure 36 is to illustrate to be used to determine the flow chart of the process of the scene containing camera flash;
Figure 37 illustrates present frame and former frame MVPBetween and present frame and next frame MVNBetween motion compensation vector;
Figure 38 is the chart for illustrating to be used to determine the relation of used variable during frame difference metric;
Figure 39 is the block diagram for illustrating coded data and calculating remnants;
Figure 40 is the block diagram for illustrating to determine frame difference metric;
Figure 41 is the flow chart for the program for illustrating wherein to be assigned to compression type frame;
Figure 42 illustrates the example of 1-D leggy resamplings;
Figure 43 is the chart for the safe action area and safe header area for illustrating data frame;And
Figure 44 is the chart in the safe action area for illustrating data frame.
Embodiment
Description includes the details for being used to provide the thorough understanding to example below.However, those skilled in the art will
Understand, even if being not described herein or the process in illustrated example or aspect or each details of device, can still put into practice the reality
Example.For example, electrical component can be shown in block diagrams, and the block diagram does not illustrate each electrical connection of the component or each electricity member
Part is so as not to meeting with example described in unnecessary unnecessary details.In other cases, the component, other structures can be shown in detail
And technology is to be explained further the example.
There is described herein some inventive aspects and preprocessor and the aspect of preprocessor operating method, it can improve existing
Deposit the performance of pretreatment and coded system.The preprocessor can handle metadata and video thinks that coding is prepared, and it includes
Perform release of an interleave, anti-telecine process, filtering, identification lens type, processing and produce metadata and produce bandwidth information.This
" one side ", " one side ", " some aspects " or the reference of " some aspects " are meaned to be retouched with reference to the aspect in text
One of special characteristic, structure or characteristic for stating or it is one or more of may include in preprocessor system at least one aspect
In.Appearance in the multiple positions of the phrase in the description not necessarily refers to, with the one hand, also not necessarily refer to and other side
Mutually exclusive independent or alternative aspect.In addition, describing, some aspects may be shown and other side may not shown
Various features.Similarly, describe the step of can be for some aspects rather than various steps the step of other side.
" multi-medium data " or " multimedia " is broad terms as used herein, and it includes video data, and (it can
Including voice data), voice data, or video data and both audio.As used herein " video data " or
" video " be broad terms, its refer to containing text, image and/or the image of voice data or one or more series or
The image of sequence, and unless specified otherwise herein, otherwise it can be used for referring to multi-medium data or the term is used interchangeably.
Fig. 1 is for the block diagram for the communication system 100 for delivering streaming multimedia.The system can be applied to digital compression
Transmission of video is to multiple terminals (as demonstrated in Figure 1).Digital video source can be (for example) digital cable or satellite feed-in or warp
Digitized simulation source.Video source is handled in transmission facilities 120, video source is encoded in the transmission facilities and is modulated to
For being transferred to one or more terminals 160 via network 140 on carrier wave.Terminal 160 decode received video and
Generally show at least a portion of the video.Network 140 refers to any kind of communication network for being suitable for transmitting encoded data
Network (wired or wireless).For example, network 140 can be cellular phone network, wired or wireless LAN network (LAN) or wide area
Network (WAN) or internet.Terminal 160 can for can receive and display data any kind of communicator, it include (but
Be not limited to) cell phone, PDA, domestic or commercial video display apparatus, computer (pocket, on knee, handheld, PC and compared with
Big server- based computing machine system) and can use the personal entertainment device of multi-medium data.
In terms of Fig. 2 and Fig. 3 illustrate the sample of preprocessor 202.In fig. 2, preprocessor 202 is in digital transmission facility
In 120.Decoder 201 decodes the encoded data from digital video source and provides pre- place by metadata 204 and video 205
Manage device 202.Preprocessor 202 is configured to perform video 205 and metadata 204 certain types of processing and by through processing
Metadata 206 (for example, Primary layer reference frame, enhancement layer reference frame, bandwidth information, content information) and video 207 are provided to volume
Code device 203.The pretreatment to multi-medium data can improve vision definition, anti aliasing (anti-aliasing) and data
Compression efficiency.Generally, preprocessor 202 receives the video sequence provided by decoder 201 and is converted into the video sequence
Progressive scanning sequence for further being handled by encoder (for example, coding).In certain aspects, preprocessor 202 can through with
Put for many operations, the operation includes anti-telecine process, release of an interleave, filtering (for example, artifact is removed, decyclization
(de-ringing), deblocking (de-blocking) and denoising (de-noising)), be sized (for example, from standard definition
To the lower sampling of the spatial resolution of a quarter Video Graphics Array (QVGA)), and gop structure generation is (for example, calculate complicated
Property mapping produce, Scene change detection and decline/flashlight detection).
Fig. 3 A illustrate preprocessor 202, and it is configured with module or component (being collectively referred to herein " module ") to perform it
There is provided to the metadata 204 and the pretreatment operation of video 205 that are received and then metadata 206 and progressive through processing
207 for further processing (such as there is provided to encoder).The module can be implemented with hardware, software, firmware or its combination.
Preprocessor 202 may include various modules, and the module includes one of illustrated module or one or more of, illustrated
Module include anti-telecine process 301, deinterlacer 302, denoising device 303, aliasing suppressor 304, resampler 305,
Deblocking device/decyclization device 306, and GOP dispensers 307, all modules described further below.Preprocessor 202 may also include can
Other appropriate modules for handling video and metadata, it includes memory 308 and communication module 309.Software module can be stayed
Stay in RAM memory, flash memory, ROM memory, eprom memory, eeprom memory, register, hard disk, can fill
In the storage media for unloading disk, CD-ROM or any other form known in the art.Exemplary storage medium is coupled to
Processor, to cause the processor from read information and storage media can be write information to.Implement substituting
In example, storage media can be integral with processor.Processor and storage media can reside within ASIC.The ASIC can reside within
In user terminal.In alternative embodiments, processor and storage media can be resided in user terminal as discrete component.
Fig. 3 B are to illustrate to be used to handle the flow chart of the process 300 of multi-medium data.Process 300 starts and proceeds to frame
320, in a block 320, receive terleaved video.Fig. 2 and preprocessor 202 illustrated in fig. 3 can perform this step.In some sides
In face, decoder (for example, Fig. 2 decoder 201), which can receive intercrossed data and provide intercrossed data, arrives preprocessor 202.
In certain aspects, the data reception module 330 (it is a part for preprocessor 202) shown in Fig. 3 C can perform this step
Suddenly.Process 300 proceeds to frame 322, in a block 322, and terleaved video is converted into progressive.It is pre- in Fig. 2 and Fig. 3 A
Processor 202 and Fig. 3 C module 332 can perform this step.If terleaved video passes through at telecine process, frame 322
Reason may include to perform anti-telecine process to produce progressive.Process 300 proceeds to frame 324 to produce with regarding line by line
The associated metadata of frequency.The module 334 in GOP dispensers 307 and Fig. 3 C in Fig. 3 A can perform the processing.Process 300
Frame 326 is proceeded to, in frame 326, at least a portion of progressive and metadata is provided to encoder to compile
Code (for example, compression).Module 336 in the preprocessor 202 and Fig. 3 C that are shown in Fig. 2 and Fig. 3 A can perform this step.
Progressive and associated metadata are provided after arriving another component for coding, process 300 can terminate.
Fig. 3 C are to illustrate to be used to handle the block diagram of the device of multi-medium data.Displaying described device is incorporated into pretreatment herein
In device 202.Preprocessor 202 includes the device (for example, module 330) for being used to receive video.Preprocessor 202 also includes being used for
Intercrossed data is converted into the device (for example, module 332) of progressive.Described device may include that (such as) space time solution is handed over
Wrong device and/or anti-teleciner.Preprocessor 202 also includes being used to produce the metadata associated with progressive
Device (for example, module 334).Described device may include that the GOP of polytype metadata can be produced as described in this article
Dispenser 307 (Fig. 3 A).Preprocessor 202 may also include for progressive and metadata to be provided to encoder for volume
The device of code, as illustrated by by module 336.In certain aspects, described device may include communication mould illustrated in Fig. 3 A
Block 309.As those skilled in the art will understand, can many standard modes implementation described devices.
The metadata obtained (for example, being obtained from decoder 201 or from another source) can be used for institute by preprocessor 202
State one of pretreatment operation or one or more of.Metadata may include relevant with the content for the multi-medium data that describes or classify
Information (" content information ").Specifically, metadata may include classifying content.In certain aspects, metadata does not include coding
Content information required for operation.In such cases, preprocessor 202 can be configured with determine content information and will it is described in
Holding information is used to pretreatment operation and/or provides content information arrive other components (for example, encoder 203).In some respects
In, the content information can be used to influence GOP to split for preprocessor 202, it is determined that appropriate filtering type, and/or determine quilt
It is sent to the coding parameter of encoder.
Fig. 4 displayings may include the illustrative example of the process frame in preprocessor, and explanation can be held by preprocessor 202
Capable processing.In this example, preprocessor 202 receives metadata and video 204,205 and will include first number (through processing)
According to and video output data 206,207 provide arrive encoder 228.The video received by preprocessor generally has three types.
First, the video received can be progressive and need not perform release of an interleave.Second, video data can be through telecine process
Video, from 24fps film sequences change terleaved video, be in this situation video.3rd, video can be without TV electricity
The terleaved video of shadow processing.Preprocessor 226 can handle the video of the type as described below.
At frame 401, preprocessor 202 determines whether received video 204,205 is progressive.In some situations
Under, this can be determined (if metadata contains described information) from metadata, or be determined in itself by handling video.Lift
For example, anti-telecine process process discussed below can determine that whether received video 205 is progressive.If
The video 205 received is progressive, then process proceeds to frame 407, in frame 407, and video is performed filtering operation to subtract
Few noise (for example, white Gauss (Gaussian) noise).If video is not progressive, at frame 401, process proceeds to
Frame 404 arrives phase detectors.
Phase detectors 404 distinguish the video for originating from the video of telecine process and starting with standard broadcast format.
If it is the decision-making (what is exited from phase detectors 404 is (YES) decision path) through telecine process to make video,
The video through telecine process is set to return to its initial format in anti-telecine process 406.Recognize and eliminate redundancy field and
Derived from same frame of video complete image will be weaved into again in field.Because reconstructed with the aturegularaintervals photographic recording of 1/24 second
The sequence of film image, so the motion estimation process performed in GOP dispensers 412 or decoder is more accurate, the process
It is using the image through anti-telecine process rather than the data through telecine process using base when having irregular.
In one aspect, phase detectors 404 make some decision-makings after frame of video is received.The decision-making includes:
(i) current video and 3 exported from telecine process:Whether 2 pulldown phases are five phase P demonstrated in Figure 50、
P1、P2、P3And P4One of;And (ii) video is through being produced as conventional NTSC.The decision-making is represented as phase P5.It is described to determine
Plan shows as the output of phase detectors 404 demonstrated in Figure 4.The road through being labeled as "Yes" from phase detectors 404
Footpath starts anti-telecine process 406, indicates that anti-telecine process 406 has possessed correct pulldown phase to cause its optional
Go out the field formed by same photographs and combine the field.The path through being labeled as "No" from phase detectors 404
Similarly start deinterlacer 405 obvious NTSC frames are divided into some fields for optimization process.Anti- telecine process
Through being further described in entitled " anti-telecine process algorithm (the Inverse Telecine based on state machine
Algorithm Based on State Machine) " the same U.S. patent application case in application [attorney docket is
QFDM.021A (050943)] in, the application case returns assignee of the present invention to possess and is incorporated to this in entirety by reference
Wen Zhong.
Phase detectors 404 can continuously analyze frame of video, because different types of video can be received at any time.Make
To illustrate, the video for meeting NTSC standard can be inserted in video and be used as commercial programme.After anti-telecine process, by institute
The progressive obtained is sent to available for the denoising device (wave filter) 407 for reducing white Gauss noise.
When picking out conventional ntsc video (no (NO) path from phase detectors 401), by the transmission of video
To deinterlacer 405 for compression.Interlaced field is transformed into progressive by deinterlacer 405, and can be then to progressive
Perform de-noise operation.
After appropriate anti-telecine process or release of an interleave processing, at frame 408, handle progressive to be mixed
It is folded to suppress and resampling (for example, being sized).
After resampling, progressive proceeds to frame 410, in frame 410, performs deblocking and decyclization operation.Two
The artifact " blocking (blocking) " of type and " cyclic (ringing) " is common occurs in video compression applications.Blocking vacation
The appearance of shadow is because each frame is divided into some pieces (for example, 8 × 8 blocks) by compression algorithm.Each piece of reconstruct has
Small error, and error of the error usually with the edge of contiguous block of the edge of block be contrasted so that block boundary is visible.Phase
Than under, cyclic artifact shows as the distortion around the edge of characteristics of image.The appearance of cyclic artifact is because encoder is in amount
Too many information has been abandoned when changing high frequency DCT coefficients.In some illustrative examples, deblocking and the usable low pass FIR of both decyclizations
(finite impulse response (FIR)) wave filter hides the visible artifact.
After deblocking and decyclization, progressive is handled by GOP dispensers 412.GOP segmentations may include that detector lens become
Change, produce complexity mapping (for example, time, spatial bandwidth mapping) and adaptive GOP segmentations.Story board detection is related to determination figure
As when the frame in group (GOP) shows the data for indicating that scene changes have occurred.Scene change detection is compiled available for video
Code device inserts I frames to determine appropriate GOP length and be based on the GOP length, rather than inserts I frames with fixed intervals.Pretreatment
Device 202 also can be configured to produce the Bandwidth map available for coded multimedia data.In certain aspects, it is changed to by positioned at pre-
Content, classification module outside processor produces Bandwidth map.Adaptive GOP segmentations, which can adaptively change, to be coded in together
Groups of pictures composition.The illustrative example of the operation demonstrated in Figure 4 is described below.
Anti- telecine process
Anti- telecine process process is described below and the illustrative reality of anti-telecine process is provided referring to Fig. 4 to Figure 16
Example.When known source attribute and using the source attribute to select the processing form ideally matched when, video compress is provided most preferably
As a result.The video (such as) of (Off-the-air) of not being on the air can originate from a number of ways.Video camera, broadcast studio
Etc. in by convention produced INVENTIONBroadcast video meet NTSC standard in the U.S..According to the standard, each frame is by two fields
Composition.One field is made up of odd lines, and another field is made up of even lines.This can be referred to as " interlocking " form.Although with substantially 30
The speed of frame/second produces frame, but the field is the record of the image of television camera, and the record is separated by 1/60 second.On the other hand,
With the speed photographic film of 24 frames/second, each frame is made up of complete image.This can be referred to as form " line by line ".Set for NTSC
Transmission in standby, " line by line " video is converted into the video format that " interlocks " via telecine process process.Discuss further below
State, in one aspect, system advantageously determines when video telecine process and performs proper transformation to regenerate
Initial progressive frame.
Fig. 4 displayings telecine process has been converted into the effect of the progressive frame of terleaved video.F1、F2、F3And F4To scheme line by line
Picture, it is the input of teleciner.Numeral " 1 " and " 2 " under respective frame are the instruction to odd number or even field.Note
Meaning, in view of the disparity between frame rate, repeats some fields.Fig. 4 also shows that pulldown phase P0、P1、P2、P3And P4.Pass through tool
There is the one mark phase P in two NTSC compliant frames of identical first0.Four subsequent frames correspond to phase P1、
P2、P3And P4.Note, by P2And P3The frame of mark has identical second.Because film frame F1Scanned three times, so being formed
The NTSC that two identicals are exported in succession is compatible first.From film frame F1Derived all NTSC from same film image
Obtain and be therefore to be obtained in synchronization.Other NTSC frames can have the opposite field for being separated by 1/24 second derived from film.
Phase detectors 404 illustrated in fig. 4 make some decision-makings after frame of video is received.The decision package
Include:(i) current video and 3 exported from telecine process:Whether 2 pulldown phases are by showing in Fig. 5 definition 512
Five phase P0、P1、P2、P3And P4One of;And (ii) video is through being produced as conventional NTSC --- the decision-making is expressed
For phase P5。
The decision-making shows as the output of phase detectors 401 demonstrated in Figure 4.Warp from phase detectors 401
The path for being labeled as "Yes" starts anti-telecine process 406, and it indicates that anti-telecine process 406 has possessed correct drop-down
Field and the combination field that phase is formed with causing it to select by same photographs.From phase detectors 401 through mark
Note and start deinterlacer frame 405 as the class of paths of "No" obvious NTSC frames are divided into some fields for the best of it
Reason.
Fig. 6 is the flow chart for the process 600 for illustrating anti-telecine process video flowing.In one aspect, by Fig. 3 anti-electricity
Radio movies handles 301 implementation procedures 600.Start at step 651, anti-teleciner 301 is based on the video received
Determine multiple measurements.In in this regard, form four measurements, four measurements be the field taken out from same frame or consecutive frame it
Between difference sum.Four measurements are through being further assembled into four measurements derived from received data and for six
The euclidean (Euclidean) of the distance between most likely value of the measurement is surveyed for the individual each for assuming phase
Amount.Euclidean and it is referred to as branch information;For each received frame, there are six amounts.Each hypothesis phase tool
There is subsequent phase, the subsequent phase changes under the situation of possible pulldown phase with each received frame.
Possible route of transition is showed in Fig. 7 and is expressed as 767.In the presence of six paths.Decision process maintains six
Individual measurement, the measurement is equivalent to the sum of the Euclidean distance in each path for assuming phase.To make program in response to having changed
The condition of change, when each Euclidean distance with becomes the old times, reduces the Euclidean distance.Euclidean distance
And minimum phase locus be considered as exercisable phase locus.The current phase of this track is referred to as " applicable phase ".
Anti- telecine process based on chosen phase can occur now, as long as it is not P5.If selected for P5, then using frame
The deinterlacer release of an interleave present frame at 405 (Fig. 4) places.In a word, by the use of applicable phase as current pulldown phase, or it is used as use
To order the designator that release of an interleave is carried out to being estimated as the frame with effective NTSC format.
For each frame received from video input, the new value of each of four measurements is calculated.The measurement warp
It is defined as:
SADFS=∑ | current field one is worth (i, j)-preceding field one and is worth (i, j) | (1)
SADSS=∑ | current field two-value (i, j)-preceding field two-value (i, j) | (2)
SADPO=∑ | current field one is worth (i, j)-preceding field two-value (i, j) | (3)
SADCO=∑ | current field one is worth (i, j)-current field two-value (i, j) | (4)
Term SAD is the abbreviation of term " total absolute difference ".In Fig. 8 diagrammatic illustration through distinguishing to form the field of measurement.Under
Mark refers to field number;Letter represents previously (=P) or current (=C).Bracket in Fig. 8 refers to the paired poor of field.SADFSRefer to present frame
Through being labeled as C1Field one and previous frame through being labeled as P1Field one between difference, in the definition provided in fig. 8 through mark
The bracket for FS is noted across the field;SADSSRefer to present frame through being labeled as C2Field two and previous frame through being labeled as P2Field
Difference between two, through being labeled as SS bracket across described two fields;SADCORefer to present frame through being labeled as C2Field 2 with it is current
Frame through being labeled as C1Field one between difference, through being labeled as CO bracket across the field;And SADPORefer to the field one of present frame
Difference between the field 2 of previous frame, through being labeled as PO bracket across described two fields.
Describe to assess every SAD computational load below.There are about 480 level of significance lines in conventional NTSC.
In order that the resolution ratio in horizontal direction is identical, with 4:3 aspect ratio, should have the vertical of 480 × 4/3=640 bars equalization
Line or the free degree.The video format of 640 × 480 pixels is one of form that advanced television standard committee member club receives.Therefore,
Every 1/30 second (duration of a frame), 640 × 480=307,200 new pixels are produced.With 9.2 × 106Pixel/second
Speed produces new data, and it, which is implied, runs the hardware or software of this system to be about 10MB or more rate processing data.
This is one of highspeed portion of system.It can give reality by hardware, software, firmware, middleware, microcode or its any combinations
Apply.SAD calculators can be the independent assembly being incorporated into as hardware, firmware, middleware in the component of another device, or with place
The microcode or software performed on reason device is practiced, or its combination.When being practiced with software, firmware, middleware or microcode,
It can will perform the procedure code calculated or code section is stored in the machine-readable medium of such as storage media.Code section can representation program,
Function, subprogram, program, routine, subroutine, module, software encapsulation, class, or instruction, data structure or program statement it is any
Combination.Can be by transmission and/or receive information, data, independent variable (argument), parameter or memory content by code section coupling
Close another yard of section or hardware circuit.
Flow chart 900 in Fig. 9 makes the relation in Fig. 8 clear and definite and Fig. 9 is the graphic representation of equation 1 to 4.Fig. 9 displayings point
SAD is not keptFS、SADCO、SADSSAnd SADPOMost recent value storage location 941,942,943 and 944.Each freedom of described value is exhausted
It is to four of poor calculator 940 and produced, the absolute difference computation device 940 handle previous first field data 931 brightness value,
The brightness of the brightness value of current first field data 932, the brightness value of current second field data 933 and previous second field data 934
Value.In the summation of definition measurement, term " value (i, j) " means as position i, the brightness value at j places, and summation is to all effective pictures
The summation of element, but it is not excluded for the summation to the significant subset of valid pixel.
Flow chart 100 in Figure 10 is to illustrate to be used to detect the video through telecine process and invert at through telecine
The video of reason is to return to the detail flowchart of the process of the film image through initially scanning.In step 1030, assess in Fig. 9
Defined measurement.Step 1083 is proceeded to, the lower envelope value of four measurements is found.The lower envelope of SAD measurements is true through dynamic
Fixed amount, it is the high lowest limit, is not passed through in its lower SAD.Step 1085 is proceeded to, it is determined that following in equation 5 to 10
Defined in branch information amount, its can be used previously determined by measurement, lower envelope value and the constant experimentally determined
A.Because continuous phase value may be inconsistent, determination amount Δ is to reduce this obvious unstability in step 1087.
When phase bit decisions sequence with it is demonstrated in Figure 7 the problem of model it is consistent when, it is believed that the phase is consistent.In the step
Afterwards, process proceeds to step 1089 to calculate decision variable using the currency of Δ.Decision variable calculator 1089 uses logical
Carry out evaluation decision variable to its all information produced in frame 1080.Step 1030,1083,1085,1087 and 1089 are
Measurement in Fig. 6 determines 651 extension.Found by phase selector 1090 from the variable and be applicable phase.As illustrated, certainly
Plan step 1091 inverts video or release of an interleave the regarding through telecine process through telecine process using phase is applicable
Frequently.It is more clearly describing for the operation to the phase detectors 404 in Fig. 4.In one aspect, by Fig. 4 phase detectors
404 perform Figure 10 processing.Start at step 1030, detector 404 is determined many by the above-mentioned process referring to described by Fig. 8
Individual measurement, and continue through step 1083,1085,1087,1089,1090 and 1091.
Flow chart 1000 illustrates the process for estimating current phase.The flow chart is described at step 1083 using warp
The measurement and lower envelope value of determination calculates branch information.Branch information can it is recognized for the euclidean previously discussed away from
From.It is equation 5 below to 10 available for the exemplary equation of branch information is produced.Figure 12 step 1209 fall into a trap point counting branch letter
Breath amount.
Video data through processing can be stored in storage media, and the storage media may include (such as) chip collocation type
Storage media (for example, ROM, RAM) or the dish-type storage media (for example, magnetic or optical) for being connected to processor.One
In a little aspects, anti-telecine process 406 and deinterlacer 405 can each contain some or all storage medias.By following
Equation defines branch information amount.
Branch information (0)=(SADFS-HS)2+(SADSS-HS)2+(SADPO-HP)2+(SADCO-LC)2 (5)
Branch information (1)=(SADFS-LS)2+(SADSS-HS)2+(SADPO-LP)2+(SADCO-HC)2 (6)
Branch information (2)=(SADFS-HS)2+(SADSS-HS)2+(SADPO-LP)2+(SADCO-HC)2 (7)
Branch information (3)=(SADFS-HS)2+(SADSS-LS)2+(SADPO-LP)2+(SADCO-LC)2 (8)
Branch information (4)=(SADFS-HS)2+(SADSS-HS)2+(SADPO-HP)2+(SADCO-LC)2 (9)
Branch information (5)=(SADFS-LS)2+(SADSS-LS)2+(SADPO-LP)2+(SADCO-LC)2 (10)
The small detail that branch calculates is shown in branch information calculator 1209 in fig. 12.Such as in calculator 1209
Shown, draw branch information usage amount LS(SADFSAnd SADSSLower envelope value), LP(SADPOLower envelope value), and LC
(SADCOLower envelope value).Lower envelope branch information calculating in be used as ranging offset, so as to individually or with predetermined constant A mono-
Rise and produce HS、HPAnd HC.The value that lower envelope is kept in lower envelope tracker discussed below is newest.H skews are defined
For:
HS=LS+A (11)
HPO=LP+A (12)
HC=LC+A (13)
Tracking L is presented in Figure 13 A, 13B and 13CS、LPAnd LCValue process.Consider place at the top of (such as) Figure 13 A
What is shown is used for LPTrack algorithm 1300.SAD will be measured in comparator 1305POWith LPCurrency add threshold value TPCarry out
Compare.If SADPOMore than LPCurrency add threshold value TP, then as shown in frame 1315, L is not changedPCurrency.If
SADPONot less than LPCurrency add threshold value TP, then as seen in frame 1313, LPNew value become SADPOWith LPLinear group
Close.In another aspect, for step 1315, LPNew value be LP+TP。
Similarly calculate the amount L in Figure 13 B and Figure 13 CSAnd LC.There is the processing of identical function in Figure 13 A, 13B and 13C
Frame through numbering in the same manner, but provide apostrophe (' or ") to show that it is operated to different set of variables.For example, formation is worked as
SADPOWith LCLinear combination when, the operation is shown in frame 1313'.For LPSituation, on the other hand for 1315' will
Use LC+TCReplace LC。
However, in LSSituation under, the algorithm in Figure 13 B alternately handles SADFSAnd SADSS, every X is marked successively, because
It is applied to two variables for this lower envelope.When by the SAD in frame 1308FSCurrency read in frame 1303 in X position in, then
By SAD in 1307SSCurrency read in frame 1302 in X position in when, occur SADFSValue and SADSSThe alternating of value.For LP
Situation, L on the other hand will be used for 1315 "S+TSReplace LS.It is intended for testing the amount A of current lower envelope value by experiment
And threshold value.
Figure 11 is to illustrate to be used to perform the flow chart of the exemplary process of Figure 10 step 1089.Figure 11, which is substantially shown, to be used for
Update the process of decision variable.(correspond to six with from the derived fresh information of measurement to update six decision variables in fig. 11
Possible decision-making).The decision variable is obtained as follows:
D0=α D4+ branch information (0) (14)
D1=α D0+ branch information (1) (15)
D2=α D1+ branch information (2) (16)
D3=α D2+ branch information (3) (17)
D4=α D3+ branch information (4) (18)
D5=α D5+ branch information (5) (19)
α is measured to be less than one and limit dependence of the decision variable to its past value;α use, which is equivalent to, works as Euclidean distance
Data become the effect that the old times reduce each Euclidean distance.In flow chart 1162, decision-making to be updated is become in left side
Measure be listed as on online 1101,1102,1103,1104,1105 and 1106 it is available.Then by phase in one of frame 1100
The each of decision variable on one of route of transition is multiplied by α, and α is the number less than one;Then declining old decision variable
Depreciation is added to the currency for the branch information variable indexed by next phase on phase transition path, the decision-making of decay
Variable is on the phase transition path.This occurs in frame 1110.Make variables D in frame 11935Offset an amount Δ;Δ be
Calculated in frame 1112.As described below, it is inconsistent in phase sequence determined by thus system to reduce to select the amount
Property.The decision variable of minimum is obtained in frame 1120.
In a word, fresh information specific to each decision-making is added to the preceding value for the appropriate decision variable for being multiplied by α
To obtain the value of current decision variable.When newly being measured, new decision-making can be made;Therefore, this technology can received
New decision-making is made during to the field 1 of each frame and field 2.The decision variable is the sum for the Euclidean distance being initially mentioned.
It is the lower target phase with minimum decision variable to be applicable phase chosen.Clearly made in Figure 10 frame 1090
Decision-making based on decision variable.Allow some decision-makings in decision space.As described in frame 1091, the decision-making is:(i) it is applicable
Phase is not P5--- it is P that then anti-telecine process video and (ii), which are applicable phase,5--- then release of an interleave video.
Because measurement is taking-up in inherently variable video, accidental mistake is there may be in the relevant string of decision-making
Difference.This technology for detection is to the phase sequence inconsistent with Fig. 7.Its operation is summarized in fig. 14.Algorithm 1400 is deposited in frame 1405
Store up the subscript (=x) of current phase bit decisions and the subscript (=y) of previous phase decision-making is stored in frame 1406.In frame 1410,
Test whether x=y=5;In frame 1411, values below is tested:
Whether
X=l, y=0;Or
X=2, y=l;Or
X=3, y=2;Or
X=4, y=3;Or
X=0, y=4.
If any one of two tests is affirmative, it is consistent to declare the decision-making in frame 1420.If appointed
One test is not affirmative, then calculates the skew being showed in Figure 11 frame 1193 in fig .15 and be added to the skew
With P5Associated decision variable D5。
To D5Modification also come across as a part for process 1500 in Figure 15, it is described modification in phase sequence not
Uniformity provides corrective action.It is assumed that the uniformity test in frame 1510 in flow chart 1500 has failed.Along from frame 1510
The "No" branch of extraction is carried out, and next test in frame 1514 is:For all i<5, if D5>Di;Or be:For i<5,
At least one D of the variableiWhether D is more than5.If the first situation effectively, in frame 1516 by initial value be δ0Parameter
δ changes over 3 δ0.If δ effectively, 4 δ is changed in frame 1517 by the second situation0.In frame 152B, the value of Δ is updated to
ΔB, wherein
ΔB=maximum (Δ-δ, -40 δ0) (20)
Return again to frame 15210, it is assumed that the decision-making string is consistent through being determined as.In frame 15215, by parameter δ change over by
The δ that following formula is defined+
δ+=maximum (2 δ, 16 δ0) (21)
δ new value is inserted into the renewal relationship delta for Δ in frame 152AAIn.This is
ΔA=maximum (Δ+δ, 40 δ0) (22)
Then the updated value of Δ is added to decision variable D in frame 15935。
Figure 16 is shown once it is determined that how pulldown phase, anti-telecine process process is carried out.Using this information, by field
1605 and 1605' is identified as representing the same field of video.Described two fields are averaging together and combine it with field 1606 with
Reconstructed frame 1620.Reconstructed frame is 1620'.Similar procedure is by reconstructed frame 1622.Do not replicate from derived from frame 1621 and 1623
.The frame is reconstructed by the way that first and second of the frame is woven together again.
In described above in terms of, whenever new frame is received, four new values being measured and using counting recently
The decision variable calculated tests sixfold hypothesis group.Other processing structures are suitably adapted for calculating the decision variable.Viterbi
(Viterbi) measurement for the branch that decoder will constitute path adds and is added together to form path metric.It is defined here to determine
Plan variable is formed by similar rule:Each for fresh information variable " sewing " and.(in summation is sewed, will newly it believe
Breath data are added to before the preceding value of decision variable, and the preceding value of the decision variable is multiplied by into the number less than one.) Viterbi
Decoder architecture may be modified to support the operation of this program.
Although just describing present aspect for processing convention video (wherein, a new frame occurred every 1/30 second), it should be noted that
This process is applicable to the frame for recording and handling backward in time.Decision space keeps identical, but there is small change, described
Change reflects the time reversal of the sequence of input frame.For example, a string of relevant telecines from time reversal pattern
Handle decision-making (presented herein)
P4P3P2P1P0
Also it will be inverted in time.
Using this change to first aspect decision-making will be allowed to carry out two kinds of trials when making successful policy:One kind is attempted
It is that in time forward, another trial is in time backward.Although two kinds of trials are not independent, it is differed, because
Measurement will be handled in a different order for each trial.
This idea can be applied together with buffer, and the buffer stores regarding for the future that may be additionally needed through maintaining
Frequency frame.If it find that video segment provides unacceptably inconsistent results in forward direction processing direction, then program will be from described
The frame in future is taken out in buffer and attempts to overcome by handling the frame in opposite direction the stretching of video difficult.
The processing to video described in this patent could be applicable to the video of PAL format.
Deinterlacer
" deinterlacer " is broad terms as used herein, and it can be used for description wholly or largely to handle friendship
Wrong multi-medium data (including is for example configured to perform to form the release of an interleave system of multi-medium data line by line, device or process
Software, firmware or the hardware of process).
INVENTIONBroadcast video produced by convention meets NTSC standard in the U.S. in video camera, broadcast studio etc..One
The normal method for planting compression video is to be interlocked.In intercrossed data, each frame is made up of one of two fields.One field
It is made up of the odd lines of frame, another field is made up of even lines.Although producing frame with substantially 30 frames/second, the field is TV phase
The record of the image of machine, the record is separated by 1/60 second.Each frame display image of terleaved video signal every a horizontal line.
When the frame is projected on screen, vision signal replaces between displaying even lines and odd lines.When performing this fast enough
When alternately (for example, 60 frames about per second), video image appears to be smooth in human eye.
Using staggeredly up to many decades in the analog television broadcast based on NTSC (U.S.) and PAL (Europe) form.Cause
To send the image of only half with each frame, it will be used so the bandwidth that terleaved video is used is about its transmission whole image
Bandwidth half.The last display format of video inside terminal 16 is unnecessary compatible with NTSC and can not easily show
Show intercrossed data.On the contrary, the modern display (for example, LCD, DLP, LCOS, plasma etc.) based on pixel is progressive scan
And video source (and many older video-unit use older interlacing technology) of the display through progressive scan.Some often make
The example of release of an interleave algorithm is described in P Arvo Haavistos (P.Haavisto), and J. wishes that Hora (J.Juhola) and Y. are about difficult to understand
(Y.Neuvo) " change (Scan rate up-conversion upwards using the sweep speed of adaptive weighted averaging filter
Using adaptive weighted median filtering) " (HDTV II signal transacting (Signal
Processing of HDTV II), the 703-710 pages, nineteen ninety) and R simons in carry (R.Simonetti), S. OK a karaoke club supports
(S.Carrato), the blue ripple Buddhist nuns (G.Ramponi) of G. and A. Borrows Fei Lisen (A.Polo Filisan) " should for multimedia
Release of an interleave (the Deinterlacing of HDTV Images for Multimedia of HDTV images
Applications) " (HDTV II signal transacting (Signal Processing of HDTV IV), the 765-772 pages,
1993) in.
Describe below for may be used alone or in combination using to improve the performance of release of an interleave and available for deinterlacer 405
Example in terms of the release of an interleave of system and method in (Fig. 4).The aspect may include to carry out release of an interleave using space time filtering
Selected frame is to determine the first interim release of an interleave frame, determine that second faces from the selected frame using bi-directional motion estimation and motion compensation
When release of an interleave frame, and then combine first and second described interim frame to form final progressive frame.The space time filtering
Weighted median filtering (" Wmed ") wave filter can be used, the wave filter may include to prevent from making what level or nearly horizontal edge were obscured
Horizontal edge detector.Intensity movements grade is produced to the previous of " current " field and the filtering of the space time of subsequent neighbouring field to reflect
Penetrate, the intensity movements grade is mapped some part classifyings of selected frame into different sport ranks, for example, static, slow
Motion and quick motion.
In certain aspects, used by Wmed filtering including from five neighbouring field (the first two, current fields and rear two
Individual field) the filtering aperture of pixel produce intensity mapping.Wmed filtering, which can determine that, can effectively dispose scene changes and thing
Body occurs and the forward, backward disappeared and the detection of bidirectional static area.In in every respect, can be crossed between field in filter patterns has
One or more of same parity utilize Wmed wave filters, and can switch it to show up by adjusting threshold value standard
Interior filter patterns.In certain aspects, estimation and compensation use brightness (intensity of pixel or lightness) and chroma data
(color information of pixel) improves the release of an interleave region of selected frame, and wherein almost consistent but color is different for lightness grade.Go
Noise filter can be used for the degree of accuracy of increase estimation.Denoising acoustic filter can be applied to through the interim of Wmed release of an interleaves
Frame is filtered produced aliasing artifacts by Wmed to remove.De-interlace method discussed below and system produce excellent solution and handed over
Wrong result and with relatively low computational complexity, it allows quick operation release of an interleave embodiment, fits the embodiment
Together in various release of an interleave applications, the application includes being used for providing data to cell phone, calculating using display
The system of machine and other types of electronics or communicator.
Deinterlacer is described with reference to various assemblies, module and/or the step for release of an interleave multi-medium data herein
And the aspect of de-interlace method.
Figure 17 is the block diagram of the one side for the deinterlacer 1700 for illustrating can be used as the deinterlacer 405 in Fig. 4.Release of an interleave
Device 1700 is included at least a portion in space and time upper (" space time ") filtering intercrossed data and produces space time letter
The spatial filter 1730 of breath.For example, Wmed can be used in spatial filter 1730.In certain aspects, release of an interleave
Device 1700 also includes denoising acoustic filter (not shown), for example, wiener (Weiner) wave filter or wavelet shrinkage (wavelet
Shrinkage) wave filter.Deinterlacer 1700 also includes providing the estimation to the selected frame of intercrossed data and compensation and production
The exercise estimator 1732 of raw movable information.Combiner 1734 receive and interblock space temporal information and movable information with formed by
Row frame.
Figure 18 is another block diagram of deinterlacer 1700.Processor 1836 in deinterlacer 1700 includes spatial filter
Module 1838, motion estimation module 1840 and combiner modules 1842.Interlaced multimedia data from external source 48 can be carried
It is supplied to the communication module 44 in deinterlacer 1700.Can by hardware, software, firmware, middleware, microcode or its any combinations come
Implement deinterlacer and its component or step.For example, deinterlacer can be to be incorporated into separately as hardware, firmware, middleware
Independent assembly in the component of one device, or be practiced with the microcode or software performed on a processor, or its combination.When with
When software, firmware, middleware or microcode are practiced, the procedure code or code section for performing deinterlacer task can be stored in for example
In the machine-readable medium of storage media.Code section can represent process, function, subprogram, program, routine, subroutine, module, soft
Any combinations of part encapsulation, class, or instruction, data structure or program statement.Can by transmission and/or receive information, data, from
Variable, parameter or memory content and by code section be coupled to another yard of section or hardware circuit.
The intercrossed data received can be stored in the storage media 1846 in deinterlacer 1700, and storage media 1846 can
Including (such as) chip collocation type storage media (for example, ROM, RAM) or it is connected to the dish-type storage media (example of processor 1836
Such as, magnetic or it is optical).In certain aspects, processor 1836 can contain some or all storage medias.Processor 1836
It is configured to handle interlaced multimedia data and is subsequently fed to the progressive frame of another device or process to be formed.
The conventional analog video device of similar TV reproduces video in an interleaved manner, i.e. described device transmission numbering is even
Several scan line (even field) and the scan line (odd field) that numbering is odd number.In terms of sample of signal viewpoint, this is equivalent to such as
The space time subsample (subsampling) that lower described pattern is carried out:
Wherein Θ represents initial two field picture, and F represents interlaced field, and (x, y, n) represent respectively pixel level, it is vertical and
Time location.
Do not lose it is general in the case of, it may be assumed that n=0 is always even field in the present invention, therefore above equation
23 simplified are
Because being extracted in horizontal size, it is possible to which ensuing n~y-coordinate describes subsample pattern.
In Figure 19, circle represents position with asterisk, and initial full frame image has sampled pixel in the position.Release of an interleave mistake
Journey extracts asterisk pixel, and it is perfect to retain circle pixel.It note that we start from scratch to index to upright position, because
This, even field is top field, and odd field is bottom field.
The target of deinterlacer is that terleaved video (field sequence) is transformed into noninterlace progressive frame (frame sequence).In other words
Say, interpolation even number and odd field are with " recovery " or generation full frame image.This can be represented by equation 25:
Wherein FiRepresent the release of an interleave result of pixel lacked.
Figure 20 be illustrate deinterlacer one side some aspects block diagram, the deinterlacer using Wmed filtering and
Estimation produces progressive frame from interlaced multimedia data.Figure 20 upper part displaying can be used from current field, the first two
The information of (PP and P) and latter two (next field and again next field) and produce exercise intensity mapping 2052.Motion is strong
Degree mapping 2052 is by current frame classification or is divided into two or more different motion grades, and can be by hereinafter entering one
The space time that step is described in detail is filtered and produced.In certain aspects, exercise intensity mapping 2052 is produced to recognize as following
Static zones, slow motor area and quick motor area with reference to described by equation 4 to 8.Spatio-temporal filter is (for example, Wmed is filtered
Ripple device 2054) interlaced multimedia data is filtered using the standard mapped based on exercise intensity, and generation space time is solved temporarily
Interlaced frame.In certain aspects, Wmed filterings are related to the horizontal neighbors of [- 1,1], the vertical neighborhood of [- 3,3], and pass through
Five fields illustrated in fig. 20 (PP, P, current field, next field, again next field) time of five opposite fields for representing is adjacent
Domain, wherein Z-1Represent the delay of a field.Relative to current field, next field and P are no parity and PP and next field again
For parity field." neighborhood " filtered for space time refers to actual used field and the space of pixel during filtering operation
And time location, and can be explained as such as (e.g.) " aperture " shown in Figure 21 and Figure 22.
Deinterlacer may also include denoising device (denoising acoustic filter) 2056.Denoising device 2056 be configured to filtering by
The interim release of an interleave frame of space time that Wmed wave filters 2054 are produced.The interim release of an interleave frame denoising of space time is made subsequent
Motion search process is more accurate, especially when source interlaced multimedia data sequence is by white noise sound pollution.Denoising device 2056 is also
The aliasing in Wmed images between even number line and odd-numbered line can be removed at least in part.Denoising device 2056 can be embodied as a variety of
Wave filter, including be equally further described below based on wavelet shrinkage and small echo wiener (Wiener) wave filter
Denoising device.
Figure 20 bottom defend oneself it is bright be used for determine interlaced multimedia data movable information (for example, motion vector candidates,
Estimation, motion compensation) aspect.Specifically, Figure 20 illustrates estimation and motion compensated schemes, the estimation
And motion-compensated interim progressive frame of the motion compensated schemes for producing selected frame, and then by itself and the interim frame groups of Wmed
Close to form " final " progressive frame of gained, it is shown as the present frame 2064 through release of an interleave.In certain aspects, staggeredly many matchmakers
Motion vector (" MV ") candidate (or estimation) of volume data is provided to deinterlacer and for being double from external movement estimator
Starting point is provided to exercise estimator and compensator (" ME/MC ") 2068.In certain aspects, MV candidate selectors 2072 for
The MV candidates of block being processed using the MV of contiguous block was used for determined by previously, for example previous through processing block (for example,
Block in previous frame 2070 through release of an interleave) MV.Can be based on frame 70 previous through release of an interleave and next (for example, future)
Wmed frames 2058 and two-way carry out motion compensation.Merged by combiner 2062 or combine current Wmed frames 2060 with it is motion-compensated
The present frame 2066 of (" MC ").The present frame 2064 (being now progressive frame) through release of an interleave of gained through provide back ME/MC 2068 with
As the previous frame 2070 through release of an interleave and be also communicated outside deinterlacer for further processing (for example, compression and pass
It is defeated to arrive display terminal).The various aspects shown in more detail below Figure 20.
Figure 25 illustrates for handling multi-medium data to produce the process 2500 of frame sequence line by line from interlaced frame sequences.One
In aspect, progressive frame is produced by deinterlacer 405 illustrated in fig. 4.At frame 2502, process 2500 (process " A ") is produced
Spatial temporal information for selected frame.Spatial temporal information may include the sport rank for multi-medium data of classifying and generation
The information of exercise intensity mapping, and including the interim release of an interleave frames of Wmed and for producing the information of the frame (for example, for equation
Information in 26 to 33).Wmed wave filters 2054 and its relevant treatment that can be illustrated in such as Figure 20 upper part (enter below
One step is described in detail) perform this process.In fig. 26 in illustrated process A, by territorial classification into not at frame 2602
With the field of sport rank, such as it is been described by further below.
Next, at frame 2504 (process " B "), process 2500 produces the motion compensation information for selected frame.One
In aspect, illustrated bidirectional motion estimation device/motion compensator 2068 can perform this process in Figure 20 lower part.Process
2500 proceed to the field of frame 2506, wherein process release of an interleave selected frame based on spatial temporal information and motion compensation information
To form the progressive frame associated with selected frame.This can be performed by illustrated combiner 2062 in Figure 20 lower part.
Exercise intensity maps
For each frame, the area of different " motions " can be determined by handling the pixel in current field to determine exercise intensity
2052 mappings.Describe to determine the illustrative aspect of three type games intensity mapping referring to Figure 21 to Figure 24 below.Exercise intensity maps
The area that each frame is specified based on the pixel compared in same parity and not like parity field is static zones, slow motor area
And quick motor area.
Static zones
Certain (a little) pixel can be determined comprising the pixel in the neighborhood for handling opposite field by determining the static zones of Motion mapping
Whether luminance difference meets certain standard.In certain aspects, determine the static zones of Motion mapping comprising five opposite fields of processing (when
Front court (C), on the time in two fields before the current field and two frames on the time after the current field) neighborhood
In pixel whether some threshold values are met with the luminance difference for determining certain (a little) pixel.Five fields are illustrated in Figure 20, Z-1Table
Show the delay of a field.In other words, generally will be with Z-1The sequence of time delay show five opposite fields.
Figure 21 illustrates the aperture that some pixels of each of five fields are recognized according to some aspects, and the aperture can
For space time filtering.The aperture includes (from left to right) again previous field (PP), previous field (P), current field (C), next
Field (N) and again 3 × 3 pixel groups of next field (NN).In certain aspects, if the area of current field is met in equation 26 to 28
Described standard, then the area that current field is thought in Motion mapping is static zones, pixels illustrated position and right in Figure 21
Answer field:
|LP-LN|<T1 (26)
And
Or
Wherein T1For threshold value,
LPFor the brightness of the pixel P in P,
LNFor the brightness of the pixel N in N,
LBFor the brightness of the pixel B in current field,
LEFor the brightness of the pixel E in current field,
LBPPFor the pixel B in PPPPBrightness,
LEPPFor the pixel E in PPPPBrightness,
LBNNFor the pixel B in NNNNBrightness, and
LENNFor the pixel E in NNNNBrightness.
Threshold value T1It can be determined and be provided by the process in addition to release of an interleave through making a reservation for and being set as particular value
(for example, as just by metadata of the video of release of an interleave), or threshold value T1It can be dynamically determined during release of an interleave.
Due at least two reasons, static zones standard of the above described in equation 26,27 and 28 is used than conventional solution
Field more than interleaving technique.First, compared with the comparison between not like parity field, the comparison between same parity have compared with
Low aliasing and phase mismatch.However, during minimum between field being processed and its most adjacent same parity neighbour
Between poor (therefore, correlation) be two fields, than between field being processed and its most adjacent not like parity field neighbour when
Between difference it is big.Relatively reliable not like parity field can improve the standard of static zones detection with combining for aliasing relatively low same parity
Exactness.
In addition, five fields can be symmetrically distributed past and future, such as Figure 21 relative to the pixel X in present frame C
It is middle to be shown.Static zones can be through being subdivided into three classes:Forward direction static (being static state relative to previous frame), backward static state are (relative under
One frame for static state), or it is two-way (if meet before to backward both criteria).To static zones this compared with the disaggregated classification property improved
Can, especially in scene changes and when object occurs/disappeared.
Slow motor area
If the brightness value of some pixels is unsatisfactory for the standard but satisfaction that be designated as static zones will be designated as slowly
The standard of motor area, a then it is believed that area of Motion mapping is slow motor area in Motion mapping.The definition of equation 2 below 9 can
Standard for determining slow motor area.Referring to Figure 22, show what is recognized in equation 29 in the aperture centered on pixel X
Pixel Ia, Ic, Ja, Jc, Ka, Kc, La, Lc, P and N position.The aperture includes 3 × 7 neighborhood of pixels of current field (C), with
And next field (N), 3 × 5 neighborhoods of previous field (P).If pixel X is unsatisfactory for above-mentioned listed standard for static zones and such as
Pixel in fruit aperture meets the following standard shown in equation 29, then it is assumed that pixel X is a part for slow motor area:
(Lla-Llc|+|LJa-LJc|+|LJa-LJc|+|LKa-LKc|+|LLa-LLc|+|LP-LN|)/5<T2 (29)
Wherein T2For threshold value, and
LIa、LIc、LJa、LJc、LKa、LKc、LLa、LLc、LP、LNRespectively pixel Ia, Ic, Ja, Jc, Ka, Kc, La, Lc, P and N
Brightness value.
Threshold value T2Also it can also be determined and be carried by the process in addition to release of an interleave through making a reservation for and being set as particular value
For (for example, as just by metadata of the video of release of an interleave), or threshold value T2It can be dynamically determined during release of an interleave.
Note that due to the angle of the rim detection ability of wave filter, wave filter can make horizontal edge it is fuzzy (for example, away from
More than 45 ° of perpendicular alignmnet).For example, the rim detection ability of aperture illustrated in fig. 22 (wave filter) by pixel " A " with
" F " or " C " influences with the angle that " D " is formed.Any edge more more horizontal than this angle and therefore rank most preferably will be not incorporated in
Terraced artifact may alternatively appear in the edge.In certain aspects, slow motion class can be divided into two subclasses " horizontal edge "
And " other " are in terms of and this rim detection effect.If meeting the standard in the following equation 30 shown, will can slowly it transport
Dynamic pixel classifications are horizontal edge, and if being unsatisfactory for the standard in equation 30, then it is so-called to be categorized as slow motion pixel
" other " class.
|(LA+LB+LC)-(LD+LE+LF)|<T3 (30)
Wherein T3For threshold value, and LA, LB, LC, LD, LE and LF are pixel A, B, C, D, E and F brightness value.
Different interpolating methods can be used to each of horizontal edge and other edges.
Quick motor area
If being unsatisfactory for the standard for the standard of static zones and for slow motor area, it is believed that pixel is quick
In motor area.
After the pixel classifications in selected frame, process A (Figure 26) proceeds to frame 2604 and based on exercise intensity
Mapping produces interim release of an interleave frame.In in this regard, the selected field of Wmed wave filters 2054 (Figure 20) filtering and necessary opposite field
To provide candidate's full frame image F0, candidate's full frame image can be defined as follows:
Wherein, αi(i=0,1,2,3) is integer weight, and it is computed as follows:
The interim release of an interleave frame that filters through Wmed is provided to carry out together with estimation and motion compensation process
It is illustrated in further processing, such as Figure 20 lower part.
As described above and as shown in equation 31, static interpolation includes interpolation and slow motion and quick fortune between field
Dynamic interpolation includes intrafield interpolation.In some aspects of time (for example, between field) interpolation for not needing same parity wherein,
Can be by by threshold value T1(equation 4 to 6) is set as zero (T1=0) " deactivation " temporal interpolation.Situation about being deactivated in temporal interpolation
Under any region class not mapped sport rank can be caused to be static zones to the processing of current field, and Wmed wave filters 2054
(Figure 20) is grasped using three fields illustrated in the aperture in Figure 22, its no parity adjacent to a current field and two
Make.
Denoising
In certain aspects, denoising device can be used for before candidate's Wmed frames are further handled using motion compensation information
Noise is removed from candidate Wmed frames.The removable noise being present in Wmed frames of denoising device and stick signal presence, but regardless of
How is the frequency content of signal.Various types of denoising acoustic filters, including wavelet filter can be used.Small echo is used in space
Certain function summary with positioning Setting signal in ratiometric conversion domain.Basic idea based on small echo for (scale) in varing proportions or
Resolution analysis signal is to cause the small change in Wavelet representation for transient to produce corresponding small change in initial signal.
In certain aspects, denoising acoustic filter is based on (4,2) biorthogonal cubic B-Spline (spline) wavelet filter
Aspect.A such a wave filter can be defined by following direct transform and inverse transformation:
And
The application of denoising acoustic filter can increase the accuracy for the motion compensation having in noise circumstance.It is assumed that in video sequence
Noise be the white Gauss of additivity.Pass throughRepresent estimated noise change.It can estimated be highest frequency sub-bands coefficient
Median absolute deviation divided by 0.6745.The embodiment of the wave filter be further described in D.L. Du Nuofu (D.L.Donoho) and
I.M. Johnston (I.M.Johnstone) " ideal space carried out by wavelet shrinkage adapts to (Ideal spatial
Adaptation by wavelet shrinkage) " (biostatistics (Biometrika), volume 8, the 425-455 pages,
1994) in, it is incorporated herein in entirety by reference.
Wavelet shrinkage or small echo Wiener filter also act as denoising device.Wavelet shrinkage denoising can relate to wavelet transformation
Contraction in domain, and generally comprise three steps:Linear positive wavelet transformation, nonlinear shrinkage denoising and linear inverse small echo become
Change.Wiener filter is MSE optimum linear filters, and it can be used for improving due to additive noise and figure that is fuzzy and degrading
Picture.The wave filter be typically it is known in technique and be described in (such as) it is referred to above " by wavelet shrinkage
The ideal space of progress adapts to (Ideal spatial adaptation by wavelet shrinkage) " and the S.P. dagger-axe last of the twelve Earthly Branches
You are (S.P.Ghael), and A.M. Saids (A.M.Sayeed) and R.G. Baraniks (R.G.Baraniuk) are written " via examination
Test improvement Wavelet Denoising Method sound (the Improvement Wavelet denoising via empirical of Wiener filtering progress
Wiener filtering) " (SPIE journals (Proceedings of SPIE), volume 3169, the 389-399 pages, the Holy Land is sub-
Brother (San Diego), in July, 1997) in.
Motion compensation
Referring to Figure 27, at frame 2702, process B performs bi-directional motion estimation, and then uses estimation at frame 104
To perform motion compensation, motion compensation be discussed in Figure 20 and be described below in illustrative aspect in.In Wmed
There is one " delayed (lag) " between wave filter and deinterlacer based on motion compensation.From as show in Figure 23 it is previous
Frame " P " is with the information prediction in a later frame " N " for " lacking " data (the non-originating row of pixel data) of current field " C "
Motion compensation information.In current field (Figure 23), solid line represents that row and dotted line present in initial pixel data represent to pass through
Row present in the pixel data of Wmed interpolations.In certain aspects, motion compensation is performed in the neighborhood of pixels that 4 rows multiply 8 row.
However, this neighborhood of pixels be an example for illustrative purposes, and those skilled in the art it should be appreciated that can based on comprising
Motion compensation is performed in the other side for the neighborhood of pixels that different number rows are arranged from different numbers, the selection of neighborhood of pixels can
Based on many factors, the factor includes (for example) calculating speed, available processes power or just by the multi-medium data of release of an interleave
Feature.Because current field only has the half of the row, four rows to be matched actually correspond to the picture of 8 pixels × 8
The area of element.
Referring to Figure 20, mean square error and (SSE) can be used in two-way ME/MC 2068, and it can be used for measurement relative under Wmed
One prediction block of the Wmed present frames 2060 of one frame 2058 and present frame 2070 through release of an interleave and one is predicted class between block
Like property.The generation of motion-compensated present frame 2066 then uses the Pixel Information from most similar match block to fill most
Missing data between first pixel line.In certain aspects, the two-way biasings of ME/MC 2068 are come the previous frame for release of an interleave of hanging oneself
The Pixel Information of 2070 information gives the Pixel Information more weights because the Pixel Information by motion compensation information and
Wmed information is produced, and only filters release of an interleave Wmed next frames 2058 by space time.
In certain aspects, the matching in the region to improve the field with similar luminance area but different chroma areas
Can, can be used a measurement, the measurement include pixel one or more brightness groups (for example, one 4 rows × 8 row it is bright
Spend block) and pixel one or more colourity groups (for example, two 2 rows × 4 row chrominance block U and V) pixel value
Composition.Methods described efficiently reduces the mismatch at color sensitivity region.
Motion vector (MV) has the granularity of 1/2 pixel in vertical dimension and has 1/2 or 1/4 picture in horizontal size
The granularity of element.Interpolation filter can be used to obtain fraction pixel (fractional-pixel) sample.For example, it can be used for
Obtaining some wave filters of half-pix sample includes bi-linear filter (1,1), the interpolation filter H.263/AVC recommended:
(1, -5,20,20, -5,1), and six branch's hamming window (Hamming windowed) SIN function wave filters (3, -21,147,
147,-21,3).1/4 pixel samples can be produced by using bi-linear filter from both full-pixel and half-pix sample.
In certain aspects, motion compensation can be used polytype search procedure to match in a certain position of present frame
The data (for example, describe an object) at place with it is corresponding at the diverse location in another frame (for example, next frame or former frame)
The difference of position in data, respective frame indicates the motion of the object.For example, search procedure is relatively wantonly searched for using can cover
The full motion search in rope area or the fast motion estimate that less pixel can be used, and/or for the selected pixel in search pattern
There can be given shape (for example, rhombus).For fast motion estimate, the field of search can be using estimation or motion candidates person in
The available starting point for being search for consecutive frame of the heart, estimation or motion candidates person.In certain aspects, it can estimate from external movement
Device, which produces MV candidates and provides MV candidates, arrives deinterlacer.Correspondence in previously motion-compensated consecutive frame is adjacent
The motion vector of the macro block in domain also acts as estimation.In certain aspects, can be from searching for corresponding previous frame and next
The neighborhood of macro block (for example, 3 macro blocks × 3 macro block) of frame produce MV candidates.
Figure 24 illustrates as show in Figure 23 can be during motion estimation/compensation by searching for former frame and next frame
Neighborhood and produce two MV mapping MVPAnd MVNAn example.In MVPWith MVNIn, the pending block to determine movable information is
The central block represented by " X ".In MVPWith MVNIn, have what can be used during current block X being processed estimation
Nine MV candidates.In this example, four in the MV candidates are present in from previously performed motion search
In same field and pass through MVPWith MVNThe block (Figure 24) of middle paler colour is described.Five described by the deeper block of color
Other MV candidates are replicated from the movable information (or mapping) of the previously frame through processing.
After motion estimation/compensation is completed, two can be produced for the row (by represented by the dotted line in Figure 23) lacked
Individual interpolation results:By an interpolation results of Wmed wave filters (the Wmed present frames 2060 in Figure 20) generation and by motion compensator
The interpolation results that the motion estimation process of (MC present frames 2066) is produced.Combiner 2062 is generally by using Wmed present frames
At least a portion of 2060 and MC present frames 2066 is to merge Wmed present frames 2060 and MC present frames 2066 to produce current warp
The frame 2064 of release of an interleave.However, under certain conditions, combiner 2062 can be used only in present frame 2060 or MC present frames 2066
One of produce the current frame through release of an interleave.In one example, the merging of combiner 2062 Wmed present frames 2060 are current with MC
Frame 2066 is with generation such as the output signal through release of an interleave shown in equation 36:
WhereinFor position x=(x, y)tThe field n at place1In brightness value, whereintFor transposition.Using definition such as
Under cut function
Cut (0,1, a)=0, if (a<0);1, if (a>1);A , Fou The (37)
k1May be calculated for:
Wherein C1For robustness parameter, and Diff for prediction frame pixel with the available pixel in predicted frame (from existing
Obtain) between luminance difference.By suitably selecting C1, it is possible to tune the relative importance of mean square error.Can be such as equation 39
It is middle to show calculating k2:
Wherein For motion vector, δ is for preventing from being divided by zero small constant.Use
Cut function (clipping function) is further described in the G.D. Chinese (G.D.Haan) and E.B. shellfishes come the release of an interleave filtered
Le Si (E.B.Bellers) " release of an interleave (De-interlacing of video data) of video data " (electric and electronic
Transaction (IEEE Transactions on Consumer Electronics) of the IEEE on consumption electronic product,
Volume 43, the 3rd phase, the 819-825 pages, 1997) in, it is incorporated herein in entirety by reference.
In certain aspects, combiner 2062 can be configured to attempt and safeguard below equation to realize high PSNR and sane
As a result:
It is possible to Wmed+MC release of an interleaves scheme come the decoupling release of an interleave prediction side for including interpolation and intrafield interpolation between field
Case.In other words, space time Wmed filtering may be used primarily for intrafield interpolation purpose, and can be performed during motion compensation between field
Interpolation.This reduces the Y-PSNR of Wmed results, but visual quality is more satisfactory after application motion compensation, because
The bad pixel of predictive mode decision-making will be removed from Wmed filterings between inaccurate field.
Colourity processing can be consistent with arranged brightness processed.Produced according to Motion mapping, by observing chroma pixel
The sport rank of four arranged luminance pixels and the sport rank for obtaining the chroma pixel.The operation can be based on voting
(chroma motion grade borrows main luma motion grade).However, it is proposed that using following conservative approach.If four bright
Any one of degree pixel has quick sport rank, then chroma motion grade should be quick motion;Otherwise, if four brightness pictures
Any one of element has slow sport rank, then chroma motion grade will be slow motion;Otherwise chroma motion grade is static state
's.The conservative approach possibly can not realize highest PSNR, but no matter whether there is ambiguity in chroma motion grade, described conservative
Method avoids the risk predicted using INTER.
The Wmed algorithms and the Wmed of combination described herein individually described using described warp is calculated with motion compensation
Method carrys out release of an interleave multi-medium data sequence.Also (or average) algorithm and " non-release of an interleave " situation (wherein, only group are blended using pixel
Close field and without any interpolation or blending) carry out release of an interleave identical multi-medium data sequence.Gained frame is analyzed to determine PSNR
And PSNR is shown in following table:
Even if being only capable of improving edge PSNR plus Wmed release of an interleave by using MC, by combining Wmed and MC interpolation knots
The visual quality of release of an interleave image produced by fruit is still visually more satisfactory due to set forth above, combination
Wmed results and MC results can suppress the aliasing and noise between even field and odd field.
In in terms of some resamplings, leggy resampler is reset through implementing to be used for image size.In lower sampling
An example in, ratio between initial image and the image of adjusted size can be the whole of prime number each other for p/q, wherein p and q
Number.The total number of phase is p.For the factor that is sized for being about 0.5, the cut-off frequency of polyphase filters is in some respects
In be 0.6.Cut-off frequency Incomplete matching is sized ratio, so as to the high frequency response of the sequence that improves adjusted size.This is not
Allow some aliasings with can avoiding.It is well known, however, that compared with fuzzy and image without aliasing, human eye prefer it is clear but
There is the image of some aliasings.
Figure 42 illustrates an example of leggy resampling, its show be sized than for 3/4 when phase.It is described in Figure 42
Bright cut-off frequency is also 3/4.Illustrate initial pixel with vertical axis in figure 4 above 2.Also by SIN function (sinc
Function) it is plotted as centered on the axle representing filter shape.Because we select cut-off frequency and resampling ratio
It is identical, so the zero of the SIN function location overlap after pixel size is readjusted with pixel, with ten in Figure 42
Font line illustrates described overlapping.To obtain pixel value after be sized, can as in below equation displaying from initial picture
The total composition of element:
Wherein fcFor cut-off frequency.Above 1-D polyphase filters are applicable to horizontal size and vertical dimension.
The another aspect meter of resampling (being sized) and overscanning.In ntsc television signal, an image has 486
Scan line, and in digital video, can have 720 pixels in each scan line.However, due to the size and screen lattice
Mismatch between formula, and not all complete image can see on TV.The invisible part of image is referred to as overscanning.
To help broadcasting station that useful information is placed in in the visible area of TV as much as possible, film and Television Engineer
Association (SMPTE) defines the particular size for the action action frame for being referred to as safe action area and safe header area.See what SMPTE recommended
Specification on the safe action area for television system and safe header area test pattern puts into practice RP 27.3-1989
(practice RP 27.3~1989on Specifications for Safe Action and Safe Title Areas
Test Pattern for Television Systems).Safe action area is defined as that " all notable actions are necessary by SMPTE
The area of generation ".Safe header area is defined as " to limit all useful informations to ensure on most of household television reception devices
Observability " area.For example, as illustrated by Figure 43, safe action area 4310 occupies the center 90% of screen, around gives
Go out 5% border.Safe header area 4305 occupies the center 80% of screen, provides 10% border.
Referring now to Figure 44, because safe header area is so small, in order to add more contents in the picture, some broadcast
Text will be included in safe action area, described this paper is inside white rectangle window 4415.Generally it can be seen that in overscanning black
Color border.For example, in Figure 44, black border appears in the upside 4420 and downside 4425 of image.Can be in overscanning
The black border is removed, because H.264 video is extended in estimation using border.Black border through extension can increase
It is remaining.Border suitably can be cut down 2% by us, and then be adjusted size.Can therefore it produce for adjusted size of filter
Ripple device.Performed before being sampled under leggy and block to remove overscanning.
Deblocking/decyclization
, can be to all 4 × 4 block edges (edges of the boundary of the frame of a frame in an example of deblocking processing
And be deactivated except any edge of de-blocking filter process) apply de-blocking filter.Will complete frame construction process after with
All macro blocks in this filtering, a frame are performed based on macro block to be handled with the order of incremental macroblock address.It is right
In each macro block, vertical edge is from left to right filtered first, and then filters horizontal edge from the top to bottom.For horizontal direction and
For vertical direction, brightness block elimination filtering process is performed in four 16 sample edges and use is performed in two 8 sample edges
In the block elimination filtering process of every chrominance component, as shown in Figure 39.The process of deblocking to previous macro block may be passed through
The top of current macro changed and the sample value to left is operated to will act as to the block elimination filtering process of current macro
Input and can further be changed during the filtering to current macro.Changed during the filtering to vertical edge
Sample value can be used as the input of the filtering of the horizontal edge for same macro block.Can individually it be called for brightness and chromatic component
Block process.
, can be adaptively using 2-D wave filters so that the area of adjacent edges be smooth in an example of decyclization processing.Edge
Pixel experience seldom filters or not suffered from filtering to avoid obscuring.
GOP dispensers
Description is including may include that the Bandwidth map in GOP dispensers is produced, story board is detected and adaptive GOP points below
The illustrative example for the processing cut.
Bandwidth map is produced
Human vision quality V can be codec complexity C and the allocated position B (also referred to bandwidth) function.Figure 29 is to say
The chart of this bright relation.It note that in terms of human vision viewpoint, codec complexity measurement C considers space and temporal frequency.Because
Human eye is more sensitive to distortion, so complexity value is corresponding higher.Generally it can be assumed that:V monotone decreasings in C, and dullness is passed in B
Increase.
To realize constant visual quality, by bandwidth (Bi) it is assigned to i-th of object (frame or MB) to be encoded, the band
Width (Bi) standard represented in two equatioies immediately below satisfaction:
Bi=B (Ci,V) (42)
In two equatioies of surface, CiFor the codec complexity of i-th of object, B is total available bandwidth, and V is
The visual quality realized for an object.
Human vision quality is difficult to be represented with equation.Therefore, non-explication above equation group.But, if it is assumed that 3-D
Model is continuous in all variables, then it is believed that bandwidth ratio (Bi/ B) (C, V) to neighborhood in it is constant.Opened up following
Bandwidth ratio β defined in the equation showni:
βi=Bi/B (44)
Then bit allocation can be defined as represented in below equation:
βi=β (Ci)
Wherein
(Ci,V)∈δ(C0,V0)
Wherein δ indicates " neighborhood ".
Codec complexity is in space with being influenceed on the time by human vision sensitiveness.Ji Luode (Girod) mankind regard
It is an example of the model available for definition space complexity to feel model.This model considers local spatial frequencies and ambient lighting.
Gained measurement is referred to as Dcsat.At pretreatment point in this process, it is not known that image will be intra-encoded or through interframe
Encode and produce the bandwidth ratio for both.According to the β of different video objectINTRABetween ratio and distribute position.For through frame in
The image of coding, bandwidth ratio is shown in below equation:
βINTRA=β0INTRAlog10(1+αINTRAY2Dcsat) (46)
In above equation, Y is the average luminance component of macro block, αINTRAFor for luminance square and subsequent Dcsat
Weighting factor, β0INTRAFor for ensureingRegular factor.For example, αINTRA=4 value realizes good vision
Quality.Content information (for example, classifying content) can be used for αINTRAIt is set as a value, described value corresponds to the certain content of video
Desired good vision credit rating.In one example, if video content includes " spokesman head (talking
Head) " news broadcast, then because may think that the frame of the video or displayable part are not so good as audio-frequency unit weight
Will, so visual quality level can be set as relatively low, and less bits can be distributed for coded data.In another example, such as
Fruit video content includes sport events, then because image shown for observer is even more important, content
Information can be used for αINTRAThe value of higher visual quality level is set to correspond to, and therefore can distribute more multidigit for coded number
According to.
To understand this relation, it is noted that, bandwidth is to be assigned codec complexity in logarithmic fashion.Luminance square Y2Reflection
The fact that the coefficient with larger value is using compared with multidigit to encode.To prevent logarithm from obtaining negative value, addition one is into bracket
.It it is also possible to use the logarithm with other radixes.
Time complexity is determined by the measurement to frame difference metric, it is described measurement consider amount of exercise (for example,
Motion vector) and such as absolute difference and (SAD) frame difference metric in the case of measure two successive frames between difference.
Bit allocation for inter-coded image is contemplated that space complexity and time complexity.This be shown in
Under:
βINTER=β0INTERlog10(1+αINTER·SSD·Dcsatexp(-γ||MVP+MVN||2)) (47)
In above equation, MVPAnd MVNFor the forward motion vector and backward motion vectors for current MB.It may be noted that
Arrive, the Y in intra-encoded bandwidth formula2Replaced by the difference of two squares with (SSD).To understand in above equation | | MVP+MVN
||2Effect, note the following characteristics of human visual system:Experience it is smooth, it is predictable motion (it is small | | MVP+MVN||2)
Area attracts notice and can be tracked as eyes and can not generally stand the distortion more than than static region.However, experience it is quick or
Uncertain motion (big | | MVP+MVN||2) area can not be traced and notable quantization can be stood.Experiment shows:αINTER=
1st, good vision quality is realized in γ=0.001.
Story board is detected
One illustrative example of story board detection is described below.The component and process may include in GOP dispensers 412
In (Fig. 4).
Motion compensator 23 can be configured to determine the bi-directional motion information on the frame in video.Motion compensator 23 is also
It can be configured to determine one or more difference measurements, such as absolute difference and (SAD) or the difference of two squares and (SSD), and
Calculate monochrome information (for example, macro block (MB) average brightness or difference), the brightness for including being used for one or more frames
The other information of histogram difference and a frame difference metric, the example of the other information is described with reference to equation 1 to 3.Camera lens point
Class device can be configured with using determined by motion compensator information by the frame classification in video into more than two classes or two classes
" camera lens ".Encoder is configured to adaptively encode multiple frames based on the shot classification.Retouched below with reference to equation 1 to 10
State motion compensator, shot classification device and encoder.
The block diagram of preprocessor 202 according to Figure 28 in terms of some, the preprocessor 202 is comprising being configured for use in
Story board detection and the processor 2831 of other pretreatment operations.Can be by outside preprocessor 202 as show in Figure 4
Source provides digital video source and digital video source is sent to the communication module 2836 in preprocessor 202.Preprocessor 202 contains
There is the storage media 2825 communicated with processor 2831, processor 2831 is logical with communication module 2836 with storage media 2825
Letter.Processor 2831 includes the operable mirror to produce as described in this article in movable information, the frame for video data of classifying
Head and motion compensator 2032, shot classification device 2833 and the other modules for pretreatment for performing other pretreatment tests
2034.Motion compensator, shot classification device and other modules can contain the process for the respective modules being similar in Fig. 4, and can locate
Video is managed to determine information discussed below.Specifically, processor 2831 can have one configuration with:Obtain and indicate multiple regard
The measurement (measurement includes bi-directional motion information and monochrome information) of difference between the consecutive frame of frequency frame, based on the measurement
And the shot change in the multiple frame of video is determined, and the multiple frame of adaptive coding based on the shot change.
In some aspects, the measurement can be calculated by the device or process outside processor 2831, described device or process are also
It can directly or indirectly be communicated via another device or memory in the outside of preprocessor 202 and with processor 2831.Can also be by
Reason device 2831 calculates the measurement, for example, calculating the measurement by motion compensator 2832.
Video and metadata for further handling, encoding and transmitting is provided and arrives other devices, example by preprocessor 202
Such as, terminal 6 (Fig. 1).Encoded video in certain aspects can be more for that can include the scalable warp of Primary layer and enhancement layer
The video of layer coding.Scalable layer coding, which is further described in, entitled " has the scalable of two layer encoding and single layer coding
Video coding (Scalable Video Coding With Two Layer Encoding And Single Layer
Decoding in the same U.S. patent application case [attorney docket is 050078] in application) ", the application case returns this to send out
Bright assignee possesses and it is incorporated herein in entirety by reference.
Various illustrative components, blocks, component with reference to described by Figure 28 and other examples disclosed herein and schema, mould
Block and circuit can be used to lower device and be carried out or perform in certain aspects:General processor, digital signal processor
(DSP), application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic devices, discrete gate or
Transistor logic, discrete hardware components or its any combinations for being designed to perform functionality described herein.For example in Figure 28
The general processor of the processor shown can be microprocessor, but in alternative embodiments, processor can be at any routine
Manage device, controller, microcontroller or state machine.Processor can also be embodied as the combination of computing device, for example, DSP and microprocessor
The combination of device, the combination of multi-microprocessor, one or more microprocessors are combined with DSP core, or any other
Such a configuration.
Video coding is generally operated to constructed groups of pictures (GOP).GOP is generally by intra-encoded frame (I
Frame) start, it is followed by a series of P (prediction) or B (two-way) frame.Generally, I frames can store all numbers for showing the frame
According to B frames are dependent on the data in former frame and a later frame (for example, only containing the data changed from former frame or different from next
Data in frame), and P frames contain the data changed from former frame.
Common in use, in encoded video, I frames are scattered with P frames and B frames.With regard to size (for example, for encoding
The number of the position of the frame) for, I frames are generally more much bigger than P frame, and P frames are again bigger than B frame.For it is efficient coding, transmission and
Decoding process, GOP length answers long enough and can reduce the effective loss for carrying out arrogant I frames, and answers short enough and can hinder
The only mismatch between encoder and decoder, or channel impairment.Further, since it is identical the reason for, macro block (MB) in P frames can be through
Intraframe coding.
Scene change detection can be used for video encoder to determine appropriate GOP length and insert I frames based on GOP length,
Rather than insert I frames with fixed intervals.In actual streaming video system, communication channel is typically due to an error or package loss
And it is undermined.Where I frames or I MB can significantly affect decoded video quality and observation experience if being placed in.A kind of encoding scheme is pair
It is intra-encoded in being used with the image significantly changed or image section from arranged prior images or image section
Frame.Usual unavailable estimation efficiently and effectively predicts the region, and if the region is not by interframe encode skill
Art (for example, using B frames and the coding of P frames) is acted on, then can efficiently be encoded.In the case of channel impairment, institute
State region and can suffer from error propagation, (or almost such) described error propagation can be reduced or eliminated by intraframe coding.
Can be by the part classifying of GOP videos into more than two classes or two classes, each of which region can have different frame ins to compile
Code standard, it may depend on particular., can be by visual classification into three classes as an example:Unexpected scene changes, are handed over
Fork decline and other slow scene changes, and camera flash-light.Unexpected scene changes include generally being caused by camera operation
The frame for being markedly different from previous frame.Because the content of the frame is different from the content of previous frame, unexpected scene changes
Frame should be encoded to I frames.Cross compound turbine and other slow scene changes include generally being caused by the computer disposal of camera lens
Scene slowly switching.The gradually blending of two kinds of different scenes may be more satisfactory in human eye, but its propose to regarding
The challenge of frequency coding.Motion compensation can not efficiently reduce the bit rate of the frame, and renewable more interior for the frame
Portion MB.
When the content of frame includes camera flash, camera flash-light or camera flash events occur.The flash of light is continuing
((for example, a frame)) relatively short on time and extremely become clear, so that the pixel in the frame of a description flash of light is relative to consecutive frame
On corresponding area show generally high brightness.Camera flash-light suddenly and rapidly changes the brightness of image.Camera flash-light
Duration generally it is shorter than the temporary transient masking duration of human visual system (HVS), human visual system's (HVS) is temporary transient
The masking duration is generally defined as 44ms.Human eye is insensitive to the quality of the short lightness burst, and therefore can be right
It is encoded roughly.Because unavailable motion compensation effectively handle flash lamp frame and flash lamp frame be future frame it is bad
Predicting candidate person, so the rough coding of the frame will not reduce the code efficiency of frame in future.Due to " artificial " high brightness, warp
The scene for being categorized as flash lamp bel not applied to predict other frames, and due to very same reason, other frames are not effectively used for prediction
The frame.Once recognizing the frame, the frame is just can be taken off, because the frame needs relatively high treating capacity.One option is shifting
Except camera flash-light frame and the coding DC coefficients in the position of camera flash-light frame;Such a solution is simple, is being calculated
On be quick and save many positions.
When detecting any one of above frame, declare camera lens event.Story board detection not only facilitates improvement coding matter
Amount, and also assist in identification video contents search and index.The one side of scene detection process is described below.
Figure 30 is illustrated to operate GOP and can be used for being compiled based on the story board detection in frame of video in certain aspects
The process 3000 of code video, the part (or subprocess) of wherein process 3000 is described by and illustrated referring to Figure 30 to Figure 40.Place
Reason device 2831 can be configured with and have process 3000.After the beginning of process 3000, process 3000 proceeds to frame 3042, in frame
In 3042, the measurement (information) for frame of video is obtained, the measurement includes the information for indicating the difference between consecutive frame.It is described
Measurement includes being used subsequently to determine the bi-directional motion information for appearing in the change between consecutive frame and the information based on brightness, described
Information can be used for shot classification.The measurement can be obtained from another device or process, or institute is calculated by (such as) processor 2831
State measurement.The illustrative example produced with reference to the process A description measurements in Figure 31.
Process 3000 proceeds to frame 3044, in frame 3044, and the shot change in video is determined based on the measurement.
Frame of video can be categorized into the lens types being contained in frame more than two classes or two classes, for example, unexpected scene changes, slow
The scene of change or the scene (camera flash) containing high luminance values.Some embodiment codings may need other classes.With reference to
Process B and more detailed process D, E and F with reference in Figure 34 to Figure 36 in Figure 32 describe an illustrative example of shot classification.
Once being classified to frame, process 3000 just proceeds to frame 3046, in frame 3046, and shot classification knot can be used
Fruit carrys out coded frame or is coding designated frame.The result can influence with intra-encoded frame come coded frame or with prediction frame
(for example, P frames or B frames) carrys out coded frame.Process C displayings in Figure 33 use an example of the encoding scheme of camera lens result.
Figure 31 illustrates an example of the process for obtaining video metric.Figure 31 illustrates to come across in Figure 30 frame 3042
Some steps.Referring still to Figure 31, in frame 3152, process A obtains or determined the bi-directional motion estimation and compensated information of video.Figure
28 motion compensator 2832 can be configured to perform bi-directional motion estimation to frame and determine available for subsequent shot classification
Motion compensation information.Process A proceeds to frame 3154, in frame 3154, process A produce include being used for current or selected frame with
The monochrome information of the luminance difference histogram of one or more consecutive frames.Finally, process A then continues to frame 3156,
In frame 3156, the measurement for indicating camera lens contained in frame is calculated.One it is such a measurement be in two examples in equation 4 and 10
The frame difference metric of displaying.Description determines the illustrative example of movable information, monochrome information and frame difference metric below.
Motion compensation
To perform bi-directional motion estimation/compensation, a video sequence, the two-way fortune can be pre-processed with bidirectional motion compensator
(one in the past, and one in general with two frames in most adjacent contiguous frames by every one 8 × 8 pieces of present frame for dynamic compensator
Come) in Block- matching.Motion compensator produces each piece of motion vector and difference measurement.Figure 37 illustrates this concept, Tu37Zhan
Show an example of the pixel matching by present frame C pixel Yu past frame P and (or next) the frame N in future, and Figure 37 is painted into
Motion vector (the past motion vector MV of matched pixelPAnd motion vector MV in futureN).The following is to bidirectional motion vector
Produce and about the Short Description of the illustrative aspect encoded.
Figure 40 illustrates an example of the prediction frame coding in motion vector determination process and (such as) MPEG-4.Institute in Figure 40
The process of description is being described in more detail for pair example procedure that can occur in Figure 31 frame 3152.In Figure 40, present image
4034 are made up of 5 × 5 macro blocks, and the number of the wherein macro block in this example is arbitrary.One macro block is by 16 × 16 pixel groups
Into.Pixel can be defined by 8 brightness values (Y) and two 8 chromatic values (Cr and Cb).
, can be with 4 in MPEG:2:0 form stores Y, Cr and Cb component, wherein in X and Y-direction with factor 2 to Cr and
Cb components carry out lower sample.Therefore, each macro block will be made up of 256 Y-components, 64 Cr components and 64 Cb components.Not
At the time point for being same as present image 4034, the macro block 4036 of present image 4034 is predicted from reference picture 4032.With reference to figure
It is grand to be positioned at best match in Y, Cr and Cb value closest to the current macro 4036 being just encoded as scanning in 4032
Block 4038.4038 position in reference picture 4032 of best match macro block is encoded in motion vector 4040.Reference picture
4032 can be the I frames or P frames that decoder has just been reconstructed before construction present image 4034.Subtracted from current macro 4036
Best match macro block 4038 (difference for calculating each of Y, Cr and Cb component), so as to produce residual error 4042.With two dimension
The coded residual error 4042 of (2D) discrete cosine transform (DCT) 4044 and then quantified 4046.It is executable quantify 4046 with
Less position is distributed to high frequency coefficient and distribute more position to low frequency coefficient and space compression is provided by (such as).Remnants are by mistake
Poor 4042 quantified coefficient and motion vector 4040 and the identification information of reference picture 4034 is to represent current macro 4036
Coding information.Coding information can be stored in memory to use in the future or for (such as) error correction or image enhaucament
Purpose operated, or transmitted over the network 140.
The encoded quantified coefficient of residual error 4042 and encoded motion vector 4040 can be used in coding
Current macro 4036 is reconstructed in device for use as the part for subsequent estimation and the reference frame of compensation.Encoder can be imitated
The program of the decoder reconstructed for this P frame.Imitate decoder encoder will be caused to be worked with decoder with same reference picture.
Restructuring procedure is presented herein, no matter the restructuring procedure is carried out for further interframe encode either in solution in the encoder
Carried out in code device.It can start to reconstruct P frames after reconstructed reference frame (or the image or a part for frame being just referenced).De-quantization
4050 encoded quantified coefficients and then execution 2-D discrete cosine inverse transformation DCT or IDCT 4052, so as to produce through solution
Code or the residual error 4054 of reconstruct.Encoded motion vector 4040 is decoded and in reconstructed reference picture
Reconstructed best match macro block 4056 is positioned in 4032.Then reconstructed residual error 4054 is added to reconstructed
Best match macro block 4056 is to form reconstructed macro block 4058.Reconstructed macro block 4058 can be stored in memory, independent
Ground is shown in an image together with other reconstructed macro blocks, or after further treatment for image enhaucament.
Using the coding (or with the bi-directional predicted any section encoded) of B frames using the region in present image
Time between the best match estimation range in previous image and the best match estimation range in latter image is superfluous
It is remaining.Combine the bi-directional predicted region that latter best match estimation range is combined with previous best match estimation range to be formed.When
Difference between the bi-directional predicted region of the combination of preceding image-region and best match is residual error (or predicated error).Can be
Position and best match estimation range of the best match estimation range in latter reference picture is encoded in two motion vectors to exist
Position in previous reference picture.
Luminance Histogram Difference
Motion compensator can produce each piece of difference measurement.The difference measurement can be the difference of two squares and (SSD) or exhausted
To difference and (SAD).Do not lose it is general in the case of, herein SAD be used as an example.
For each frame, SAD ratios are calculated as follows:
Wherein SADPAnd SADNThe respectively sum of the absolute difference of forward and backward difference measurement.It note that denominator contains one small
Positive number ε to prevent " be divided by zero " error.Molecule is also containing ε to balance in denominator one effect.For example, if preceding
One frame, present frame and next frame are identicals, then motion search should produce SADP=SADN=0.In this situation, the above is calculated
Produce γ=1 rather than 0 or unlimited.
A brightness histogram can be calculated for each frame.Multi-media image generally has the brightness and depth of 8 (for example, " interval
(bin) number ").16 can be set to obtain Nogata according to the brightness and depth that some aspects are used to calculate brightness histogram
Figure.In other side, brightness and depth can be set to proper number, and the proper number may depend on data being processed
Type, available calculating power or other preassigneds.In certain aspects, can be based on the measurement for calculating or receiving
(for example, content of data) and dynamic set brightness and depth.
Equation 49 illustrates the example for calculating Luminance Histogram Difference (lambda):
Wherein NPiFor the number of the block in i-th of interval for former frame, and NCiFor i-th of area for present frame
Between in block number, and N be a frame in block total number.If the Luminance Histogram Difference of former frame and present frame is complete
Different (or non-intersect), then λ=2.
The frame difference metric D discussed with reference to Fig. 5 block 56 can be calculated as shown in equation 50:
Wherein A be according to apply selected constant, andAnd
Figure 32 illustrates the mistake that the change of three class camera lenses (or scene) is determined using the measurement for obtaining or determining for video
A journey B example.Figure 32 illustrates some steps occurred in the one side of Figure 30 frame 3044.Referring again to Figure 32, in frame
In 3262, process B determines whether frame meets the standard by unexpected scene changes are designated as first.Process D in Figure 34 is said
One example of this bright determination.Process B proceeds to frame 3264, in frame 3264, and whether determine the frame is slowly varying
A part for scene.Process C in Figure 35 illustrates the example for determining slowly varying scene.Finally, at frame 3266, process
B determines whether frame contains camera flash (in other words, different from the big brightness value of former frame).Process F explanations in Figure 36
It is determined that an example of the frame containing camera flash.One illustrative example of the process is described below.
Unexpected scene changes
Figure 34 is the flow chart for illustrating to determine the process of unexpected scene changes.Figure 34 is further elaborated on can be in Figure 32
Frame 3262 some aspects in some steps for occurring.Check whether frame difference metric D is met in equation 51 at frame 3482
The standard shown:
Wherein A is according to the selected constant of application, and T1For threshold value.If meeting the standard, at frame 3484,
Process D specifies the frame to be unexpected scene changes, and in this example, it is not necessary to any other shot classification.
In one example, simulation shows setting A=1 and T1=5 realize excellent detection performance.If present frame is unexpected
Scene change frame, then γCShould big and γPShould be small.Ratio can be usedWithout γ is used aloneCTo cause measurement through on regular turn to
Activity Level hereafter.
It note that above standard uses Luminance Histogram Difference (λ) with nonlinear method.Figure 39 illustrates that λ * (2 λ+1) are convex
Function.When λ small (for example, close to zero), it is only pre-emphasis.When λ becomes big, more emphasize is carried out by the function.
In the case of this pre-emphasis, for any λ more than 1.4, if threshold value T15 are set to, then detects unexpected scene
Change.
Cross compound turbine and slow scene changes
Figure 35 further illustrates other details in terms of some that can occur in Figure 32 step 3264.Referring to Figure 35,
At frame 3592, process E determine frame whether be the series of frames for describing slow scene changes a part.If frame difference degree
Measure D and be less than first threshold T1And more than or equal to Second Threshold T2(illustrated in such as equation 52), then process E determines that present frame is
Cross compound turbine or other slow scene changes:
T2≤D<T1 (52)
For certain number successive frame, wherein T1For same threshold value used above and T2For another threshold value.Due to embodiment party
Possible difference, T in case1And T2Explicit value generally determined by normally testing.If meeting standard, at frame 94,
Frame classification is the part classified for the slowly varying scene shot of selected frame end by process E.
Camera flash-light event
The process F shown in Figure 36 is that can determine that whether present frame includes an example of the process of camera flash-light.
In this illustrative aspect camera, luminance histogram statistics are used to determine whether present frame includes camera flash-light.As at frame 3602
Shown, whether brightness of the process F by determining present frame first is more than the brightness of former frame and the brightness of next frame to determine
Camera flash events are in selected frame.If answer is no, frame is not camera flash events;But if answer is yes, then
Frame may be camera flash events.At frame 3604, process F determines whether backward difference measurement is more than threshold value T3, and forward direction is poor
Whether different measurement is more than threshold value T4;If two conditions are met, at frame 3606, process F divides present frame
Class is with camera flash-light.In one example, at frame 3602, process F determines that the mean flow rate of present frame subtracts former frame
Mean flow rate whether equal or exceed threshold value T3, and process F determines that the mean flow rate of present frame subtracts the average bright of next frame
Whether degree is more than or equal to threshold value T3, as shown in equation 53 and 54:
If being unsatisfactory for standard, present frame is not categorized as comprising camera flash-light and process F returns.If meeting mark
Standard, then process F proceed to frame 3604, in frame 3604, it is determined that backward difference measurement SADPAnd forward difference measurement SADNIt is whether big
In specific threshold T4, as illustrated by equation 5 below 5 and 56:
SADP≥T4 (55)
SADN≥T4 (56)
WhereinFor the mean flow rate of present frame,For the mean flow rate of former frame,For the average bright of next frame
Degree, and SADPAnd SADNFor the forward and backward difference measurement associated with present frame.If being unsatisfactory for standard, process F is returned
Return.
Because implementing described process can cause to include the difference of the operating parameter of threshold value, T3Value is generally by just
Often experiment is determined.Because camera flash generally only carries out a frame, sad value is included in determination, and due to brightness
Difference, it is impossible to predict this frame well from forward direction and backward direction using motion compensation.
In certain aspects, threshold value T1、T2、T3And T4One of or it is one or more of through predetermined and described value through being incorporated into
In shot classification device in code device.The test of the particular generally detected via story board selects the threshold
Value.In certain aspects, can be based on the use information (for example, metadata) for being fed to shot classification device or based on by shot classification
The information that device the is calculated given threshold T (for example, dynamically) during processing in itself1、T2、T3And T4One of or one with
On.
Referring now to Figure 33, Figure 33 is shown to be determined to be used for video or for encoding for the shot classification based on selected frame
State the process C of the coding parameter of video.At frame 3370, process C determines whether selected frame is classified as unexpected scene changes.
If answer is yes, at frame 3371, present frame is categorized as to unexpected scene changes, and I frames can be encoded a frame as and can
Determine GOP borders.If answer is no, process C proceeds to frame 3372;If present frame is classified as slowly varying scene
A part, then the present frame in slowly varying scene and other frames can be encoded to at frame 3373 prediction frame (for example, P
Frame or B frames).Process C proceeds to frame 3374, at frame 3374, checks whether present frame is classified as including camera flash
Flash lamp scene.If answer is yes, it can recognize that frame is used for specially treated at frame 3375, for example, removing, replicating previous
Frame, or encode the particular factor for the frame.If answer is no, without present frame any classification and can be according to it
Its standard encodes selected frame, and selected frame is encoded into I frames or discarding.Can implementation process C in the encoder.
In above-mentioned aspect, by between frame difference metric D instructions frame to be compressed and two adjacent frames of the frame
Measures of dispersion.If detecting significant unidirectional luminance delta, it represents the cross compound turbine effect in frame.Cross compound turbine is got over
Significantly, bigger gain can be realized by using B frames.In certain aspects, using being shown in such as equation 5 below 7 through repairing
The frame difference metric changed:
If YP-Δ≥YC≥YN+ Δ or YP+Δ≤YC≤YN-Δ, (57)
Wherein dP=| YC-YP| and dN=| YC-YN| it is respectively the luminance difference and present frame between present frame and former frame
Luminance difference between next frame, Δ represents the constant that can be determined in normal experiment (because it may depend on implementation
Scheme), and α is the weight variable with the value between 0 and 1.
If it is observed that the consistent trend of brightness change and warping strength is sufficiently large, then modified frame difference metric D1Only
Different from initial frame difference metric D.D1Equal to or less than D.If the change of brightness is stable (dP=dN), then it is modified
Frame difference metric D1Less than initial frame difference metric D, minimum ratio is (1- α).
Table 1 below is shown by adding the performance improvement that unexpected Scene change detection is obtained.Non- scene changes (NSC) with
In scene changes (SC) situation, the total number of I frames is approximately the same.In NSC situations, I frames are uniformly distributed in whole sequence, and
In SC situations, I frames are only assigned to unexpected scene change frame.
It can be seen that 0.2~0.3dB improvement generally can be achieved in terms of PSNR.Analog result is shown:Story board detector exists
It is very accurate in camera lens event mentioned above to determine.Simulation exhibition to five chips with normal cross compound turbine effect
Show:In the case of Δ=5.5 and α=0.4,0.226031dB PSNR gains are realized under same bit rate.
Sequence measurement | Bit rate (kbps) | Average QP | PSNR(dB) |
Animation NSC | 226.2403 | 31.1696 | 35.6426 |
Animation SC | 232.8023 | 29.8171 | 36.4513 |
Music NSC | 246.6394 | 32.8524 | 35.9337 |
Music SC | 250.0994 | 32.3209 | 36.1202 |
Title NSC | 216.9493 | 29.8304 | 38.9804 |
Headline news SC | 220.2512 | 28.9011 | 39.3151 |
Basketball NSC | 256.8726 | 33.1429 | 33.5262 |
Basketball SC | 254.9242 | 32.4341 | 33.8635 |
Table 1:The analog result of unexpected Scene change detection
Adaptive gop structure
One illustrative example of adaptive gop structure operation is described below.The operation may include in GOP points of Figure 41 2
In cutter 412.Although can force a regular texture, MPEG2 (older video compression standard) does not require that GOP has a rule
Structure.MPEG2 sequences are always started with I frames, i.e. the frame encoded in the case of without reference to prior images.It is generally logical
The spacing crossed in the GOP of fixed P images or prognostic chart picture after an i frame arranges MPEG2GOP forms in advance at encoder.P
Frame is the image for having given fractional prediction from previous I images or P images.Frame between the I frames of starting and follow-up P frames is encoded
For B frames." B " frame (B represents two-way) can use previous and ensuing I images or P images individually or simultaneously as reference.With
Number in the position of coding I frames can averagely exceed the number for the position for being used to encode P frames;Equally, for the number for the position for encoding P frames
The number for the position for being used to encode B frames can averagely be exceeded.If using the frame being skipped, the frame can be used for without using any position
It is represented.
Using P frames and B frames and (closer to compression algorithm in) jump of frame one has an advantage that, it is possible to reduction is regarded
Keep pouring in defeated size.When time redundancy is higher (for example, when there is small change between image), P images, B images or it is skipped
The use of image effectively represents video flowing, because previous decoded I images or P images later serve as the other P images of decoding
Or the reference of B images.
Groups of pictures dispenser adaptively coded frame to minimize time redundancy.Difference between quantized frame and to warp
The difference of quantization perform after suitable test it is automatic make by I frames, P frames, B frames or the frame that is skipped represent image certainly
Plan.The processing in GOP dispensers is aided in by other operations of preprocessor 202, the processing provides filtering to make an uproar
Sound is removed.
Adaptive coding process has the unavailable advantage in " fixation " cataloged procedure.In fixed process is ignored
The possibility of small change has occurred in appearance;However, self-adapting program allow to insert more B frames between every I frames and P frames or
Between two P frames, the number of the position for fully representing frame sequence is thereby reduced.On the contrary, (such as) is in fixed cataloged procedure
In, when the change in video content is more significant, because the difference between prediction frame and reference frame is too big, the efficiency of P frames
It is greatly reduced.Under the described conditions, the object of matching may be dropped out from motion search area, or due to by camera angle
Change the distortion caused and reduce the similarity between the object of matching.Adaptive coding process is advantageously used for optionally
It is determined that when P frames should encoded.
In system disclosed herein, the type of condition described above is sensed automatically.It is described herein adaptive
It is the change that is flexible and making it suitable for content to answer cataloged procedure.The frame difference metric of adaptive coding process assessment one,
The frame difference metric can be considered as the measurement to the distance between the frame with identical additivity distance property.In concept
On, if frame F1、F2And F3With interframe apart from d12And d23, then it is assumed that F1With F3The distance between be at least d12+d23.Such
Like progress frame appointment on the basis of the measurement of distance and other measurements.
GOP dispensers 412 are operated by the way that image type is assigned into frame when a frame is received.Image type indicates available
In each piece of Forecasting Methodology of coding:
The coded I pictures under without reference to other images.Because I images are independent, it provides deposit in a stream
Take a little, decoding can be started at the access point.If " distance " to the previous video frames of frame exceedes scene change threshold, by I
Type of coding is assigned to the frame.
Previous I images or P images can be used to carry out motion compensated prediction for P images.P images are used in preceding field or frame
Can from be just predicted block movement block as coding basis.After reference block is subtracted from just considered block, generally
Carry out coded residual block using the discrete cosine transform for eliminating spatial redundancy.If frame and the assigned last frame for P frames it
Between " distance " exceed Second Threshold of typically smaller than first threshold, then P type of codings are assigned to the frame.
B two field pictures can carry out motion compensation using previous and ensuing P images or I images as described above.Can before
To, the backward or bidirectionally block in prediction B images;Or can be in the case of without reference to other frames to described piece of progress frame in volume
Code.In h .264, reference block can be the linear combination of up to 32 blocks from up to 32 frames.If frame can not be assigned
For I types or P types, if being more than typically smaller than described second from the frame to " distance " of the previous video frames abutted of the frame
3rd threshold value of threshold value, then be assigned as B types by the frame.If frame can not be assigned as to becoming encoded B frames, by institute
State frame and be assigned as " frame-skipping " state.This frame is can skip, because it is actually the duplicate of former frame.
The Part I that the measurement for quantifying the difference between consecutive frame with display order is this processing is assessed, it is betided
In GOP dispensers 412.This measurement is distance mentioned above;The appropriate type of each frame is estimated with this measurement.Cause
This, the gap variable between I frames and adjacent P frames or between two successive P frames.The measurement is calculated by with block-based
Motion compensator processing frame of video start, although other block sizes of such as 8 × 8,4 × 4 and 8 × 16 are possible, but for regarding
The block of the base unit of frequency compression generally comprises 16 × 16 pixels.For being presented in being made up of two release of an interleaves for output
Frame, motion compensation is carried out based on field, the search of reference block is occurred rather than in field in frame to occur.For current
One piece in first of frame, a forward direction reference block is found in the field of frame after the current frame;Equally, in close proximity in current field
A backward reference block is found in the field of frame before.The current block is combined into compensated field.The process is with the of frame
Two continuation.Two compensated fields of combination compensate frame to form forward and backward.
For the frame created in anti-telecine process 406, to the search of reference block can only based on frame because
Only produce reconstructed film frame.Obtain two reference blocks and two difference (forward direction with backward), thus also produce it is preceding to and after
To compensation frame.In a word, motion compensator produces the motion vector and difference measurement for each piece.Note, before assessing
To depending on difference or backward difference, block in just considered field or frame is with most preferably matching described piece of (described piece of block
In previous field or frame or in close proximity in field thereafter or frame) between assess the difference of measurement.Only brightness value participates in this meter
Calculate.
Therefore motion compensation step produces two groups of differences.The difference is between the block with present intensity value and has
Between brightness value in the reference block obtained from the frame for being close proximity to before present frame and being close proximity in time after present frame.For
Each pixel in block determines the absolute value of each forward difference and each backward difference and added up to respectively on whole frame each
Person.When release of an interleave NTSC of the processing comprising frame, two kinds of summations include two fields.In this approach, obtain forward difference and after
To total absolute value SAD of differencePAnd SADN。
For each frame, SAD ratios are calculated using following relation,
Wherein SADPAnd SADNRespectively total absolute value of forward difference and backward difference.Small positive number ε is added to point
Son to prevent " be divided by zero " error.Similar ε items are added to denominator, further reduces and works as SADPOr SADNClose to zero
When γ sensitiveness.
In an alternative aspect, difference can be that SSD (sum of the difference of two squares) and SAD (sum of absolute difference) or SATD (wherein, lead to
Cross and the block of pixel value is converted to block application two-dimension discrete cosine transform before the difference in obtaining block element).Although at it
It can be used less area in its aspect, but it is described and assessed in the area of effective video.
Also calculate the brightness histogram of each frame (non-motion compensated) received.The histogram to the 16 of coefficient ×
DC coefficients (that is, (0,0) coefficient) (if it is available) in 16 arrays are operated, and the array is the block application to brightness value
The result of two-dimension discrete cosine transform.Equally, the average values of 256 values of the brightness in 16 × 16 pieces can be used for histogram
In.For the image that brightness and depth is eight, interval number is set to 16.Ensuing metric evaluation histogram difference
More than, NPiFor the number of the block of the former frame in the i-th interval, and NciFor from belong in the i-th interval work as
The number of the block of previous frame, N be frame in block total number.
The intermediate result is combined as follows to form current frame difference metric
Wherein γCFor the SAD ratios based on present frame, and γPFor the SAD ratios based on former frame.If scene has smooth fortune
Dynamic and its brightness histogram hardly changes, then M ≈ 1.If present frame shows unexpected scene changes, γCWill big and γP
Should be small.Using thanWithout γ is used aloneCTo cause measurement through the regular Activity Level for turning to context.
Data flow 4100 in Figure 40 illustrates the specific components that can be used for calculating frame difference metric.Preprocessor 4125 will be handed over
The frame of wrong field (under the video situation with NTSC sources) and film image is (when video source is the result of anti-telecine process
When) it is delivered to bidirectional motion compensator 4133.Bidirectional motion compensator 4133 by by field splitting into the block of 16 × 16 pixels and
All 16 × 16 pieces in each piece of defined area with the field of former frame are compared and to the field (or in film video source
Situation under frame) operated.Selection provides the block of best match and subtracts described piece from current block.Obtain the difference
Absolute value and the aggregate result in 256 pixels comprising current block.When all current blocks to the field carry out this operation and
Then when two fields are carried out with this operation, output SAD is calculated by backward difference module 4137N(backward difference measurement).Can be by
Forward difference module 4136 performs similar program.Forward difference module 4136 uses the frame being close proximity in time before present frame
As the source of reference block to draw SADP(forward difference measurement).Although carrying out estimation procedure using recovered film frame,
Same estimation procedure is also betided when input frame is formed in anti-telecine process.Can be in histogram difference module 4141
Form the histogram for the calculating that can be used for completing frame difference metric.The average value of block-based brightness assigns every one 16 × 16 pieces
It is interval to one.This information is formed by following:All 256 pixel brightness values are added in one piece, by 256 to it
Carry out regular (if necessary) and be incremented by the interval counting for being equipped with average value.Each frame through advance motion compensation is entered
Row is once calculated, and when new present frame is reached, the histogram for present frame becomes the histogram for former frame.By straight
The number difference of block in square figure difference module 4141 and regular described two histograms are defined with being formed by equation 59
λ.The result is combined in frame difference combiner 4143 to assess the current frame difference defined in equation 60, the frame difference group
Clutch 4143 uses knot in the middle of being obtained in histogram difference module 4141, forward and backward difference module 4136 and 4137
Really.
Can by hardware, software, firmware, middleware, microcode or its any combinations implementing procedure Figure 41 00 system and its
Component or step.Each functional unit (including preprocessor 4135, bidirectional motion compensator 4133, the forward direction of flow chart 4100
And backward difference measurement module 4136 and 4137, histogram difference module 4141 and frame difference metric combiner 4143) can be achieved
For independent assembly, in the component that another device is incorporated into as hardware, firmware, middleware, or it is micro- with what is performed on a processor
Code or software are practiced, or its combination.When being practiced with software, firmware, middleware or microcode, it can be appointed performing
Procedure code or the code section of business are stored in the machine-readable medium of such as storage media.Code section can represent process, function, sub- journey
Sequence, program, routine, subroutine, module, software encapsulation, class, or instruction, any combinations of data structure or program statement.It can lead to
Cross transmission and/or receive information, data, independent variable, parameter or memory content and code section is coupled to another yard of section or hardware
Circuit.
Received and processing data can be stored in storage media, and the storage media may include that (such as) chip is configured
Formula storage media (for example, ROM, RAM) or the dish-type storage media (for example, magnetic or optical) for being connected to processor.
In some aspects, combiner 4143 can contain some or all storage medias.The explanation of flow chart 4200 in Figure 41 will pressure
Process of the contracting type assignment to frame.On the one hand in M, the current frame difference defined in equation 3 is to be used to assign relative to frame
The basis for all decision-makings made.When decision block 4253 is indicated:If it is considered that in the sequence of frame system one in the first frame, be labeled as
The decision path for being proceeds to frame 4255, and it is I frames thereby to declare the frame.The frame difference of accumulation is set as in frame 4257
Zero, and initial block 4253 is arrived in process return (in frame 4258).If considered frame is not the first frame in a sequence, mark
No path is designated as since the frame 4253 made decision, and it is poor for scene change threshold test present frame in test block 4259
It is different.If current frame difference is more than the threshold value, proceeds to frame 4255 labeled as the decision path for being, again lead to I frames
Assign.If current frame difference is less than scene change threshold, no path proceeds to frame 4261, in frame 4261, by present frame
Difference is added to the frame difference of accumulation.
Continue the flow chart, at decision block 4263, by the frame difference of accumulation and typically smaller than scene change threshold
Threshold value t compares.If the frame difference of accumulation is more than t, control is transferred to frame 4265, and frame is assigned as into P frames;Then in frame
The frame difference of accumulation is reset to zero in 4267.If the frame difference of accumulation is less than t, control to be transferred to frame from frame 4263
4269.Current frame difference is compared with the τ less than t in frame 4269.If current frame difference is less than τ, refer in frame 4273
The frame is skipped by group;If current frame difference is more than τ, frame is assigned as β frames.
In an alternative aspect, another frame encoding complexity indicator M* is defined as
M*=M × minimum (1, α maximums (0, SADP- s) × maximum (0, MVP-m)), (61)
Wherein α is conversion factor (scaler), SADPFor the SAD with forward motion compensation, MVPFor from preceding to fortune
The sum of the length measured in the pixel of the motion vector of dynamic compensation, and s and m is to work as SADPLess than s or MVPFrame is compiled during less than m
Code complexity designator is reproduced as zero two number of threshold values.Present frame in the flow chart 4200 that Figure 41 is replaced using M* is poor
It is different.As can be seen, only when it is preceding go out slow sport rank to motion compensation shows when, M* just be different from M.In this situation, M*
Less than M.
It should be noted that story board detection described herein and encoding context can be described by as process, the process is retouched
It is depicted as flow chart (flowchart, flow diagram), structure chart or block diagram.Although the flow chart shown in figure will can be grasped
It is described as a progressive process, but many operations can be performed in parallel or concurrently.In addition, the order of operation can be rearranged.When complete
Into process operation when, typically end up the process.Process may correspond to method, function, program, subroutine, subprogram etc..When
When process corresponds to function, it terminates the return for corresponding to the function to call function or principal function.
Those skilled in the art will be further appreciated that can rearrange this paper institutes in the case where not influenceing the operation of device
One or more elements of the device of announcement.Similarly, this paper institutes can be combined in the case where not influenceing the operation of device
One or more elements of the device of announcement.Be understood by those skilled in the art that, can be used a variety of different technologies and
Any one of skill and technique represents information and multi-medium data.Those skilled in the art will be further understood that, with reference to herein
Various illustrative components, blocks, module and the algorithm steps that disclosed example is been described by can be embodied as electronic hardware, firmware, meter
Calculation machine software, middleware, microcode or its combination.For this interchangeability of clear explanation hardware and software, the above is substantially in function
Property aspect describe various Illustrative components, block, module, circuit and step.The feature is implemented as hardware or software
Depending on application-specific and force at the design constraint of whole system.For each application-specific, those skilled in the art
The method that can change implements described feature, but the implementation decision should not be interpreted as causing and run counter to disclosed side
The scope of method.
For example, the method or algorithm with reference to described by story board detection disclosed herein and encoding example and schema
The step of can be embodied directly in hardware, in the software module of computing device, or both combination in.Methods described and calculation
Method is particularly suitable for use in the communication technology, and it includes video to mobile phone, computer, laptop computer, PDA and all types of
Personal and business correspondence device is wirelessly transferred.Software module can reside within RAM memory, flash memory, ROM memory,
Eprom memory, eeprom memory, register, hard disk, removeable disk, CD-ROM or known in the art are any
In the storage media of other forms.Exemplary storage medium is coupled to processor, to cause processor to be read from storage media
Information and by information to write-in storage media.In alternative embodiments, storage media can be integral with processor.Processor and deposit
Storage media can reside within application specific integrated circuit (ASIC).The ASIC can reside within modem.In alternate embodiment
In, processor and storage media can be resided in modem as discrete component.
In addition, the multiple declaration logical block, component, module and circuit with reference to described by examples disclosed herein can use
Following device is practiced or performed:General processor, digital signal processor (DSP), application specific integrated circuit (ASIC), scene
Programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic, discrete hardware components or its warp
Design to perform any combinations of functionality described herein.General processor can be microprocessor, but in alternative embodiments,
Processor can be any conventional processor, controller, microcontroller or state machine.Processor can also be embodied as computing device
Combination, for example, the combination of DSP and microprocessor, the combination of multi-microprocessor, one or more microprocessors and DSP
The joint of core, or any other such configuration.
Being previously described so that any those skilled in the art can make or use to disclosed example is provided
Disclosed method and apparatus.It will be readily apparent to those skilled in the art that the various modifications to the example, and this paper institutes
The principle of definition is applicable to other examples and can be in the case where not departing from the spirit or scope of disclosed method and apparatus
Add additional element.Description to the aspect is not intended to limit the scope of claims it is intended that illustrative.
Claims (21)
1. a kind of method for handling multi-medium data, it is included:
Receive digital interlaced frames of video;
The digital interlaced frames of video is converted into by digital progressive video frame by digital interlaced frames of video described in release of an interleave,
Wherein, the release of an interleave includes:
Produce for the spatial information of the digital interlaced frames of video and at least one in the digital interlaced frames of video
The movable information of frame;And
The digital progressive video frame is produced using the spatial information and the movable information;
The Pixel Information corresponding to the first frame wherein produced using both the spatial information and movable information, compared to immediately in
The unused movable information after first frame and the Pixel Information corresponding to the second frame for producing, are applied in more power
Weight.
2. according to the method described in claim 1, wherein release of an interleave is further included:
Produce the bi-directional motion information for the digital interlaced frames of video;And
The digital interlaced frames of video is based on using the bi-directional motion information and produces the digital progressive.
3. according to the method described in claim 1, wherein changing the digital interlaced frames of video includes anti-telecine process 3:2
Pulldown video frame.
4. according to the method described in claim 1, it further includes and the digital progressive is sized.
5. according to the method described in claim 1, it, which is further included, spends the noise filter filtering digital progressive.
6. according to the method described in claim 1, further comprise:Metadata is produced based on converted digital progressive video frame;
Coding parameter is determined based on the metadata;And, the digital progressive video frame is encoded according to the coding parameter.
7. a kind of equipment for handling multi-medium data, it is included:
Receiver, it is configured to receive digital interlaced frames of video;
Deinterlacer, it is configured to digital interlaced frames of video described in release of an interleave and is converted into the digital interlaced frames of video
Digital progressive video frame, wherein, the release of an interleave includes:
Produce for the spatial information of the digital interlaced frames of video and at least one in the digital interlaced frames of video
The movable information of frame;And
The digital progressive video frame is produced using the spatial information and the movable information;
The Pixel Information corresponding to the first frame wherein produced using both the spatial information and movable information, compared to immediately in
The unused movable information after first frame and the Pixel Information corresponding to the second frame for producing, are applied in more power
Weight.
8. equipment according to claim 7, it further includes encoder, and the encoder is configured to receive the number
Word progressive video frame and the compression information that is produced according to dispenser encode the digital progressive, and the dispenser is configured to
Produce the metadata associated with the digital progressive video frame.
9. equipment according to claim 7, it, which is further included, is used to carry out denoising to the digital progressive video frame
Denoising acoustic filter.
10. equipment according to claim 7, wherein the deinterlacer includes anti-teleciner.
11. equipment according to claim 7, it further includes and is configured to the digital progressive video frame line by line
The adjusted size of resampler of frame of video.
12. equipment according to claim 7, wherein deinterlacer are configured to
Produce the bi-directional motion information for the digital interlaced frames of video;And
The digital interlaced frames of video is based on using the bi-directional motion information and produces the digital progressive video frame.
13. equipment according to claim 7, further comprises dispenser, the dispenser is configured to:Produce and the number
The associated metadata of word progressive video frame, by the digital progressive video frame and the metadata be supplied to encoder for
The coding of the digital progressive video frame, wherein the metadata includes compression information.
14. a kind of equipment for handling multi-medium data, it is included:
Reception device, it is used to receive digital interlaced frames of video;
Conversion equipment, its be used for by digital interlaced frames of video described in release of an interleave by the digital terleaved video be converted into numeral by
Row frame of video, wherein, the release of an interleave includes:
Produce for the spatial information of the digital interlaced frames of video and at least one in the digital interlaced frames of video
The movable information of frame;And
The digital progressive video frame is produced using the spatial information and the movable information;
The Pixel Information corresponding to the first frame wherein produced using both the spatial information and movable information, compared to immediately in
The unused movable information after first frame and the Pixel Information corresponding to the second frame for producing, are applied in more power
Weight.
15. equipment according to claim 14, wherein the conversion equipment includes anti-teleciner.
16. equipment according to claim 14, it is further comprising sampling device is refetched, and the sampling device that refetches is used to refetch
Sample to progressive frame to be sized.
17. equipment according to claim 14, it further includes code device, and the code device is used for using being carried
The metadata associated with the digital progressive video frame supplied encodes the digital progressive video frame.
18. equipment according to claim 14, it further includes denoising apparatus, and the denoising apparatus is used for institute
State digital progressive video frame denoising.
19. equipment according to claim 14, wherein the conversion equipment is configured to:Produce and interlock for the numeral
The bi-directional motion information of frame of video;And using the bi-directional motion information be based on the interlaced frames of video and produce it is described numeral by
Row frame of video.
20. equipment according to claim 14, further comprises:It is associated with the digital progressive video frame for producing
Metadata device;And for by the digital progressive video frame and at least partly metadata be supplied to encoder with
In the device of the coding of the digital progressive video frame, wherein coding parameter is determined based at least part metadata.
21. a kind of processor, is configured for:
Receive digital interlaced frames of video;
The digital interlaced frames of video is converted into by digital progressive video frame by digital interlaced frames of video described in release of an interleave,
Wherein, the release of an interleave includes:
Produce for the spatial information of the digital interlaced frames of video and at least one in the digital interlaced frames of video
The movable information of frame;And
The digital progressive video frame is produced using the spatial information and the movable information;
The Pixel Information corresponding to the first frame wherein produced using both the spatial information and movable information, compared to immediately in
The unused movable information after first frame and the Pixel Information corresponding to the second frame for producing, are applied in more power
Weight.
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US78904806P | 2006-04-03 | 2006-04-03 | |
US60/789,048 | 2006-04-03 | ||
US78926606P | 2006-04-04 | 2006-04-04 | |
US78937706P | 2006-04-04 | 2006-04-04 | |
US60/789,377 | 2006-04-04 | ||
US60/789,266 | 2006-04-04 | ||
CNA2007800107539A CN101411183A (en) | 2006-04-03 | 2007-03-13 | Preprocessor method and apparatus |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2007800107539A Division CN101411183A (en) | 2006-04-03 | 2007-03-13 | Preprocessor method and apparatus |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104159060A CN104159060A (en) | 2014-11-19 |
CN104159060B true CN104159060B (en) | 2017-10-24 |
Family
ID=38121947
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410438251.8A Expired - Fee Related CN104159060B (en) | 2006-04-03 | 2007-03-13 | Preprocessor method and equipment |
Country Status (7)
Country | Link |
---|---|
EP (1) | EP2002650A1 (en) |
JP (3) | JP2009532741A (en) |
KR (5) | KR101019010B1 (en) |
CN (1) | CN104159060B (en) |
AR (1) | AR060254A1 (en) |
TW (1) | TW200803504A (en) |
WO (1) | WO2007114995A1 (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI396975B (en) * | 2008-08-06 | 2013-05-21 | Realtek Semiconductor Corp | Adaptable buffer device and method thereof |
TWI392335B (en) * | 2009-08-14 | 2013-04-01 | Sunplus Technology Co Ltd | De-ring system and method for reducing the overshooting and undershooting of a video signal in a scaler |
CN102648490B (en) | 2009-11-30 | 2016-08-17 | 株式会社半导体能源研究所 | Liquid crystal display, for driving the method for this liquid crystal display and include the electronic equipment of this liquid crystal display |
EP2666289A1 (en) * | 2011-01-21 | 2013-11-27 | Thomson Licensing | System and method for enhanced remote transcoding using content profiling |
EP2761597A4 (en) * | 2011-10-01 | 2015-07-01 | Intel Corp | Systems, methods and computer program products for integrated post-processing and pre-processing in video transcoding |
KR101906946B1 (en) | 2011-12-02 | 2018-10-12 | 삼성전자주식회사 | High density semiconductor memory device |
JP2014225718A (en) * | 2013-05-15 | 2014-12-04 | ソニー株式会社 | Image processing apparatus and image processing method |
US10136147B2 (en) | 2014-06-11 | 2018-11-20 | Dolby Laboratories Licensing Corporation | Efficient transcoding for backward-compatible wide dynamic range codec |
JP6883218B2 (en) * | 2016-03-07 | 2021-06-09 | ソニーグループ株式会社 | Coding device and coding method |
EP3735606B1 (en) * | 2018-01-02 | 2023-03-22 | King's College London | Method and system for localisation microscopy |
CN111310744B (en) * | 2020-05-11 | 2020-08-11 | 腾讯科技(深圳)有限公司 | Image recognition method, video playing method, related device and medium |
CN112949449B (en) * | 2021-02-25 | 2024-04-19 | 北京达佳互联信息技术有限公司 | Method and device for training staggered judgment model and method and device for determining staggered image |
CN114363638B (en) * | 2021-12-08 | 2022-08-19 | 慧之安信息技术股份有限公司 | Video encryption method based on H.265 entropy coding binarization |
CN114125346B (en) * | 2021-12-24 | 2023-08-29 | 成都索贝数码科技股份有限公司 | Video conversion method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5864369A (en) * | 1997-06-16 | 1999-01-26 | Ati International Srl | Method and apparatus for providing interlaced video on a progressive display |
EP1005227A2 (en) * | 1998-11-25 | 2000-05-31 | Sharp Kabushiki Kaisha | Low-delay interlace to progressive conversion with field averaging of a film sourced video signal |
CN1372769A (en) * | 2000-03-13 | 2002-10-02 | 索尼公司 | Method and apapratus for generating compact transcoding hints metadata |
EP1164792A3 (en) * | 2000-06-13 | 2003-08-13 | Samsung Electronics Co., Ltd. | Format converter using bidirectional motion vector and method thereof |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2700090B1 (en) | 1992-12-30 | 1995-01-27 | Thomson Csf | Method for deinterlacing frames of a sequence of moving images. |
JP4256471B2 (en) * | 1994-04-05 | 2009-04-22 | エヌエックスピー ビー ヴィ | Interlace-to-sequential scan conversion method and apparatus |
JP2832927B2 (en) * | 1994-10-31 | 1998-12-09 | 日本ビクター株式会社 | Scanning line interpolation apparatus and motion vector detection apparatus for scanning line interpolation |
JPH09284770A (en) * | 1996-04-13 | 1997-10-31 | Sony Corp | Image coding device and method |
JP3649370B2 (en) * | 1998-02-25 | 2005-05-18 | 日本ビクター株式会社 | Motion compensation coding apparatus and motion compensation coding method |
JP3588564B2 (en) * | 1999-03-31 | 2004-11-10 | 株式会社東芝 | Video data recording device |
JP2001204026A (en) * | 2000-01-21 | 2001-07-27 | Sony Corp | Image information converter and method |
US6970513B1 (en) * | 2001-06-05 | 2005-11-29 | At&T Corp. | System for content adaptive video decoding |
KR100393066B1 (en) | 2001-06-11 | 2003-07-31 | 삼성전자주식회사 | Apparatus and method for adaptive motion compensated de-interlacing video data using adaptive compensated olation and method thereof |
US6784942B2 (en) * | 2001-10-05 | 2004-08-31 | Genesis Microchip, Inc. | Motion adaptive de-interlacing method and apparatus |
JP4016646B2 (en) * | 2001-11-30 | 2007-12-05 | 日本ビクター株式会社 | Progressive scan conversion apparatus and progressive scan conversion method |
KR100446083B1 (en) * | 2002-01-02 | 2004-08-30 | 삼성전자주식회사 | Apparatus for motion estimation and mode decision and method thereof |
KR100850706B1 (en) * | 2002-05-22 | 2008-08-06 | 삼성전자주식회사 | Method for adaptive encoding and decoding motion image and apparatus thereof |
KR20060011281A (en) * | 2004-07-30 | 2006-02-03 | 한종기 | Apparatus for converting resolution of image applied to transcoder and method of the same |
JP2006074684A (en) * | 2004-09-06 | 2006-03-16 | Matsushita Electric Ind Co Ltd | Image processing method and apparatus |
-
2007
- 2007-03-13 KR KR1020087026885A patent/KR101019010B1/en not_active IP Right Cessation
- 2007-03-13 KR KR1020107022928A patent/KR101127432B1/en not_active IP Right Cessation
- 2007-03-13 KR KR1020117026505A patent/KR101373896B1/en not_active IP Right Cessation
- 2007-03-13 CN CN201410438251.8A patent/CN104159060B/en not_active Expired - Fee Related
- 2007-03-13 EP EP07758479A patent/EP2002650A1/en not_active Withdrawn
- 2007-03-13 KR KR1020137034600A patent/KR20140010190A/en not_active Application Discontinuation
- 2007-03-13 WO PCT/US2007/063929 patent/WO2007114995A1/en active Application Filing
- 2007-03-13 KR KR1020127017181A patent/KR101377370B1/en not_active IP Right Cessation
- 2007-03-13 JP JP2009504372A patent/JP2009532741A/en not_active Withdrawn
- 2007-03-26 TW TW096110382A patent/TW200803504A/en unknown
- 2007-03-30 AR ARP070101371A patent/AR060254A1/en unknown
-
2012
- 2012-07-23 JP JP2012162714A patent/JP5897419B2/en not_active Expired - Fee Related
-
2014
- 2014-12-25 JP JP2014263408A patent/JP6352173B2/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5864369A (en) * | 1997-06-16 | 1999-01-26 | Ati International Srl | Method and apparatus for providing interlaced video on a progressive display |
EP1005227A2 (en) * | 1998-11-25 | 2000-05-31 | Sharp Kabushiki Kaisha | Low-delay interlace to progressive conversion with field averaging of a film sourced video signal |
CN1372769A (en) * | 2000-03-13 | 2002-10-02 | 索尼公司 | Method and apapratus for generating compact transcoding hints metadata |
EP1164792A3 (en) * | 2000-06-13 | 2003-08-13 | Samsung Electronics Co., Ltd. | Format converter using bidirectional motion vector and method thereof |
Also Published As
Publication number | Publication date |
---|---|
JP2015109662A (en) | 2015-06-11 |
JP2013031171A (en) | 2013-02-07 |
EP2002650A1 (en) | 2008-12-17 |
JP2009532741A (en) | 2009-09-10 |
KR20090006159A (en) | 2009-01-14 |
KR101019010B1 (en) | 2011-03-04 |
JP5897419B2 (en) | 2016-03-30 |
KR101377370B1 (en) | 2014-03-26 |
TW200803504A (en) | 2008-01-01 |
KR101373896B1 (en) | 2014-03-12 |
AR060254A1 (en) | 2008-06-04 |
KR20120091423A (en) | 2012-08-17 |
KR20100126506A (en) | 2010-12-01 |
CN104159060A (en) | 2014-11-19 |
WO2007114995A1 (en) | 2007-10-11 |
KR20110128366A (en) | 2011-11-29 |
KR20140010190A (en) | 2014-01-23 |
JP6352173B2 (en) | 2018-07-04 |
KR101127432B1 (en) | 2012-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104159060B (en) | Preprocessor method and equipment | |
US9131164B2 (en) | Preprocessor method and apparatus | |
US7860167B2 (en) | Apparatus and method for adaptive 3D artifact reducing for encoded image signal | |
US8750372B2 (en) | Treating video information | |
US6862372B2 (en) | System for and method of sharpness enhancement using coding information and local spatial features | |
RU2378790C1 (en) | Scalability techniques based on content information | |
Lee et al. | Loop filtering and post-filtering for low-bit-rates moving picture coding | |
US20100303150A1 (en) | System and method for cartoon compression | |
TW200803517A (en) | Redundant data encoding methods and device | |
JP2009532741A6 (en) | Preprocessor method and apparatus | |
Yuen | Coding artifacts and visual distortions | |
US7031388B2 (en) | System for and method of sharpness enhancement for coded digital video | |
US6873657B2 (en) | Method of and system for improving temporal consistency in sharpness enhancement for a video signal | |
Chen et al. | Design a deblocking filter with three separate modes in DCT-based coding | |
CN101411183A (en) | Preprocessor method and apparatus | |
WO1999059342A1 (en) | Method and system for mpeg-2 encoding with frame partitioning | |
Cao et al. | An Effective Error Concealment Method Based on the Scene Change |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171024 |
|
CF01 | Termination of patent right due to non-payment of annual fee |