US9241167B2 - Metadata assisted video decoding - Google Patents

Metadata assisted video decoding Download PDF

Info

Publication number
US9241167B2
US9241167B2 US13/399,769 US201213399769A US9241167B2 US 9241167 B2 US9241167 B2 US 9241167B2 US 201213399769 A US201213399769 A US 201213399769A US 9241167 B2 US9241167 B2 US 9241167B2
Authority
US
United States
Prior art keywords
decoder
decoding
metadata
video sequence
decoder engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US13/399,769
Other versions
US20130215978A1 (en
Inventor
Yongjun Wu
Shyam Sadhwani
Naveen Thumpudi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US13/399,769 priority Critical patent/US9241167B2/en
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: THUMPUDI, NAVEEN, SADHWANI, SHYAM, WU, YONGJUN
Priority to CN201380009666.7A priority patent/CN104106264B/en
Priority to PCT/US2013/026004 priority patent/WO2013123100A1/en
Priority to EP13749713.7A priority patent/EP2815574B1/en
Priority to JP2014557749A priority patent/JP2015513386A/en
Priority to KR1020147022904A priority patent/KR102006044B1/en
Publication of US20130215978A1 publication Critical patent/US20130215978A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Priority to US14/979,277 priority patent/US9807409B2/en
Publication of US9241167B2 publication Critical patent/US9241167B2/en
Application granted granted Critical
Priority to JP2017199533A priority patent/JP6423061B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process

Definitions

  • the present application relates generally to decoding and, particularly, to optimizing decoding using metadata from an encoded video sequence with a file container.
  • Engineers use compression (also called source coding or source encoding) to reduce the bitrate of digital video. Compression decreases the cost of storing and transmitting video information by converting the information into a lower bitrate form. Decompression (also called decoding) reconstructs a version of the original information from the compressed form.
  • a “codec” is an encoder/decoder system.
  • H.261, H.262 MPEG-2
  • H.263 MPEG-1
  • MPEG-4 MPEG-4
  • H.264 sometimes referred to as AVC or 14496-10
  • VC-1 MPEG-4
  • Next generation standard of HEVC is in development. For additional details, see representative versions of the respective standards.
  • a video codec standard typically defines options for the syntax of an encoded video bitstream, detailing parameters that are in the bitstream for a video sequence when particular features are used in encoding and decoding.
  • a video codec standard also provides details about the decoding operations a decoder can perform to achieve correct results in decoding.
  • GPU graphics processing unit
  • a GPU is a specialized electronic circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display.
  • GPUs are used in embedded systems, mobile phones, personal computers, workstations, game consoles, etc.
  • Modern GPUs are very efficient at manipulating computer graphics, and their highly parallel structure makes them more effective than general-purpose CPUs for algorithms where processing of large blocks of data is done in parallel.
  • a video decoder uses metadata of an encoded video sequence in order to make optimization decisions. For example, in one embodiment, metadata can be used to choose which decoder engine can receive a video sequence. Multiple decoder engines can be available in a decoder, such as a one that can handle baseline profile (e.g., a CPU) and one that cannot (e.g., a GPU), but which in general is more efficient. By using the metadata to choose the most efficient decoder engine, an optimized decoder is realized.
  • the optimization decisions can be based on length and location metadata information associated with a video sequence. Using such metadata information, a decoder engine can skip start-code scanning to make the decoding process more efficient.
  • an emulation prevention byte can be removed dynamically while a bitstream parser decodes slice headers, a sequence parameter set (SPS), a picture parameter set (PPS) and supplemental enhancement information (SEI).
  • SPS sequence parameter set
  • PPS picture parameter set
  • SEI supplemental enhancement information
  • FIG. 1 is a diagram of an example computing environment in which some described embodiments can be implemented.
  • FIG. 2 is a flowchart of a method for making decoding optimization decisions based on metadata in an encoded video sequence.
  • FIG. 3 is a flowchart of using the metadata to determine which available decoder engine can be used to decode the video sequence.
  • FIG. 4 is a more detailed flowchart that can be used to expand on the flowchart of FIG. 3 and is a particular example for an MPEG-4 video sequence.
  • FIG. 5 is an example architectural structure of a decoder including multiple decoder engines.
  • FIG. 6 is an example decoder engine that can be used.
  • FIG. 7 is a flowchart of a method for passing length and location to a decoder engine.
  • FIG. 8 is a flowchart of a method for removing an emulation prevention byte.
  • FIG. 1 illustrates a generalized example of a suitable computing environment ( 100 ) in which several of the described techniques and tools may be implemented.
  • the computing environment ( 100 ) is not intended to suggest any limitation as to scope of use or functionality, as the techniques and tools may be implemented in diverse general-purpose or special-purpose computing environments.
  • the computing environment ( 100 ) includes one or more processing units ( 110 , 115 ) and memory ( 120 , 125 ) that can be used in implementing a computing device.
  • this most basic configuration ( 130 ) is included within a dashed line.
  • the processing units ( 110 , 115 ) execute computer-executable instructions.
  • a processing unit can be a general-purpose central processing unit (CPU), processor in an application-specific integrated circuit (ASIC) or any other type of processor.
  • ASIC application-specific integrated circuit
  • FIG. 1 shows a central processing unit ( 110 ) as well as a graphics processing unit or co-processing unit ( 115 ).
  • the memory ( 120 , 125 ) may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.), or some combination of the two, accessible by the processing unit(s).
  • the memory ( 120 , 125 ) stores software ( 180 ) implementing one or more innovations for decoder optimization, in the form of computer-executable instructions suitable for execution by the processing unit(s).
  • a computing environment may have additional features.
  • the computing environment ( 100 ) includes storage ( 140 ), one or more input devices ( 150 ), one or more output devices ( 160 ), and one or more communication connections ( 170 ).
  • An interconnection mechanism such as a bus, controller, or network interconnects the components of the computing environment ( 100 ).
  • operating system software provides an operating environment for other software executing in the computing environment ( 100 ), and coordinates activities of the components of the computing environment ( 100 ).
  • the tangible storage ( 140 ) may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, DVDs, or any other medium which can be used to store information in a non-transitory way and which can be accessed within the computing environment ( 100 ).
  • the storage ( 140 ) can store instructions for the software ( 180 ) implementing one or more innovations for decoder optimization.
  • the input device(s) ( 150 ) may be a touch input device such as a keyboard, mouse, pen, or trackball, a voice input device, a scanning device, or another device that provides input to the computing environment ( 100 ).
  • the input device(s) ( 150 ) may be a video card, TV tuner card, or similar device that accepts video input in analog or digital form, or a CD-ROM or CD-RW that reads video samples into the computing environment ( 100 ).
  • the output device(s) ( 160 ) may be a display, printer, speaker, CD-writer, or another device that provides output from the computing environment ( 100 ).
  • the communication connection(s) ( 170 ) enable communication over a communication medium to another computing entity.
  • the communication medium conveys information such as computer-executable instructions, audio or video input or output, or other data in a modulated data signal.
  • a modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media include wired or wireless techniques implemented with an electrical, optical, RF, or other carrier.
  • Computer-readable media are any available tangible media that can be accessed within a computing environment.
  • Computer-readable media include memory ( 120 ), storage ( 140 ), and combinations of any of the above.
  • program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the functionality of the program modules may be combined or split between program modules as desired in various embodiments.
  • Computer-executable instructions for program modules may be executed within a local or distributed computing environment.
  • system and “device” are used interchangeably herein. Unless the context clearly indicates otherwise, neither term implies any limitation on a type of computing system or computing device. In general, a computing system or computing device can be local or distributed, and can include any combination of special-purpose hardware and/or general-purpose hardware with software implementing the functionality described herein.
  • FIG. 2 is a flowchart of a method for using metadata to optimize decoding.
  • an encoded video sequence is received.
  • the encoded video sequence can be a movie clip or other video sequence and can use any of a variety of container formats including MPEG-X, AVI, MKV, ASF, etc. Additionally, any variety of standards for video compression can be used, such as H.264/AVC, VC-1, MPEG-4 Pt2, etc.
  • metadata in the encoded video sequence can be analyzed in order to make decoding decisions.
  • the metadata can be any that is used to assist in describing or classifying the raw image data of the video sequence.
  • a Sample Description Box can be used in MPEG-4 to obtain metadata regarding parameter sets, such as the sequence parameter set (SPS) and picture parameter set (PPS).
  • the parameter sets can contain additional information regarding arbitrary slice ordering (ASO) and flexible macroblock ordering (FMO) that can be used to make decoder optimization decisions (process block 204 ).
  • ASO arbitrary slice ordering
  • FMO flexible macroblock ordering
  • metadata can be used to optimize bitstream parsing, as further described below.
  • Other metadata can be used, and the embodiments described herein are not limited to the particular metadata used and the particular decoder optimization decisions.
  • FIG. 3 is an example embodiment showing additional details or process blocks used in conjunction with FIG. 2 .
  • metadata can be analyzed to determine a most efficient decoder engine.
  • An efficient decoder engine can be one that performs the decoding in the least amount of time, for example. Other efficiency parameters can also be analyzed, such as memory usage, accuracy, etc.
  • the decoder can include a plurality (e.g., two or more) decoder engines and, as a result of the analysis, the encoded data can be forwarded to the appropriate decoder engine.
  • a first decoder engine type can be a decoder engine capable of decoding main profile or bitstreams at higher profiles.
  • decoder engines such as GPUs fall into this group, or other processors that perform a portion of the decoding in specialized hardware designed for decoding (generally called hardware acceleration).
  • a second decoder engine type can be capable of decoding baseline, main and higher profiles.
  • a central processing unit (CPU) that decodes using software is an exemplary second decoder engine type.
  • CPUs have generalized hardware that can be used for purposes other than decoding and, as a result, can be less efficient.
  • the chosen decoder engine decodes the encoded data. The decoded data can then be used for displaying the decoded video, such as on a user display.
  • FIG. 4 is a flowchart of a method expanding on process block 300 using a specific example of MPEG-4.
  • parameter sets such as SPS and PPS in the encoded bitstream metadata can be parsed.
  • decision block 404 a determination can be made whether ASO or FMO are used by analyzing the SPS and PPS metadata. If so, then a decoder engine that is capable of decoding baseline profile, main and higher is chosen, such as a CPU that decodes the bitstream using software (process block 406 ). Otherwise, in process block 408 , a decoder engine is chosen that is capable of decoding main profile and higher, such as a GPU or other hardware accelerated decoding engine.
  • FIG. 5 shows a high-level diagram of a decoder 500 including multiple decoder engines 502 , 504 that can be used.
  • Decoder engine 502 is capable of decoding main profile and higher encoded data, and is generally not capable of or is inefficient at decoding baseline profiles. Thus, the decoder engine 502 is less desirable to use when the data is encoded using certain types of algorithms, such as ASO and FMO.
  • the decoder engine 504 is a decoder engine capable of decoding baseline profile, main and higher. Thus, the decoder engine 504 can decode a wider variety of encoded data types than the decoder engine 502 .
  • decoder decision logic selects which decoder engine 502 , 504 can decode the received encoded data based on metadata in the bitstream using algorithms described herein.
  • FIG. 6 is a schematic diagram of a general video decoder engine 600 that can perform decoding using any of the embodiments described herein.
  • Alternative decoder engines can be used, and FIG. 6 is merely an example.
  • the decoder engine 600 receives information 695 for a compressed sequence of video frames (e.g., via a compressed video bitstream) and produces output including a reconstructed block 605 .
  • Particular embodiments of video decoders can use a variation or supplemented version of the generalized decoder 600 .
  • the decoder engine 600 decompresses blocks coded using inter-prediction and/or intra-prediction.
  • FIG. 6 shows a path for intra-coded blocks through the decoder engine 600 (shown as the intra block path) and a path for inter-coded blocks (shown as the inter block path).
  • Many of the components of the decoder engine 600 are used for decompressing both inter-coded and intra-coded blocks. The exact operations performed by those components can vary depending on the type of information being decompressed.
  • a buffer 690 receives the information 695 for the compressed video sequence and makes the received information available to the entropy decoder 680 .
  • the buffer 690 typically receives the information at a rate that is fairly constant over time.
  • the buffer 690 can include a playback buffer and other buffers as well. Alternatively, the buffer 690 receives information at a varying rate.
  • the entropy decoder 680 entropy decodes entropy-coded quantized data as well as entropy-coded side information (e.g., motion information 615 , flags, modes, syntax elements, and other side information), typically applying the inverse of the entropy encoding performed in the encoder.
  • the entropy decoder 680 can use any of the disclosed counter-based variable length coding techniques described below to perform decoding (e.g., decoding of syntax elements).
  • An inverse quantizer 670 inverse quantizes entropy-decoded data.
  • An inverse frequency transformer 660 converts the quantized, frequency domain data into spatial domain video information by applying an inverse transform such as an inverse frequency transform.
  • a motion compensator 630 applies motion information 615 (e.g., predicted motion information) to a reference frame 625 to form a prediction 635 of the block 605 being reconstructed.
  • a buffer (store) 620 stores previous reconstructed frames for use as reference frames.
  • a motion compensator applies other types of motion compensation. The prediction by the motion compensator is rarely perfect, so the decoder 600 also reconstructs a prediction residual 645 to be added to the prediction 635 to reconstruct block 605 .
  • the store 620 buffers the reconstructed frame for use in predicting a subsequent frame.
  • the frame is predicted on a block-by-block basis (as illustrated) and respective blocks of the frame can be predicted.
  • One or more of the predicted blocks can be predicted using motion information from blocks in the same frame or one or more blocks of a different frame.
  • an intra-predictor 655 forms a prediction 665 of the block 610 being reconstructed.
  • the buffer (store) 620 stores previous reconstructed blocks and frames.
  • the prediction by the motion compensator is rarely perfect, so the decoder 600 can also reconstruct a prediction residual 675 to be added to the prediction 665 to reconstruct block 610 .
  • decoder engine Although a particular decoder engine is described, a wide variety of decoder structures can be used, as the type of decoder engine is a matter of design choice, and depends on the particular application.
  • FIG. 7 is a flowchart of a method for implementing decoder optimization using metadata.
  • H.264/AVC decoder accepts a bitstream, with start code 0x00 00 001 in the beginning of each network access layer unit (NALU).
  • MPEG-4 file format indicates the length of each network access layer unit and sends one picture per sample to H.264/AVC decoder.
  • NALU length information is available, hardware accelerated decoding can completely skip start code scanning, and send NALU's one by one directly to hardware.
  • decoder decision logic retrieves length and location information associated with the NALU. Such information is metadata and can be found in the file container associated with the incoming bitstream.
  • the length and location information is passed to the appropriate decoder engine that was selected, as previously described. Alternatively, the method can be applied to decoders having a single decode engine.
  • the decoder engine can use the length and location metadata information to decode the bitstream without scanning for start codes.
  • the location information describes the position in the bitstream wherein the data starts.
  • the length information provides where the data ends, relative to the start. Substantial savings in CPU cycles is achieved by eliminating the need for start-code scanning because the location and length information is already provided.
  • FIG. 8 shows an embodiment of an additional method wherein an emulation prevention byte is analyzed.
  • Software decoding of H.264/AVC videos can remove emulation prevention byte 0x03 in the compressed bitstream to achieve efficient entropy decoding. That is, it can be more efficient for software CABAC/CAVLC decoding not to detect an emulation prevention byte in the process of entropy decoding.
  • the bitstream parser in software decoding is designed to perform start code parsing and emulation prevention byte removal at substantially the same time.
  • hardware accelerator decoding sometimes does not need to remove emulation prevention 0x03 from a compressed bitstream.
  • a different bitstream parser can be designed to scan start code only, detect and remove emulation prevention byte 0x03 in flight (dynamically) while the bitstream parser decodes slice headers, sequence parameter set (SPS), picture parameter set (PPS), and supplemental enhancement information (SEI).
  • SPS sequence parameter set
  • PPS picture parameter set
  • SEI Supplemental Enhancement Information
  • decision block 800 a determination is made whether length and location information are available in metadata. If yes, then in process block 802 , the length and location of the start code are passed to the appropriate decoder engine so that the decoder can avoid start-code scanning. If not, then the start codes are searched so that length and location information can be determined (process block 804 ). The searching can be performed by scanning the bit stream and comparing each byte to a start code to find the location. The length can be determined by searching for an end code and counting the bytes between the start and end codes. In decision block 806 , a determination is made if a software decoder engine (e.g., CPU) is used or a decoder engine which needs optimization by removal of the emulation prevention byte.
  • a software decoder engine e.g., CPU
  • the emulation prevention byte is removed from the bitstream.
  • the encoded data with the emulation prevention byte removed is sent to the decoder engine, which is capable of decoding baseline profile, main and higher.
  • the length and location information can also be sent to the decoder.
  • decision block 806 is answered in the negative, then in process block 816 , the encoded data is sent to the decoder capable of decoding main profile and higher with the emulation prevention byte together with the length and location information.
  • Any of the disclosed methods can be implemented as computer-executable instructions stored on one or more computer-readable storage media (e.g., non-transitory computer-readable media, such as one or more optical media discs, volatile memory components (such as DRAM or SRAM), or nonvolatile memory components (such as hard drives)) and executed on a computer (e.g., any commercially available computer, including smart phones or other mobile devices that include computing hardware).
  • a computer e.g., any commercially available computer, including smart phones or other mobile devices that include computing hardware.
  • Any of the computer-executable instructions for implementing the disclosed techniques as well as any data created and used during implementation of the disclosed embodiments can be stored on one or more computer-readable media (e.g., non-transitory computer-readable media).
  • the computer-executable instructions can be part of, for example, a dedicated software application or a software application that is accessed or downloaded via a web browser or other software application (such as a remote computing application).
  • Such software can be executed, for example, on a single local computer (e.g., any suitable commercially available computer) or in a network environment (e.g., via the Internet, a wide-area network, a local-area network, a client-server network (such as a cloud computing network), or other such network) using one or more network computers.
  • any of the software-based embodiments can be uploaded, downloaded, or remotely accessed through a suitable communication means.
  • suitable communication means include, for example, the Internet, the World Wide Web, an intranet, software applications, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication means.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video decoder is disclosed that uses metadata in order to make optimization decisions. In one embodiment, metadata is used to choose which of multiple available decoder engines should receive a video sequence. In another embodiment, the optimization decisions can be based on length and location metadata information associated with a video sequence. Using such metadata information, a decoder engine can skip start-code scanning to make the decoding process more efficient. Also based on the choice of decoder engine, it can decide whether emulation prevention byte removal shall happen together with start code scanning or not.

Description

FIELD
The present application relates generally to decoding and, particularly, to optimizing decoding using metadata from an encoded video sequence with a file container.
BACKGROUND
Engineers use compression (also called source coding or source encoding) to reduce the bitrate of digital video. Compression decreases the cost of storing and transmitting video information by converting the information into a lower bitrate form. Decompression (also called decoding) reconstructs a version of the original information from the compressed form. A “codec” is an encoder/decoder system.
Over the last two decades, various video codec standards have been adopted, including the H.261, H.262 (MPEG-2) and H.263 standards and the MPEG-1 and MPEG-4 standards. More recently, the H.264 standard (sometimes referred to as AVC or 14496-10) and VC-1 standard have been adopted. Next generation standard of HEVC is in development. For additional details, see representative versions of the respective standards. A video codec standard typically defines options for the syntax of an encoded video bitstream, detailing parameters that are in the bitstream for a video sequence when particular features are used in encoding and decoding. In many cases, a video codec standard also provides details about the decoding operations a decoder can perform to achieve correct results in decoding.
For modern decoding, a graphics processing unit (GPU) can be used. A GPU is a specialized electronic circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display. GPUs are used in embedded systems, mobile phones, personal computers, workstations, game consoles, etc. Modern GPUs are very efficient at manipulating computer graphics, and their highly parallel structure makes them more effective than general-purpose CPUs for algorithms where processing of large blocks of data is done in parallel.
Although fast, most GPUs are not designed to handle videos encoded with Arbitrary Slice Order (ASO) and/or Flexible Macro-block Order (FMO). Video encoded using such algorithms is typically processed using a decoder designed to handle baseline profiles, like a CPU. Instead, GPUs are generally designed to handle a video sequence having a main profile and higher profiles. Unfortunately, many H.264/AVC encoders produce baseline bitstreams, which are actually conformant to main profile, but have a constraint flag incorrectly set. This incorrectly set flag makes H.264/AVC decoders treat those clips as pure baseline including ASO or FMO, even though such algorithms may not have been used.
SUMMARY
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
A video decoder is disclosed that uses metadata of an encoded video sequence in order to make optimization decisions. For example, in one embodiment, metadata can be used to choose which decoder engine can receive a video sequence. Multiple decoder engines can be available in a decoder, such as a one that can handle baseline profile (e.g., a CPU) and one that cannot (e.g., a GPU), but which in general is more efficient. By using the metadata to choose the most efficient decoder engine, an optimized decoder is realized.
In another embodiment, the optimization decisions can be based on length and location metadata information associated with a video sequence. Using such metadata information, a decoder engine can skip start-code scanning to make the decoding process more efficient.
In yet another embodiment, an emulation prevention byte can be removed dynamically while a bitstream parser decodes slice headers, a sequence parameter set (SPS), a picture parameter set (PPS) and supplemental enhancement information (SEI). When the network access layer unit (NALU) length information is available, hardware accelerated decoding can completely skip start code scanning, and send NALU's one by one directly to hardware.
The foregoing and other objects, features, and advantages of the invention will become more apparent from the following detailed description, which proceeds with reference to the accompanying figures.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a diagram of an example computing environment in which some described embodiments can be implemented.
FIG. 2 is a flowchart of a method for making decoding optimization decisions based on metadata in an encoded video sequence.
FIG. 3 is a flowchart of using the metadata to determine which available decoder engine can be used to decode the video sequence.
FIG. 4 is a more detailed flowchart that can be used to expand on the flowchart of FIG. 3 and is a particular example for an MPEG-4 video sequence.
FIG. 5 is an example architectural structure of a decoder including multiple decoder engines.
FIG. 6 is an example decoder engine that can be used.
FIG. 7 is a flowchart of a method for passing length and location to a decoder engine.
FIG. 8 is a flowchart of a method for removing an emulation prevention byte.
DETAILED DESCRIPTION
I. Example Computing Environment.
FIG. 1 illustrates a generalized example of a suitable computing environment (100) in which several of the described techniques and tools may be implemented. The computing environment (100) is not intended to suggest any limitation as to scope of use or functionality, as the techniques and tools may be implemented in diverse general-purpose or special-purpose computing environments.
With reference to FIG. 1, the computing environment (100) includes one or more processing units (110, 115) and memory (120, 125) that can be used in implementing a computing device. In FIG. 1, this most basic configuration (130) is included within a dashed line. The processing units (110, 115) execute computer-executable instructions. A processing unit can be a general-purpose central processing unit (CPU), processor in an application-specific integrated circuit (ASIC) or any other type of processor. In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power. For example, FIG. 1 shows a central processing unit (110) as well as a graphics processing unit or co-processing unit (115). The memory (120, 125) may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.), or some combination of the two, accessible by the processing unit(s). The memory (120, 125) stores software (180) implementing one or more innovations for decoder optimization, in the form of computer-executable instructions suitable for execution by the processing unit(s).
A computing environment may have additional features. For example, the computing environment (100) includes storage (140), one or more input devices (150), one or more output devices (160), and one or more communication connections (170). An interconnection mechanism (not shown) such as a bus, controller, or network interconnects the components of the computing environment (100). Typically, operating system software (not shown) provides an operating environment for other software executing in the computing environment (100), and coordinates activities of the components of the computing environment (100).
The tangible storage (140) may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, DVDs, or any other medium which can be used to store information in a non-transitory way and which can be accessed within the computing environment (100). The storage (140) can store instructions for the software (180) implementing one or more innovations for decoder optimization.
The input device(s) (150) may be a touch input device such as a keyboard, mouse, pen, or trackball, a voice input device, a scanning device, or another device that provides input to the computing environment (100). For video decoding, the input device(s) (150) may be a video card, TV tuner card, or similar device that accepts video input in analog or digital form, or a CD-ROM or CD-RW that reads video samples into the computing environment (100). The output device(s) (160) may be a display, printer, speaker, CD-writer, or another device that provides output from the computing environment (100).
The communication connection(s) (170) enable communication over a communication medium to another computing entity. The communication medium conveys information such as computer-executable instructions, audio or video input or output, or other data in a modulated data signal. A modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media include wired or wireless techniques implemented with an electrical, optical, RF, or other carrier.
The techniques and tools can be described in the general context of computer-readable media. Computer-readable media are any available tangible media that can be accessed within a computing environment. By way of example, and not limitation, with the computing environment (100), computer-readable media include memory (120), storage (140), and combinations of any of the above.
The techniques and tools can be described in the general context of computer-executable instructions, such as those included in program modules, being executed in a computing environment on a target real or virtual processor. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or split between program modules as desired in various embodiments. Computer-executable instructions for program modules may be executed within a local or distributed computing environment.
The terms “system” and “device” are used interchangeably herein. Unless the context clearly indicates otherwise, neither term implies any limitation on a type of computing system or computing device. In general, a computing system or computing device can be local or distributed, and can include any combination of special-purpose hardware and/or general-purpose hardware with software implementing the functionality described herein.
For the sake of presentation, the detailed description uses terms like “determine” and “select” to describe computer operations in a computing environment. These terms are high-level abstractions for operations performed by a computer, and should not be confused with acts performed by a human being. The actual computer operations corresponding to these terms vary depending on implementation.
II. Overview of Metadata Assisted Decoding Optimization
FIG. 2 is a flowchart of a method for using metadata to optimize decoding. In process block 200, an encoded video sequence is received. The encoded video sequence can be a movie clip or other video sequence and can use any of a variety of container formats including MPEG-X, AVI, MKV, ASF, etc. Additionally, any variety of standards for video compression can be used, such as H.264/AVC, VC-1, MPEG-4 Pt2, etc. In process block 202, metadata in the encoded video sequence can be analyzed in order to make decoding decisions. The metadata can be any that is used to assist in describing or classifying the raw image data of the video sequence. For example, a Sample Description Box (STSD) can be used in MPEG-4 to obtain metadata regarding parameter sets, such as the sequence parameter set (SPS) and picture parameter set (PPS). The parameter sets can contain additional information regarding arbitrary slice ordering (ASO) and flexible macroblock ordering (FMO) that can be used to make decoder optimization decisions (process block 204). For example, using the metadata, a choice can be made to forward the encoded data to the most efficient decoder engine. In other examples, metadata can be used to optimize bitstream parsing, as further described below. Other metadata can be used, and the embodiments described herein are not limited to the particular metadata used and the particular decoder optimization decisions.
III. Decoder Engine Selection Using Metadata
FIG. 3 is an example embodiment showing additional details or process blocks used in conjunction with FIG. 2. In process block 300, metadata can be analyzed to determine a most efficient decoder engine. An efficient decoder engine can be one that performs the decoding in the least amount of time, for example. Other efficiency parameters can also be analyzed, such as memory usage, accuracy, etc. In process block 302, the decoder can include a plurality (e.g., two or more) decoder engines and, as a result of the analysis, the encoded data can be forwarded to the appropriate decoder engine. A first decoder engine type can be a decoder engine capable of decoding main profile or bitstreams at higher profiles. Typically, decoder engines such as GPUs fall into this group, or other processors that perform a portion of the decoding in specialized hardware designed for decoding (generally called hardware acceleration). A second decoder engine type can be capable of decoding baseline, main and higher profiles. A central processing unit (CPU) that decodes using software is an exemplary second decoder engine type. In general, CPUs have generalized hardware that can be used for purposes other than decoding and, as a result, can be less efficient. In process block 304, the chosen decoder engine decodes the encoded data. The decoded data can then be used for displaying the decoded video, such as on a user display.
FIG. 4 is a flowchart of a method expanding on process block 300 using a specific example of MPEG-4. In process block 400, parameter sets, such as SPS and PPS in the encoded bitstream metadata can be parsed. In process block 402 and decision block 404, a determination can be made whether ASO or FMO are used by analyzing the SPS and PPS metadata. If so, then a decoder engine that is capable of decoding baseline profile, main and higher is chosen, such as a CPU that decodes the bitstream using software (process block 406). Otherwise, in process block 408, a decoder engine is chosen that is capable of decoding main profile and higher, such as a GPU or other hardware accelerated decoding engine.
FIG. 5 shows a high-level diagram of a decoder 500 including multiple decoder engines 502, 504 that can be used. Decoder engine 502 is capable of decoding main profile and higher encoded data, and is generally not capable of or is inefficient at decoding baseline profiles. Thus, the decoder engine 502 is less desirable to use when the data is encoded using certain types of algorithms, such as ASO and FMO. The decoder engine 504 is a decoder engine capable of decoding baseline profile, main and higher. Thus, the decoder engine 504 can decode a wider variety of encoded data types than the decoder engine 502. At 506, decoder decision logic selects which decoder engine 502, 504 can decode the received encoded data based on metadata in the bitstream using algorithms described herein.
Thus, if metadata from MPEG-4 file format is available, hardware accelerated video decoding can still be done for clips, even if flags are improperly set in the encoded bitstream. For example, if there is single ‘stsd’ sample entry from MPEG-4 file format in the clip, which is often true, all the SPS's and PPS's can be sent in the very beginning of bitstream decoding. Even after an H.264 decoder parses the SPS's and finds the bitstream is in baseline profile, not compatible with main profile, an H.264/AVC decoder could parse all PPS's and find out whether the bitstream really uses FMO and ASO. If not, which is often true, the bitstream can still be sent to hardware accelerator for video decoding. Basically, using the additional meta data information from MPEG-4 file format, it could go one step further and look into all PPS's and then decide whether it shall use hardware accelerated video decoding or not for a given clip.
IV. Exemplary Video Decoder Engine
FIG. 6 is a schematic diagram of a general video decoder engine 600 that can perform decoding using any of the embodiments described herein. Alternative decoder engines can be used, and FIG. 6 is merely an example. The decoder engine 600 receives information 695 for a compressed sequence of video frames (e.g., via a compressed video bitstream) and produces output including a reconstructed block 605. Particular embodiments of video decoders can use a variation or supplemented version of the generalized decoder 600.
The decoder engine 600 decompresses blocks coded using inter-prediction and/or intra-prediction. For the sake of presentation, FIG. 6 shows a path for intra-coded blocks through the decoder engine 600 (shown as the intra block path) and a path for inter-coded blocks (shown as the inter block path). Many of the components of the decoder engine 600 are used for decompressing both inter-coded and intra-coded blocks. The exact operations performed by those components can vary depending on the type of information being decompressed.
A buffer 690 receives the information 695 for the compressed video sequence and makes the received information available to the entropy decoder 680. The buffer 690 typically receives the information at a rate that is fairly constant over time. The buffer 690 can include a playback buffer and other buffers as well. Alternatively, the buffer 690 receives information at a varying rate.
The entropy decoder 680 entropy decodes entropy-coded quantized data as well as entropy-coded side information (e.g., motion information 615, flags, modes, syntax elements, and other side information), typically applying the inverse of the entropy encoding performed in the encoder. For example, the entropy decoder 680 can use any of the disclosed counter-based variable length coding techniques described below to perform decoding (e.g., decoding of syntax elements). An inverse quantizer 670 inverse quantizes entropy-decoded data. An inverse frequency transformer 660 converts the quantized, frequency domain data into spatial domain video information by applying an inverse transform such as an inverse frequency transform.
If the block 605 to be reconstructed is an inter-coded block using forward-prediction, a motion compensator 630 applies motion information 615 (e.g., predicted motion information) to a reference frame 625 to form a prediction 635 of the block 605 being reconstructed. A buffer (store) 620 stores previous reconstructed frames for use as reference frames. Alternatively, a motion compensator applies other types of motion compensation. The prediction by the motion compensator is rarely perfect, so the decoder 600 also reconstructs a prediction residual 645 to be added to the prediction 635 to reconstruct block 605.
When the decoder needs a reconstructed frame for subsequent motion compensation, the store 620 buffers the reconstructed frame for use in predicting a subsequent frame. In some implementations of predicting a frame, the frame is predicted on a block-by-block basis (as illustrated) and respective blocks of the frame can be predicted. One or more of the predicted blocks can be predicted using motion information from blocks in the same frame or one or more blocks of a different frame.
If the block 605 to be reconstructed is an intra-coded block, an intra-predictor 655 forms a prediction 665 of the block 610 being reconstructed. The buffer (store) 620 stores previous reconstructed blocks and frames. The prediction by the motion compensator is rarely perfect, so the decoder 600 can also reconstruct a prediction residual 675 to be added to the prediction 665 to reconstruct block 610.
Although a particular decoder engine is described, a wide variety of decoder structures can be used, as the type of decoder engine is a matter of design choice, and depends on the particular application.
V. Optimized Bitstream Parsers Using Metadata
FIG. 7 is a flowchart of a method for implementing decoder optimization using metadata. For purposes of illustration, the flowchart of FIG. 7 is described in relation to an H.264/AVC decoder. However, it is generally understood that the method can apply to other decoder types. In general, H.264/AVC decoder accepts a bitstream, with start code 0x00 00 001 in the beginning of each network access layer unit (NALU). MPEG-4 file format indicates the length of each network access layer unit and sends one picture per sample to H.264/AVC decoder. When the NALU length information is available, hardware accelerated decoding can completely skip start code scanning, and send NALU's one by one directly to hardware.
In process block 700, decoder decision logic (such as shown at 506, FIG. 5), retrieves length and location information associated with the NALU. Such information is metadata and can be found in the file container associated with the incoming bitstream. In process block 702, the length and location information is passed to the appropriate decoder engine that was selected, as previously described. Alternatively, the method can be applied to decoders having a single decode engine. In process block 704, the decoder engine can use the length and location metadata information to decode the bitstream without scanning for start codes. The location information describes the position in the bitstream wherein the data starts. The length information provides where the data ends, relative to the start. Substantial savings in CPU cycles is achieved by eliminating the need for start-code scanning because the location and length information is already provided.
FIG. 8 shows an embodiment of an additional method wherein an emulation prevention byte is analyzed. Software decoding of H.264/AVC videos can remove emulation prevention byte 0x03 in the compressed bitstream to achieve efficient entropy decoding. That is, it can be more efficient for software CABAC/CAVLC decoding not to detect an emulation prevention byte in the process of entropy decoding. The bitstream parser in software decoding is designed to perform start code parsing and emulation prevention byte removal at substantially the same time. On the other hand, hardware accelerator decoding sometimes does not need to remove emulation prevention 0x03 from a compressed bitstream. A different bitstream parser can be designed to scan start code only, detect and remove emulation prevention byte 0x03 in flight (dynamically) while the bitstream parser decodes slice headers, sequence parameter set (SPS), picture parameter set (PPS), and supplemental enhancement information (SEI).
In decision block 800, a determination is made whether length and location information are available in metadata. If yes, then in process block 802, the length and location of the start code are passed to the appropriate decoder engine so that the decoder can avoid start-code scanning. If not, then the start codes are searched so that length and location information can be determined (process block 804). The searching can be performed by scanning the bit stream and comparing each byte to a start code to find the location. The length can be determined by searching for an end code and counting the bytes between the start and end codes. In decision block 806, a determination is made if a software decoder engine (e.g., CPU) is used or a decoder engine which needs optimization by removal of the emulation prevention byte. If either of these decoder engines is used, then in process block 810, the emulation prevention byte is removed from the bitstream. In process block 812, the encoded data with the emulation prevention byte removed is sent to the decoder engine, which is capable of decoding baseline profile, main and higher. The length and location information can also be sent to the decoder. If decision block 806 is answered in the negative, then in process block 816, the encoded data is sent to the decoder capable of decoding main profile and higher with the emulation prevention byte together with the length and location information. Thus, when the NALU length information is available, hardware accelerated decoding can completely skip start code scanning, and send NALU's one by one directly to hardware. The optimized bitstream parser achieves a substantial gain in CPU usage, especially for hardware accelerated video decoding on low-end machines.
Any of the disclosed methods can be implemented as computer-executable instructions stored on one or more computer-readable storage media (e.g., non-transitory computer-readable media, such as one or more optical media discs, volatile memory components (such as DRAM or SRAM), or nonvolatile memory components (such as hard drives)) and executed on a computer (e.g., any commercially available computer, including smart phones or other mobile devices that include computing hardware). Any of the computer-executable instructions for implementing the disclosed techniques as well as any data created and used during implementation of the disclosed embodiments can be stored on one or more computer-readable media (e.g., non-transitory computer-readable media). The computer-executable instructions can be part of, for example, a dedicated software application or a software application that is accessed or downloaded via a web browser or other software application (such as a remote computing application). Such software can be executed, for example, on a single local computer (e.g., any suitable commercially available computer) or in a network environment (e.g., via the Internet, a wide-area network, a local-area network, a client-server network (such as a cloud computing network), or other such network) using one or more network computers.
For clarity, only certain selected aspects of the software-based implementations are described. Other details that are well known in the art are omitted. For example, it should be understood that the disclosed technology is not limited to any specific computer language or program. For instance, the disclosed technology can be implemented by software written in C++, Java, Perl, JavaScript, Adobe Flash, or any other suitable programming language. Likewise, the disclosed technology is not limited to any particular computer or type of hardware. Certain details of suitable computers and hardware are well known and need not be set forth in detail in this disclosure.
Furthermore, any of the software-based embodiments (comprising, for example, computer-executable instructions for causing a computer to perform any of the disclosed methods) can be uploaded, downloaded, or remotely accessed through a suitable communication means. Such suitable communication means include, for example, the Internet, the World Wide Web, an intranet, software applications, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication means.
In view of the many possible embodiments to which the principles of the disclosed invention may be applied, it should be recognized that the illustrated embodiments are only preferred examples of the invention and should not be taken as limiting the scope of the invention. Rather, the scope of the invention is defined by the following claims. We therefore claim as our invention all that comes within the scope of these claims.

Claims (20)

We claim:
1. In a computing device that implements a video decoder, a method comprising:
receiving an encoded video sequence with a file container;
by the video decoder, detecting a flag within the video sequence indicating that a decoder engine is required that is capable of decoding a baseline profile;
with the computing device that implements the video decoder, analyzing metadata associated with the encoded video sequence in the file container;
based on the metadata indicating that the decoder engine that is capable of decoding a baseline profile is not required, determining that the flag is improperly set; and
using the metadata to make decoder optimization decisions in the video decoder, the decoder optimization decisions including selecting a decoder that is not capable of decoding a baseline profile in accordance with the metadata and ignoring that the flag is set.
2. The method of claim 1, wherein the decoder optimization decisions include choosing a decoder engine, based on the metadata, to perform the decoding from a plurality of decoder engines.
3. The method of claim 2, wherein the plurality of decoder engines are chosen from a list including of one of the following: a decoder engine capable of decoding a video sequence of main profile and higher profiles and a decoder engine capable of decoding baseline, main and higher profiles.
4. The method of claim 3, wherein the decoder engine capable of decoding a video sequence of main profile and higher profiles includes a graphics processing unit for hardware acceleration and the decoder engine capable of decoding baseline, main and higher profiles includes a central processing unit.
5. The method of claim 1, further including searching the metadata for a type of algorithm used in the encoding and choosing a decoding engine based on the type of algorithm.
6. The method of claim 5, wherein the types of algorithms include arbitrary slice ordering (ASO) and flexible macroblock ordering (FMO).
7. The method of claim 5, wherein searching the metadata includes parsing the one or more parameter sets in the encoded video sequence.
8. The method of claim 1, further including retrieving length information and location information associated with the encoded video sequence and passing the length and location information to a decoder engine as the metadata.
9. The method of claim 1, wherein analyzing the metadata further includes scanning the encoded video sequence for at least one start code and end code and calculating length information based on the at least one start and end codes.
10. The method of claim 9, further including removing an emulation prevention byte from the metadata before passing the encoded video sequence to the decoder engine.
11. The method of claim 1, wherein the decoder optimization decision includes suppressing start-code scanning based on the metadata.
12. A computer-readable nonvolatile storage device having encoded therein computer-executable instructions for causing a computing device programmed thereby to perform a method comprising:
receiving an encoded video sequence in a decoder, the encoded video sequence including a flag indicating use of a decoder engine capable of decoding a baseline profile;
analyzing metadata in the encoded video sequence to determine which of at least two decoder engines would be more efficient, the metadata indicating use of a decoder engine that is not capable of decoding a baseline profile, which indication is opposite to that which theflag indicated; and
forwarding the encoded video sequence to the determined decoder engine based on the metadata, wherein the determined decoder engine is not capable of decoding a baseline profile.
13. The computer-readable nonvolatile storage device of claim 12, wherein the plurality of decoder engines are one of the following: a decoder engine capable of decoding a video sequence of main profile and higher profiles and a decoder engine capable of decoding baseline, main and higher profiles.
14. The computer-readable nonvolatile storage device of claim 13, wherein the decoder engine capable of decoding a video sequence of main profile and higher profiles includes a graphics processing unit for hardware acceleration and the decoder engine capable of decoding baseline, main and higher profiles includes a central processing unit.
15. The computer-readable nonvolatile storage device of claim 12, further including searching the metadata for a type of algorithm used in the encoding and choosing a decoding engine based on the type of algorithm.
16. The computer-readable nonvolatile storage device of claim 15, wherein the types of algorithms include arbitrary slice ordering (ASO) and flexible macroblock ordering (FMO) and if either of these algorithms are used then choosing the decoder engine that is capable of decoding baseline, main and higher profiles.
17. The computer-readable nonvolatile storage device of claim 15, wherein searching the metadata includes parsing the one or more parameter sets in the encoded video sequence.
18. A computing device that implements a video decoder, the computing device comprising one or more processing units and being adapted to perform a method comprising:
receiving encoded data in a bitstream for a video sequence, wherein the bitstream includes one or more parameter sets included in metadata, the one or more parameter sets including information indicating that a decoder engine is required capable of decoding a video sequence with a baseline profile;
parsing the one or more parameter sets to determine whether arbitrary slice ordering or flexible macroblock ordering are used;
if arbitrary slice ordering or flexible macroblock ordering are used, forwarding the encoded data to a first decoder engine capable of decoding a video sequence of baseline, main and higher profiles;
if the arbitrary slice ordering or flexible macroblock ordering are not used, forwarding the encoded data to a second decoder engine, different than the first decoder engine, capable of decoding main profile and higher profiles, but not a baseline profile, irrespective of the one or more parameter sets including information indicating that a decoder engine is required that is capable of decoding a video sequence with a baseline profile.
19. The computing device of claim 18, further including suppressing start-code scanning based on the metadata.
20. The computing device of claim 18, further including removing an emulation prevention byte from the metadata.
US13/399,769 2012-02-17 2012-02-17 Metadata assisted video decoding Active 2033-04-26 US9241167B2 (en)

Priority Applications (8)

Application Number Priority Date Filing Date Title
US13/399,769 US9241167B2 (en) 2012-02-17 2012-02-17 Metadata assisted video decoding
CN201380009666.7A CN104106264B (en) 2012-02-17 2013-02-13 Metadata assisted video decoding
PCT/US2013/026004 WO2013123100A1 (en) 2012-02-17 2013-02-13 Metadata assisted video decoding
EP13749713.7A EP2815574B1 (en) 2012-02-17 2013-02-13 Metadata assisted video decoding
JP2014557749A JP2015513386A (en) 2012-02-17 2013-02-13 Metadata assisted video decoding
KR1020147022904A KR102006044B1 (en) 2012-02-17 2013-02-13 Metadata assisted video decoding
US14/979,277 US9807409B2 (en) 2012-02-17 2015-12-22 Metadata assisted video decoding
JP2017199533A JP6423061B2 (en) 2012-02-17 2017-10-13 Computing device and method for implementing video decoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/399,769 US9241167B2 (en) 2012-02-17 2012-02-17 Metadata assisted video decoding

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/979,277 Continuation US9807409B2 (en) 2012-02-17 2015-12-22 Metadata assisted video decoding

Publications (2)

Publication Number Publication Date
US20130215978A1 US20130215978A1 (en) 2013-08-22
US9241167B2 true US9241167B2 (en) 2016-01-19

Family

ID=48982247

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/399,769 Active 2033-04-26 US9241167B2 (en) 2012-02-17 2012-02-17 Metadata assisted video decoding
US14/979,277 Active US9807409B2 (en) 2012-02-17 2015-12-22 Metadata assisted video decoding

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/979,277 Active US9807409B2 (en) 2012-02-17 2015-12-22 Metadata assisted video decoding

Country Status (6)

Country Link
US (2) US9241167B2 (en)
EP (1) EP2815574B1 (en)
JP (2) JP2015513386A (en)
KR (1) KR102006044B1 (en)
CN (1) CN104106264B (en)
WO (1) WO2013123100A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10271069B2 (en) 2016-08-31 2019-04-23 Microsoft Technology Licensing, Llc Selective use of start code emulation prevention
US10362241B2 (en) 2016-12-30 2019-07-23 Microsoft Technology Licensing, Llc Video stream delimiter for combined frame
US20200014906A1 (en) * 2018-07-06 2020-01-09 Mediatek Singapore Pte. Ltd. Methods and apparatus for immersive media content overlays
EP4398488A3 (en) * 2018-10-23 2024-07-31 Tencent America LLC Techniques for multiple conformance points in media coding

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2885911B1 (en) * 2013-03-28 2021-03-10 Irdeto B.V. Processing digital content
US9911460B2 (en) 2014-03-24 2018-03-06 Microsoft Technology Licensing, Llc Fast and smart video trimming at frame accuracy on generic platform
US9516147B2 (en) 2014-10-30 2016-12-06 Microsoft Technology Licensing, Llc Single pass/single copy network abstraction layer unit parser
US9854261B2 (en) * 2015-01-06 2017-12-26 Microsoft Technology Licensing, Llc. Detecting markers in an encoded video signal
US9979983B2 (en) * 2015-03-16 2018-05-22 Microsoft Technology Licensing, Llc Application- or context-guided video decoding performance enhancements
US10129566B2 (en) * 2015-03-16 2018-11-13 Microsoft Technology Licensing, Llc Standard-guided video decoding performance enhancements
US11265580B2 (en) * 2019-03-22 2022-03-01 Tencent America LLC Supplemental enhancement information messages for neural network based video post processing
US11356706B2 (en) 2020-01-08 2022-06-07 Qualcomm Incorporated Storage and delivery of video data for video coding
US11831920B2 (en) * 2021-01-08 2023-11-28 Tencent America LLC Method and apparatus for video coding
CN115866254A (en) * 2022-11-24 2023-03-28 亮风台(上海)信息科技有限公司 Method and equipment for transmitting video frame and camera shooting parameter information

Citations (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6311204B1 (en) * 1996-10-11 2001-10-30 C-Cube Semiconductor Ii Inc. Processing system with register-based process sharing
US20030063676A1 (en) * 2000-04-17 2003-04-03 Pulsent Corporation Decoder for decoding segment-based encoding of video data using segmentation performed at a decoder
US20040196902A1 (en) * 2001-08-30 2004-10-07 Faroudja Yves C. Multi-layer video compression system with synthetic high frequencies
US20050123274A1 (en) * 2003-09-07 2005-06-09 Microsoft Corporation Signaling coding and display options in entry point headers
US20050152448A1 (en) * 2003-09-07 2005-07-14 Microsoft Corporation Signaling for entry point frames with predicted first field
US20050259729A1 (en) * 2004-05-21 2005-11-24 Shijun Sun Video coding with quality scalability
US20060013305A1 (en) * 2004-07-14 2006-01-19 Sharp Laboratories Of America, Inc. Temporal scalable coding using AVC coding tools
US20060126726A1 (en) 2004-12-10 2006-06-15 Lin Teng C Digital signal processing structure for decoding multiple video standards
US20060215754A1 (en) 2005-03-24 2006-09-28 Intel Corporation Method and apparatus for performing video decoding in a multi-thread environment
US20070121728A1 (en) * 2005-05-12 2007-05-31 Kylintv, Inc. Codec for IPTV
US20070183508A1 (en) * 2004-08-05 2007-08-09 Shintaro Kudo Image decoding device and image encoding device
CN101115195A (en) 2006-07-24 2008-01-30 同济大学 Macroblock grade coupled decoding and loop filtering method and apparatus for video code stream
US20080189660A1 (en) * 2006-12-08 2008-08-07 Masao Nakagawa Information Processing Apparatus, Information Processing Method, and Information Processing Program
US7439883B1 (en) 2006-02-10 2008-10-21 Nvidia Corporation Bitstream generation for VLC encoded data
US20090003447A1 (en) 2007-06-30 2009-01-01 Microsoft Corporation Innovations in video decoder implementations
US20090003446A1 (en) 2007-06-30 2009-01-01 Microsoft Corporation Computing collocated macroblock information for direct mode macroblocks
US20090002379A1 (en) 2007-06-30 2009-01-01 Microsoft Corporation Video decoding implementations for a graphics processing unit
US20090196352A1 (en) * 2008-01-31 2009-08-06 Yosef Stein Video decoder system and method with video enhancement using direct contrast enhancement in the spatial domain
US20090228601A1 (en) * 2008-03-06 2009-09-10 Yi-Chen Tseng Method and apparatus for processing audio/video bit-stream
CN101577110A (en) 2009-05-31 2009-11-11 腾讯科技(深圳)有限公司 Method for playing videos and video player
US20090305680A1 (en) * 2008-04-03 2009-12-10 Swift Roderick D Methods and apparatus to monitor mobile devices
US20090323813A1 (en) 2008-06-02 2009-12-31 Maciel De Faria Sergio Manuel Method to transcode h.264/avc video frames into mpeg-2 and device
KR20100000066A (en) 2008-06-24 2010-01-06 (주)휴맥스 홀딩스 Reconfigurable avc adaptive video decoder and method thereof
US20100027974A1 (en) * 2008-07-31 2010-02-04 Level 3 Communications, Inc. Self Configuring Media Player Control
US20100046631A1 (en) 2008-08-19 2010-02-25 Qualcomm Incorporated Power and computational load management techniques in video processing
US20100135383A1 (en) 2008-11-28 2010-06-03 Microsoft Corporation Encoder with multiple re-entry and exit points
US20100189182A1 (en) 2009-01-28 2010-07-29 Nokia Corporation Method and apparatus for video coding and decoding
US20100195721A1 (en) 2009-02-02 2010-08-05 Microsoft Corporation Local picture identifier and computation of co-located information
KR20100109333A (en) 2009-03-31 2010-10-08 삼성전자주식회사 Method and apparatus for transmitting compressed data using digital data interface, and method and apparatus for receiving the same
US20110058792A1 (en) 2009-09-10 2011-03-10 Paul Towner Video Format for Digital Video Recorder
CN101986708A (en) 2010-10-29 2011-03-16 北京中星微电子有限公司 Video decoding method and decoder
US20110080425A1 (en) * 2009-10-05 2011-04-07 Electronics And Telecommunications Research Institute System for providing multi-angle broadcasting service
US20110225417A1 (en) * 2006-12-13 2011-09-15 Kavi Maharajh Digital rights management in a mobile environment
US20110261885A1 (en) * 2010-04-27 2011-10-27 De Rivaz Peter Francis Chevalley Method and system for bandwidth reduction through integration of motion estimation and macroblock encoding
US20120044153A1 (en) * 2010-08-19 2012-02-23 Nokia Corporation Method and apparatus for browsing content files
US20120066589A1 (en) * 2010-09-13 2012-03-15 Santos Jair F Teixeira Dos Content placement
US20120147141A1 (en) * 2009-07-10 2012-06-14 Taiji Sasaki Recording medium, playback device, and integrated circuit
US20120236939A1 (en) * 2011-03-15 2012-09-20 Broadcom Corporation Sub-band video coding architecture for packet based transmission
US20120317305A1 (en) * 2010-02-19 2012-12-13 Telefonaktiebolaget Lm Ericsson (Publ) Method and Arrangement for Representation Switching in HTTP Streaming
US20130007827A1 (en) * 2011-06-30 2013-01-03 Samsung Electronics Co., Ltd. Receiving a broadcast stream
US20130121669A1 (en) * 2006-09-07 2013-05-16 Opentv, Inc. Systems and methods to position and play content
US20130128947A1 (en) * 2011-11-18 2013-05-23 At&T Intellectual Property I, L.P. System and method for automatically selecting encoding/decoding for streaming media
US20130148947A1 (en) * 2011-12-13 2013-06-13 Ati Technologies Ulc Video player with multiple grpahics processors
US20130156101A1 (en) * 2011-12-16 2013-06-20 Microsoft Corporation Hardware-accelerated decoding of scalable video bitstreams
US20130166580A1 (en) * 2006-12-13 2013-06-27 Quickplay Media Inc. Media Processor
US20130188685A1 (en) * 2012-01-19 2013-07-25 Panasonic Corporation Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US20130195183A1 (en) * 2012-01-31 2013-08-01 Apple Inc. Video coding efficiency with camera metadata
US20130235152A1 (en) * 2011-08-31 2013-09-12 Nokia Corporation Video Coding and Decoding
US20130308707A1 (en) * 2005-09-27 2013-11-21 Qualcomm Incorporated Methods and device for data alignment with time domain boundary

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7613727B2 (en) * 2002-02-25 2009-11-03 Sont Corporation Method and apparatus for supporting advanced coding formats in media files
US7865581B2 (en) * 2005-04-15 2011-01-04 Thomson Licensing Remote management method of a distant device, and corresponding video device
US20070022215A1 (en) * 2005-07-19 2007-01-25 Singer David W Method and apparatus for media data transmission
US8436889B2 (en) * 2005-12-22 2013-05-07 Vidyo, Inc. System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers
US8619865B2 (en) * 2006-02-16 2013-12-31 Vidyo, Inc. System and method for thinning of scalable video coding bit-streams
US8428125B2 (en) * 2006-12-22 2013-04-23 Qualcomm Incorporated Techniques for content adaptive video frame slicing and non-uniform access unit coding
US8411734B2 (en) * 2007-02-06 2013-04-02 Microsoft Corporation Scalable multi-thread video decoding
JP4691062B2 (en) 2007-03-30 2011-06-01 株式会社東芝 Information processing device
US9848209B2 (en) * 2008-04-02 2017-12-19 Microsoft Technology Licensing, Llc Adaptive error detection for MPEG-2 error concealment
US20090290645A1 (en) * 2008-05-21 2009-11-26 Broadcast International, Inc. System and Method for Using Coded Data From a Video Source to Compress a Media Signal
US8867605B2 (en) 2008-10-14 2014-10-21 Nvidia Corporation Second deblocker in a decoding pipeline
US20100226444A1 (en) * 2009-03-09 2010-09-09 Telephoto Technologies Inc. System and method for facilitating video quality of live broadcast information over a shared packet based network
JP5470405B2 (en) * 2009-12-28 2014-04-16 パナソニック株式会社 Image coding apparatus and method
US20120079054A1 (en) * 2010-03-24 2012-03-29 General Instrument Corporation Automatic Memory Management for a Home Transcoding Device
JP2012085211A (en) * 2010-10-14 2012-04-26 Sony Corp Image processing device and method, and program
US9338465B2 (en) * 2011-06-30 2016-05-10 Sharp Kabushiki Kaisha Context initialization based on decoder picture buffer
US8891630B2 (en) * 2011-10-24 2014-11-18 Blackberry Limited Significance map encoding and decoding using partition set based context assignment
US11240515B2 (en) * 2012-09-10 2022-02-01 Apple Inc. Video display preference filtering

Patent Citations (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6311204B1 (en) * 1996-10-11 2001-10-30 C-Cube Semiconductor Ii Inc. Processing system with register-based process sharing
US20030063676A1 (en) * 2000-04-17 2003-04-03 Pulsent Corporation Decoder for decoding segment-based encoding of video data using segmentation performed at a decoder
US20040196902A1 (en) * 2001-08-30 2004-10-07 Faroudja Yves C. Multi-layer video compression system with synthetic high frequencies
US20050123274A1 (en) * 2003-09-07 2005-06-09 Microsoft Corporation Signaling coding and display options in entry point headers
US20050152448A1 (en) * 2003-09-07 2005-07-14 Microsoft Corporation Signaling for entry point frames with predicted first field
US20050259729A1 (en) * 2004-05-21 2005-11-24 Shijun Sun Video coding with quality scalability
US20060013305A1 (en) * 2004-07-14 2006-01-19 Sharp Laboratories Of America, Inc. Temporal scalable coding using AVC coding tools
US20070183508A1 (en) * 2004-08-05 2007-08-09 Shintaro Kudo Image decoding device and image encoding device
US20060126726A1 (en) 2004-12-10 2006-06-15 Lin Teng C Digital signal processing structure for decoding multiple video standards
US20060215754A1 (en) 2005-03-24 2006-09-28 Intel Corporation Method and apparatus for performing video decoding in a multi-thread environment
US20070121728A1 (en) * 2005-05-12 2007-05-31 Kylintv, Inc. Codec for IPTV
US20130308707A1 (en) * 2005-09-27 2013-11-21 Qualcomm Incorporated Methods and device for data alignment with time domain boundary
US7439883B1 (en) 2006-02-10 2008-10-21 Nvidia Corporation Bitstream generation for VLC encoded data
CN101115195A (en) 2006-07-24 2008-01-30 同济大学 Macroblock grade coupled decoding and loop filtering method and apparatus for video code stream
US20130121669A1 (en) * 2006-09-07 2013-05-16 Opentv, Inc. Systems and methods to position and play content
US20080189660A1 (en) * 2006-12-08 2008-08-07 Masao Nakagawa Information Processing Apparatus, Information Processing Method, and Information Processing Program
US20130166580A1 (en) * 2006-12-13 2013-06-27 Quickplay Media Inc. Media Processor
US20110225417A1 (en) * 2006-12-13 2011-09-15 Kavi Maharajh Digital rights management in a mobile environment
US20090002379A1 (en) 2007-06-30 2009-01-01 Microsoft Corporation Video decoding implementations for a graphics processing unit
US20090003446A1 (en) 2007-06-30 2009-01-01 Microsoft Corporation Computing collocated macroblock information for direct mode macroblocks
US20090003447A1 (en) 2007-06-30 2009-01-01 Microsoft Corporation Innovations in video decoder implementations
US20090196352A1 (en) * 2008-01-31 2009-08-06 Yosef Stein Video decoder system and method with video enhancement using direct contrast enhancement in the spatial domain
US20090228601A1 (en) * 2008-03-06 2009-09-10 Yi-Chen Tseng Method and apparatus for processing audio/video bit-stream
US20090305680A1 (en) * 2008-04-03 2009-12-10 Swift Roderick D Methods and apparatus to monitor mobile devices
US20090323813A1 (en) 2008-06-02 2009-12-31 Maciel De Faria Sergio Manuel Method to transcode h.264/avc video frames into mpeg-2 and device
KR20100000066A (en) 2008-06-24 2010-01-06 (주)휴맥스 홀딩스 Reconfigurable avc adaptive video decoder and method thereof
US20100027974A1 (en) * 2008-07-31 2010-02-04 Level 3 Communications, Inc. Self Configuring Media Player Control
US20100046631A1 (en) 2008-08-19 2010-02-25 Qualcomm Incorporated Power and computational load management techniques in video processing
US20100135383A1 (en) 2008-11-28 2010-06-03 Microsoft Corporation Encoder with multiple re-entry and exit points
US20100189182A1 (en) 2009-01-28 2010-07-29 Nokia Corporation Method and apparatus for video coding and decoding
US20100195721A1 (en) 2009-02-02 2010-08-05 Microsoft Corporation Local picture identifier and computation of co-located information
KR20100109333A (en) 2009-03-31 2010-10-08 삼성전자주식회사 Method and apparatus for transmitting compressed data using digital data interface, and method and apparatus for receiving the same
CN101577110A (en) 2009-05-31 2009-11-11 腾讯科技(深圳)有限公司 Method for playing videos and video player
US20120147141A1 (en) * 2009-07-10 2012-06-14 Taiji Sasaki Recording medium, playback device, and integrated circuit
US20110058792A1 (en) 2009-09-10 2011-03-10 Paul Towner Video Format for Digital Video Recorder
US20110080425A1 (en) * 2009-10-05 2011-04-07 Electronics And Telecommunications Research Institute System for providing multi-angle broadcasting service
US20120317305A1 (en) * 2010-02-19 2012-12-13 Telefonaktiebolaget Lm Ericsson (Publ) Method and Arrangement for Representation Switching in HTTP Streaming
US20110261885A1 (en) * 2010-04-27 2011-10-27 De Rivaz Peter Francis Chevalley Method and system for bandwidth reduction through integration of motion estimation and macroblock encoding
US20120044153A1 (en) * 2010-08-19 2012-02-23 Nokia Corporation Method and apparatus for browsing content files
US20120066589A1 (en) * 2010-09-13 2012-03-15 Santos Jair F Teixeira Dos Content placement
CN101986708A (en) 2010-10-29 2011-03-16 北京中星微电子有限公司 Video decoding method and decoder
US20120236939A1 (en) * 2011-03-15 2012-09-20 Broadcom Corporation Sub-band video coding architecture for packet based transmission
US20130007827A1 (en) * 2011-06-30 2013-01-03 Samsung Electronics Co., Ltd. Receiving a broadcast stream
US20130235152A1 (en) * 2011-08-31 2013-09-12 Nokia Corporation Video Coding and Decoding
US20130128947A1 (en) * 2011-11-18 2013-05-23 At&T Intellectual Property I, L.P. System and method for automatically selecting encoding/decoding for streaming media
US20130148947A1 (en) * 2011-12-13 2013-06-13 Ati Technologies Ulc Video player with multiple grpahics processors
US20130156101A1 (en) * 2011-12-16 2013-06-20 Microsoft Corporation Hardware-accelerated decoding of scalable video bitstreams
US20130188685A1 (en) * 2012-01-19 2013-07-25 Panasonic Corporation Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US20130195183A1 (en) * 2012-01-31 2013-08-01 Apple Inc. Video coding efficiency with camera metadata

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
"First Office Action and Search Report issued in Chinese Patent Application No. 201380009666.7," Mailed Jun. 2, 2015, 14 pages.
Chen et al., "Implementation of H.264 Encoder and Decoder on Personal Computers," Journal of Visual Communication and Image Representation, 2006, 19 pages.
Extended European Search Report for EP 13 74 9713, dated Sep. 4, 2015, 8 pages.
International Search Report and Written Opinion for PCT/US2013/026004, dated Jun. 2, 2013, 9 pages.
Qiu et al., "An Architecture for Programmable Multi-core IP Accelerated Platform with an Advanced Application of H.264 Codec Implementation," Journal of Signal Processing Systems, vol. 57, No. 2, Nov. 2009, pp. 123-137.
Sullivan, "DirectX Video Acceleration Specification for H.264/AVC Decoding," Dec. 2010, 66 pages.
Sullivan, "DirectX Video Acceleration Specification for Windows Media Video v8, v9 and vA Decoding (Including SMPTE 421M 'Vc-1')," Aug. 2010, 102 pages.
Wiegand, Thomas, and Gary J. Sullivan. "The H. 264/AVC video coding standard." IEEE Signal Processing Magazine 24.2 (2007), pp. 148-153. *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10271069B2 (en) 2016-08-31 2019-04-23 Microsoft Technology Licensing, Llc Selective use of start code emulation prevention
US10362241B2 (en) 2016-12-30 2019-07-23 Microsoft Technology Licensing, Llc Video stream delimiter for combined frame
US20200014906A1 (en) * 2018-07-06 2020-01-09 Mediatek Singapore Pte. Ltd. Methods and apparatus for immersive media content overlays
WO2020010298A1 (en) * 2018-07-06 2020-01-09 Mediatek Singapore Pte. Ltd. Methods and apparatus for immersive media content overlays
US10931930B2 (en) * 2018-07-06 2021-02-23 Mediatek Singapore Pte. Ltd. Methods and apparatus for immersive media content overlays
EP4398488A3 (en) * 2018-10-23 2024-07-31 Tencent America LLC Techniques for multiple conformance points in media coding

Also Published As

Publication number Publication date
WO2013123100A1 (en) 2013-08-22
KR20140123957A (en) 2014-10-23
JP2015513386A (en) 2015-05-11
US20160219288A1 (en) 2016-07-28
US20130215978A1 (en) 2013-08-22
US9807409B2 (en) 2017-10-31
CN104106264A (en) 2014-10-15
CN104106264B (en) 2017-05-17
JP6423061B2 (en) 2018-11-14
EP2815574A1 (en) 2014-12-24
EP2815574B1 (en) 2019-07-24
KR102006044B1 (en) 2019-07-31
JP2018026871A (en) 2018-02-15
EP2815574A4 (en) 2015-10-07

Similar Documents

Publication Publication Date Title
US9807409B2 (en) Metadata assisted video decoding
US12010338B2 (en) Representative motion information for temporal motion prediction in video encoding and decoding
TWI603609B (en) Constraints and unit types to simplify video random access
US10489426B2 (en) Category-prefixed data batching of coded media data in multiple categories
US9271013B2 (en) Merge mode for motion information prediction
US20150146794A1 (en) Decoding For High Efficiency Video Transcoding
US10298931B2 (en) Coupling sample metadata with media samples
TW201444351A (en) Conditional signaling of picture order count timing information for video timing in video coding
US8121189B2 (en) Video decoding using created reference pictures
US12041252B2 (en) Multi-threaded CABAC decoding

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WU, YONGJUN;SADHWANI, SHYAM;THUMPUDI, NAVEEN;SIGNING DATES FROM 20120229 TO 20120301;REEL/FRAME:027876/0418

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0541

Effective date: 20141014

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8