US20130117418A1 - Hybrid platform for content delivery and transcoding - Google Patents
Hybrid platform for content delivery and transcoding Download PDFInfo
- Publication number
- US20130117418A1 US20130117418A1 US13/667,267 US201213667267A US2013117418A1 US 20130117418 A1 US20130117418 A1 US 20130117418A1 US 201213667267 A US201213667267 A US 201213667267A US 2013117418 A1 US2013117418 A1 US 2013117418A1
- Authority
- US
- United States
- Prior art keywords
- file
- content
- transcoding
- server
- proxy servers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000012384 transportation and delivery Methods 0.000 title abstract description 65
- 238000000034 method Methods 0.000 claims abstract description 140
- 230000008569 process Effects 0.000 claims description 106
- 238000003860 storage Methods 0.000 claims description 79
- 230000004044 response Effects 0.000 claims description 43
- 238000006243 chemical reaction Methods 0.000 claims description 32
- 230000006870 function Effects 0.000 claims description 20
- 238000012544 monitoring process Methods 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims 2
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000003044 adaptive effect Effects 0.000 abstract description 5
- 238000009826 distribution Methods 0.000 abstract description 4
- 238000004364 calculation method Methods 0.000 abstract description 3
- 230000001976 improved effect Effects 0.000 abstract description 3
- 238000002360 preparation method Methods 0.000 abstract description 3
- 239000012634 fragment Substances 0.000 description 87
- 238000013459 approach Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 12
- 230000008859 change Effects 0.000 description 11
- 230000000670 limiting effect Effects 0.000 description 11
- 238000007726 management method Methods 0.000 description 10
- 230000008901 benefit Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 238000012986 modification Methods 0.000 description 8
- 230000004048 modification Effects 0.000 description 8
- 239000003795 chemical substances by application Substances 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 101100122750 Caenorhabditis elegans gop-2 gene Proteins 0.000 description 4
- 101000946275 Homo sapiens Protein CLEC16A Proteins 0.000 description 4
- 102100034718 Protein CLEC16A Human genes 0.000 description 4
- 238000013480 data collection Methods 0.000 description 4
- 230000037406 food intake Effects 0.000 description 4
- AWSBQWZZLBPUQH-UHFFFAOYSA-N mdat Chemical compound C1=C2CC(N)CCC2=CC2=C1OCO2 AWSBQWZZLBPUQH-UHFFFAOYSA-N 0.000 description 4
- 230000008093 supporting effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 241000278713 Theora Species 0.000 description 2
- 239000003999 initiator Substances 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000013468 resource allocation Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000007723 transport mechanism Effects 0.000 description 2
- KJLPSBMDOIVXSN-UHFFFAOYSA-N 4-[4-[2-[4-(3,4-dicarboxyphenoxy)phenyl]propan-2-yl]phenoxy]phthalic acid Chemical compound C=1C=C(OC=2C=C(C(C(O)=O)=CC=2)C(O)=O)C=CC=1C(C)(C)C(C=C1)=CC=C1OC1=CC=C(C(O)=O)C(C(O)=O)=C1 KJLPSBMDOIVXSN-UHFFFAOYSA-N 0.000 description 1
- 101100476639 Caenorhabditis elegans gop-3 gene Proteins 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000008571 general function Effects 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 102220124522 rs746215581 Human genes 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 108020001568 subdomains Proteins 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 230000002618 waking effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234309—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/756—Media network packet handling adapting media to device capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/177—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/40—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
Definitions
- This disclosure relates generally to computer systems for processing of media files, and other content, using distributed computing techniques.
- Content providers such as large-scale broadcasters, film distributors, and the like desire to distribute their content online in a manner that complements traditional mediums such as broadcast TV (including high definition or “HD” television) and DVD. It is important to them to have the ability to distribute content to a wide variety of third-party client application/device formats, and to offer a quality viewing experience regardless of network conditions, using modern technologies like adaptive bitrate streaming. Notably, since Internet-based content delivery is no longer limited to fixed line environments such as the desktop, and more and more end users now use mobile devices to receive and view content in wireless environments, the ability to support new client device formats and new streaming technologies is particularly important.
- a media file may be single-media content (e.g., audio-only media) or the media file may comprise multiple media types, i.e., a multimedia file with audio/video data.
- a given multimedia file is built on data in several different formats.
- the audio and video data are each encoded using appropriate codecs, which are algorithms that encode and compress that data.
- Example codecs include H.264, VP6, AAC, MP3, etc.
- a container or package format that functions as a wrapper and describes the data elements and metadata of the multimedia file, so that a client application knows how to play it.
- Example container formats include Flash, Silverlight, MP4, PIFF, and MPEG-TS.
- bit rate at which to encode the audio and video data must be selected.
- An encoding with a lower bitrate and smaller frame size (among other factors) generally will be easier to stream reliably, since the amount of data will be smaller, but the quality of the experience will suffer.
- an encoding at a higher-bitrate and a larger frame will be a higher quality experience, but is more likely to lead to interrupted and/or poor quality streams due to network delivery issues.
- Current adaptive bitrate streaming technologies require multiple streams each encoded at a different bitrate, allowing the client and/or server to switch between streams in order to compensate for network congestion.
- content providers typically must create many different versions of their content. For example, they often will create multiple copies of a given movie title at different screen sizes, bit rates, quality levels and client player formats. Furthermore, over time they may want to change formats, for example by updating the encoding (e.g., to take advantage of newer codecs that compress content more efficiently). They may also need to change the container format to accommodate new client environments, a process often referred to as transmuxing. Failing to provide certain bit rates or poor encoding practices will likely reduce the quality of the stream. But generating so many different versions of content, as well as converting from one to another and storing them, is a time-consuming and costly process that is difficult to manage.
- a “distributed system” of this type typically refers to a collection of autonomous computers linked by a network or networks, together with the software, systems, protocols and techniques designed to facilitate various services, such as content delivery or the support of outsourced site infrastructure.
- content delivery means the storage, caching, or transmission of content, streaming media and applications on behalf of content providers, including ancillary technologies used therewith including, without limitation, DNS query handling, provisioning, data monitoring and reporting, content targeting, personalization, and business intelligence.
- a content delivery network such as that just described typically supports different content formats, and offers many advantages for accelerating the delivery of content, once created.
- the content provider still faces the problem of creating and managing the creation of all of the various versions of content that it desires and/or that are necessary.
- the subject matter herein generally relates to transcoding content, typically audio/video files though not limited to such.
- the transcoding is performed in preparation for online streaming or other delivery to end users.
- Such transcoding may involve converting from one format to another (e.g., converting codecs or container formats), or creating multiple versions of an original source file in different bitrates, resolutions, or otherwise, to support distribution to a wide array of devices and to utilize performance-enhancing technologies like adaptive bitrate streaming.
- This disclosure describes a transcoding platform that, in certain embodiments, leverages distributed computing techniques to transcode content in parallel across a platform of machines that are preferably idle or low-utilization resources of a content delivery network.
- the transcoding system also utilizes, in certain embodiments, improved techniques for breaking up the original source file that are performed so that different segments of the file can be sent to different machines for transcoding in parallel.
- a transcoding platform is made up of distributed transcoding resources, typically servers with available processing power and programmed to assist in the transcoding function.
- These transcoding resources may be dedicated machines as well as machines that are shared with other functions.
- the machines can be idle or low-utilization HTTP proxy servers (relative to other such proxy servers) in a content delivery network. While these machines may spend much of their time receiving and responding to client requests for content, and otherwise facilitating delivery of online content to requesting end-users, at certain times (in the middle of night in their local time zone, for example) they may be relatively lightly-loaded, and hence available to perform certain transcoding tasks.
- the transcoding platform may also include a set of machine(s) that manage and coordinate the transcoding process.
- These machines may receive requests to perform a particular transcoding job, e.g., to convert a particular file from a first version to a second version.
- the request may come from a user interface (through which a content provider user of the platform uploads their content to be transcoded, for instance), from a network storage system, or from components in the content delivery network that are streaming content (e.g., that need to be able to deliver a particular format to a requesting end-user client), including one of the proxy servers.
- the transcoding job may be designated with a priority level, which may correspond semantically to a “live”, “real-time” or “batch” mode conversion.
- proxy servers are only used if the priority level is below a certain threshold because the proxy servers are considered to be unreliable for transcoding tasks.
- proxy servers may operate such that content delivery processes (e.g., responding to client requests) take priority over transcoding tasks when allocating processing time within the proxy server.
- a machine(s) managing the transcoding process obtains a list of candidate servers for performing transcoding tasks.
- This list may include the results of a lookup into the content delivery network's monitoring and mapping system to determine which proxy servers within the network are currently experiencing a relatively light load for content delivery services, as measured by such metrics as processor (CPU), memory, or disk utilization, and/or client request rate, etc.
- the management machine retrieves the file to be transcoded and breaks it up into segments suitable to be independently converted. These segments are then sent to the various transcoding resources (e.g., the proxy servers or the dedicated machines) distributed across the platform, which given the nature of the content delivery network may be global in nature. Also sent along are instructions with parameters about the desired transcode operation and/or target format.
- Each transcoding resource performs its task independently, e.g., decoding the chunk that it is given and re-encoding with the appropriate parameters. It then returns the result to the management machine(s), which reassembles the new segments into the new file.
- the proxy servers can continue to service client requests for content (the proxy process) while performing the transcode process with residual resources.
- proxy servers are responsible for servicing client requests, that process typically takes priority over the transcoding process.
- the proxy server may determine that it cannot complete the transcode request and may send a message back to the management machine with an error or otherwise indicating it will not complete the transcode. Typically this would occur if the proxy server's load began to increase or to exceed a particular threshold.
- the transcoding process may involve changing any of a variety of characteristics of the file, for example and without limitation, changing a codec used to encode data in the file, changing a container format of the file, and/or changing one or more encoding parameters or container format parameters.
- the transcoding process may involve changing a bit-rate of encoded data in the file, an image resolution for data in the file, a frame size for data in the file, an aspect ratio for data in the file, a compression setting used to encode data in the file, other settings such as GoP settings, color spaces, stereo/audio settings, etc.
- the transcoding process may also involve changing other characteristics, such as an interlacing characteristic for data in the file.
- the system may be used to change or add security features to the file, e.g., by applying encryption, embedding a watermark or a fingerprint in the content, or inserting data to apply a digital rights management scheme to the file.
- the platform uses a pseudo-chunking approach for breaking up the video file to create the transcoding segments.
- the management machine(s) can be configured to be frame-aware, such that it can include “additional” frames in a given segment to enhance the ability for a given transcoding resource to transcode that segment independently of other frame information in the file. This is advantageous and sometimes necessary because the transcoding resource usually will not receive the entire original source file.
- Such pseudo-chunking techniques are useful when the transcode involves modifying the size of GoPs, the rate of keyframes in the source file is relatively high, or the source file contains so-called open GoPs, among other scenarios.
- a frame-aware segmentation process can receive a video file that is to be converted from a first version to a second version.
- the video file is typically made up of a plurality of frames organized into a plurality of groups-of-pictures (GoPs).
- GoPs groups-of-pictures
- the segmenter examines frames in the file to identify a given GoP and to determine the type of frames in the given GoP, and creates a segment that includes frames beyond those in the given GoP. This segment is then sent off to be independently transcoded as described above.
- the inclusion of the additional frames may occur because the segmenter determines that the given GoP cannot be divided into a whole number of target GoPs (the target GoPs representing desired GoPs for the second version and having a smaller number of frames), in which case the segmenter can create the segment from the file to include at least some frames in the given GoP and at least one frame from a GoP immediately following the given GoP.
- the target GoP is larger than the given GoP, and that it is not a whole-number-multiple of the size of the given GoP, in which case the segmenter can create the segment to include the given GoP and at least enough frames from GoPs immediately following the given GoP such that the segment reaches the size of the target GoP.
- segmenter identifies the given GoP as an open-GoP, and therefore creates the segment to include all of the frames from the given GoP and frames (e.g., up to and including a keyframe) from a GoP immediately following the given GoP.
- the segmenter determines that the given GoP contains a number of frames that is less than a predetermined minimum number of frames, and so creates the segment to include the given GoP and at least enough additional frames so as to reach that predetermined minimum number of frames.
- FIG. 1 is a diagram illustrating one embodiment of a known distributed computer system configured as a content delivery network
- FIG. 2 is a diagram illustrating one embodiment of a machine on which a CDN server in the system of FIG. 1 may be implemented;
- FIG. 3 is a diagram illustrating one embodiment of an architecture for live streaming delivery as described in U.S. application Ser. No. 12/858,177;
- FIG. 4 is a diagram illustrating one embodiment of an architecture and request flow of a video-on-demand approach as described in U.S. application Ser. No. 12/858,177;
- FIG. 5 is a schematic view of one embodiment of an architecture for live streaming, as described in U.S. application Ser. No. 13/329,057;
- FIG. 6 is a schematic view of one embodiment of an architecture for on-demand streaming as described in U.S. application Ser. No. 13/329,057;
- FIG. 7 is a schematic view illustrating the live streaming architecture of FIG. 5 in more detail as described in U.S. application Ser. No. 13/329,057;
- FIG. 8 illustrates an example of a first live streaming workflow used when a stream is published from an encoder to an entrypoint (EP) as described in U.S. application Ser. No. 13/329,057;
- FIG. 9 illustrates an example of a second live streaming workflow used when an end-user makes a live request for content as described in U.S. application Ser. No. 13/329,057;
- FIG. 10 illustrates an example of a process by which live streams can be announced in the exemplary architectures shown in FIGS. 5 , 7 , 8 and 9 , as described in U.S. application Ser. No. 13/329,057;
- FIG. 11 illustrates an example of a technique for replicating live streams as described in U.S. application Ser. No. 13/329,057;
- FIG. 12 illustrates an example of an on-demand streaming workflow used when an end-user makes a request for content as described in U.S. application Ser. No. 13/329,057;
- FIG. 13 illustrates an example of the TransformLib component in more detail as described in U.S. application Ser. No. 13/329,057;
- FIG. 14 illustrates an example of a workflow supporting ingestion and output of a content stream in a given format as described in U.S. application Ser. No. 13/329,057;
- FIG. 15 illustrates an example of a workflow for supporting ingestion and output of a content stream in another given format as described in U.S. application Ser. No. 13/329,057;
- FIG. 16 illustrates an example of a workflow using binary-side-includes (BSI) to facilitate streaming as described in U.S. application Ser. No. 13/329,081;
- BBI binary-side-includes
- FIG. 17 is a block diagram of one embodiment of a transcoding platform that includes a transcoding region with certain machines, as well as an existing content delivery network with machines that are leveraged to provide transcoding resources;
- FIG. 18 illustrates an example of a workflow for video-on-demand batch transcoding in accordance with the teachings hereof;
- FIG. 19 illustrates an example of a workflow for live transcoding in accordance with the teachings hereof
- FIG. 20 illustrates an example of a workflow for live transcoding from the point of view of the Fluxer component, in accordance with the teachings hereof;
- FIG. 21 illustrates an example of a workflow for batch video-on-demand transcoding from the point of view of the Fluxer component, in accordance with the teachings hereof;
- FIG. 22 illustrates an example of a workflow for real-time video-on-demand transcoding from the point of view of the Fluxer component, in accordance with the teachings hereof;
- FIG. 23 is a diagram illustrating examples of certain transcoding processes executing in a server functioning as a transcoding resource, in accordance with the teachings hereof;
- FIG. 24 is a diagram illustrating modification of group-of-picture (GoP) size as part of a transcoding job
- FIG. 25 is a diagram illustrating an example of a pseudo-chunking approach for transcoding, in according with the teachings hereof.
- FIG. 26 is a diagram that illustrates hardware in a computer system that may be used to implement the teachings hereof.
- the subject matter hereof provides improved ways to convert audio/video content (or other content) from one codec format to another, or from one container format to another, and/or that have different encoding/formatting settings, to generate multiple versions of a file.
- the conversions may involve changing the bitrate (e.g., 10 Mbps to 500 kps), frame size, aspect ratio, or in changing compression settings (other than bitrate), and/or other characteristics such as GoP settings, color spaces, stereo/audio choices, sample rates, etc.
- the process may also involve changing other characteristics, such as whether interlacing is used.
- the teachings hereof may be used to change or add security features, such as encryption or watermarking, as will be described in more detail below.
- transcoding is used herein to refer to performing any or all of such transformations on a given piece of content; however it is not limited to such transformations, which are merely examples provided for illustrative purposes.
- the transcoding techniques disclosed herein preferably are implemented in a distributed computing platform such as a content delivery network (CDN), and preferably one that can not only perform transcoding services but also the deliver the transcoded content.
- a content delivery network platform is now described.
- FIG. 1 illustrates a known distributed computer system 100 is configured as a CDN and is assumed to have a set of machines 102 distributed around the Internet. Typically, most of the machines are servers located near the edge of the Internet, i.e., at or adjacent end user access networks.
- a network operations command center (NOCC) 104 manages operations of the various machines in the system.
- NOCC network operations command center
- Third party sites such as web site 106 , offload delivery of content (e.g., HTML, embedded web page objects, streaming media, software downloads, and the like) to the distributed computer system 100 and, in particular, to the CDN's content servers 102 (sometimes referred to as “edge” servers in light of their location near the “edges” of the Internet, or as proxy servers if running an HTTP proxy or other proxy process, as is typical and as is described further below in connection with FIG. 2 ).
- content providers offload their content delivery by aliasing (e.g., by a DNS CNAME) given content provider domains or sub-domains to domains that are managed by the service provider's authoritative domain name service.
- the distributed computer system may also include other infrastructure, such as a distributed data collection system 108 that collects usage and other data from the edge servers, aggregates that data across a region or set of regions, and passes that data to other back-end systems 110 , 112 , 114 and 116 to facilitate monitoring, logging, alerts, billing, management and other operational and administrative functions.
- a distributed data collection system 108 that collects usage and other data from the edge servers, aggregates that data across a region or set of regions, and passes that data to other back-end systems 110 , 112 , 114 and 116 to facilitate monitoring, logging, alerts, billing, management and other operational and administrative functions.
- Distributed network agents 118 monitor the network as well as the server loads and provide network, traffic and load data (e.g., from the CDN's content servers 102 ) to a DNS query handling mechanism 115 , which is authoritative for content domains being managed by the CDN and which responds to DNS queries from end users by handing out, e.g., addresses for one or more of the content servers in the CDN.
- a distributed data transport mechanism 120 may be used to distribute control information (e.g., metadata to manage content, to facilitate load balancing, and the like) to the servers.
- a given machine 200 comprises commodity hardware (e.g., an Intel Pentium processor) 202 running an operating system kernel (such as Linux or variant) 204 that supports one or more applications 206 a - n .
- given machines typically run a set of applications, such as an HTTP web proxy 207 (sometimes referred to as a “global host” or “ghost” process), a name server 208 , a local monitoring process 210 , a distributed data collection process 212 , and the like.
- the machine running the proxy 207 typically provides caching functionality for content passing therethrough, although it need not.
- the machine typically includes one or more media servers, such as a Windows Media Server (WMS) or Flash server, as required by the supported media formats.
- WMS Windows Media Server
- a given content server is configured to provide one or more extended content delivery features, preferably on a domain-specific, customer-specific basis, preferably using configuration files that are distributed to the edge servers using a configuration system.
- a given configuration file preferably is XML-based and includes a set of content handling rules and directives that facilitate one or more advanced content handling features.
- the configuration file may be delivered to the content server via the data transport mechanism.
- U.S. Pat. Nos. 7,240,100 and 7,111,057 illustrates useful infrastructures for delivering and managing edge server content control information, and this and other edge server control information can be provisioned by the CDN service provider itself, or (via an extranet or the like) the content provider customer who operates the origin server.
- the CDN may provide secure content delivery among a client browser, edge server and customer origin server in the manner described in U.S. Publication No. 20040093419. Secure content delivery as described therein enforces SSL-based links between the client and the content server, on the one hand, and between the content server process and an origin server process, on the other hand. This enables an SSL-protected web page and/or components thereof to be delivered via the content server.
- the CDN may include a network storage subsystem (sometimes referred to as “NetStorage”), such as described in U.S. Pat. No. 7,472,178, the disclosure of which is incorporated herein by reference.
- NetworkStorage a network storage subsystem
- the CDN described above may be designed to provide a variety of streaming services.
- the CDN may include a delivery subsystem, such as described in U.S. Pat. No. 7,296,082, the disclosure of which is incorporated herein by reference.
- the CDN may be extended to provide an integrated HTTP-based delivery platform that provides for the delivery online of HD-video quality content to the most popular runtime environments and to the latest devices in both fixed line and wireless environments.
- An example of such a platform is set forth in U.S. Ser. No. 12/858,177, filed Aug. 17, 2010 (now published as US Patent Publication 2011/0173345, incorporated herein by reference). The platform described there supports delivery of both “live” and “on-demand” content. It should be noted that while some of the description below and otherwise in application Ser. No.
- 12/858,177 uses the context of the Adobe Flash runtime environment for illustrative purposes, this is not a limitation, as a similar type of solution may also be implemented for other runtime environments both fixed line and mobile (including, without limitation, Microsoft Silverlight, Apple iPhone, and others).
- FIG. 3 illustrates an overview of an exemplary architecture for live streaming delivery as described in U.S. application Ser. No. 12/858,177, filed Aug. 17, 2010.
- the system generally is divided into two independent tiers: a stream recording tier 300 , and a stream player tier 302 .
- the recording process (provided by the stream recording tier 300 ) is initiated from the Encoder 304 forward.
- streams are recorded even if there are currently no viewers (because there may be DVR requests later).
- the playback process (provided by the stream player tier 302 ) plays a given stream starting at a given time.
- a “live stream,” in effect, is equivalent to a “DVR stream” with a start time of “now.”
- the live streaming process begins with a stream delivered from an Encoder 304 to an Entry Point 306 .
- a Puller component 308 e.g., running on a Linux-based machine
- an EP Region (not shown) is instructed to subscribe to the stream on the EP 306 and to push the resulting data to one or more Archiver 310 processes, preferably running on other machines.
- one of the Archivers 310 may operate as the “leader” as a result of executing a leader election protocol across the archiving processes.
- the Archivers 310 act as origin servers for a content server's HTTP proxy processes (an example of which is shown at 312 ) for live or near-live requests.
- the HTTP proxy 312 provides HTTP delivery to requesting end user clients, one of which is the Client 314 .
- a representative Client 314 is a computer that includes a browser, typically with native or plug-in support for media players, codecs, and the like. If DVR is enabled, content preferably is also uploaded to the Storage subsystem 316 , so that the Storage subsystem serves as the origin for DVR requests.
- a request for content (e.g., from an end user Client 314 ) is directed to the HTTP proxy 312 , preferably using techniques such as those described in U.S. Pat. Nos. 6,108,703, 7,240,100, 7,293,093 and others.
- the HTTP proxy 312 receives an HTTP request for a given stream, it makes various requests, preferably driven by HTTP proxy metadata (as described in U.S. Pat. Nos. 7,240,100, 7,111,057 and others), possibly via a cache hierarchy 318 (see., e.g., U.S. Pat. No. 7,376,716 and others), to locate, learn about, and download a stream to serve to the Client 314 .
- the streaming-specific knowledge is handled by the HTTP proxy 312 that is directly connected to a Client 314 .
- Any go-forward (cache miss) requests (issued from the HTTP proxy) preferably are standard HTTP requests.
- the HTTP proxy 312 starts the streaming process by retrieving a “Stream Manifest” that contains preferably attributes of the stream and information needed by the HTTP proxy 312 to track down the actual stream content.
- the HTTP proxy 312 For “live” requests, the HTTP proxy 312 starts requesting content relative to “now,” which, in general, is approximately equal to the time at the content server's HTTP proxy process. Given a seek time, the HTTP proxy downloads a “Fragment Index” whose name preferably is computed based on information in the indexInfo range and an epoch seek time. Preferably, a Fragment Index covers a given time period (e.g., every few minutes). By consulting the Fragment Index, an “Intermediate Format (IF) Fragment” number and an offset into that IF fragment are obtained.
- IF Intermediate Format
- the HTTP proxy 312 can then begin downloading the fragment (e.g., via the cache hierarchy 318 , or from elsewhere within the CDN infrastructure), skipping data before the specified offset, and then begin serving (to the requesting Client 314 ) from there. In general, and unless the Stream Manifest indicates otherwise, for live streaming the HTTP proxy then continues serving data from consecutively-numbered IF Fragments.
- the Intermediate Format In the context of live HTTP-based delivery, the Intermediate Format (IF) describes an internal representation of a stream used to get data from the Puller through to the HTTP proxy.
- a “source” format (SF) is a format in which the Entry Point 306 provides content and a “target” format (TF) is a format in which HTTP proxy 312 delivers data to the Client 314 .
- SF may differ from TF, i.e., a stream may be acquired in FLV format and served in a dynamic or adaptive (variable bit rate) format.
- the format is the container used to convey the stream; typically, the actual raw audio and video chunks are considered opaque data, although transcoding between different codecs may be implemented as well.
- the above-described architecture is useful for live streaming.
- the platform can also be used to support video on demand (VOD).
- VOD video on demand
- the solution can provide VOD streaming from customer and Storage subsystem-based origins.
- the stream recorder tier 300 (of FIG. 3 ) is replaced, preferably with a translation tier.
- a translation tier As described in Ser. No. 12/858,177, filed Aug. 17, 2010, typically VOD content is off-loaded to the CDN for HTTP delivery.
- a conversion tool (a script) is used to convert source content (such as FLV) to IF, with the resulting IF files then uploaded to the Storage subsystem.
- the HTTP proxy 312 then gets the content and the Stream Manifest from the Storage subsystem.
- Exemplary translation tier approaches are described in more detail in Ser. No. 12/858,177, filed Aug. 17, 2010.
- a translation tier 400 is located between an origin 402 (e.g., customer origin server, or the Storage subsystem, or other source of content) and the stream player tier 404 .
- origin 402 e.g., customer origin server, or the Storage subsystem, or other source of content
- FIG. 5 is a high-level component diagram illustrating one embodiment of an architecture for streaming live content, as set forth in U.S. patent application Ser. No. 13/329,057.
- the Entry Point (EP) 502 ingests the stream to be delivered from an encoder 500 , demuxes the stream from its native format to an IF format, such as a fragmented format like f-MP4, and archives the stream to Storage 504 (typically a network storage subsystem).
- the EP 502 serves “current” live stream fragments to a Streaming Mid-Tier (SMT) process 506 , which is typically running on a separate SMT machine.
- SMT Streaming Mid-Tier
- the SMT 506 retrieves “current” live stream fragments from EP 502 , and it generates a muxed output in the desired native format.
- the SMT 506 generates muxing instructions for use by a content server running an HTTP proxy process 508 (again, sometimes referred to as “global host” or simply “ghost”) in the CDN.
- the instructions are returned to the content server 508 , along with the IF fragments if needed, although the IF fragments may have been previously cached by the content server 508 or retrieved by the content server from Storage 504 instead.
- the muxing instructions may be realized as binary-side-includes, or BSI, which is described in detail in U.S.
- the content server 508 forwards end-user requests to SMT 506 , caches the response from SMT 506 , which response either is a native output object for the stream or a BSI fragment, and, when BSI is used, the content server 508 also creates an output object from the BSI and IF fragment.
- the content server 508 also delivers the native output object to the end-user client, typically a client player application. It does not need to understand any container format(s).
- the Storage 504 stores an archive for DVR or VOD playback, and it also stores live stream session metadata.
- FIG. 6 is a high-level component diagram illustrating one embodiment of an architecture for streaming on-demand content.
- the SMT 604 requests and receives the native on-demand file from either a customer origin 600 or Storage 604 (again, typically a network storage subsystem).
- the SMT 606 parses a native source file index and creates an intermediate MetaIndex. It also generates a muxed output object or SMT 606 generates muxing instructions (BSI or equivalent functionality) for use by the content server 608 to create the native object.
- BBI muxing instructions
- the content server 608 forwards end-user requests to SMT 606 , caches the response from SMT, which response either is a native output object or a BSI fragment, and, when BSI is used, the content server 608 also creates an output object from the BSI and IF fragment.
- Storage 604 typically stores on-demand files in native format.
- FIG. 7 illustrates further details regarding the EP and SMT components and their respective functions.
- the EP 700 comprises two services: an ingest server 706 and an entry point stream manager (ESM) 701 .
- the ingest server 706 is composed of a format-specific ingest server 706 and a library of functions 708 , called TransformLib.
- the library 708 is a shared library that is linked into the ingest server 706 .
- the library contains format-specific logic for muxing and demuxing.
- the ingest server 706 receives a stream from an encoder 702 , authenticates the encoder 702 , passes the received data to the library 708 for demuxing, and sends the demuxed stream to the ESM 701 .
- the library demuxes from a native format (e.g., MP3, MPEG2-TS, or otherwise) to the IF, such as f-MP4.
- the ESM 710 is a format-independent component that preferably resides on the EP 700 .
- the role of ESM 701 preferably is the same across different streaming formats. It received the demuxed stream from the ingest server 706 , manages ESM publishing points, archives the stream to Storage 705 , serves “current” live request from SMT, and announces active streams to all SMTs.
- An EP machine may be a Windows-based server, or a Linux-based server, or otherwise.
- the ESM code is cross-platform compatible.
- the SMT machine comprises two primary services; SMT 712 and local ghost process 714 .
- the local HTTP proxy (ghost) process 714 handles incoming HTTP requests from an content server ghost process 715 .
- the local ghost process 714 makes a forward request to the local SMT component 712 .
- SMT component 712 passes the incoming request to TransformLib 716 for processing, and that processing is based on the container format.
- TransformLib 716 first rewrites the container-specific incoming URL to an IF (e.g., f-MP4) forward URL.
- IF e.g., f-MP4
- TransformLib 716 uses the IF fragment to create instructions (BSI), and to serve back any IF requests to the content server ghost 715 .
- TransformLib 716 creates the output object in native format if the instruction set (BSI) approach is disabled.
- the local ghost process 714 makes the forward requests (to SMT component 712 ), and it caches the forward response on local disk.
- An intermediary caching process may be used between the SMT 712 and local ghost process 714 .
- ghost-to-ghost communications between the content server and the SMT may be used (and optimized).
- FIG. 8 illustrates an embodiment of a first live streaming workflow embodiment that is used when a CDN customer publishes a stream from its encoder to a CDN entry-point (EP).
- EP CDN entry-point
- FIG. 9 illustrates an embodiment of a second live streaming workflow that is used when an end-user makes a live request to a content server.
- the encoder publishes a live stream to the EP.
- the ingest server authenticates the encoder connection, preferably using a streamID to lookup the appropriate stream configuration (Step 1 ).
- Ingest server demuxes the input and pushes the stream to ESM (Step 2 ).
- ESM auto-creates a publishing point, preferably uploading to Storage three (3) XML-based files: LiveSession, LSM, and ACF.
- LiveSession file includes live stream information, such as entrypoint IP, sessionID, and streamState.
- the LSM includes session-specific metadata like bitrates, etc.
- ACF includes information for use in configuring an archive copy of the live stream.
- ESM receives fragments from the ingest server, it aggregates the fragments into segments on the local disk. When the segment size reaches the accumulation threshold, it uploads the segment to Storage. With each segment uploaded to Storage, ESM also uploads an FDX file (Step 4 ).
- the FDX (Fragment Index) file is a binary encoded file that provides an index of the fragments that have been uploaded to Storage. This index tells SMT what fragments are in Storage and where to locate them. For fragments that are not in the FDX file, the fragment either is on the EP (because it has not been uploaded to Storage yet) or the fragment does not actually exist.
- the LSM and livesession.xml file are updated to change the “streamState” property from “started” to “stopped.”
- FIG. 9 illustrates an exemplary embodiment of a workflow when an end-user client makes a live streaming request to a ghost process on a content server.
- the client e.g., a client media player application
- makes a stream request to the content server ghost process (Step 1 ).
- This process then makes a forward request to SMT (Step 2 ).
- SMT constructs and caches information about the live stream.
- SMT pulls information from Storage for the past DVR fragments and pull information from the EP for the current fragments.
- SMT makes a request to Storage to get the livesession.xml and LSM file.
- the LSM file will give information about the live stream and what FDX files to lookup for a particular fragment index range (Step 3 ).
- the SMT makes a Manifest request to the EP and the Manifest will list the current set of fragment indexes that reside on the EP (Step 4 ).
- SMT finds and obtains the requested fragment, it muxes the fragment to the output format.
- SMT does not create the actual output object but, instead, SMT creates a BSI instruction response containing the appropriate container format headers and IF fragment request (Step 7 ).
- the content server makes a request for the IF fragment, and preferably this request is only for the “mdat” data, which is the video/audio data (Step 8 ).
- the content server ghost process then uses the instructions in the response and the IF fragment to construct the output object. It sends the resulting output object back to the end-user as a response to the original request (Step 9 ).
- SMT For SMT to know what fragments are in Storage, preferably it continuously polls Storage for a latest version of the FDX file (Step 10 ). Polling interval for the FDX file typically is a given, potentially configurable time period (Step 10 ).
- SMT polls the EP for a latest Manifest file (Step 11 ).
- the client player URLs have the following format:
- Live and Archive URLs preferably have a prefix that denotes that streaming container format and the type of request (e.g., live, archive).
- the client-player URLs have the following format:
- the sessionID part of the URL differentiates archives from different live stream sessions.
- An archive URL gives the location of the archive directory in Storage.
- the archive URL “format” is simply the path to the default Storage location to which the archive is uploaded. If desired, the archive can be moved to a different Storage directory, in which case the archive path URL is changed to the new Storage directory location.
- the archive URL is immediately available for playback even if the live event is not over yet.
- the archive URL represents the content that has been archived to Storage so far. For example, if the live stream event has been running for 60 minutes and 58 minutes of the event has been archived to Storage, the archive URL represents a VOD file that is 58 minutes long. As more content is archived to Storage, the archive URL represents a longer and longer VOD file.
- An IF URL is constructed by taking the “base URL” of the client request and appending Fragment( ⁇ params>) to the end.
- the “base URL” typically is the portion of the URL that is up to and including the file name.
- the IF URL parameters are name/value pairs separated by commas and specify bitrate and response types:
- Illustrative parameter tag names include:
- SMT will return a BSI fragment response. (Note that for implementations that involve instruction sets other than BSI, the parameter might be “instr_set_name”.) If “frg” is specified, SMT will return the f-MP4 fragment. If “hdr” is specified, SMT will only return f-MP4 headers. If “dat” is specified, SMT will return the mdat box of the f-MP4 fragment. The mdat box is the MP4 box containing the audio/video samples.
- ESM In operation, as ESM receives the live stream fragments from the ingest server, ESM writes the data to local disk.
- ESM has a configurable option to either coalesce all bitrates into a single file or have a different file per bitrate.
- the advantage of coalescing into a single file is that the number of file uploads to Storage is reduced.
- the disadvantage of a single file is that it is not possible to only retrieve fragments for a single bitrate without also retrieving fragments for other bitrates, thereby making caching less efficient on SMT when a single bitrate is being requested by the end-user. In either case, though, all of the fragments usually are in a single file (be it for one bitrate or many).
- An ESM trailing window parameter configures how much ESM will save on local disk. Once a segment is outside the trailing window, ESM will delete it from local disk.
- ESM will archive the stream to Storage for DVR or later VOD playback.
- ESM stores the last “n” minutes of a live stream.
- a customer wants a 4 hour DVR window for their live stream, the customer enables “Archive To Storage” so that fragments older than n minutes are saved in Storage and available for DVR.
- the customer can disable “Archive To Storage” and the live stream is not uploaded to Storage. In such case, live stream fragment requests are served from the EP.
- Some customers have 24 ⁇ 7 streams and want say, one (1) day DVR functionality. In that case, the customer enables “Archive To Storage” and enables a 1 day “Archive Trailing Window”.
- the “Archive Trailing Window” setting can limit the size of the archive that is stored in Storage. For example, if the “Archive Trailing Window” is set to 1 day, ESM will automatically delete from Storage fragments that are older than 1 day. This is beneficial for the customer because they can have a long DVR window but do not need to worry about cleaning up Storage for their long running live streams.
- live stream announcements between SMT and ESM are done using HTTP GET requests from SMT to ESM.
- each ESM in an EP region e.g., EP region 1 or 2, as shown
- each ESM in an EP region makes an HTTP request to other EPs in the same region and asks for all live streams on the EP.
- ESM aggregates together all active live streams from the other EPs in the same region.
- SMT only needs to make a HTTP GET request to a single EP machine in an EP region (that is, a set of EP machines) to get information about all active live streams in a region.
- the request is made via the SMT local ghost process with a given (e.g., 5 second) time-to-live (TTL).
- TTL time-to-live
- ICP Inter-Cache Protocol
- the algorithm to choose the EP machine to query preferably is deterministic and repeatable across all SMTs so that all SMTs will make the forward request to the same EP in the EP region.
- polling from SMT to EP is done every few seconds and is configured through a global server setting. Having a short polling interval minimizes the amount of time between a customer publishing a stream and the SMT knowing the stream exists on the EP.
- the request logic from SMT to EP handles situations where an EP is down for maintenance or temporarily inaccessible.
- the live stream archive is stored on Storage for later VOD playback. Any metadata for the live stream session is also stored on the Storage system, preferably in the same location as the live stream archive. If “Archive To Storage” is not enabled, nothing is stored on Storage.
- ingested fragments are demuxed into the IF format (Intermediate Format).
- the muxer can convert from the IF format to any supported streaming container format. This simplifies conversion from any input (source) format to any output (target) format.
- the PIFF (Protected Interoperable File Format) container format available from Microsoft, may be used as the basis for the IF container format.
- PIFF enhances the MPEG-4 Part 12 specification by providing guidelines and UUID extensions for fragmented multi-bitrate HTTP streaming.
- other choices for container formats are Adobe's HTTP Streaming For Flash (Zeri), Apple's MPEG2-TS, or a proprietary format.
- the EP For EPs to support DEEM, whenever an encoder pushes a stream to the EP, the EP must determine if the stream is a brand new stream or a DEEM failover from a previous live stream session. The EP determines the state of the stream by getting the corresponding livesession.xml from Storage. The livesession.xml contains the “streamState”. If the stream is a DEEM failover, the “streamState” will have a “started” value. The EP also does consistency checks, such as query the old EP to determine if the stream actually existed. Consistency checks ensure that the new EP does not unintentionally consider the stream to be a DEEM failover stream when it is not. For the case when a stream is not archived to Storage, the EP simply ingests the live stream without retrieving the livesession.xml from Storage. The SMT does the work of stitching the live stream from different EPs into a single live stream.
- the livesession.xml contains the following attributes for DEEM support:
- the “discontinuityThreshold” is set to a given time period, e.g., 30 minutes. This means if an EP goes down and the encoder does not push the stream to the new EP within 30 minutes, the live stream session will not be resumed.
- the EP checks if the threshold has been exceeded by subtracting the current time against the “lastRefreshTime”. If this time difference is more than 30 minutes, the EP will not resume the previous live stream session.
- SMT For SMTs to support DEEM, SMT tracks stream states via stream announcements. When the encoder is stopped, a live stream is transitioned to the “stopped” state on the EP. If the EP goes down, the stream does not gracefully transition to the “stopped” state. The SMT tracks ungraceful stream state transitions, and it stitches together live stream sessions if needed. SMT combines DVR fragments from a previous live session and the currently resumed live stream session. From the end-user point of view, the merged live stream sessions is a single live stream session.
- ESM on the ingest entry point has an option to replicate the stream.
- the replicated stream is sent either to the backup EP or another EP altogether.
- the target stream preferably uses a different stream ID than the source stream.
- an SMT component handles on-demand requests from a content server.
- the same SMT machine can handle both live and on-demand requests.
- the client player makes a stream request to the content server (Step 1 ).
- the content server ghost process makes a forward request to SMT machine (Step 2 ). If this is the first request to the SMT machine for this on-demand stream, SMT needs to construct and cache information about the on-demand stream. To get this information, SMT first passes the request URL to TransformLib, and TransformLib constructs the appropriate forward requests for the native format file. SMT makes these forward requests to Storage/customer origin via SMT's local ghost process (Step 3 ). TransformLib takes the forward responses and constructs the response (e.g., BSI) for the requested output format (Step 4 ).
- SMT returns the response back to the content server (Step 5 ).
- the BSI response contains the container-specific format headers and the request URLs for the IF fragments.
- the content server ghost process makes IF requests to construct the output object (Step 6 ).
- the output object is returned to the end-user in the native format (Step 7 ).
- BSI is optional but can be used to reduce the cache footprint on the content server ghost process. If BSI is not enabled, SMT can return the native output object (i.e., in the target format) to the content server ghost process.
- the native output object can be cached by the content server just like any HTTP object from an origin server.
- the client-player URLs may have the following format:
- SMT returns a BSI fragment that consists of the container headers and the IF URLs for the mdat data.
- the IF URLs look like the following for audio and video:
- the Fragment( ⁇ params>) portion is appended to the “base URL” of the client request (e.g., video.mp4 in the example above).
- the “base URL” is typically the portion of the URL up to and including the file name but can vary depending on the streaming format.
- TransformLib on the SMT contains the logic to demux the native input file and mux into the requested output object.
- TransformLib first parses the native input file to generate a MetaIndex.
- the MetaIndex is a generic index that contains information such as composition time, decoding time, IF fragment boundaries, and byte range offsets into the native source file for each IF fragment.
- the output muxers use the MetaIndex to extract the appropriate bytes from the native source file and use the other information such as composition time to construct the appropriate container headers.
- the MetaIndex provides a generic interface into the native source files. This interface is an abstraction layer on top of the native source file so that the output muxers do not need to be aware of the underlying container format.
- the MetaIndex may be cached within SMT's local ghost process cache for later reuse or for use by an ICP peer. Creating the MetaIndex can take time, and caching on the local ghost process decreases the response time for the first VOD fragment request.
- SMT makes a local host request via ghost for “/metaIndex”. The loopback request is handled by the local SMT, and its response is cached by the ghost process. Other SMTs in the region also get the benefit of using this MetaIndex because it is available via ICP.
- the above-described architectures (for live or on-demand) is extensible to support any streaming format.
- the following section describes how to support a new streaming container format.
- FIG. 14 illustrates one exemplary embodiment of a technique for supporting ingestion of iPhone content and output of iPhone content.
- an iPhone EP 1400 ingests an Apple-Segmented MPEG2-TS stream, and TransformLib 1408 supports MPEG2TS for demuxing and muxing MPEG2-TS.
- TransformLib 1408 parses iPhone URLs and rewrites them to the forward path.
- the iPhone ingest server 1406 handles HTTP POST/PUT requests from the encoder 1402 .
- the iPhone ingest server passes the TS segments to TransformLib 1408 for demuxing into IF (e.g., f-MP4) format.
- the iPhone ingest server then sends the IF fragments to the local ESM 1401 .
- IF e.g., f-MP4
- the ESM archives the stream to Storage and announces the live stream to the SMTs, as described above.
- the TransformLib 1416 processes iPhone request URLs for m3u8 and MPEG2-TS.
- TransformLib 1416 constructs the BSI response and returns it to the content server 1415 .
- For MPEG2-TS segments data packets are interleaved with container headers every 188 bytes. This means that for every 188 bytes of audio/video, there will be some container headers.
- the BSI syntax supports loop constructs to reduce the complexity of the BSI response and still generate the appropriate MPEG2-TS segment. Using BSI to mux the object on the content server is optional.
- SMT 1412 can also return native MPEG2-TS segments back to the content server 1415 if BSI is disabled.
- FIG. 15 illustrates an embodiment for supporting the Shoutcast format.
- Shoutcast is a protocol that is primarily used for audio live streaming over HTTP-like connections.
- the client makes an HTTP request and the HTTP response body is a continuous audio stream (i.e., unbounded response body).
- the audio stream is a mix of MP3 data (or AAC/OGG) and Shoutcast metadata.
- Shoutcast metadata typically contains song titles or artist info. While the Shoutcast protocol is similar to HTTP, it is not true HTTP because the protocol includes some non-standard HTTP request and response headers.
- this embodiment comprises a Shoutcast EP 1500 to ingest Shoutcast-encoded streams.
- the TranformLib 1508 for Shoutcast library is provided to demux and mux MP3/AAC/OGG.
- TransformLib 1508 also parses Shoutcast URLs, rewrites them to the forward path, and generates BSI instructions. Because the client-player downloads a continuous unbounded HTTP response, the content server ghost process 1415 must turn fragmented forward origin requests into a single continuous client download. BSI instructs the ghost process on how to construct the client response from fragmented responses to forward requests.
- the network architecture for Shoutcast support is similar to the iPhone support as provided in FIG. 14 .
- the Shoutcast EP 1500 ingests the stream.
- the ingest server demuxes the stream using TransformLib 1508 .
- TransformLib 1515 on SMT 1512 parses Shoutcast URLs, creates BSI responses for Shoutcast, and muxes into Shoutcast output format.
- BSI is a name for functionality executable in a content server to generate output objects given an input object and certain instructions, typically instructions from another component such as the SMT component described above.
- the instructions typically define manipulations or actions to be performed on the input data.
- Such functionality is intended to enable modification of payloads as they are served to a requesting client, allowing a content server to easily provide, among other things, custom or semi-custom content given a generic object.
- this functionality can be built into the HTTP proxy (ghost) application on the content server, although in alternative embodiments it can be implemented external to ghost.
- a mechanism is defined for representing the difference (or “diff”) between the source(s) and output content, allowing a generic feature in the content server to handle an increasing number of streaming formats in an efficient way.
- components other than the content server are made responsible for defining or generating transforming logic and for providing instructions—along with binary “diff” information—that can be understood by the content server.
- the client-facing content server may handle an increasing number of requests efficiently.
- the inputs e.g., the generic source object, instructions, etc.
- the output of the process also may be cached in some cases.
- this function is called BSI, for Binary-edge-Side Includes, or Binary Server Integration.
- the BSI language with proposed syntax described below, defines different sources—incoming pieces of data that help construct the final output. Instructions (like ‘combine’ and others) define the byte ranges and order of how to merge these inputs, as well as controlling output headers.
- the BSI fragment and source object both can be cached (e.g., at the content server), placing far less load on the BSI generation tier than the content server would have handling them directly.
- the BSI may be generated once, and a BSI fragment cached (e.g., either on the content server, or on network storage or other dedicated storage subsystem such as is shown in FIGS. 5-6 ).
- the BSI approach is ideally very fast.
- the syntax is XML-based, and the number of instructions typically is kept very low, allowing fast parsing.
- the execution of BSI instructs the content server what order, and from which source, to fill an output buffer that is served to the client.
- BSI functionality can be used between the SMT and content server to streamline the creation of an output object (e.g., an output object representing the stream in a native format for iPhone or other client device) from an input source (in the above cases, the IF fragments).
- the SMT receives IF fragments and performs muxing steps. Instead of muxed content as output, the SMT creates a dynamic BSI fragment that can be served to the content server, along with a binary object that contains the additional bits that the content server needs to combine with the IF fragment it normally receives.
- the content server uses this information to create the muxed output object in the native format, representing all or some portion of the stream.
- the content server ghost process 1600 receives a request from a client player 1601 for particular content (step 1 ) in certain target format.
- the content server makes a request to a muxing tier (the SMT 1602 ) for the BSI instructions required (step 2 ).
- the request includes parameters via query string, to specify the type of request (manifest, content, key file, etc), the bitrate requested, a time determination (fragment no, time offset, etc.), and other parameters related to muxing (segment duration, A/V types, etc.).
- the SMT 1602 obtains the relevant IF fragments from the EP 1604 (step 3 ) or Storage 1603 (step 3 a ), builds an appropriate output object from the IF fragments as if it were to serve the content, creates a buffer of the bytes needed beyond what was contained in the IF fragments, along with instructions about how to ‘interleave’ or combine the binary diff with the IF. In some implementations, it should be understood, any necessary diff data may be embedded directly in the instructions themselves.
- the SMT 1602 then sends the BSI response to the content server. The response may also include a reference to the IF fragments that are needed.
- the content server gets the IF fragments in any of variety of ways, including from the SMT (that is, in addition to the BSI), from its own cache, or from Storage 1603 , which is typically a network storage subsystem that was previously described in connection with the streaming platform.
- step 5 in FIG. 16 shows the IF fragments arriving from Storage and being cached.
- the BSI response with its binary diff typically might be around a few percent of the overall size of the object to be served.
- the content server ghost 1600 applies the BSI, generating and serving a muxed output object to the client (step 6 ).
- the BSI response can be cached by the content server ghost 1600 for some period of time.
- the parameters supplied in the request to the SMT (step 2 ) are used in the cache key so that only subsequent requests for content with the same parameters utilize the cached BSI response.
- the output of the BSI operation need not be cached.
- BSI provides a way for the process to support any streaming container format without needing associated code changes at the content server ghost process.
- BSI instructions can change, but the content server ghost process logic remains the same. This eliminates any cross-component dependency with the content server or its ghost process when developing or implementing new streaming features.
- BSI can reduce the ghost cache footprint size because the ghost process caches the IF fragments but muxes the IF into different native formats.
- the muxed output is not cached; rather, only the IF fragment is cached.
- the system can be used to stream Adobe Zeri (HTTP Streaming for Flash) to Android devices running Flash 10.1 and stream to MPEG2-TS to iPhone devices.
- the live stream only the IF fragment is cached and the content server muxes into Zeri for Android devices and muxes into MPEG2-TS for IPhone devices.
- BSI functionality can be used for progressive-download-style formats and, in particular, to mux fragment responses from the origin (e.g., a content provider origin or CDN storage subsystem) into a continuous HTTP download stream for the client.
- BSI can also be used to implement progressive-download-specific features, like jump-to-live-on-drift and delayed metadata injection based on user-agent. Specific progressive-download-style requirements thus can be inherently supported through BSI without requiring any changes in the content server.
- Fragmented streaming formats may also use BSI functionality.
- the SMT can send the content server content in a native format or a BSI fragment that the content server ghost process muxes into the native format. If a CDN content provider customer is only doing streaming for a single container format, there is no need to cache IF fragments and mux on the content server ghost process via BSI. In such case, it is more efficient for SMT to return the native object, which the content server ghost process caches.
- Enabling or disabling using BSI is configurable, preferably on a content provider by content provider basis, and, for a given content provider, on a site by site basis, or even a file by file basis.
- the content delivery network (CDN) described above provides an advantageous and feature-rich platform for streaming and object delivery.
- the CDN platform may be enhanced yet further by integrating into it a distributed, scalable transcoding system that provides the ability to transform content such as audio, video and other files, which may then be delivered to end-users over the platform.
- Typical transcoding tasks include the conversion of media from one bitrate/resolution to another for the purposes of adding bitrates to a multi-bitrate stream, converting from one container format to another or one encoding format to another in order to allow clients utilizing such formats to play the content. These tasks may be part of prepping media for ingestion into the streaming platform described above.
- the distributed transcoding system described herein leverages the resources of the aforementioned content delivery architecture to perform certain processing tasks within the CDN, as real-time or background (batch mode) processes.
- the CDN may prepare and transcode certain content in preparation for delivery, even while other content (from the same or other content provider users of the system) is being delivered.
- the machines described above that provide content delivery services may be leveraged, in accordance with the teachings hereof, to perform transcoding tasks.
- the transcoding system may be implemented not only with a set of purpose-built hardware, specific to the transcoding task, but also supplemented with the available idle or low-usage resources of the content delivery network that was previously described, to achieve a highly scalable and flexible solution.
- the resources of the various distributed CDN content servers including in particular the HTTP proxy servers, aka ghost servers, described above, among others, may be leveraged in this way. Exemplary implementation details will set forth in more detail below.
- transcoding system implemented in conjunction within a CDN, although that is one useful implementation.
- distributed transcoding techniques described herein may be implemented in a standalone system with dedicated machines, entirely separate from other content delivery services or machines.
- the transcoding system can process files either in batch or real-time modes. Both kinds of jobs may be running within the platform at any given point of time. Preferably every transcode that runs in the system is happening as fast as possible given its priority and the available resources.
- the transcoding system itself is generally incognizant to the type of job it is processing—it simply processes requests with a given priority. In this way the system can be used for both batch and real-time transcoding of on-demand or live content.
- a transcoding system includes several components some of which are in a dedicated transcoding region and others of which are from the network of CDN servers.
- a region in this sense typically refers to a machine or set of machines in a particular network location, which may or may not be co-located with a region in the content delivery network.
- the transcoder region typically includes fluxer machines running a Fluxer (a fluxer process), transcoding resource access server application (TRAS), and a coordination server (C-server), as well as a set of managed transcoding resources (MTRs), e.g., a managed transcoder machine running a transcoding process.
- Fluxer a fluxer process
- TAS transcoding resource access server application
- C-server coordination server
- MTRs managed transcoding resources
- FIG. 17 shows the fluxer machines and MTRs in a single region, but the actual network location/topology of the transcoding region components is flexible and this example should not viewed as limiting.
- one implementation many include many transcoding regions with one or more fluxer machines and one or more MTRs may be distributed throughout various networks, and even co-located in the content delivery regions with content servers shown in FIG. 17 .
- the CDN content servers represent shared transcoding resources (STRs) to the transcoding system, as they are shared with the delivery and other CDN functions (e.g., security, content adaptation, authentication/authorization processes, reporting functions and so on). More broadly, the STRs are idle or low-utilization resources across the CDN that have transcoding capabilities and can be called upon to serve the transcoding system with their raw processing capabilities. Since these are typically idle or low-utilization servers, their main value is their processor (CPU). They are not expected to contain specialized hardware, nor can they be expected to be as reliable or available as MTRs, although they may exist in greater numbers.
- Prime examples of potential STRs are the HTTP proxy servers (e.g., also known as ghost servers or edge servers) described previously in conjunction with FIGS. 1-16 . However, any of the machines shown in FIGS. 1-16 are candidates for use as STRs provided they can be modified in accordance with the teachings below to become part of the transcoding system.
- the Fluxer is responsible for breaking apart media files into transcodable segments and sending those segments off to transcoding resources to be transcoded in parallel. Preferably the segments are coded so that the amount of data sent around the network is reduced.
- the transcoding resources can then decode and re-encode to accomplish the requested transcode.
- the Fluxer uses the TRAS to get lists of available transcoding resources and reports its status to the C-server.
- the transcoding resources (TRs, which may be either MTRs or STRs) are responsible for transcoding individual media segments and sending the derivatives back to the Fluxer to be remuxed back into a transcoded media file. MTRs, which are dedicated resources, report their status to C-Server.
- the TRAS can be implemented as a library that is responsible for encapsulating TR selection to an interface for consumption by the Fluxer.
- the TRAS uses a combination of awareness of local transcoders from C-server as well as requests to a Mapper (e.g. the map-maker and DNS system shown in FIG. 1 ) to identify idle HTTP proxy servers or other CDN servers.
- Mapper e.g. the map-maker and DNS system shown in FIG. 1
- the C-server tracks liveness from local TRs and Fluxers and acts as a local messaging platform for all transcoding servers in a region.
- FIGS. 18 and 19 illustrate the general function of and communication amongst components for particular embodiments of video-on-demand (VOD) transcoding and live transcoding, respectively.
- the Fluxer receives files to transcode or responds to transcode-initiation requests for VOD and live streams.
- a variety of components are potential sources for requesting batch or live transcoding jobs. Examples of such components include, for example, a storage system (as shown, for example, in FIGS.
- a content provider user interface e.g., a web-based portal providing a customer with a user interface to the CDN for configuring, uploading content to transcode, setting transcoding parameters, and monitoring the operation
- an Entry Point or Puller or other component in the streaming architecture as shown, for example, in FIGS. 3 , 5 - 7
- a CDN server 102 that has received a request from an end-user client.
- the Map-Maker and DNS system shown in connection with FIG. 1 can be leveraged to find the closest and best available Fluxer, as the map-maker monitoring agents and the data collection system 108 are already monitoring network conditions and machine usage for the content delivery network.
- the requesting component makes a DNS request to a Fluxer domain and receives back the IP address of a particular Fluxer machine available for connection.
- the requestor can use a shared secret to authenticate to the Fluxer.
- the Fluxer contacts the TRAS to request a list of servers to use for transcoding, and preferably provides the TRAS with as many specifics about the job as possible, including the approximate size of the input source, and whether the job is classified as real-time or batch or otherwise, which effectively classifies the priority of the job, and potentially specifics about the input/output formats, desired bitrates, etc.
- the TRAS uses this information to approximate how many transcoding resources it will need, and what mix of MTRs and STRs will be the most appropriate.
- MTRs are dedicated transcoding resources that are managed by the transcoding system
- STRs are transcoding resources which are shared with content delivery resources (or shared with some other business function in the platform).
- the TRAS can uses a resource management service referred to here as the coordination server (C-server).
- C-server uses the C-server to reserve local MTRs, while it asks the map-maker system ( FIG. 1 ) for any needed STR.
- the Mapper will identify an approximate number of CDN servers from a pool that are running with a low utilization (e.g., with CPU or memory or request rate or other hardware metrics below some predetermined threshold, which ideally ensures that content delivery is not compromised) and return a list to TRAS.
- the TRAS merges the lists, preferring MTRs for real-time jobs and STRs for batch jobs, and returns the final list to the Fluxer.
- the Fluxer begins splitting the input source file into a plurality of segments.
- the input file is not raw, uncompressed data but a somewhat compressed file arriving from a customer that is too big to serve to requesting clients, but is suitable for transcoding (for example, a 50 MB/s video may be suitable, depending on the nature of the content and the encoding used).
- the input file may also be a previously encoded/compressed file that is now being transcoded to another format or bitrate.
- the Fluxer splits the file into segments for transcoding purposes.
- the transcoding segments may correspond to group-of-picture (GoP) boundaries, in which case they are referred to herein as chunks.
- the transcoding segments are split along other boundaries into pseudo-chunks, as will be described in more detail below.
- a transcoding segment refers to the actual bits being transcoded, i.e., the bits involved in the input and output, and does not necessarily correspond to a single chunk or pseudo-chunk, as it may contain multiple chunks or pseudo-chunks. Pseudo-chunks may overlap in time, i.e., they do not necessarily represent contiguous portions of the overall input file.
- the process of determining how to split the file into transcoding segments can involve many determinations and is explained later in more detail in the section titled “Creating Transcoding Segments From an Input”.
- the Fluxer sends the transcoding segments to selected transcoding resources along with a list of ways in which that segment should be transcoded. Note that this means that the list may specify more than one output—for example, “transcode the segment into a derivative segment in format/bitrate 1 , and another derivative segment in format/bitrate 2 .” As each transcoding resource transcodes its given segment, it replies over the open HTTP connection with the derivative segments produced from the input source. If a transcoding resource cannot complete the transcode due to some unforeseen circumstance, it simply tears down the connection and goes away, leaving the Fluxer to source another transcoding resource for that segment. Once all of the segments have been transcoded, the Fluxer re-assembles them into a single file and sends the file to the destination specified by the initial request.
- the destination of the file may be, for example, a network storage system, a streaming mid-tier machine (e.g., as shown in the architectures of FIGS. 5-7 for example), proxy server, or other component in the CDN. Unless the target format produced by the transcoding system was intermediate format (IF), the destination component may then convert the file to IF for use with the streaming platform described previously, for shipping the data within the streaming architecture.
- a streaming mid-tier machine e.g., as shown in the architectures of FIGS. 5-7 for example
- proxy server e.g., proxy server, or other component in the CDN.
- the destination component may then convert the file to IF for use with the streaming platform described previously, for shipping the data within the streaming architecture.
- transcoding when transcoding is initiated, it is initiated by the Puller component in response to the presence of a set of transcoding profiles in the Stream Manifest Manager (SMM) for that live stream.
- SMM already carries the concept of an Archiver set, and here includes the concept of a Fluxer Set.
- the Puller contacts one of the Fluxer Machines in the Fluxer Set with the parameters of the live event and the Fluxer set begins an election process to decide who is the most appropriate Fluxer Machine to act as the Mother (the remaining Fluxers will be designated as Children).
- the Mother begins transcoding by pulling the stream from the source Archiver, transcoding using transcoding resources as described above, and pushing it to the target Archiver. Children are responsible for monitoring the Mother and electing a new Mother in the event of a failure. (For simplicity of illustration, in FIG. 19 only the Fluxer that is acting as the Mother is shown.)
- FIG. 19 illustrates and the foregoing describes operation of the transcoding system with the streaming architecture shown in FIG. 3 .
- the transcoding system works in conjunction with the streaming architecture illustrated in FIGS. 5-15 .
- the Fluxer can receive a request to transcode and source content from an entry-point (EP) stream manager process and sends transcoded output to an SMT machine, rather than a Target Archiver.
- the transcoding system is not limited to use with any particular streaming architecture, or with a streaming architecture at all (i.e., it can be a standalone transcoding service).
- the C-server is a coordination engine for a given transcoding region that provides a service for maintaining configuration information, naming, providing synchronization and group services to distributed applications.
- C-server functionality may be built on top of existing, known platforms such as Zookeeper (Apache Hadoop) for example, although this should not be viewed as limiting or required.
- the C-server provides a job-queue and tracks which resources are working on those jobs, and also maintains resiliency when those servers fail.
- the C-server is region specific and runs on all Fluxers in a region using an internal election algorithm to determine the leader for write coordination to the C-server system.
- the C-server can report its region and status to a supervisory query function so that alerts can be triggered for a low number of C-servers running in a region, mitigating availability issues.
- the TRAS provides an application programming interface (API) for obtaining a set of possible transcoders that can be called directly by the Fluxer to perform transcoding of segments. Since there are multiple types of transcoding resources available (MTR/STR) and since the method of accessing them may differ, TRAS provides an abstraction for access to both of these resources through a common interface. TRAS can be implemented as a built-in library to be consumed by the Fluxer. This means that it is run as part of the Fluxer process.
- API application programming interface
- TRAS allows for distinct types of transcoder requests, for example: high-priority (typically real-time needs for live transcodes, which may necessitate using only MTRs) and low-priority (typically batch needs, which may involve a mix of MTRs and STRs).
- TRAS returns a list of possible resources for use as transcoders to Fluxer. Both high-priority and low-priority requests typically specify a bucket-size, which TRAS will attempt to fill.
- the response to Fluxer is a data structure that includes the transcoding resource's IP address and type.
- the transcoding resources themselves are considered volatile and TRAS provides no guarantees that the resources will accept a transcoding request.
- STR availability is delegated to Mapper in this embodiment.
- CDN server utilizations are reported back to Mapper as part of monitoring agents and the data collection system 108 in FIG. 1 .
- Mapper identifies a pool of available CDN servers which are mostly idle (e.g., as defined by some metric such as CPU utilization in the recent past, cache utilization, geographic location relative to expected load—in other words, servers that are located in regions where demand for delivery services is low due to time of day or some other reasons, etc.), pseudo-randomize the selection and will return the maximum number of available IP addresses that can fit in a response packet.
- TRAS may perform this request more than once to fill the internal bucket requested by the Fluxer.
- TRAS When TRAS receives a request that uses at least some MTRs (for example, a live-event transcode), it will use C-server's coordination capabilities to “reserve” a number of MTRs as requested by the Fluxer.
- TRAS provides its service through a combined, parallel query to both Mapper and C-server. As noted, it gathers enough resources to fill a bucket, the size of which depends on the priority of the request, then returns that bucket of resources to the Fluxer. In this approach, TRAS is gathering a group of resources that are likely available but may not be. In the end, it is a combination of pseudo-randomization of the large pool of STRs and usage of local MTRs that achieves distribution of load among all transcoding resources.
- TRAS monitors the regional load of the MTRs it is managing.
- An MTR regularly updates the C-server with its queue load.
- TRAS periodically calculates the percentage of MTRs available, weighting them by their remaining capacity. An average is then calculated and used as a Regional Load Factor. For example if there are 10 MTRs each with a load of 10%, 20%, 30%, . . . 100%, then the algorithm would be as follows:
- This Regional Load Factor may be reported to any system attempting to determine the availability of work units for a given regional transcoding installation.
- the foregoing load-factor algorithm should not be viewed as limiting, as other algorithms may be used in other embodiments.
- the Fluxers are the primary interface of the transcoding system to the outside world and the most common component for external clients to interact with.
- the purpose of the Fluxer is to break-up a video into segments, send those segments to one or more transcoders and reassemble those segments into the target container file. There are a number of low-level details involved in this function.
- Fluxers provide several interfaces to support Live (real-time), VOD (batch) and VOD (real-time) use cases.
- Fluxer live interfaces allow the Fluxer to transcode a live event by pulling a bitrate/format from an Archiver or Entry-Point, producing one or more transcoded bitrates/formats, and publishing all configured bitrates/formats to an Archiver or Streaming Mid-Tier.
- This activity is initiated by an HTTP Request to the Fluxer's live interface, containing the source Archiver set or Entry-Point, the target stream-id and the configuration for all derivative bitrates/formats.
- the initiating HTTP request causes the Fluxer to begin transcoding until the stream is torn-down.
- Fluxer VOD interfaces are primarily implemented in the current embodiments as pull-based HTTP interface with the primary difference being how much of the file is transcoded at a given time. Regardless of the request being over the live or VOD interface, Fluxers generally wait to acknowledge jobs until they have obtained an initial set of resources from TRAS. If initial resource allocation fails, then the Fluxer can communicate that failure immediately regardless of a synchronous or asynchronous job.
- Fluxer's live interface is a URL that triggers Fluxer activity but does not require that the initiator remain connected to the HTTP Socket, as the activity is ongoing and no feedback is required for the initiator.
- This allows a resource to ask a Fluxer to initiate transcoding of a live stream and to contact some number of additional Fluxers, asking them to monitor the primary.
- the initiation of this request typically contains the following information:
- FIG. 20 illustrates one embodiment of the operation of the Fluxer (and other system components) when transcoding a live stream.
- the Puller contacts the streaming manifest manager and gets an Archiver set or Fluxer set.
- the Puller contacts source Archiver, initiates a stream.
- the Puller contacts first Fluxer from Fluxer Set and passes transcoding information. The contacted Fluxer then contacts remaining Fluxers in the set and they decide who will be the Mother and who will be Children. Transcoding parameters are communicated here. Fluxer Children begin monitoring the Mother.
- the Mother Fluxer contacts SMM to get the Archiver set.
- Fluxer contacts TRAS to get transcoding resources.
- step 6 Fluxer initiates pull from Source Archiver.
- step 7 the Mother Fluxer begins the parallel transcode of the stream being pulled from Source Archiver, utilizing the transcoding resources (TRs).
- step 8 the Mother Fluxer re-assembles the transcoded segments and sends the transcoded stream to target Archiver set assigned by SMM for each bitrate.
- an Entry-Point locates a Fluxer and requests a transcode.
- the Entry-Point itself sources the stream to be transcoded, or points to the Fluxer to a Storage source stream using the metadata files described in connection with FIG. 8 .
- the transcoded stream is sent to a streaming mid-tier SMT machine or to the Storage system, rather than an Archiver.
- FIG. 21 illustrates the operation of the Fluxer (and other system components) when transcoding a VOD stream in batch mode.
- the Job Queue contacts Fluxer.
- the Job Queue can exist as part storage system process, portal, or other component accessing the transcoding system.
- Fluxer contacts TRAS to get transcoding resources.
- Fluxer pulls media from the source.
- the Fluxer orchestrates the transcoding of the content using transcoders resources from TRAS.
- the Fluxer posts transcoded content to a destination.
- Job Queue removes the job.
- the Job Source can pick a Fluxer at its own discretion however, preferably it chooses a Fluxer that is both idle and near the job source.
- the Mapping system can be used to determine the best Fluxer by sending a DNS request to a fluxer domain and receiving back from the Mapping system the IP address of a suitable Fluxer. Batch VOD Fluxer requests, although not prohibited from using MTRs, can be weighted to prefer using idle or low-usage STR transcoders.
- FIG. 22 illustrates the operation of the Fluxer (and other system components) when transcoding a VOD stream in real-time mode.
- a request comes in to Fluxer from a CDN's content server (e.g., an HTTP proxy server as shown and described in connection with FIG. 1 ) that has received a user request for a file, or from a cache hierarchy region that has been asked for the content by the server (e.g., using a cache hierarchy technique as described in U.S. Pat. No. 7,376,716, the disclosure of which is incorporated herein by reference), or from a SMT machine (see, e.g., FIG.
- a CDN's content server e.g., an HTTP proxy server as shown and described in connection with FIG. 1
- a cache hierarchy region that has been asked for the content by the server
- SMT machine see, e.g., FIG.
- step 2 assume Fluxer checks its transcoding region cache for requested segments of the content (which may correspond to, e.g., one or more IF fragments). Assume it receives a cache miss.
- step 3 the Fluxer contacts TRAS to identify transcoding resources.
- step 4 the Fluxer requests and receives the segments from the source (e.g., from storage or origin).
- step 5 the Fluxer transcodes them using transcoding resources.
- step 6 the Fluxer returns transcoded segments to the requesting component following re-assembly into a file or portion thereof.
- step 7 the Fluxer begins workahead transcoding.
- the Fluxer determines that there is a region cache hit in step 2 , then the Fluxer retrieves the trancoded segment from region cache, looking for a segment that is at least N seconds ahead of the requested segment (where N is determined by a configuration parameter). Fluxer either begins workahead or not depending on whether it can find sufficient number of segments in cache to meet the workahead criteria.
- a content provider's configuration for real-time VOD transcoding contains a parameter which defines the number of segments to transcode ahead of the most current request, e.g., by indicating a number of seconds to work ahead.
- a real-time VOD request comes to a Fluxer it can check to see if the required segments have already been transcoded and if so will begin delivering immediately while it performs the workahead of N segments based on the position of the request being served.
- Caching proxy server functionality is employed locally on a Fluxer to maintain a cache-layer for the work performed in real-time. Once a request has been transcoded the derivative is cached locally within the transcoding region.
- the Fluxer leverages this feature by performing a lookahead request of N segments ahead of the current segment request. If a non-200 response code is returned by the local cache server for any of the N segments, Fluxer will respond by posting the required segment to a TR through its local cache server, resulting in caching of the transcoded response within the cache server layer.
- Pre-processing the media by transcoding the first few segments of a video means that the system can begin streaming immediately while the transcoder builds up a workahead buffer.
- Pre-processing typically includes the following actions:
- a Mapper load-feedback property can be used to find appropriate Fluxers for real-time VOD transcoding.
- real-time Fluxer requests use local MTR (dedicated) transcoder resources.
- Load-feedback from the Fluxer to the Mapper can include both the local Fluxer load and the regional transcoding resource load as well.
- Regional transcoder load estimation can be obtained from the Fluxer by making a call to TRAS to perform the “Regional Load Estimation”, as described above in connection with the TRAS component, and thereby return a “Regional Load Factor” to the Fluxer.
- the role of the transcoding resource is primarily to transcode segments of audio/video, or other content that needs to be transcoded.
- a transcoding resource uses an HTTP-based API for receiving and transmitting segments.
- all transcoding resources are considered unreliable—and particularly STRs.
- a shared transcoding resource may terminate the transcode for any reason although if it terminates the transcode due to an error in the source media it preferably indicates that fact to the Fluxer, e.g., using an HTTP 415 Unsupported Media Type error, for example. If a Fluxer receives an unexpected disconnect from a transcoding resource (particularly an STR) it preferably ceases using that transcoding resource for at least a given time period, to prevent impacting STRs that are delivering content in the CDN.
- STRs load is a concern for STRs, as they are typically the HTTP proxy servers running in the CDN and delivering content to end users in FIGS. 1-16 , since the integrity of the delivery network is preferably protected.
- the process managing the transcoding on the STR is configured to avoid impact to the STR.
- STRs monitor their local environment and terminate jobs if the environment becomes constrained. In the STR environment, the HTTP proxy server (ghost) process is considered more important than the transcoding process.
- STRs run a process “manager” which in turn runs and monitors the actual transcoding server as a child process. This “manager” may take any of several steps to “lock-down” the transcoding process such as using LD_PRELOAD to block dangerous system calls, chrooting the process and monitoring the process for excessive runtime and/or CPU consumption.
- FIG. 23 provides an overview of processes executing on a transcoding resource (excluding HTTP proxy processes for content delivery).
- a client e.g., a Fluxer
- transcoding resources can communicate with transcoding resources using an HTTP 100 Expect/Continue workflow. This is preferable because a transcoding resource may not be able to handle any work and it is useless and wasteful to send an entire segment only to be denied.
- a transcoding resource may block for a period of time before sending a 100 Continue response to a requesting client but also preferably responds immediately if unable to handle the request.
- transcoding resources accept transcoding segments that are chunks or pseudo-chunks for transcoding.
- transcoders are generally considered unreliable by the Fluxers.
- a Fluxer receives a list of transcoding resources so that it may begin to send segments to them. Without a large, global, fine-grained, resource allocation system, it would be impossible to have a high degree of certainty that a given transcoding resource will accept a segment to transcode.
- transcoding resources run on commodity hardware, so failure of a transcoding resource during the transcoding process is not only a possibility but may even be likely at some point across the transcoding system. For this reason, it is simpler to adopt an unreliable view of transcoding resources. This view also simplifies the transcoding resource implementation.
- the Fluxer preferably sends the segment to an alternate transcoding resource. However, if the transcoding resource returns an actual error about the source bits (e.g. some fatal error with the original encode) then the Fluxer may send the segment to another transcoding resource or it may give up on the segment altogether, failing the transcode.
- Possible transcoders are identified from a pool of available transcoding resources in one of a few ways.
- Mapper is used to provide a map that can return a list of possible resources which appear to be under a given load threshold, as mentioned above. This is provided over a DNS interface with the parameters encoded into the requesting hostname.
- This DNS request may return a large number of possible hosts—more than that associated with a typical DNS lookup in the delivery network.
- STRs returned are considered volatile and may accept or reject the request based on their own local load.
- Transcoding resources can have a fixed number of “slots” which is made up of two counters and indicates the number of individual transcode-segment requests that may be accepted by that transcoding resource at any given period of time.
- One counter is the “available-process” counter and is some sub-percentage of the number of available cores on the system.
- the other counter is the “queue” counter and is some configurable number of additional tasks that are allowed to be waiting but not actively being worked on. Both of these factors are reactive to the hardware the transcoding resource is installed on and both are configurable. For example, an available-process factor of 0.5 (or 50% of system cores) and a queue counter of 0.10 (or 10% of cores). Taken together, these two counters make up the total number of available “slots” for a given transcoding resource.
- transcoding resource As a transcoding resource is accepting work it continues to accept requests to transcode segments so long as it has available processes and/or slots. Should the transcoding resource be completely full, it denies the request with a HTTP 503 Service Unavailable error. A 100 Expect/Continue method is otherwise used to ensure that the request is valid and that the transcoding resource has an available process to perform the requested action. If the processes are all allocated and an inbound Fluxer request lands on a queue slot then the transcoding resource should block its “CONTINUE” response until the queue slot becomes assigned to a process.
- a queuing system exists to request files be transcoded at the earliest possible convenience.
- This queue contains a list of jobs that define a source, a transcode profile and a destination and will be executed on as soon as possible given the resources available.
- the queue itself is quite simple, can be distributed into many sub-queues and will mostly be used by some user interface to provide batch-transcoding services for bitrates that a content provider wishes to crate and have stored for later delivery.
- the local queue manager Upon waking up, the local queue manager will simply take the top N jobs off the stack and make required batch requests to the Fluxers, allowing the transcoding system to work to complete the transcoding job.
- Multiple queues may be running within a given transcoding region, typically running on the same hardware that is running the Fluxer or TRAS code.
- Examples of jobs which the transcoding system is configured to support may include the following (which are non-limiting examples):
- the transcoding system also preferably supports the application of filters and scalers (i.e. deinterlacing and frame-scaling).
- the system described herein is not limited to such.
- the teachings above may be extended so as to provide a distributed platform for applying security or rights management schemes to content.
- the system above may be modified by having the Fluxer receive requests (by way of illustration) to apply a given encryption algorithm to a file.
- the Fluxer can break up the file into segments that are each to be encrypted, and delegate the tasks of doing so to distributed MTRs and STRs, as described above.
- the nature of the assigned task may change but the system still operates similarly.
- Other tasks might include embedding a watermark in the content, or inserting data to apply a digital rights management scheme to the file.
- system can receive an end-user client request for content, discern information about the end-user client (client IP address, user-agent, user-id, other identifier, etc.) and incorporate that data into a fingerprint that is inserted into the content in real-time, leveraging the real-time transcoding flow described above (e.g., FIG. 22 ) to convert the file on the fly.
- client IP address client IP address
- user-agent user-agent
- user-id other identifier, etc.
- system can receive an end-user client request for content, discern information about the end-user client (client IP address, user-agent, user-id, other identifier, etc.) and incorporate that data into a fingerprint that is inserted into the content in real-time, leveraging the real-time transcoding flow described above (e.g., FIG. 22 ) to convert the file on the fly.
- the content can be marked with information related to the end-user (or client machine) to whom it was delivered.
- the following presents examples of how the Fluxer can break apart incoming files into transcoding segments, and more particularly how it can break apart incoming video files.
- segmented parallel encoding typically makes the tradeoff of inflexible keyframe intervals for the speed of encoding videos using a large number of encoders operating in parallel. If keyframe intervals are not altered then the boundary of a keyframe may be considered a chunk or segment and treated independently of other chunks.
- the transcode can be parallelized, increasing its speed relative to the number of encoders and reduce the encoding time to the minimum of (demuxing_time+slowest_segment_encode_time+re-muxing_time).
- Codecs enable the compression of video by taking advantage of the similarity between frames.
- I-frames aka, keyframes
- P-frames a frame that contains the complete information to construct a complete frame on its own.
- P-frames reference essentially what has changed between itself and the previous frame while B-frames can refer to frames ahead of them or behind them.
- the group of frames that starts with an I-frame and ends with the last frame before the next I-frame is often referred to as a Group Of Pictures or “GoP”.
- a video that is encoded as a Closed-GoP video means that each GoP can be treated independently from the others.
- a container generally refers to a file that wraps the raw encoded bits of media (e.g., audio, video) and may provide indexing, seekability and metadata.
- a container divides the raw bits into “packets” which may contain one or more frames.
- a frame typically has a frame-type of audio, video or a number of less-frequent possibilities such as subtitles and sprites. For video, these frames each correspond to the type of frames mentioned above, I-Frame, B-Frame, P-Frame, etc.
- GoP Size Modification becomes complicated with parallelizing transcodes to multiple processors. For example, if a typical encode has a GoP size of 250 frames (8.34 seconds of NTSC Video), this can be an issue for high-keyframe-rates, which may be present, e.g., in HD video formats. If a HD or other video format is desired to run 2-3 seconds between keyframes (approximately 60-90 frames in the GoP), neither 60 or 90 frames can be evenly divided into the 250 frame/second source keyframe rate. Solving this problem involves maintaining some kind of alignment over how many frames will be required to decode the frames necessary to produce a keyframe at an unusual time.
- NEWGoP 1 will be frames 1 - 90 , and needs frames 1 - 90 to be able to be re-encoded
- NEWGoP 2 will be from frame 91 - 180 and needs frames 1 - 180 to be able to be re-encoded
- NEWGoP 3 will be from frames 180 to 270 and will therefore need frames 1 - 270 to be able to be re-encoded. Notice, we've crossed into a new GoP now. NEWGoP 3 will have to start with the first GoP and need several frames from the second GoP in order to be encoded.
- NEWGoP 4 doesn't have this problem, it will be made up of frames 271 - 360 and therefore only needs frames 251 - 360 in order to start from a keyframe and encode its bits.
- FIG. 24 illustrates this scenario.
- a pseudo-chunking approach can address this issue by, in one embodiment, allowing for segments that are not aligned to keyframes or GoPs.
- a pseudo-chunk may be larger or smaller than a GoP.
- the segmenter e.g., the Fluxer
- the Fluxer can create a pseudo-chunk that extends past the Current GoP to reach the end of NewGoP 3 .
- Pseudo-chunking also applies to scene change detection, and more particularly, to situations where there are frequent scene changes in a video file.
- a scene change refers to an interruption in the regular sequence of keyframes. It typically exists because enough has changed from one frame to the next that a P or B frame becomes impractical, i.e., there is enough difference between frames for the encoder to place an additional keyframe in-line for quality sake. Most modern encoders contain some threshold for inserting extra keyframes on scene changes in order to optimize the encoding experience. Scene-changes can present a problem if too simplistic of an algorithm is used when segmenting, such as simply splitting on keyframes.
- pseudo-chunking approach in which pseudo-chunks may span more than one keyframe in appropriate circumstances, can address this issue (e.g., by including some predetermined minimum number of frames/time in the pseudo-chunk segment, regardless of keyframe intervals).
- Pseudo-chunking addresses open GoP encoding as well.
- a GoP ends with a P-frame (which references a previous frame). This is a closed GoP.
- a B-frame which could refer to the next frame in the next GoP (the starting I-Frame). When this occurs it is referred to as an open-GoP.
- An open-GoP presents a problem over a closed-GoP when parallelizing encodes because some amount of the next GoP is required to complete the encode.
- a device managing the transcode (such as the Fluxer in the transcoding system previously described) is configured to be aware of what frames it needs to use, as a subset of those received, to produce a new transcode.
- the Fluxer will look at a frame to determine what kind of frame it is (B-frame, P-frame, keyframe, etc., Closed-GoP situation, etc.), understand what GoP size it needs to target. It is frame-aware.
- the Fluxer has intelligence to create pseudo-chunks, rather than blindly segmenting on keyframes. It can then include the appropriate coded frames in a pseudo chunk, so that the transcoding resource has all the data it needs to decode, convert the data, and re-encode as required.
- a pseudo chunk may be either a partial or super-GoP.
- a pseudo chunk is used as a unit of data that is transferred from a Fluxer to a transcoder and may not include the entire GoP if the entire GoP is not required for transcoding the target number of frames.
- a pseudo chunk may also contain more frames then a given GoP in the case of an Open GoP condition or if the target keyframe interval is sufficiently different from the source keyframe interval. So a pseudo-chunk is not necessarily aligned with a GoP, and may extend past the original GoP boundary or not reach that far.
- FIG. 25 illustrates an example of pseudo-chunking to change the GoP size in a given video file.
- the pseudo-chunk starts at a keyframe boundary and continues past the Current GoP (the original GoP) until enough frames are included to construct the New GoP that bridges the boundary between Current GoP 1 and Current GoP 2 .
- Given a video that is 1 frame per second and has a 10 second GoP we have a GoP every 10 frames ( 1 - 10 , 11 - 20 , 21 - 30 , etc. . . . ).
- Current GoPs 1 and 2 are such GoPs with 10 frames each.
- the Fluxer preferably ensures that the last frame of the pseudo-chunk is not a B-frame referring to a frame ahead of it. If it is, then another frame(s) may need to be included in Pseudo Chunk 1 .
- pseudo-chunking involves including both the starting and ending keyframes to deal with open GOP situations.
- sequential encoding one would only need the frames that are desired to be encoded—and the keyframe of the next GOP is unnecessary—but in parallel transcoding case, and with a “frame-aware” Fluxer, one can and should send the extra frame.
- the Fluxer ensures that our pseudo-chunks always start on a keyframe and continue past the frame-number of the last needed frame to the point that there are either no further forward-looking B-frames or it encounters the next keyframe.
- a pseudo-chunking Fluxer can mitigate the effects of frequent scene changes, which can produce transcoding segments that are too small, by applying certain thresholds (minimum number of frames for a segment) in the pseudo-chunking process.
- Fluxer can produce an index file describing the breakup of all pseudo chunks produced, for the input audio and video tracks, called a “Chunk Index Header”. This file can be used for accelerating real-time transcodes by identifying the individual pseudo chunks for the particular input and what byte-offsets they occupy in the file, making retrieval of discrete units easier.
- pseudo-chunking is not limited to the applications described above, nor is it limited to use by a Fluxer described herein. Any module charged with segmenting a file for encoding may employ pseudo-chunking. Further, other forms of media, particularly those that utilize atomic data that references other data in a stream (as do B-frames, P-frames, etc.)
- the clients, servers, and other devices described herein may be implemented with conventional computer systems, as modified by the teachings hereof, with the functional characteristics described above realized in special-purpose hardware, general-purpose hardware configured by software stored therein for special purposes, or a combination thereof.
- Software may include one or several discrete programs. Any given function may comprise part of any given module, process, execution thread, or other such programming construct. Generalizing, each function described above may be implemented as computer code, namely, as a set of computer instructions, executable in one or more processors to provide a special purpose machine. The code may be executed using conventional apparatus—such as a processor in a computer, digital data processing device, or other computing apparatus—as modified by the teachings hereof. In one embodiment, such software may be implemented in a programming language that runs in conjunction with a proxy on a standard Intel hardware platform running an operating system such as Linux. The functionality may be built into the proxy code, or it may be executed as an adjunct to that code.
- FIG. 26 is a block diagram that illustrates hardware in a computer system 2600 upon which such software may run in order to implement embodiments of the invention.
- the computer system 2600 may be embodied in a client device, server, personal computer, workstation, tablet computer, wireless device, mobile device, network device, router, hub, gateway, or other device.
- Representative machines on which the subject matter herein is provided may be Intel Pentium-based computers running a Linux or Linux-variant operating system and one or more applications to carry out the described functionality.
- Computer system 2600 includes a processor 2604 coupled to bus 2601 . In some systems, multiple processor and/or processor cores may be employed. Computer system 2600 further includes a main memory 2610 , such as a random access memory (RAM) or other storage device, coupled to the bus 2601 for storing information and instructions to be executed by processor 2604 . A read only memory (ROM) 2608 is coupled to the bus 2601 for storing information and instructions for processor 2604 . A non-volatile storage device 2606 , such as a magnetic disk, solid state memory (e.g., flash memory), or optical disk, is provided and coupled to bus 2601 for storing information and instructions. Other application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs) or circuitry may be included in the computer system 2600 to perform functions described herein.
- ASICs application-specific integrated circuits
- FPGAs field programmable gate arrays
- the system 2600 may have a peripheral interface 2612 communicatively couples computer system 2600 to a user display 2614 that displays the output of software executing on the computer system, and an input device 2615 (e.g., a keyboard, mouse, trackpad, touchscreen) that communicates user input and instructions to the computer system 2600 .
- the peripheral interface 2612 may include interface circuitry, control and/or level-shifting logic for local buses such as RS-485, Universal Serial Bus (USB), IEEE 1394, or other communication links
- Computer system 2600 is coupled to a communication interface 2616 that provides a link (e.g., at a physical layer, data link layer, or otherwise) between the system bus 2601 and an external communication link.
- the communication interface 2616 provides a network link 2618 .
- the communication interface 2616 may represent a Ethernet or other network interface card (NIC), a wireless interface, modem, an optical interface, or other kind of input/output interface.
- NIC network interface card
- Network link 2618 provides data communication through one or more networks to other devices. Such devices include other computer systems that are part of a local area network (LAN) 2626 . Furthermore, the network link 2618 provides a link, via an internet service provider (ISP) 2620 , to the Internet 2622 . In turn, the Internet 2622 may provide a link to other computing systems such as a remote server 2630 and/or a remote client 2631 . Network link 2618 and such networks may transmit data using packet-switched, circuit-switched, or other data-transmission approaches.
- ISP internet service provider
- the computer system 2600 may implement the functionality described herein as a result of the processor executing code.
- code may be read from or stored on a non-transitory computer-readable medium, such as memory 2610 , ROM 2608 , or storage device 2606 .
- a non-transitory computer-readable medium such as memory 2610 , ROM 2608 , or storage device 2606 .
- Other forms of non-transitory computer-readable media include disks, tapes, magnetic media, CD-ROMs, optical media, RAM, PROM, EPROM, and EEPROM. Any other non-transitory computer-readable medium may be employed.
- Executing code may also be read from network link 2618 (e.g., following storage in an interface buffer, local memory, or other circuitry).
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The subject matter herein generally relates to transcoding content, typically audio/video files though not limited to such, from one version to another in preparation for online streaming or other delivery to end users. Such transcoding may involve converting from one format to another (e.g., changing codecs or container formats), or creating multiple versions of an original source file in different bitrates, frame-sizes, or otherwise, to support distribution to a wide array of devices and to utilize performance-enhancing technologies like adaptive bitrate streaming. A transcoding platform is described herein that, in certain embodiments, leverages distributed computing techniques to transcode content in parallel across a platform of machines that are preferably idle or low-utilization resources of a content delivery network. The transcoding system also utilizes, in certain embodiments, improved techniques for segmenting the original source file so as to enable different segments to be sent to different machines for parallel transcodes.
Description
- This application claims the benefit of priority of U.S. Provisional Application No. 61/556,236, filed Nov. 6, 2011, and of U.S. Provisional Application No. 61/556,237, filed Nov. 6, 2011, the teachings of both of which are hereby incorporated by reference in their entirety.
- 1. Technical Field
- This disclosure relates generally to computer systems for processing of media files, and other content, using distributed computing techniques.
- 2. Brief Description of the Related Art
- Content providers (such as large-scale broadcasters, film distributors, and the like) desire to distribute their content online in a manner that complements traditional mediums such as broadcast TV (including high definition or “HD” television) and DVD. It is important to them to have the ability to distribute content to a wide variety of third-party client application/device formats, and to offer a quality viewing experience regardless of network conditions, using modern technologies like adaptive bitrate streaming. Notably, since Internet-based content delivery is no longer limited to fixed line environments such as the desktop, and more and more end users now use mobile devices to receive and view content in wireless environments, the ability to support new client device formats and new streaming technologies is particularly important.
- Media files are one common kind of content that content providers distribute. A media file may be single-media content (e.g., audio-only media) or the media file may comprise multiple media types, i.e., a multimedia file with audio/video data. Generally speaking, a given multimedia file is built on data in several different formats. For example, the audio and video data are each encoded using appropriate codecs, which are algorithms that encode and compress that data. Example codecs include H.264, VP6, AAC, MP3, etc. A container or package format that functions as a wrapper and describes the data elements and metadata of the multimedia file, so that a client application knows how to play it. Example container formats include Flash, Silverlight, MP4, PIFF, and MPEG-TS.
- The bit rate at which to encode the audio and video data must be selected. An encoding with a lower bitrate and smaller frame size (among other factors) generally will be easier to stream reliably, since the amount of data will be smaller, but the quality of the experience will suffer. Likewise, an encoding at a higher-bitrate and a larger frame will be a higher quality experience, but is more likely to lead to interrupted and/or poor quality streams due to network delivery issues. Current adaptive bitrate streaming technologies require multiple streams each encoded at a different bitrate, allowing the client and/or server to switch between streams in order to compensate for network congestion.
- While other kinds of media files (like an audio-only file) may be somewhat less complex than the multimedia file described above, they nevertheless present similar issues in terms of encoding and formatting, stream quality tradeoffs, and player compatibility.
- Hence, to support the distribution of content to a wide variety of devices, content providers typically must create many different versions of their content. For example, they often will create multiple copies of a given movie title at different screen sizes, bit rates, quality levels and client player formats. Furthermore, over time they may want to change formats, for example by updating the encoding (e.g., to take advantage of newer codecs that compress content more efficiently). They may also need to change the container format to accommodate new client environments, a process often referred to as transmuxing. Failing to provide certain bit rates or poor encoding practices will likely reduce the quality of the stream. But generating so many different versions of content, as well as converting from one to another and storing them, is a time-consuming and costly process that is difficult to manage.
- For online delivery (e.g., streaming, download) of these various versions of content, content providers often use distributed computing systems to deliver their content. One such distributed computer system is a “content delivery network” or “CDN” that is operated and managed by a service provider. The service provider typically provides the content delivery service on behalf of third parties. A “distributed system” of this type typically refers to a collection of autonomous computers linked by a network or networks, together with the software, systems, protocols and techniques designed to facilitate various services, such as content delivery or the support of outsourced site infrastructure. Typically, “content delivery” means the storage, caching, or transmission of content, streaming media and applications on behalf of content providers, including ancillary technologies used therewith including, without limitation, DNS query handling, provisioning, data monitoring and reporting, content targeting, personalization, and business intelligence.
- A content delivery network such as that just described typically supports different content formats, and offers many advantages for accelerating the delivery of content, once created. However, the content provider still faces the problem of creating and managing the creation of all of the various versions of content that it desires and/or that are necessary.
- Thus, there is a need to provide methods and systems for generating, preparing and transforming streaming content in an efficient and scalable way. There is also a need to provide such functionality in a way that is compatible with delivery solutions so as to provide an overall end-to-end solution for content providers. The teachings herein address these needs and offer other features and advantages that will become apparent in view of this disclosure.
- The subject matter herein generally relates to transcoding content, typically audio/video files though not limited to such. Typically the transcoding is performed in preparation for online streaming or other delivery to end users. Such transcoding may involve converting from one format to another (e.g., converting codecs or container formats), or creating multiple versions of an original source file in different bitrates, resolutions, or otherwise, to support distribution to a wide array of devices and to utilize performance-enhancing technologies like adaptive bitrate streaming. This disclosure describes a transcoding platform that, in certain embodiments, leverages distributed computing techniques to transcode content in parallel across a platform of machines that are preferably idle or low-utilization resources of a content delivery network. The transcoding system also utilizes, in certain embodiments, improved techniques for breaking up the original source file that are performed so that different segments of the file can be sent to different machines for transcoding in parallel.
- In one embodiment, a transcoding platform is made up of distributed transcoding resources, typically servers with available processing power and programmed to assist in the transcoding function. These transcoding resources may be dedicated machines as well as machines that are shared with other functions. In particular, the machines can be idle or low-utilization HTTP proxy servers (relative to other such proxy servers) in a content delivery network. While these machines may spend much of their time receiving and responding to client requests for content, and otherwise facilitating delivery of online content to requesting end-users, at certain times (in the middle of night in their local time zone, for example) they may be relatively lightly-loaded, and hence available to perform certain transcoding tasks. The transcoding platform may also include a set of machine(s) that manage and coordinate the transcoding process. These machines may receive requests to perform a particular transcoding job, e.g., to convert a particular file from a first version to a second version. The request may come from a user interface (through which a content provider user of the platform uploads their content to be transcoded, for instance), from a network storage system, or from components in the content delivery network that are streaming content (e.g., that need to be able to deliver a particular format to a requesting end-user client), including one of the proxy servers. As appropriate, depending on the foregoing circumstances, the transcoding job may be designated with a priority level, which may correspond semantically to a “live”, “real-time” or “batch” mode conversion. In some cases, the proxy servers are only used if the priority level is below a certain threshold because the proxy servers are considered to be unreliable for transcoding tasks. Indeed, proxy servers may operate such that content delivery processes (e.g., responding to client requests) take priority over transcoding tasks when allocating processing time within the proxy server.
- Continuing with the current example, a machine(s) managing the transcoding process obtains a list of candidate servers for performing transcoding tasks. This list may include the results of a lookup into the content delivery network's monitoring and mapping system to determine which proxy servers within the network are currently experiencing a relatively light load for content delivery services, as measured by such metrics as processor (CPU), memory, or disk utilization, and/or client request rate, etc. The management machine retrieves the file to be transcoded and breaks it up into segments suitable to be independently converted. These segments are then sent to the various transcoding resources (e.g., the proxy servers or the dedicated machines) distributed across the platform, which given the nature of the content delivery network may be global in nature. Also sent along are instructions with parameters about the desired transcode operation and/or target format. Each transcoding resource performs its task independently, e.g., decoding the chunk that it is given and re-encoding with the appropriate parameters. It then returns the result to the management machine(s), which reassembles the new segments into the new file. Thus, for example, the proxy servers can continue to service client requests for content (the proxy process) while performing the transcode process with residual resources.
- Because proxy servers are responsible for servicing client requests, that process typically takes priority over the transcoding process. In some cases, the proxy server may determine that it cannot complete the transcode request and may send a message back to the management machine with an error or otherwise indicating it will not complete the transcode. Typically this would occur if the proxy server's load began to increase or to exceed a particular threshold.
- The transcoding process may involve changing any of a variety of characteristics of the file, for example and without limitation, changing a codec used to encode data in the file, changing a container format of the file, and/or changing one or more encoding parameters or container format parameters. Thus the transcoding process may involve changing a bit-rate of encoded data in the file, an image resolution for data in the file, a frame size for data in the file, an aspect ratio for data in the file, a compression setting used to encode data in the file, other settings such as GoP settings, color spaces, stereo/audio settings, etc. The transcoding process may also involve changing other characteristics, such as an interlacing characteristic for data in the file. In addition, the system may be used to change or add security features to the file, e.g., by applying encryption, embedding a watermark or a fingerprint in the content, or inserting data to apply a digital rights management scheme to the file.
- In some cases, when the source file is a video, the platform uses a pseudo-chunking approach for breaking up the video file to create the transcoding segments. For example, the management machine(s) can be configured to be frame-aware, such that it can include “additional” frames in a given segment to enhance the ability for a given transcoding resource to transcode that segment independently of other frame information in the file. This is advantageous and sometimes necessary because the transcoding resource usually will not receive the entire original source file. Such pseudo-chunking techniques are useful when the transcode involves modifying the size of GoPs, the rate of keyframes in the source file is relatively high, or the source file contains so-called open GoPs, among other scenarios.
- More specifically, in some embodiments, a frame-aware segmentation process (e.g. in the management server) can receive a video file that is to be converted from a first version to a second version. The video file is typically made up of a plurality of frames organized into a plurality of groups-of-pictures (GoPs). The segmenter examines frames in the file to identify a given GoP and to determine the type of frames in the given GoP, and creates a segment that includes frames beyond those in the given GoP. This segment is then sent off to be independently transcoded as described above.
- The inclusion of the additional frames may occur because the segmenter determines that the given GoP cannot be divided into a whole number of target GoPs (the target GoPs representing desired GoPs for the second version and having a smaller number of frames), in which case the segmenter can create the segment from the file to include at least some frames in the given GoP and at least one frame from a GoP immediately following the given GoP.
- Another possibility is that the target GoP is larger than the given GoP, and that it is not a whole-number-multiple of the size of the given GoP, in which case the segmenter can create the segment to include the given GoP and at least enough frames from GoPs immediately following the given GoP such that the segment reaches the size of the target GoP.
- Another possibility is that the segmenter identifies the given GoP as an open-GoP, and therefore creates the segment to include all of the frames from the given GoP and frames (e.g., up to and including a keyframe) from a GoP immediately following the given GoP.
- Yet another possibility is that the segmenter determines that the given GoP contains a number of frames that is less than a predetermined minimum number of frames, and so creates the segment to include the given GoP and at least enough additional frames so as to reach that predetermined minimum number of frames.
- As those skilled in the art will recognize, the foregoing merely refers to non-limiting embodiments of the subject matter disclosed herein. The teachings hereof may be realized in a variety of systems, methods, apparatus, and non-transitory computer-readable media. It is also noted that the allocation of functions to different machines is not limiting, as the functions recited herein may be combined or split amongst different machines in a variety of ways.
- The subject matter herein will be more fully understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
-
FIG. 1 is a diagram illustrating one embodiment of a known distributed computer system configured as a content delivery network; -
FIG. 2 is a diagram illustrating one embodiment of a machine on which a CDN server in the system ofFIG. 1 may be implemented; -
FIG. 3 is a diagram illustrating one embodiment of an architecture for live streaming delivery as described in U.S. application Ser. No. 12/858,177; -
FIG. 4 is a diagram illustrating one embodiment of an architecture and request flow of a video-on-demand approach as described in U.S. application Ser. No. 12/858,177; -
FIG. 5 is a schematic view of one embodiment of an architecture for live streaming, as described in U.S. application Ser. No. 13/329,057; -
FIG. 6 is a schematic view of one embodiment of an architecture for on-demand streaming as described in U.S. application Ser. No. 13/329,057; -
FIG. 7 is a schematic view illustrating the live streaming architecture ofFIG. 5 in more detail as described in U.S. application Ser. No. 13/329,057; -
FIG. 8 illustrates an example of a first live streaming workflow used when a stream is published from an encoder to an entrypoint (EP) as described in U.S. application Ser. No. 13/329,057; -
FIG. 9 illustrates an example of a second live streaming workflow used when an end-user makes a live request for content as described in U.S. application Ser. No. 13/329,057; -
FIG. 10 illustrates an example of a process by which live streams can be announced in the exemplary architectures shown in FIGS. 5,7, 8 and 9, as described in U.S. application Ser. No. 13/329,057; -
FIG. 11 illustrates an example of a technique for replicating live streams as described in U.S. application Ser. No. 13/329,057; -
FIG. 12 illustrates an example of an on-demand streaming workflow used when an end-user makes a request for content as described in U.S. application Ser. No. 13/329,057; -
FIG. 13 illustrates an example of the TransformLib component in more detail as described in U.S. application Ser. No. 13/329,057; -
FIG. 14 illustrates an example of a workflow supporting ingestion and output of a content stream in a given format as described in U.S. application Ser. No. 13/329,057; -
FIG. 15 illustrates an example of a workflow for supporting ingestion and output of a content stream in another given format as described in U.S. application Ser. No. 13/329,057; and -
FIG. 16 illustrates an example of a workflow using binary-side-includes (BSI) to facilitate streaming as described in U.S. application Ser. No. 13/329,081; -
FIG. 17 is a block diagram of one embodiment of a transcoding platform that includes a transcoding region with certain machines, as well as an existing content delivery network with machines that are leveraged to provide transcoding resources; -
FIG. 18 illustrates an example of a workflow for video-on-demand batch transcoding in accordance with the teachings hereof; -
FIG. 19 illustrates an example of a workflow for live transcoding in accordance with the teachings hereof; -
FIG. 20 illustrates an example of a workflow for live transcoding from the point of view of the Fluxer component, in accordance with the teachings hereof; -
FIG. 21 illustrates an example of a workflow for batch video-on-demand transcoding from the point of view of the Fluxer component, in accordance with the teachings hereof; -
FIG. 22 illustrates an example of a workflow for real-time video-on-demand transcoding from the point of view of the Fluxer component, in accordance with the teachings hereof; -
FIG. 23 is a diagram illustrating examples of certain transcoding processes executing in a server functioning as a transcoding resource, in accordance with the teachings hereof; -
FIG. 24 is a diagram illustrating modification of group-of-picture (GoP) size as part of a transcoding job; -
FIG. 25 is a diagram illustrating an example of a pseudo-chunking approach for transcoding, in according with the teachings hereof; and, -
FIG. 26 is a diagram that illustrates hardware in a computer system that may be used to implement the teachings hereof. - The following description sets forth non-limiting embodiments to provide an overall understanding of the principles of the structure, function, manufacture, and use of the methods, systems, and apparatus disclosed herein. The methods, systems, and apparatus described herein and illustrated in the accompanying drawings are non-limiting examples; the scope of the present invention is defined solely by the claims. The features described or illustrated in connection with one exemplary embodiment may be combined with the features of other embodiments. Such modifications and variations are intended to be included within the scope of the present invention. All patents, publications and references cited herein are expressly incorporated herein by reference in their entirety.
- The subject matter hereof provides improved ways to convert audio/video content (or other content) from one codec format to another, or from one container format to another, and/or that have different encoding/formatting settings, to generate multiple versions of a file. For example, the conversions may involve changing the bitrate (e.g., 10 Mbps to 500 kps), frame size, aspect ratio, or in changing compression settings (other than bitrate), and/or other characteristics such as GoP settings, color spaces, stereo/audio choices, sample rates, etc. The process may also involve changing other characteristics, such as whether interlacing is used. In addition, in some applications the teachings hereof may be used to change or add security features, such as encryption or watermarking, as will be described in more detail below. The term transcoding is used herein to refer to performing any or all of such transformations on a given piece of content; however it is not limited to such transformations, which are merely examples provided for illustrative purposes.
- In many embodiments, the transcoding techniques disclosed herein preferably are implemented in a distributed computing platform such as a content delivery network (CDN), and preferably one that can not only perform transcoding services but also the deliver the transcoded content. An example of a content delivery network platform is now described.
- Content Delivery Network
-
FIG. 1 illustrates a known distributedcomputer system 100 is configured as a CDN and is assumed to have a set ofmachines 102 distributed around the Internet. Typically, most of the machines are servers located near the edge of the Internet, i.e., at or adjacent end user access networks. A network operations command center (NOCC) 104 manages operations of the various machines in the system. Third party sites, such asweb site 106, offload delivery of content (e.g., HTML, embedded web page objects, streaming media, software downloads, and the like) to the distributedcomputer system 100 and, in particular, to the CDN's content servers 102 (sometimes referred to as “edge” servers in light of their location near the “edges” of the Internet, or as proxy servers if running an HTTP proxy or other proxy process, as is typical and as is described further below in connection withFIG. 2 ). Typically, content providers offload their content delivery by aliasing (e.g., by a DNS CNAME) given content provider domains or sub-domains to domains that are managed by the service provider's authoritative domain name service. End users that desire the content are directed to the distributed computer system to obtain that content more reliably and efficiently. Although not shown in detail, the distributed computer system may also include other infrastructure, such as a distributeddata collection system 108 that collects usage and other data from the edge servers, aggregates that data across a region or set of regions, and passes that data to other back-end systems network agents 118 monitor the network as well as the server loads and provide network, traffic and load data (e.g., from the CDN's content servers 102) to a DNSquery handling mechanism 115, which is authoritative for content domains being managed by the CDN and which responds to DNS queries from end users by handing out, e.g., addresses for one or more of the content servers in the CDN. A distributeddata transport mechanism 120 may be used to distribute control information (e.g., metadata to manage content, to facilitate load balancing, and the like) to the servers. - More detail about CDN operation can be found in U.S. Pat. Nos. 7,293,093 and 7,693,959, the disclosures of which are incorporated by reference.
- As illustrated in
FIG. 2 , a givenmachine 200 comprises commodity hardware (e.g., an Intel Pentium processor) 202 running an operating system kernel (such as Linux or variant) 204 that supports one ormore applications 206 a-n. To facilitate content delivery services, for example, given machines typically run a set of applications, such as an HTTP web proxy 207 (sometimes referred to as a “global host” or “ghost” process), aname server 208, alocal monitoring process 210, a distributeddata collection process 212, and the like. The machine running theproxy 207 typically provides caching functionality for content passing therethrough, although it need not. For streaming media, the machine typically includes one or more media servers, such as a Windows Media Server (WMS) or Flash server, as required by the supported media formats. - A given content server is configured to provide one or more extended content delivery features, preferably on a domain-specific, customer-specific basis, preferably using configuration files that are distributed to the edge servers using a configuration system. A given configuration file preferably is XML-based and includes a set of content handling rules and directives that facilitate one or more advanced content handling features. The configuration file may be delivered to the content server via the data transport mechanism. U.S. Pat. Nos. 7,240,100 and 7,111,057 (the disclosures of which is hereby incorporated by reference) illustrates useful infrastructures for delivering and managing edge server content control information, and this and other edge server control information can be provisioned by the CDN service provider itself, or (via an extranet or the like) the content provider customer who operates the origin server. The CDN may provide secure content delivery among a client browser, edge server and customer origin server in the manner described in U.S. Publication No. 20040093419. Secure content delivery as described therein enforces SSL-based links between the client and the content server, on the one hand, and between the content server process and an origin server process, on the other hand. This enables an SSL-protected web page and/or components thereof to be delivered via the content server.
- The CDN may include a network storage subsystem (sometimes referred to as “NetStorage”), such as described in U.S. Pat. No. 7,472,178, the disclosure of which is incorporated herein by reference.
- Streaming Using a Content Delivery Network
- The CDN described above may be designed to provide a variety of streaming services. For example, for fault tolerant streaming delivery, the CDN may include a delivery subsystem, such as described in U.S. Pat. No. 7,296,082, the disclosure of which is incorporated herein by reference.
- In other streaming implementations, the CDN may be extended to provide an integrated HTTP-based delivery platform that provides for the delivery online of HD-video quality content to the most popular runtime environments and to the latest devices in both fixed line and wireless environments. An example of such a platform is set forth in U.S. Ser. No. 12/858,177, filed Aug. 17, 2010 (now published as US Patent Publication 2011/0173345, incorporated herein by reference). The platform described there supports delivery of both “live” and “on-demand” content. It should be noted that while some of the description below and otherwise in application Ser. No. 12/858,177 uses the context of the Adobe Flash runtime environment for illustrative purposes, this is not a limitation, as a similar type of solution may also be implemented for other runtime environments both fixed line and mobile (including, without limitation, Microsoft Silverlight, Apple iPhone, and others).
-
FIG. 3 illustrates an overview of an exemplary architecture for live streaming delivery as described in U.S. application Ser. No. 12/858,177, filed Aug. 17, 2010. As seen in the embodiment shown inFIG. 3 , the system generally is divided into two independent tiers: astream recording tier 300, and astream player tier 302. The recording process (provided by the stream recording tier 300) is initiated from theEncoder 304 forward. Preferably, streams are recorded even if there are currently no viewers (because there may be DVR requests later). The playback process (provided by the stream player tier 302) plays a given stream starting at a given time. Thus, a “live stream,” in effect, is equivalent to a “DVR stream” with a start time of “now.” - Referring to
FIG. 3 , the live streaming process begins with a stream delivered from anEncoder 304 to anEntry Point 306. A Puller component 308 (e.g., running on a Linux-based machine) in an EP Region (not shown) is instructed to subscribe to the stream on theEP 306 and to push the resulting data to one ormore Archiver 310 processes, preferably running on other machines. In this embodiment, one of theArchivers 310 may operate as the “leader” as a result of executing a leader election protocol across the archiving processes. Preferably, theArchivers 310 act as origin servers for a content server's HTTP proxy processes (an example of which is shown at 312) for live or near-live requests. TheHTTP proxy 312 provides HTTP delivery to requesting end user clients, one of which is theClient 314. Arepresentative Client 314 is a computer that includes a browser, typically with native or plug-in support for media players, codecs, and the like. If DVR is enabled, content preferably is also uploaded to theStorage subsystem 316, so that the Storage subsystem serves as the origin for DVR requests. - In operation, a request for content (e.g., from an end user Client 314) is directed to the
HTTP proxy 312, preferably using techniques such as those described in U.S. Pat. Nos. 6,108,703, 7,240,100, 7,293,093 and others. When theHTTP proxy 312 receives an HTTP request for a given stream, it makes various requests, preferably driven by HTTP proxy metadata (as described in U.S. Pat. Nos. 7,240,100, 7,111,057 and others), possibly via a cache hierarchy 318 (see., e.g., U.S. Pat. No. 7,376,716 and others), to locate, learn about, and download a stream to serve to theClient 314. Preferably, the streaming-specific knowledge is handled by theHTTP proxy 312 that is directly connected to aClient 314. Any go-forward (cache miss) requests (issued from the HTTP proxy) preferably are standard HTTP requests. For example, when aClient 314 requests a particular stream, theHTTP proxy 312 starts the streaming process by retrieving a “Stream Manifest” that contains preferably attributes of the stream and information needed by theHTTP proxy 312 to track down the actual stream content. - For “live” requests, the
HTTP proxy 312 starts requesting content relative to “now,” which, in general, is approximately equal to the time at the content server's HTTP proxy process. Given a seek time, the HTTP proxy downloads a “Fragment Index” whose name preferably is computed based on information in the indexInfo range and an epoch seek time. Preferably, a Fragment Index covers a given time period (e.g., every few minutes). By consulting the Fragment Index, an “Intermediate Format (IF) Fragment” number and an offset into that IF fragment are obtained. TheHTTP proxy 312 can then begin downloading the fragment (e.g., via thecache hierarchy 318, or from elsewhere within the CDN infrastructure), skipping data before the specified offset, and then begin serving (to the requesting Client 314) from there. In general, and unless the Stream Manifest indicates otherwise, for live streaming the HTTP proxy then continues serving data from consecutively-numbered IF Fragments. - In the context of live HTTP-based delivery, the Intermediate Format (IF) describes an internal representation of a stream used to get data from the Puller through to the HTTP proxy. A “source” format (SF) is a format in which the
Entry Point 306 provides content and a “target” format (TF) is a format in whichHTTP proxy 312 delivers data to theClient 314. These formats need not be the same. Thus, SF may differ from TF, i.e., a stream may be acquired in FLV format and served in a dynamic or adaptive (variable bit rate) format. The format is the container used to convey the stream; typically, the actual raw audio and video chunks are considered opaque data, although transcoding between different codecs may be implemented as well. By passing the formats through the HTTP proxy 312 (and delivering to theClient 314 via conventional HTTP), the container used to deliver the content can be changed as long as the underlying codecs can be managed appropriately. - The above-described architecture is useful for live streaming. The platform can also be used to support video on demand (VOD). In particular, the solution can provide VOD streaming from customer and Storage subsystem-based origins.
- For VOD delivery, the stream recorder tier 300 (of
FIG. 3 ) is replaced, preferably with a translation tier. As described in Ser. No. 12/858,177, filed Aug. 17, 2010, typically VOD content is off-loaded to the CDN for HTTP delivery. In one embodiment, a conversion tool (a script) is used to convert source content (such as FLV) to IF, with the resulting IF files then uploaded to the Storage subsystem. TheHTTP proxy 312 then gets the content and the Stream Manifest from the Storage subsystem. Exemplary translation tier approaches are described in more detail in Ser. No. 12/858,177, filed Aug. 17, 2010. - An architecture and request flow of a VOD approach is shown in
FIG. 4 . In this embodiment, atranslation tier 400 is located between an origin 402 (e.g., customer origin server, or the Storage subsystem, or other source of content) and thestream player tier 404. - More detail about the above streaming architectures can be found in aforementioned U.S. application Ser. No. 12/858,177.
- It is known that the above-described streaming architecture can be enhanced in a variety of ways, for example as set forth in U.S. patent application Ser. No. 13/329,057, filed Dec. 16, 2011, (now published as US Publication No. US 2012/0265853 and as WIPO Publication No. WO/2012/083298) the contents of which are hereby incorporated by reference.
- Live Streaming Components
-
FIG. 5 is a high-level component diagram illustrating one embodiment of an architecture for streaming live content, as set forth in U.S. patent application Ser. No. 13/329,057. In this embodiment, the Entry Point (EP) 502 ingests the stream to be delivered from anencoder 500, demuxes the stream from its native format to an IF format, such as a fragmented format like f-MP4, and archives the stream to Storage 504 (typically a network storage subsystem). TheEP 502 serves “current” live stream fragments to a Streaming Mid-Tier (SMT)process 506, which is typically running on a separate SMT machine. TheSMT 506 retrieves “current” live stream fragments fromEP 502, and it generates a muxed output in the desired native format. In an alternative embodiment, theSMT 506 generates muxing instructions for use by a content server running an HTTP proxy process 508 (again, sometimes referred to as “global host” or simply “ghost”) in the CDN. The instructions are returned to thecontent server 508, along with the IF fragments if needed, although the IF fragments may have been previously cached by thecontent server 508 or retrieved by the content server fromStorage 504 instead. The muxing instructions may be realized as binary-side-includes, or BSI, which is described in detail in U.S. patent application Ser. No. 13/329,057 and will be summarized below. Thecontent server 508 forwards end-user requests toSMT 506, caches the response fromSMT 506, which response either is a native output object for the stream or a BSI fragment, and, when BSI is used, thecontent server 508 also creates an output object from the BSI and IF fragment. Thecontent server 508 also delivers the native output object to the end-user client, typically a client player application. It does not need to understand any container format(s). TheStorage 504 stores an archive for DVR or VOD playback, and it also stores live stream session metadata. - On Demand Streaming Components
-
FIG. 6 is a high-level component diagram illustrating one embodiment of an architecture for streaming on-demand content. In this embodiment, theSMT 604 requests and receives the native on-demand file from either acustomer origin 600 or Storage 604 (again, typically a network storage subsystem). TheSMT 606 parses a native source file index and creates an intermediate MetaIndex. It also generates a muxed output object orSMT 606 generates muxing instructions (BSI or equivalent functionality) for use by thecontent server 608 to create the native object. Thecontent server 608 forwards end-user requests toSMT 606, caches the response from SMT, which response either is a native output object or a BSI fragment, and, when BSI is used, thecontent server 608 also creates an output object from the BSI and IF fragment.Storage 604 typically stores on-demand files in native format. - Live Streaming Operation
-
FIG. 7 illustrates further details regarding the EP and SMT components and their respective functions. - In this embodiment, the
EP 700 comprises two services: an ingestserver 706 and an entry point stream manager (ESM) 701. The ingestserver 706 is composed of a format-specific ingestserver 706 and a library offunctions 708, called TransformLib. Thelibrary 708 is a shared library that is linked into the ingestserver 706. The library contains format-specific logic for muxing and demuxing. In operation, the ingestserver 706 receives a stream from anencoder 702, authenticates theencoder 702, passes the received data to thelibrary 708 for demuxing, and sends the demuxed stream to theESM 701. The library, as noted above, demuxes from a native format (e.g., MP3, MPEG2-TS, or otherwise) to the IF, such as f-MP4. The ESM 710 is a format-independent component that preferably resides on theEP 700. The role ofESM 701 preferably is the same across different streaming formats. It received the demuxed stream from the ingestserver 706, manages ESM publishing points, archives the stream toStorage 705, serves “current” live request from SMT, and announces active streams to all SMTs. An EP machine may be a Windows-based server, or a Linux-based server, or otherwise. Preferably, the ESM code is cross-platform compatible. - The SMT machine comprises two primary services;
SMT 712 andlocal ghost process 714. The local HTTP proxy (ghost)process 714 handles incoming HTTP requests from an contentserver ghost process 715. In response, thelocal ghost process 714 makes a forward request to thelocal SMT component 712.SMT component 712 passes the incoming request toTransformLib 716 for processing, and that processing is based on the container format. Preferably,TransformLib 716 first rewrites the container-specific incoming URL to an IF (e.g., f-MP4) forward URL.SMT 712 then retrieves the IF fragment on behalf ofTransformLib 716. Finally,TransformLib 716 uses the IF fragment to create instructions (BSI), and to serve back any IF requests to thecontent server ghost 715.TransformLib 716 creates the output object in native format if the instruction set (BSI) approach is disabled. As noted, thelocal ghost process 714 makes the forward requests (to SMT component 712), and it caches the forward response on local disk. An intermediary caching process may be used between theSMT 712 andlocal ghost process 714. By usinglocal ghost process 714 in the SMT machine, ghost-to-ghost communications between the content server and the SMT may be used (and optimized). -
FIG. 8 illustrates an embodiment of a first live streaming workflow embodiment that is used when a CDN customer publishes a stream from its encoder to a CDN entry-point (EP). -
FIG. 9 illustrates an embodiment of a second live streaming workflow that is used when an end-user makes a live request to a content server. - Referring now to
FIG. 8 , the encoder publishes a live stream to the EP. The ingest server authenticates the encoder connection, preferably using a streamID to lookup the appropriate stream configuration (Step 1). Ingest server then demuxes the input and pushes the stream to ESM (Step 2). ESM auto-creates a publishing point, preferably uploading to Storage three (3) XML-based files: LiveSession, LSM, and ACF. These per-session metadata files are created at the start of each live stream session (Step 3). The LiveSession file includes live stream information, such as entrypoint IP, sessionID, and streamState. The LSM includes session-specific metadata like bitrates, etc. ACF includes information for use in configuring an archive copy of the live stream. As ESM receives fragments from the ingest server, it aggregates the fragments into segments on the local disk. When the segment size reaches the accumulation threshold, it uploads the segment to Storage. With each segment uploaded to Storage, ESM also uploads an FDX file (Step 4). The FDX (Fragment Index) file is a binary encoded file that provides an index of the fragments that have been uploaded to Storage. This index tells SMT what fragments are in Storage and where to locate them. For fragments that are not in the FDX file, the fragment either is on the EP (because it has not been uploaded to Storage yet) or the fragment does not actually exist. Once the stream is stopped, the LSM and livesession.xml file are updated to change the “streamState” property from “started” to “stopped.” -
FIG. 9 illustrates an exemplary embodiment of a workflow when an end-user client makes a live streaming request to a ghost process on a content server. The client (e.g., a client media player application) makes a stream request to the content server ghost process (Step 1). This process then makes a forward request to SMT (Step 2). If this is the first request for this live stream to the SMT machine, SMT constructs and caches information about the live stream. To get this information about the live stream, SMT pulls information from Storage for the past DVR fragments and pull information from the EP for the current fragments. SMT makes a request to Storage to get the livesession.xml and LSM file. The LSM file will give information about the live stream and what FDX files to lookup for a particular fragment index range (Step 3). To know what fragments are on the EP, the SMT makes a Manifest request to the EP and the Manifest will list the current set of fragment indexes that reside on the EP (Step 4). Once SMT finds and obtains the requested fragment, it muxes the fragment to the output format. When BSI instructions are used, SMT does not create the actual output object but, instead, SMT creates a BSI instruction response containing the appropriate container format headers and IF fragment request (Step 7). The content server makes a request for the IF fragment, and preferably this request is only for the “mdat” data, which is the video/audio data (Step 8). The content server ghost process then uses the instructions in the response and the IF fragment to construct the output object. It sends the resulting output object back to the end-user as a response to the original request (Step 9). For SMT to know what fragments are in Storage, preferably it continuously polls Storage for a latest version of the FDX file (Step 10). Polling interval for the FDX file typically is a given, potentially configurable time period (Step 10). For SMT to know what fragments are available on the EP, preferably SMT polls the EP for a latest Manifest file (Step 11). - The following section describes preferred URL formats for live, archive and IF requests from a client-player→content server→SMT.
- In one embodiment, for live stream requests, the client player URLs have the following format:
- https://<domain>/<formatPrefix>/<streamID>/<streamName>/<additionalParams>
- Live and Archive URLs preferably have a prefix that denotes that streaming container format and the type of request (e.g., live, archive).
- In one embodiment, for archive stream requests, the client-player URLs have the following format:
- https://<domain>/<formatPrefix>/<streamID>/<streamName>/<sessionID>/21 streamName>/<additionalParams>
- The sessionID part of the URL differentiates archives from different live stream sessions. An archive URL gives the location of the archive directory in Storage. The archive URL “format” is simply the path to the default Storage location to which the archive is uploaded. If desired, the archive can be moved to a different Storage directory, in which case the archive path URL is changed to the new Storage directory location. Preferably, the archive URL is immediately available for playback even if the live event is not over yet. The archive URL represents the content that has been archived to Storage so far. For example, if the live stream event has been running for 60 minutes and 58 minutes of the event has been archived to Storage, the archive URL represents a VOD file that is 58 minutes long. As more content is archived to Storage, the archive URL represents a longer and longer VOD file.
- An IF URL is constructed by taking the “base URL” of the client request and appending Fragment(<params>) to the end. The “base URL” typically is the portion of the URL that is up to and including the file name. The IF URL parameters are name/value pairs separated by commas and specify bitrate and response types:
- https://<domain>/<formatPrefix>/<streamID>/<streamName>.<fileExtension>/
Fragment(brt=<bitrate>,idx=<fragmentIndex>,trk=<trackName>,typ=<fragmentType>) - Illustrative parameter tag names include:
-
- brt—Bitrate
- idx—Fragment index
- trk—Track name (usually audio or video)
- typ—Type of response fragment, possible values are: bsi, frg, hdr, dat
- For the “typ” parameter, if “bsi” is specified, SMT will return a BSI fragment response. (Note that for implementations that involve instruction sets other than BSI, the parameter might be “instr_set_name”.) If “frg” is specified, SMT will return the f-MP4 fragment. If “hdr” is specified, SMT will only return f-MP4 headers. If “dat” is specified, SMT will return the mdat box of the f-MP4 fragment. The mdat box is the MP4 box containing the audio/video samples.
- In operation, as ESM receives the live stream fragments from the ingest server, ESM writes the data to local disk. For multi-bitrate streams, ESM has a configurable option to either coalesce all bitrates into a single file or have a different file per bitrate. The advantage of coalescing into a single file is that the number of file uploads to Storage is reduced. The disadvantage of a single file is that it is not possible to only retrieve fragments for a single bitrate without also retrieving fragments for other bitrates, thereby making caching less efficient on SMT when a single bitrate is being requested by the end-user. In either case, though, all of the fragments usually are in a single file (be it for one bitrate or many). An ESM trailing window parameter configures how much ESM will save on local disk. Once a segment is outside the trailing window, ESM will delete it from local disk.
- If an “Archive to Storage” parameter is enabled, ESM will archive the stream to Storage for DVR or later VOD playback. Typically, ESM stores the last “n” minutes of a live stream. If a customer wants a 4 hour DVR window for their live stream, the customer enables “Archive To Storage” so that fragments older than n minutes are saved in Storage and available for DVR. For certain streams, the customer can disable “Archive To Storage” and the live stream is not uploaded to Storage. In such case, live stream fragment requests are served from the EP. Some customers have 24×7 streams and want say, one (1) day DVR functionality. In that case, the customer enables “Archive To Storage” and enables a 1 day “Archive Trailing Window”. By archiving to Storage, DVR requests older than “n” minutes are available from Storage. The “Archive Trailing Window” setting can limit the size of the archive that is stored in Storage. For example, if the “Archive Trailing Window” is set to 1 day, ESM will automatically delete from Storage fragments that are older than 1 day. This is beneficial for the customer because they can have a long DVR window but do not need to worry about cleaning up Storage for their long running live streams.
- SMT can determine all the active live streams through stream “announcements” from ESM. A preferred technique is illustrated in
FIG. 10 . In this particular implementation, the SMT must know the state of all live streams because the content server ghost process can make a live stream request to any SMT, and SMT needs to know which EP to get the fragments from. If the live stream state is inactive, on the other hand, SMT would know to retrieve the fragments only from Storage (assuming “Archive To Storage” option was enabled). - In the embodiment illustrated in
FIG. 10 , live stream announcements between SMT and ESM are done using HTTP GET requests from SMT to ESM. To reduce the amount of HTTP requests from SMT to EP, preferably each ESM in an EP region (e.g.,EP region - Because the forward request to an EP explicitly would contain the EP IP address, all SMTs in a region should be making an HTTP request to the same EP machine in the EP region to utilize ICP. If the request was not made to same EP machine, the cache key will be different and ICP cannot be used. Therefore, the algorithm to choose the EP machine to query preferably is deterministic and repeatable across all SMTs so that all SMTs will make the forward request to the same EP in the EP region. Preferably, polling from SMT to EP is done every few seconds and is configured through a global server setting. Having a short polling interval minimizes the amount of time between a customer publishing a stream and the SMT knowing the stream exists on the EP. The request logic from SMT to EP handles situations where an EP is down for maintenance or temporarily inaccessible.
- As noted above, the live stream archive is stored on Storage for later VOD playback. Any metadata for the live stream session is also stored on the Storage system, preferably in the same location as the live stream archive. If “Archive To Storage” is not enabled, nothing is stored on Storage.
- To simplify output muxing to any container format, as noted above, ingested fragments are demuxed into the IF format (Intermediate Format). Once an ingest stream is converted to IF, the muxer can convert from the IF format to any supported streaming container format. This simplifies conversion from any input (source) format to any output (target) format. The PIFF (Protected Interoperable File Format) container format, available from Microsoft, may be used as the basis for the IF container format. PIFF enhances the MPEG-4 Part 12 specification by providing guidelines and UUID extensions for fragmented multi-bitrate HTTP streaming. Besides PIFF, other choices for container formats are Adobe's HTTP Streaming For Flash (Zeri), Apple's MPEG2-TS, or a proprietary format.
- Fault Tolerance, Redundancy, and Replication
- For stream redundancy and failover, customers may publish a stream to a primary and one or more backup Entry Points. EPs also may support DEEM (Dynamic Entry Point to Encoder Mapping) to provide optimal DNS mapping from encoder to entry point. If an EP were to go down, DEEM can minimize stream downtime by quickly remapping an entry point alias (e.g., via a DNS CNAME) to an EP that is up and running DEEM functionality includes the ability to resume a live stream session when the EP alias switches from one EP another EP. When an encoder is pushing a stream to one EP and that EP goes down, DEEM remaps the alias, the encoder then starts pushing to the new EP, and the EP “appends” fragments to the previous live stream session. This means the live stream DVR from the previous session is retained and the archive in Storage is uninterrupted.
- For EPs to support DEEM, whenever an encoder pushes a stream to the EP, the EP must determine if the stream is a brand new stream or a DEEM failover from a previous live stream session. The EP determines the state of the stream by getting the corresponding livesession.xml from Storage. The livesession.xml contains the “streamState”. If the stream is a DEEM failover, the “streamState” will have a “started” value. The EP also does consistency checks, such as query the old EP to determine if the stream actually existed. Consistency checks ensure that the new EP does not unintentionally consider the stream to be a DEEM failover stream when it is not. For the case when a stream is not archived to Storage, the EP simply ingests the live stream without retrieving the livesession.xml from Storage. The SMT does the work of stitching the live stream from different EPs into a single live stream.
- The livesession.xml contains the following attributes for DEEM support:
-
- streamState—holds state of the stream
- lastRefreshTime—time when the EP last updated the livesession.xml with the current state
- discontinuityThreshold—time threshold at which the EP will not resume a previous live stream
- By default, the “discontinuityThreshold” is set to a given time period, e.g., 30 minutes. This means if an EP goes down and the encoder does not push the stream to the new EP within 30 minutes, the live stream session will not be resumed. The EP checks if the threshold has been exceeded by subtracting the current time against the “lastRefreshTime”. If this time difference is more than 30 minutes, the EP will not resume the previous live stream session.
- For SMTs to support DEEM, SMT tracks stream states via stream announcements. When the encoder is stopped, a live stream is transitioned to the “stopped” state on the EP. If the EP goes down, the stream does not gracefully transition to the “stopped” state. The SMT tracks ungraceful stream state transitions, and it stitches together live stream sessions if needed. SMT combines DVR fragments from a previous live session and the currently resumed live stream session. From the end-user point of view, the merged live stream sessions is a single live stream session.
- In certain circumstances, it may be desirable to replicate a single ingest stream to another EP. One possible use case facilitates live stream archive redundancy, which can be used for providing a hot backup of the live stream archive on the backup EP. In this approach, if the primary EP were to go down, the encoder can start pushing the stream to the backup and past DVR is still available because it was auto replicated. Another use case for such replication is live stream redistribution, in which the live stream may be replicated to an EP that is far away (e.g., ingest in United States and replicate to Europe). With the stream replicated to another EP farther away, the content server, SMT, EP, and Storage serving that far away region can be located closer together (all in Europe, for example), reducing the network distance between them.
FIG. 11 illustrates one example of a technique. In this embodiment, preferably ESM on the ingest entry point has an option to replicate the stream. The replicated stream is sent either to the backup EP or another EP altogether. Where stream replication is used, the target stream preferably uses a different stream ID than the source stream. - On-Demand Streaming Operation
- Similar to live streaming, and as shown in
FIG. 12 , in an on-demand embodiment, an SMT component handles on-demand requests from a content server. The same SMT machine can handle both live and on-demand requests. - As shown in
FIG. 12 , the SMT machine preferably has two primary services: SMT, and local ghost. The SMT service uses TransformLib to process the request URL, and TransformLib constructs the appropriate forward requests to Storage or customer origin. These forward requests are made via the SMT local ghost process and use a cache process as an intermediary between SMT and local ghost. Preferably, the same TransformLib component is used for on-demand and live streaming. - The following details the workflow when an end-user makes an on-demand stream request to the content server. The client player makes a stream request to the content server (Step 1). The content server ghost process makes a forward request to SMT machine (Step 2). If this is the first request to the SMT machine for this on-demand stream, SMT needs to construct and cache information about the on-demand stream. To get this information, SMT first passes the request URL to TransformLib, and TransformLib constructs the appropriate forward requests for the native format file. SMT makes these forward requests to Storage/customer origin via SMT's local ghost process (Step 3). TransformLib takes the forward responses and constructs the response (e.g., BSI) for the requested output format (Step 4). SMT returns the response back to the content server (Step 5). The BSI response contains the container-specific format headers and the request URLs for the IF fragments. Based on the BSI instructions, the content server ghost process makes IF requests to construct the output object (Step 6). The output object is returned to the end-user in the native format (Step 7). As noted above, BSI is optional but can be used to reduce the cache footprint on the content server ghost process. If BSI is not enabled, SMT can return the native output object (i.e., in the target format) to the content server ghost process. The native output object can be cached by the content server just like any HTTP object from an origin server.
- For on-demand requests, the client-player URLs may have the following format:
- https://<domain>/<formatPrefix>/<forwardpath>/<streamName>
- Similar to live and archive URLs, on-demand URLs have a prefix that denotes the streaming container format and type of request (i.e., on-demand).
- If BSI functionality is enabled, SMT returns a BSI fragment that consists of the container headers and the IF URLs for the mdat data. For iPhone, e.g., the IF URLs look like the following for audio and video:
- https://example.com/iosvod/path/video.mp4/Fragment(brt=512000,idx=5000,trk=video,typ=dat)
https://example.com/iosvod/path/video.mp4/Fragment(brt=64000,idx=5026,trk=audio,typ=dat) - The Fragment(<params>) portion is appended to the “base URL” of the client request (e.g., video.mp4 in the example above). The “base URL” is typically the portion of the URL up to and including the file name but can vary depending on the streaming format.
- For muxing into the desired output format, TransformLib on the SMT contains the logic to demux the native input file and mux into the requested output object. For the request processing workflow, TransformLib first parses the native input file to generate a MetaIndex. The MetaIndex is a generic index that contains information such as composition time, decoding time, IF fragment boundaries, and byte range offsets into the native source file for each IF fragment. The output muxers use the MetaIndex to extract the appropriate bytes from the native source file and use the other information such as composition time to construct the appropriate container headers. The MetaIndex provides a generic interface into the native source files. This interface is an abstraction layer on top of the native source file so that the output muxers do not need to be aware of the underlying container format. A benefit of this design is that if it is desired to support a new input container format, a new native source file parser/demuxer is implemented, but the output muxers remain the same. Similarly, if it is desired to support a new output container format, a new muxer is implemented but input demuxers remain the same.
FIG. 13 illustrates this abstraction layer. If desired, the MetaIndex may be cached within SMT's local ghost process cache for later reuse or for use by an ICP peer. Creating the MetaIndex can take time, and caching on the local ghost process decreases the response time for the first VOD fragment request. To support local ghost process caching, SMT makes a local host request via ghost for “/metaIndex”. The loopback request is handled by the local SMT, and its response is cached by the ghost process. Other SMTs in the region also get the benefit of using this MetaIndex because it is available via ICP. - The above-described architectures (for live or on-demand) is extensible to support any streaming format. The following section describes how to support a new streaming container format.
-
FIG. 14 illustrates one exemplary embodiment of a technique for supporting ingestion of iPhone content and output of iPhone content. In this embodiment, aniPhone EP 1400 ingests an Apple-Segmented MPEG2-TS stream, andTransformLib 1408 supports MPEG2TS for demuxing and muxing MPEG2-TS.TransformLib 1408 parses iPhone URLs and rewrites them to the forward path. On theEP 1400, the iPhone ingestserver 1406 handles HTTP POST/PUT requests from theencoder 1402. The iPhone ingest server passes the TS segments toTransformLib 1408 for demuxing into IF (e.g., f-MP4) format. The iPhone ingest server then sends the IF fragments to thelocal ESM 1401. The ESM archives the stream to Storage and announces the live stream to the SMTs, as described above. On theSMT 1412, theTransformLib 1416 processes iPhone request URLs for m3u8 and MPEG2-TS.TransformLib 1416 constructs the BSI response and returns it to thecontent server 1415. For MPEG2-TS segments, data packets are interleaved with container headers every 188 bytes. This means that for every 188 bytes of audio/video, there will be some container headers. Preferably, the BSI syntax supports loop constructs to reduce the complexity of the BSI response and still generate the appropriate MPEG2-TS segment. Using BSI to mux the object on the content server is optional.SMT 1412 can also return native MPEG2-TS segments back to thecontent server 1415 if BSI is disabled. -
FIG. 15 illustrates an embodiment for supporting the Shoutcast format. Shoutcast is a protocol that is primarily used for audio live streaming over HTTP-like connections. To play a Shoutcast stream, the client makes an HTTP request and the HTTP response body is a continuous audio stream (i.e., unbounded response body). The audio stream is a mix of MP3 data (or AAC/OGG) and Shoutcast metadata. Shoutcast metadata typically contains song titles or artist info. While the Shoutcast protocol is similar to HTTP, it is not true HTTP because the protocol includes some non-standard HTTP request and response headers. As illustrated inFIG. 15 , this embodiment comprises aShoutcast EP 1500 to ingest Shoutcast-encoded streams. TheTranformLib 1508 for Shoutcast library is provided to demux and mux MP3/AAC/OGG.TransformLib 1508 also parses Shoutcast URLs, rewrites them to the forward path, and generates BSI instructions. Because the client-player downloads a continuous unbounded HTTP response, the contentserver ghost process 1415 must turn fragmented forward origin requests into a single continuous client download. BSI instructs the ghost process on how to construct the client response from fragmented responses to forward requests. As shown inFIG. 15 , the network architecture for Shoutcast support is similar to the iPhone support as provided inFIG. 14 . TheShoutcast EP 1500 ingests the stream. The ingest server demuxes thestream using TransformLib 1508. It then sends the stream toESM 1501. The ESM and SMT components remain the same. TransformLib 1515 onSMT 1512 parses Shoutcast URLs, creates BSI responses for Shoutcast, and muxes into Shoutcast output format. - Further details on live and on-demand streaming architectures may be found in aforementioned U.S. patent application Ser. No. 13/329,057, the teachings of which are hereby incorporated by reference.
- Binary Side Includes (BSI)
- As described in U.S. patent application Ser. No. 13/329,081, filed Dec. 16, 2011 (now published as U.S. Patent Publication No. 2012/0259942 and as WIPO Publication No. WO/2012/083296), the teachings of which are hereby incorporated by reference, BSI is a name for functionality executable in a content server to generate output objects given an input object and certain instructions, typically instructions from another component such as the SMT component described above. The instructions typically define manipulations or actions to be performed on the input data. Such functionality is intended to enable modification of payloads as they are served to a requesting client, allowing a content server to easily provide, among other things, custom or semi-custom content given a generic object. In a typical but non-limiting embodiment, this functionality can be built into the HTTP proxy (ghost) application on the content server, although in alternative embodiments it can be implemented external to ghost.
- Typically, many modifications made by the content server result in a minimal overall change to content, meaning that the resulting data served to the requesting client differs from the input by, for example, only a few percent. In one embodiment, a mechanism is defined for representing the difference (or “diff”) between the source(s) and output content, allowing a generic feature in the content server to handle an increasing number of streaming formats in an efficient way.
- In general, with BSI, components other than the content server are made responsible for defining or generating transforming logic and for providing instructions—along with binary “diff” information—that can be understood by the content server. By providing a mechanism for representing the difference (or “diff”) between the source(s) and output content, and providing the content server with a way to use these to modify a generic source object, the client-facing content server may handle an increasing number of requests efficiently. Furthermore, depending on the circumstances, the inputs (e.g., the generic source object, instructions, etc.) may be cached. The output of the process also may be cached in some cases.
- As noted previously, for convenience of illustration, in this disclosure this function is called BSI, for Binary-edge-Side Includes, or Binary Server Integration. The BSI language, with proposed syntax described below, defines different sources—incoming pieces of data that help construct the final output. Instructions (like ‘combine’ and others) define the byte ranges and order of how to merge these inputs, as well as controlling output headers. When generated in real-time, the BSI fragment and source object both can be cached (e.g., at the content server), placing far less load on the BSI generation tier than the content server would have handling them directly. For fixed/on-demand applications, the BSI may be generated once, and a BSI fragment cached (e.g., either on the content server, or on network storage or other dedicated storage subsystem such as is shown in
FIGS. 5-6 ). - The BSI approach is ideally very fast. Preferably, the syntax is XML-based, and the number of instructions typically is kept very low, allowing fast parsing. The execution of BSI instructs the content server what order, and from which source, to fill an output buffer that is served to the client.
- In the context of the previously-described streaming platforms, BSI functionality can be used between the SMT and content server to streamline the creation of an output object (e.g., an output object representing the stream in a native format for iPhone or other client device) from an input source (in the above cases, the IF fragments). The SMT receives IF fragments and performs muxing steps. Instead of muxed content as output, the SMT creates a dynamic BSI fragment that can be served to the content server, along with a binary object that contains the additional bits that the content server needs to combine with the IF fragment it normally receives. The content server uses this information to create the muxed output object in the native format, representing all or some portion of the stream.
- Examples of using BSI for streaming are illustrated in previous FIGS., but
FIG. 16 shows an embodiment of a workflow with additional detail. In this illustrative embodiment, the contentserver ghost process 1600 receives a request from aclient player 1601 for particular content (step 1) in certain target format. The content server makes a request to a muxing tier (the SMT 1602) for the BSI instructions required (step 2). Typically, the request includes parameters via query string, to specify the type of request (manifest, content, key file, etc), the bitrate requested, a time determination (fragment no, time offset, etc.), and other parameters related to muxing (segment duration, A/V types, etc.). TheSMT 1602 obtains the relevant IF fragments from the EP 1604 (step 3) or Storage 1603 (step 3 a), builds an appropriate output object from the IF fragments as if it were to serve the content, creates a buffer of the bytes needed beyond what was contained in the IF fragments, along with instructions about how to ‘interleave’ or combine the binary diff with the IF. In some implementations, it should be understood, any necessary diff data may be embedded directly in the instructions themselves. Instep 4, theSMT 1602 then sends the BSI response to the content server. The response may also include a reference to the IF fragments that are needed. The content server gets the IF fragments in any of variety of ways, including from the SMT (that is, in addition to the BSI), from its own cache, or fromStorage 1603, which is typically a network storage subsystem that was previously described in connection with the streaming platform. Purely by way of example,step 5 inFIG. 16 shows the IF fragments arriving from Storage and being cached. - As the vast bulk of the data, which is represented by the IF fragment, is cached at the content server, the BSI response with its binary diff typically might be around a few percent of the overall size of the object to be served. The
content server ghost 1600 applies the BSI, generating and serving a muxed output object to the client (step 6). The BSI response, including both the instructions and the diff data, can be cached by thecontent server ghost 1600 for some period of time. Preferably, the parameters supplied in the request to the SMT (step 2) are used in the cache key so that only subsequent requests for content with the same parameters utilize the cached BSI response. The output of the BSI operation need not be cached. - The foregoing approach can provide a variety of advantages. Because the BSI instructions can be used tell the content server ghost process how to mux or otherwise create the output object, BSI provides a way for the process to support any streaming container format without needing associated code changes at the content server ghost process. To handle new container formats or bug fixes to support existing container formats, BSI instructions can change, but the content server ghost process logic remains the same. This eliminates any cross-component dependency with the content server or its ghost process when developing or implementing new streaming features.
- Further, for streaming to client devices using different container formats, BSI can reduce the ghost cache footprint size because the ghost process caches the IF fragments but muxes the IF into different native formats. Preferably, the muxed output is not cached; rather, only the IF fragment is cached. For example, the system can be used to stream Adobe Zeri (HTTP Streaming for Flash) to Android devices running Flash 10.1 and stream to MPEG2-TS to iPhone devices. For the live stream, only the IF fragment is cached and the content server muxes into Zeri for Android devices and muxes into MPEG2-TS for IPhone devices. These are just representative examples.
- For streaming of progressive-download-style formats (like Shoutcast), data is streamed to client as a long-running unbound HTTP download. From the end user client perspective, it is downloading a file that never ends. BSI functionality can be used for progressive-download-style formats and, in particular, to mux fragment responses from the origin (e.g., a content provider origin or CDN storage subsystem) into a continuous HTTP download stream for the client. Using metadata applied by the content server ghost process (configurable by content provider) and progressive-download-style BSI from the SMT, BSI can also be used to implement progressive-download-specific features, like jump-to-live-on-drift and delayed metadata injection based on user-agent. Specific progressive-download-style requirements thus can be inherently supported through BSI without requiring any changes in the content server.
- Fragmented streaming formats (like Zeri, iPhone, and Silverlight) may also use BSI functionality. For example, the SMT can send the content server content in a native format or a BSI fragment that the content server ghost process muxes into the native format. If a CDN content provider customer is only doing streaming for a single container format, there is no need to cache IF fragments and mux on the content server ghost process via BSI. In such case, it is more efficient for SMT to return the native object, which the content server ghost process caches. Enabling or disabling using BSI is configurable, preferably on a content provider by content provider basis, and, for a given content provider, on a site by site basis, or even a file by file basis.
- More details and examples of BSI can be found in aforementioned U.S. patent application Ser. No. 13/329,057.
- Transcoding System
- The content delivery network (CDN) described above provides an advantageous and feature-rich platform for streaming and object delivery. However, the CDN platform may be enhanced yet further by integrating into it a distributed, scalable transcoding system that provides the ability to transform content such as audio, video and other files, which may then be delivered to end-users over the platform. Typical transcoding tasks include the conversion of media from one bitrate/resolution to another for the purposes of adding bitrates to a multi-bitrate stream, converting from one container format to another or one encoding format to another in order to allow clients utilizing such formats to play the content. These tasks may be part of prepping media for ingestion into the streaming platform described above.
- In one embodiment, the distributed transcoding system described herein leverages the resources of the aforementioned content delivery architecture to perform certain processing tasks within the CDN, as real-time or background (batch mode) processes. Thus, for example, the CDN may prepare and transcode certain content in preparation for delivery, even while other content (from the same or other content provider users of the system) is being delivered. In other words, the machines described above that provide content delivery services (streaming, object delivery, or otherwise) may be leveraged, in accordance with the teachings hereof, to perform transcoding tasks. More particularly, the transcoding system may be implemented not only with a set of purpose-built hardware, specific to the transcoding task, but also supplemented with the available idle or low-usage resources of the content delivery network that was previously described, to achieve a highly scalable and flexible solution. For example, the resources of the various distributed CDN content servers (including in particular the HTTP proxy servers, aka ghost servers, described above), among others, may be leveraged in this way. Exemplary implementation details will set forth in more detail below.
- It should be noted that the subject matter herein is not limited to a transcoding system implemented in conjunction within a CDN, although that is one useful implementation. For example, the distributed transcoding techniques described herein may be implemented in a standalone system with dedicated machines, entirely separate from other content delivery services or machines.
- As mentioned previously, in one embodiment, the transcoding system can process files either in batch or real-time modes. Both kinds of jobs may be running within the platform at any given point of time. Preferably every transcode that runs in the system is happening as fast as possible given its priority and the available resources. The transcoding system itself is generally incognizant to the type of job it is processing—it simply processes requests with a given priority. In this way the system can be used for both batch and real-time transcoding of on-demand or live content.
- For convenience of illustration, the exemplary transcoding system described herein makes use of the following concepts:
-
- Fluxer. Generally speaking, in this embodiment, the Fluxer is the primary interface of the transcoding system. It is responsible for breaking up files, managing the transcoding process across many individual sub-transcoders, putting the file back together and sending it to the destination.
- Transcoding job. A job refers to a request to transcode an entire file (e.g., a particular audio, video, multimedia file, or otherwise) as opposed to an individual “task” which refers to the transcode of a single segment of the file. A “job” is also called a “Fluxer Job” and is made up of many transcoding “tasks”.
- I-frame/keyframe. I-frame refers to a video frame that contains enough data to reconstruct the frame on its own (also known as a keyframe.)
- P-frame. P-frame refers is a video frame that contains information relative to a frame in the past of the data stream.
- B-frame. A B-frame refers to a video frame that may contain information relative to a frame that exists either in the past or in the future of the data stream.
- GoP. GoP stands for Group of Pictures and refers to a keyframe (I-frame) and all subsequent P and B frames which reference that keyframe until the next keyframe.
- Closed GoP. When no P or B frames within a GoP reference frames from any other GoP, the GoP is said to be a Closed GoP.
- Open GoP. Since B frames may reference frames both before and after itself, it is possible for a B frame to reference the keyframe of the next GoP. When frames from another GoP are referenced, the GoP is said to be an Open GoP. Therefore Open GoPs generally require at least a portion of the next GoP is needed in order to fully decode the Open GoP.
- Referring to
FIG. 17 , in one embodiment, a transcoding system includes several components some of which are in a dedicated transcoding region and others of which are from the network of CDN servers. A region in this sense typically refers to a machine or set of machines in a particular network location, which may or may not be co-located with a region in the content delivery network. The transcoder region typically includes fluxer machines running a Fluxer (a fluxer process), transcoding resource access server application (TRAS), and a coordination server (C-server), as well as a set of managed transcoding resources (MTRs), e.g., a managed transcoder machine running a transcoding process.FIG. 17 shows the fluxer machines and MTRs in a single region, but the actual network location/topology of the transcoding region components is flexible and this example should not viewed as limiting. For example, one implementation many include many transcoding regions with one or more fluxer machines and one or more MTRs may be distributed throughout various networks, and even co-located in the content delivery regions with content servers shown inFIG. 17 . - The CDN content servers represent shared transcoding resources (STRs) to the transcoding system, as they are shared with the delivery and other CDN functions (e.g., security, content adaptation, authentication/authorization processes, reporting functions and so on). More broadly, the STRs are idle or low-utilization resources across the CDN that have transcoding capabilities and can be called upon to serve the transcoding system with their raw processing capabilities. Since these are typically idle or low-utilization servers, their main value is their processor (CPU). They are not expected to contain specialized hardware, nor can they be expected to be as reliable or available as MTRs, although they may exist in greater numbers. Prime examples of potential STRs are the HTTP proxy servers (e.g., also known as ghost servers or edge servers) described previously in conjunction with
FIGS. 1-16 . However, any of the machines shown inFIGS. 1-16 are candidates for use as STRs provided they can be modified in accordance with the teachings below to become part of the transcoding system. - Turning to the operation of the transcoding system, in general, the Fluxer is responsible for breaking apart media files into transcodable segments and sending those segments off to transcoding resources to be transcoded in parallel. Preferably the segments are coded so that the amount of data sent around the network is reduced. The transcoding resources can then decode and re-encode to accomplish the requested transcode. The Fluxer uses the TRAS to get lists of available transcoding resources and reports its status to the C-server. The transcoding resources (TRs, which may be either MTRs or STRs) are responsible for transcoding individual media segments and sending the derivatives back to the Fluxer to be remuxed back into a transcoded media file. MTRs, which are dedicated resources, report their status to C-Server. The TRAS can be implemented as a library that is responsible for encapsulating TR selection to an interface for consumption by the Fluxer. The TRAS uses a combination of awareness of local transcoders from C-server as well as requests to a Mapper (e.g. the map-maker and DNS system shown in
FIG. 1 ) to identify idle HTTP proxy servers or other CDN servers. The C-server tracks liveness from local TRs and Fluxers and acts as a local messaging platform for all transcoding servers in a region. -
FIGS. 18 and 19 illustrate the general function of and communication amongst components for particular embodiments of video-on-demand (VOD) transcoding and live transcoding, respectively. The Fluxer receives files to transcode or responds to transcode-initiation requests for VOD and live streams. A variety of components are potential sources for requesting batch or live transcoding jobs. Examples of such components include, for example, a storage system (as shown, for example, inFIGS. 3 , 5-7, and including network-based storage), a content provider user interface (e.g., a web-based portal providing a customer with a user interface to the CDN for configuring, uploading content to transcode, setting transcoding parameters, and monitoring the operation), or an Entry Point or Puller or other component in the streaming architecture (as shown, for example, inFIGS. 3 , 5-7), or aCDN server 102 that has received a request from an end-user client. - In one implementation, the Map-Maker and DNS system shown in connection with
FIG. 1 (the “Mapper”) can be leveraged to find the closest and best available Fluxer, as the map-maker monitoring agents and thedata collection system 108 are already monitoring network conditions and machine usage for the content delivery network. The requesting component makes a DNS request to a Fluxer domain and receives back the IP address of a particular Fluxer machine available for connection. The requestor can use a shared secret to authenticate to the Fluxer. Once a job begins, the Fluxer contacts the TRAS to request a list of servers to use for transcoding, and preferably provides the TRAS with as many specifics about the job as possible, including the approximate size of the input source, and whether the job is classified as real-time or batch or otherwise, which effectively classifies the priority of the job, and potentially specifics about the input/output formats, desired bitrates, etc. The TRAS uses this information to approximate how many transcoding resources it will need, and what mix of MTRs and STRs will be the most appropriate. As noted above, MTRs are dedicated transcoding resources that are managed by the transcoding system, while STRs are transcoding resources which are shared with content delivery resources (or shared with some other business function in the platform). To select MTRs, the TRAS can uses a resource management service referred to here as the coordination server (C-server). The TRAS uses the C-server to reserve local MTRs, while it asks the map-maker system (FIG. 1 ) for any needed STR. The Mapper will identify an approximate number of CDN servers from a pool that are running with a low utilization (e.g., with CPU or memory or request rate or other hardware metrics below some predetermined threshold, which ideally ensures that content delivery is not compromised) and return a list to TRAS. The TRAS merges the lists, preferring MTRs for real-time jobs and STRs for batch jobs, and returns the final list to the Fluxer. - Once the Fluxer has obtained a list of available transcoding resources it begins splitting the input source file into a plurality of segments. Although not limiting, in many cases the input file is not raw, uncompressed data but a somewhat compressed file arriving from a customer that is too big to serve to requesting clients, but is suitable for transcoding (for example, a 50 MB/s video may be suitable, depending on the nature of the content and the encoding used). The input file may also be a previously encoded/compressed file that is now being transcoded to another format or bitrate.
- The Fluxer splits the file into segments for transcoding purposes. The transcoding segments may correspond to group-of-picture (GoP) boundaries, in which case they are referred to herein as chunks. Alternatively, the transcoding segments are split along other boundaries into pseudo-chunks, as will be described in more detail below. A transcoding segment refers to the actual bits being transcoded, i.e., the bits involved in the input and output, and does not necessarily correspond to a single chunk or pseudo-chunk, as it may contain multiple chunks or pseudo-chunks. Pseudo-chunks may overlap in time, i.e., they do not necessarily represent contiguous portions of the overall input file. The process of determining how to split the file into transcoding segments can involve many determinations and is explained later in more detail in the section titled “Creating Transcoding Segments From an Input”.
- The Fluxer sends the transcoding segments to selected transcoding resources along with a list of ways in which that segment should be transcoded. Note that this means that the list may specify more than one output—for example, “transcode the segment into a derivative segment in format/
bitrate 1, and another derivative segment in format/bitrate 2.” As each transcoding resource transcodes its given segment, it replies over the open HTTP connection with the derivative segments produced from the input source. If a transcoding resource cannot complete the transcode due to some unforeseen circumstance, it simply tears down the connection and goes away, leaving the Fluxer to source another transcoding resource for that segment. Once all of the segments have been transcoded, the Fluxer re-assembles them into a single file and sends the file to the destination specified by the initial request. - The destination of the file may be, for example, a network storage system, a streaming mid-tier machine (e.g., as shown in the architectures of
FIGS. 5-7 for example), proxy server, or other component in the CDN. Unless the target format produced by the transcoding system was intermediate format (IF), the destination component may then convert the file to IF for use with the streaming platform described previously, for shipping the data within the streaming architecture. - With reference to
FIG. 19 , when transcoding a live stream, there are some variations over the VOD batch workflow described above. First, in this embodiment, when transcoding is initiated, it is initiated by the Puller component in response to the presence of a set of transcoding profiles in the Stream Manifest Manager (SMM) for that live stream. SMM already carries the concept of an Archiver set, and here includes the concept of a Fluxer Set. The Puller contacts one of the Fluxer Machines in the Fluxer Set with the parameters of the live event and the Fluxer set begins an election process to decide who is the most appropriate Fluxer Machine to act as the Mother (the remaining Fluxers will be designated as Children). The Mother begins transcoding by pulling the stream from the source Archiver, transcoding using transcoding resources as described above, and pushing it to the target Archiver. Children are responsible for monitoring the Mother and electing a new Mother in the event of a failure. (For simplicity of illustration, inFIG. 19 only the Fluxer that is acting as the Mother is shown.) - It is important to note that
FIG. 19 illustrates and the foregoing describes operation of the transcoding system with the streaming architecture shown inFIG. 3 . However, in an alternate embodiment, the transcoding system works in conjunction with the streaming architecture illustrated inFIGS. 5-15 . This means that the Fluxer can receive a request to transcode and source content from an entry-point (EP) stream manager process and sends transcoded output to an SMT machine, rather than a Target Archiver. Indeed, as mentioned above, the transcoding system is not limited to use with any particular streaming architecture, or with a streaming architecture at all (i.e., it can be a standalone transcoding service). - The following sections provide more detail about the each of the individual components that make up the transcoding system.
- Coordination Server (C-Server)
- In the above-described embodiment, the C-server is a coordination engine for a given transcoding region that provides a service for maintaining configuration information, naming, providing synchronization and group services to distributed applications. C-server functionality may be built on top of existing, known platforms such as Zookeeper (Apache Hadoop) for example, although this should not be viewed as limiting or required. Preferably, the C-server provides a job-queue and tracks which resources are working on those jobs, and also maintains resiliency when those servers fail. In the above-described embodiment, the C-server is region specific and runs on all Fluxers in a region using an internal election algorithm to determine the leader for write coordination to the C-server system. The C-server can report its region and status to a supervisory query function so that alerts can be triggered for a low number of C-servers running in a region, mitigating availability issues.
- Transcoding Resource Access Server (TRAS)
- The TRAS provides an application programming interface (API) for obtaining a set of possible transcoders that can be called directly by the Fluxer to perform transcoding of segments. Since there are multiple types of transcoding resources available (MTR/STR) and since the method of accessing them may differ, TRAS provides an abstraction for access to both of these resources through a common interface. TRAS can be implemented as a built-in library to be consumed by the Fluxer. This means that it is run as part of the Fluxer process. TRAS allows for distinct types of transcoder requests, for example: high-priority (typically real-time needs for live transcodes, which may necessitate using only MTRs) and low-priority (typically batch needs, which may involve a mix of MTRs and STRs). TRAS returns a list of possible resources for use as transcoders to Fluxer. Both high-priority and low-priority requests typically specify a bucket-size, which TRAS will attempt to fill. The response to Fluxer is a data structure that includes the transcoding resource's IP address and type. The transcoding resources themselves are considered volatile and TRAS provides no guarantees that the resources will accept a transcoding request.
- Determination of STR availability is delegated to Mapper in this embodiment. During normal CDN operation, CDN server utilizations are reported back to Mapper as part of monitoring agents and the
data collection system 108 inFIG. 1 . When STR resources are requested, a DNS request will be sent to Mapper to retrieve a set of STRs. Mapper identifies a pool of available CDN servers which are mostly idle (e.g., as defined by some metric such as CPU utilization in the recent past, cache utilization, geographic location relative to expected load—in other words, servers that are located in regions where demand for delivery services is low due to time of day or some other reasons, etc.), pseudo-randomize the selection and will return the maximum number of available IP addresses that can fit in a response packet. TRAS may perform this request more than once to fill the internal bucket requested by the Fluxer. - In this implementation, it is up to the TRAS to de-duplicate the IP addresses retrieved from Mapper if it performs the DNS request more than once. Mapper is not required to maintain state of IP addresses returned. If the Fluxer requests additional resources from TRAS, then the Fluxer is required to de-duplicate the IP addresses retrieved from TRAS, as TRAS is not required to maintain state of IP addresses returned to Fluxer.
- When TRAS receives a request that uses at least some MTRs (for example, a live-event transcode), it will use C-server's coordination capabilities to “reserve” a number of MTRs as requested by the Fluxer. TRAS provides its service through a combined, parallel query to both Mapper and C-server. As noted, it gathers enough resources to fill a bucket, the size of which depends on the priority of the request, then returns that bucket of resources to the Fluxer. In this approach, TRAS is gathering a group of resources that are likely available but may not be. In the end, it is a combination of pseudo-randomization of the large pool of STRs and usage of local MTRs that achieves distribution of load among all transcoding resources.
- In this embodiment, TRAS monitors the regional load of the MTRs it is managing. An MTR regularly updates the C-server with its queue load. TRAS periodically calculates the percentage of MTRs available, weighting them by their remaining capacity. An average is then calculated and used as a Regional Load Factor. For example if there are 10 MTRs each with a load of 10%, 20%, 30%, . . . 100%, then the algorithm would be as follows:
- S1=1−0.1, S2=1−0.2, S3=1−0.2, . . . S10=1−1, (S1+S2+S3+ . . . +S10)/10=0.45 (or 45% available; 55% current load)
- This Regional Load Factor may be reported to any system attempting to determine the availability of work units for a given regional transcoding installation. The foregoing load-factor algorithm should not be viewed as limiting, as other algorithms may be used in other embodiments.
- Fluxer
- In the present embodiment, the Fluxers are the primary interface of the transcoding system to the outside world and the most common component for external clients to interact with. At a high-level, the purpose of the Fluxer is to break-up a video into segments, send those segments to one or more transcoders and reassemble those segments into the target container file. There are a number of low-level details involved in this function.
- Fluxers provide several interfaces to support Live (real-time), VOD (batch) and VOD (real-time) use cases.
- For Live, Fluxer live interfaces allow the Fluxer to transcode a live event by pulling a bitrate/format from an Archiver or Entry-Point, producing one or more transcoded bitrates/formats, and publishing all configured bitrates/formats to an Archiver or Streaming Mid-Tier. This activity is initiated by an HTTP Request to the Fluxer's live interface, containing the source Archiver set or Entry-Point, the target stream-id and the configuration for all derivative bitrates/formats. The initiating HTTP request causes the Fluxer to begin transcoding until the stream is torn-down.
- Fluxer VOD interfaces, whether real-time or batch, are primarily implemented in the current embodiments as pull-based HTTP interface with the primary difference being how much of the file is transcoded at a given time. Regardless of the request being over the live or VOD interface, Fluxers generally wait to acknowledge jobs until they have obtained an initial set of resources from TRAS. If initial resource allocation fails, then the Fluxer can communicate that failure immediately regardless of a synchronous or asynchronous job.
- Fluxer Live Interface
- In this embodiment, Fluxer's live interface is a URL that triggers Fluxer activity but does not require that the initiator remain connected to the HTTP Socket, as the activity is ongoing and no feedback is required for the initiator. This allows a resource to ask a Fluxer to initiate transcoding of a live stream and to contact some number of additional Fluxers, asking them to monitor the primary. The initiation of this request typically contains the following information:
-
- The source stream
- The bitrate, height/width and transcoding configuration for each transcode of the live stream.
- The list of additional Fluxers that together with the target make up the Fluxer Group
-
FIG. 20 illustrates one embodiment of the operation of the Fluxer (and other system components) when transcoding a live stream. Instep 1, the Puller contacts the streaming manifest manager and gets an Archiver set or Fluxer set. Instep 2, the Puller contacts source Archiver, initiates a stream. Instep 3, the Puller contacts first Fluxer from Fluxer Set and passes transcoding information. The contacted Fluxer then contacts remaining Fluxers in the set and they decide who will be the Mother and who will be Children. Transcoding parameters are communicated here. Fluxer Children begin monitoring the Mother. Instep 4, the Mother Fluxer contacts SMM to get the Archiver set. Instep 5, Fluxer contacts TRAS to get transcoding resources. Instep 6, Fluxer initiates pull from Source Archiver. Instep 7, the Mother Fluxer begins the parallel transcode of the stream being pulled from Source Archiver, utilizing the transcoding resources (TRs). Instep 8, the Mother Fluxer re-assembles the transcoded segments and sends the transcoded stream to target Archiver set assigned by SMM for each bitrate. - Alternatively, the above operation can be performed with the live streaming components depicted in
FIG. 5 . In such a case, an Entry-Point locates a Fluxer and requests a transcode. The Entry-Point itself sources the stream to be transcoded, or points to the Fluxer to a Storage source stream using the metadata files described in connection withFIG. 8 . The transcoded stream is sent to a streaming mid-tier SMT machine or to the Storage system, rather than an Archiver. - Should a Mother Fluxer fail, the Fluxer Children will begin an election to decide which Fluxer will assume the role of Mother. Election should prefer the Fluxer that is closer to the source of the stream. The new Mother will query at the target Archiver to confirm that the old Mother is no longer sending data and to retrieve the position of the last data received. The new Mother then assumes the Mother role and begins transcoding where the last Mother left off.
-
FIG. 21 illustrates the operation of the Fluxer (and other system components) when transcoding a VOD stream in batch mode. Instep 1, the Job Queue contacts Fluxer. (The Job Queue can exist as part storage system process, portal, or other component accessing the transcoding system.) Instep 2, Fluxer contacts TRAS to get transcoding resources. Instep 3, Fluxer pulls media from the source. Instep 4, the Fluxer orchestrates the transcoding of the content using transcoders resources from TRAS. Instep 5, the Fluxer posts transcoded content to a destination. Instep 6, Job Queue removes the job. - In this implementation, the Job Source can pick a Fluxer at its own discretion however, preferably it chooses a Fluxer that is both idle and near the job source. In other implementations, the Mapping system can be used to determine the best Fluxer by sending a DNS request to a fluxer domain and receiving back from the Mapping system the IP address of a suitable Fluxer. Batch VOD Fluxer requests, although not prohibited from using MTRs, can be weighted to prefer using idle or low-usage STR transcoders.
-
FIG. 22 illustrates the operation of the Fluxer (and other system components) when transcoding a VOD stream in real-time mode. Instep 1, a request comes in to Fluxer from a CDN's content server (e.g., an HTTP proxy server as shown and described in connection withFIG. 1 ) that has received a user request for a file, or from a cache hierarchy region that has been asked for the content by the server (e.g., using a cache hierarchy technique as described in U.S. Pat. No. 7,376,716, the disclosure of which is incorporated herein by reference), or from a SMT machine (see, e.g.,FIG. 12 , where content server ghost has asked SMT machine in step (2) thereof for VOD content to satisfy request step (1) thereof). Instep 2, assume Fluxer checks its transcoding region cache for requested segments of the content (which may correspond to, e.g., one or more IF fragments). Assume it receives a cache miss. Instep 3, the Fluxer contacts TRAS to identify transcoding resources. Instep 4, the Fluxer requests and receives the segments from the source (e.g., from storage or origin). Instep 5, the Fluxer transcodes them using transcoding resources. Instep 6, the Fluxer returns transcoded segments to the requesting component following re-assembly into a file or portion thereof. Instep 7, the Fluxer begins workahead transcoding. - If the Fluxer determines that there is a region cache hit in
step 2, then the Fluxer retrieves the trancoded segment from region cache, looking for a segment that is at least N seconds ahead of the requested segment (where N is determined by a configuration parameter). Fluxer either begins workahead or not depending on whether it can find sufficient number of segments in cache to meet the workahead criteria. - Thus, in the VOD real-time case, Fluxer works ahead of the anticipated requests in order to maintain a smooth experience for end users. Preferably, a content provider's configuration for real-time VOD transcoding contains a parameter which defines the number of segments to transcode ahead of the most current request, e.g., by indicating a number of seconds to work ahead. When a real-time VOD request comes to a Fluxer it can check to see if the required segments have already been transcoded and if so will begin delivering immediately while it performs the workahead of N segments based on the position of the request being served.
- The following provides more detail about caching at a transcoding region. Caching proxy server functionality is employed locally on a Fluxer to maintain a cache-layer for the work performed in real-time. Once a request has been transcoded the derivative is cached locally within the transcoding region. The Fluxer leverages this feature by performing a lookahead request of N segments ahead of the current segment request. If a non-200 response code is returned by the local cache server for any of the N segments, Fluxer will respond by posting the required segment to a TR through its local cache server, resulting in caching of the transcoded response within the cache server layer.
- The following describes optional pre-processing of media for VOD real-time case. Before allowing real-time transcoding of a VOD asset, some amount of work can be done to ensure that the media is prepped such that there is a standard starting point from which to begin transcoding. Pre-processing the media by transcoding the first few segments of a video means that the system can begin streaming immediately while the transcoder builds up a workahead buffer. Pre-processing typically includes the following actions:
-
- Create an optimized version of the inbound file (optimized keyframe rate and bitrate)
- Create an index of segment locations to byte-ranges
- Produce the first N segments for each target bitrate
- The following describes an example of a process for identifying a Fluxer for VOD real-time workflows. A Mapper load-feedback property can be used to find appropriate Fluxers for real-time VOD transcoding. Preferably, real-time Fluxer requests use local MTR (dedicated) transcoder resources. Load-feedback from the Fluxer to the Mapper can include both the local Fluxer load and the regional transcoding resource load as well. Regional transcoder load estimation can be obtained from the Fluxer by making a call to TRAS to perform the “Regional Load Estimation”, as described above in connection with the TRAS component, and thereby return a “Regional Load Factor” to the Fluxer.
- Transcoding Resources (MTRs, STRs)
- In the current example, the role of the transcoding resource (sometimes referred to herein as a “transcoder”) is primarily to transcode segments of audio/video, or other content that needs to be transcoded. In one embodiment, a transcoding resource uses an HTTP-based API for receiving and transmitting segments. Typically, all transcoding resources are considered unreliable—and particularly STRs. A shared transcoding resource may terminate the transcode for any reason although if it terminates the transcode due to an error in the source media it preferably indicates that fact to the Fluxer, e.g., using an HTTP 415 Unsupported Media Type error, for example. If a Fluxer receives an unexpected disconnect from a transcoding resource (particularly an STR) it preferably ceases using that transcoding resource for at least a given time period, to prevent impacting STRs that are delivering content in the CDN.
- Put another way, load is a concern for STRs, as they are typically the HTTP proxy servers running in the CDN and delivering content to end users in
FIGS. 1-16 , since the integrity of the delivery network is preferably protected. The process managing the transcoding on the STR is configured to avoid impact to the STR. STRs monitor their local environment and terminate jobs if the environment becomes constrained. In the STR environment, the HTTP proxy server (ghost) process is considered more important than the transcoding process. STRs run a process “manager” which in turn runs and monitors the actual transcoding server as a child process. This “manager” may take any of several steps to “lock-down” the transcoding process such as using LD_PRELOAD to block dangerous system calls, chrooting the process and monitoring the process for excessive runtime and/or CPU consumption. -
FIG. 23 provides an overview of processes executing on a transcoding resource (excluding HTTP proxy processes for content delivery). - In one embodiment, a client (e.g., a Fluxer) can communicate with transcoding resources using an
HTTP 100 Expect/Continue workflow. This is preferable because a transcoding resource may not be able to handle any work and it is useless and wasteful to send an entire segment only to be denied. A transcoding resource may block for a period of time before sending a 100 Continue response to a requesting client but also preferably responds immediately if unable to handle the request. - In the current implementation, transcoding resources accept transcoding segments that are chunks or pseudo-chunks for transcoding.
- Regardless of a transcoding resource's role as either a MTR or a STR, in the current embodiment, transcoders are generally considered unreliable by the Fluxers. As noted previously, a Fluxer receives a list of transcoding resources so that it may begin to send segments to them. Without a large, global, fine-grained, resource allocation system, it would be impossible to have a high degree of certainty that a given transcoding resource will accept a segment to transcode. Moreover, transcoding resources run on commodity hardware, so failure of a transcoding resource during the transcoding process is not only a possibility but may even be likely at some point across the transcoding system. For this reason, it is simpler to adopt an unreliable view of transcoding resources. This view also simplifies the transcoding resource implementation. If the transcoding resource is overloaded, it is sufficient and acceptable for that transcoding resource to simply deny any inbound transcoding requests until the load drops below a threshold. Should a transcoding resource process be leveraging idle CPU on a machine with a more important role, such as an STR, it is sufficient to simply “go away” if the resources being consumed by the transcoding resource become needed. In response to a deny or an unexpected socket close, the Fluxer preferably sends the segment to an alternate transcoding resource. However, if the transcoding resource returns an actual error about the source bits (e.g. some fatal error with the original encode) then the Fluxer may send the segment to another transcoding resource or it may give up on the segment altogether, failing the transcode.
- Identification of possible transcoding resources to use for a particular job is now described. Possible transcoders are identified from a pool of available transcoding resources in one of a few ways. For STRs that represent HTTP proxy servers somewhere in the delivery network, Mapper is used to provide a map that can return a list of possible resources which appear to be under a given load threshold, as mentioned above. This is provided over a DNS interface with the parameters encoded into the requesting hostname. This DNS request may return a large number of possible hosts—more than that associated with a typical DNS lookup in the delivery network. As noted, STRs returned are considered volatile and may accept or reject the request based on their own local load.
- A non-limiting, exemplary approach for an internal queue of a transcoding resource is described as follows. Transcoding resources can have a fixed number of “slots” which is made up of two counters and indicates the number of individual transcode-segment requests that may be accepted by that transcoding resource at any given period of time. One counter is the “available-process” counter and is some sub-percentage of the number of available cores on the system. The other counter is the “queue” counter and is some configurable number of additional tasks that are allowed to be waiting but not actively being worked on. Both of these factors are reactive to the hardware the transcoding resource is installed on and both are configurable. For example, an available-process factor of 0.5 (or 50% of system cores) and a queue counter of 0.10 (or 10% of cores). Taken together, these two counters make up the total number of available “slots” for a given transcoding resource.
- As a transcoding resource is accepting work it continues to accept requests to transcode segments so long as it has available processes and/or slots. Should the transcoding resource be completely full, it denies the request with a HTTP 503 Service Unavailable error. A 100 Expect/Continue method is otherwise used to ensure that the request is valid and that the transcoding resource has an available process to perform the requested action. If the processes are all allocated and an inbound Fluxer request lands on a queue slot then the transcoding resource should block its “CONTINUE” response until the queue slot becomes assigned to a process.
- Batch VOD Queuing
- The queuing of VOD batch requests is now described. A queuing system exists to request files be transcoded at the earliest possible convenience. This queue contains a list of jobs that define a source, a transcode profile and a destination and will be executed on as soon as possible given the resources available. The queue itself is quite simple, can be distributed into many sub-queues and will mostly be used by some user interface to provide batch-transcoding services for bitrates that a content provider wishes to crate and have stored for later delivery. Upon waking up, the local queue manager will simply take the top N jobs off the stack and make required batch requests to the Fluxers, allowing the transcoding system to work to complete the transcoding job. Multiple queues may be running within a given transcoding region, typically running on the same hardware that is running the Fluxer or TRAS code.
- Examples of jobs which the transcoding system is configured to support may include the following (which are non-limiting examples):
-
- Conversion to the following video codecs: h.264, theora, vp8
- Conversion to the following audio codecs: mp3, aac, vorbis
- Conversion to the following containers: mp4-standard, mp4-fragmented, fly, IF (intermediate format as described previously)
- Conversion from the following video codecs: h.264, mpeg1, mpeg2, VC1, theora, VP3/6/8, DV
- Conversion from the following audio codecs: aac, mp3, mpa, pcm, vorbis
- Conversion from the following containers: mpeg2ts, mpeg2ps, mpeg1, avi, mp4, wmv/asf, mp3, WEBM/Matroska
- The transcoding system also preferably supports the application of filters and scalers (i.e. deinterlacing and frame-scaling).
- While some of the foregoing examples have focused on converting media formats, codecs, and the like, the system described herein is not limited to such. The teachings above may be extended so as to provide a distributed platform for applying security or rights management schemes to content. For example, the system above may be modified by having the Fluxer receive requests (by way of illustration) to apply a given encryption algorithm to a file. The Fluxer can break up the file into segments that are each to be encrypted, and delegate the tasks of doing so to distributed MTRs and STRs, as described above. In sum, the nature of the assigned task may change but the system still operates similarly. Other tasks might include embedding a watermark in the content, or inserting data to apply a digital rights management scheme to the file. In other embodiments, system can receive an end-user client request for content, discern information about the end-user client (client IP address, user-agent, user-id, other identifier, etc.) and incorporate that data into a fingerprint that is inserted into the content in real-time, leveraging the real-time transcoding flow described above (e.g.,
FIG. 22 ) to convert the file on the fly. Hence, the content can be marked with information related to the end-user (or client machine) to whom it was delivered. In some use cases, it may be preferable not to break the original file apart but rather assign the entire file transcoding job to a particular MTR or STR, perhaps with low priority, so that the assigned machine has all the data in the file to work with in performing its task. - Creating Transcoding Segments from an Input (Pseudo-Chunking)
- The following presents examples of how the Fluxer can break apart incoming files into transcoding segments, and more particularly how it can break apart incoming video files.
- The embodiments described above provide a transcoding system that implements segmented parallel encoding for video and other content. For video, segmented parallel encoding typically makes the tradeoff of inflexible keyframe intervals for the speed of encoding videos using a large number of encoders operating in parallel. If keyframe intervals are not altered then the boundary of a keyframe may be considered a chunk or segment and treated independently of other chunks. By breaking up a video into segments and submitting those segments in parallel to multiple transcoding resources, the transcode can be parallelized, increasing its speed relative to the number of encoders and reduce the encoding time to the minimum of (demuxing_time+slowest_segment_encode_time+re-muxing_time).
- Codecs enable the compression of video by taking advantage of the similarity between frames. Generally speaking, there are 3 types of frames that are used to varying degrees in different codecs: I-frames (aka, keyframes), P-frames and B-frames. In general, and as mentioned previously, I-frames can be thought of as a stand-alone frame that contains the complete information to construct a complete frame on its own. P-frames reference essentially what has changed between itself and the previous frame while B-frames can refer to frames ahead of them or behind them. The group of frames that starts with an I-frame and ends with the last frame before the next I-frame is often referred to as a Group Of Pictures or “GoP”. Hence, a video that is encoded as a Closed-GoP video means that each GoP can be treated independently from the others.
- A container generally refers to a file that wraps the raw encoded bits of media (e.g., audio, video) and may provide indexing, seekability and metadata. Typically, a container divides the raw bits into “packets” which may contain one or more frames. A frame typically has a frame-type of audio, video or a number of less-frequent possibilities such as subtitles and sprites. For video, these frames each correspond to the type of frames mentioned above, I-Frame, B-Frame, P-Frame, etc. There are a large number of different containers and each may have a little different way of getting at the raw media data.
- In sequential encoding (as opposed to parallelized encoding), all frames can be considered in a sequence (or with some parallelism resulting from a multi-threaded computer architecture) and an approach derived across a large number of frames. When encoding in this manner, it is relatively trivial to do things such as modify the GoP size because there is always enough information available to create an I-frame (since the entire stream is available). When parallelizing encodes in a cloud (where multiple servers are involved, as can occur with the transcoding system presented herein), making modifications to the GoP size can become more problematic. If, for example, the request is to reduce the GoP size to a non-factor of the original GoP size then the I-frames will no longer be aligned.
- The following describes some examples of kinds of complications when parallelizing encodes and a pseudo-chunking approach to solve them.
- GoP Size Modification. GoP size modification becomes complicated with parallelizing transcodes to multiple processors. For example, if a typical encode has a GoP size of 250 frames (8.34 seconds of NTSC Video), this can be an issue for high-keyframe-rates, which may be present, e.g., in HD video formats. If a HD or other video format is desired to run 2-3 seconds between keyframes (approximately 60-90 frames in the GoP), neither 60 or 90 frames can be evenly divided into the 250 frame/second source keyframe rate. Solving this problem involves maintaining some kind of alignment over how many frames will be required to decode the frames necessary to produce a keyframe at an unusual time.
- For example, and with reference to
FIG. 24 , assume a current GoP size of 250 frames and a target-GoP size of 90 frames. As a result, NEWGoP1 will be frames 1-90, and needs frames 1-90 to be able to be re-encoded, NEWGoP2 will be from frame 91-180 and needs frames 1-180 to be able to be re-encoded. NEWGoP3 will be from frames 180 to 270 and will therefore need frames 1-270 to be able to be re-encoded. Notice, we've crossed into a new GoP now. NEWGoP3 will have to start with the first GoP and need several frames from the second GoP in order to be encoded. NEWGoP4 doesn't have this problem, it will be made up of frames 271-360 and therefore only needs frames 251-360 in order to start from a keyframe and encode its bits.FIG. 24 illustrates this scenario. - A pseudo-chunking approach can address this issue by, in one embodiment, allowing for segments that are not aligned to keyframes or GoPs. A pseudo-chunk may be larger or smaller than a GoP. In the above example, the segmenter (e.g., the Fluxer) can create a pseudo-chunk that extends past the Current GoP to reach the end of NewGoP3.
- Note that when dealing with GoP modification, it's often preferable to allow the encoder to produce multiple GoPs from a single source GoP. One usually wouldn't want to transfer one GoP three times just to get three new GoPs, when you could transfer one GoP+a few frames of the second (the entire pseudo-chunk) and receive back three GoPs.
- Pseudo-chunking also applies to scene change detection, and more particularly, to situations where there are frequent scene changes in a video file. A scene change refers to an interruption in the regular sequence of keyframes. It typically exists because enough has changed from one frame to the next that a P or B frame becomes impractical, i.e., there is enough difference between frames for the encoder to place an additional keyframe in-line for quality sake. Most modern encoders contain some threshold for inserting extra keyframes on scene changes in order to optimize the encoding experience. Scene-changes can present a problem if too simplistic of an algorithm is used when segmenting, such as simply splitting on keyframes. When many scene-change keyframes are present it could cause too-small a fragment to be used for the encoders and could actually slow down parallel transcodes. A pseudo-chunking approach, in which pseudo-chunks may span more than one keyframe in appropriate circumstances, can address this issue (e.g., by including some predetermined minimum number of frames/time in the pseudo-chunk segment, regardless of keyframe intervals).
- Pseudo-chunking addresses open GoP encoding as well. Typically, a GoP ends with a P-frame (which references a previous frame). This is a closed GoP. However, it's possible to end a GoP with a B-frame, which could refer to the next frame in the next GoP (the starting I-Frame). When this occurs it is referred to as an open-GoP. An open-GoP presents a problem over a closed-GoP when parallelizing encodes because some amount of the next GoP is required to complete the encode.
- Details on Pseudo-Chunking Approach
- In one embodiment, a device managing the transcode (such as the Fluxer in the transcoding system previously described) is configured to be aware of what frames it needs to use, as a subset of those received, to produce a new transcode. For example, the Fluxer will look at a frame to determine what kind of frame it is (B-frame, P-frame, keyframe, etc., Closed-GoP situation, etc.), understand what GoP size it needs to target. It is frame-aware. Hence, the Fluxer has intelligence to create pseudo-chunks, rather than blindly segmenting on keyframes. It can then include the appropriate coded frames in a pseudo chunk, so that the transcoding resource has all the data it needs to decode, convert the data, and re-encode as required.
- As explained above, a pseudo chunk may be either a partial or super-GoP. A pseudo chunk is used as a unit of data that is transferred from a Fluxer to a transcoder and may not include the entire GoP if the entire GoP is not required for transcoding the target number of frames. A pseudo chunk may also contain more frames then a given GoP in the case of an Open GoP condition or if the target keyframe interval is sufficiently different from the source keyframe interval. So a pseudo-chunk is not necessarily aligned with a GoP, and may extend past the original GoP boundary or not reach that far.
-
FIG. 25 illustrates an example of pseudo-chunking to change the GoP size in a given video file. In this example, the pseudo-chunk starts at a keyframe boundary and continues past the Current GoP (the original GoP) until enough frames are included to construct the New GoP that bridges the boundary betweenCurrent GoP 1 andCurrent GoP 2. Given a video that is 1 frame per second and has a 10 second GoP we have a GoP every 10 frames (1-10, 11-20, 21-30, etc. . . . ). For illustrative purposes, assumeCurrent GoPs FIG. 24 ) is a problem becauseframe 10 belongs toCurrent GoP 1 whileframe 11 belongs toCurrent GoP 2. We need to send a chunk of data to the transcoding resource that includes the entireCurrent GoP 1 and two frames ofCurrent GoP 2 in order to have enough frame data at the transcoding resource to encode the New GoP4 at frames 10-12. This chunk of data is represented byPseudo Chunk 1 inFIG. 24 . Also note that the Fluxer preferably ensures that the last frame of the pseudo-chunk is not a B-frame referring to a frame ahead of it. If it is, then another frame(s) may need to be included inPseudo Chunk 1. - Another aspect of pseudo-chunking involves including both the starting and ending keyframes to deal with open GOP situations. Typically, with sequential encoding, one would only need the frames that are desired to be encoded—and the keyframe of the next GOP is unnecessary—but in parallel transcoding case, and with a “frame-aware” Fluxer, one can and should send the extra frame. To do this, the Fluxer ensures that our pseudo-chunks always start on a keyframe and continue past the frame-number of the last needed frame to the point that there are either no further forward-looking B-frames or it encounters the next keyframe.
- Finally, a pseudo-chunking Fluxer can mitigate the effects of frequent scene changes, which can produce transcoding segments that are too small, by applying certain thresholds (minimum number of frames for a segment) in the pseudo-chunking process.
- In one implementation, for every batch transcode, Fluxer can produce an index file describing the breakup of all pseudo chunks produced, for the input audio and video tracks, called a “Chunk Index Header”. This file can be used for accelerating real-time transcodes by identifying the individual pseudo chunks for the particular input and what byte-offsets they occupy in the file, making retrieval of discrete units easier.
- It should be understood that pseudo-chunking is not limited to the applications described above, nor is it limited to use by a Fluxer described herein. Any module charged with segmenting a file for encoding may employ pseudo-chunking. Further, other forms of media, particularly those that utilize atomic data that references other data in a stream (as do B-frames, P-frames, etc.)
- Computer-Based Implementation
- The clients, servers, and other devices described herein may be implemented with conventional computer systems, as modified by the teachings hereof, with the functional characteristics described above realized in special-purpose hardware, general-purpose hardware configured by software stored therein for special purposes, or a combination thereof.
- Software may include one or several discrete programs. Any given function may comprise part of any given module, process, execution thread, or other such programming construct. Generalizing, each function described above may be implemented as computer code, namely, as a set of computer instructions, executable in one or more processors to provide a special purpose machine. The code may be executed using conventional apparatus—such as a processor in a computer, digital data processing device, or other computing apparatus—as modified by the teachings hereof. In one embodiment, such software may be implemented in a programming language that runs in conjunction with a proxy on a standard Intel hardware platform running an operating system such as Linux. The functionality may be built into the proxy code, or it may be executed as an adjunct to that code.
- While in some cases above a particular order of operations performed by certain embodiments is set forth, it should be understood that such order is exemplary and that they may be performed in a different order, combined, or the like. Moreover, some of the functions may be combined or shared in given instructions, program sequences, code portions, and the like. References in the specification to a given embodiment indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic.
-
FIG. 26 is a block diagram that illustrates hardware in acomputer system 2600 upon which such software may run in order to implement embodiments of the invention. Thecomputer system 2600 may be embodied in a client device, server, personal computer, workstation, tablet computer, wireless device, mobile device, network device, router, hub, gateway, or other device. Representative machines on which the subject matter herein is provided may be Intel Pentium-based computers running a Linux or Linux-variant operating system and one or more applications to carry out the described functionality. -
Computer system 2600 includes aprocessor 2604 coupled tobus 2601. In some systems, multiple processor and/or processor cores may be employed.Computer system 2600 further includes amain memory 2610, such as a random access memory (RAM) or other storage device, coupled to thebus 2601 for storing information and instructions to be executed byprocessor 2604. A read only memory (ROM) 2608 is coupled to thebus 2601 for storing information and instructions forprocessor 2604. Anon-volatile storage device 2606, such as a magnetic disk, solid state memory (e.g., flash memory), or optical disk, is provided and coupled tobus 2601 for storing information and instructions. Other application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs) or circuitry may be included in thecomputer system 2600 to perform functions described herein. - Although the
computer system 2600 is often managed remotely via acommunication interface 2616, for local administration purposes thesystem 2600 may have aperipheral interface 2612 communicatively couplescomputer system 2600 to a user display 2614 that displays the output of software executing on the computer system, and an input device 2615 (e.g., a keyboard, mouse, trackpad, touchscreen) that communicates user input and instructions to thecomputer system 2600. Theperipheral interface 2612 may include interface circuitry, control and/or level-shifting logic for local buses such as RS-485, Universal Serial Bus (USB), IEEE 1394, or other communication links -
Computer system 2600 is coupled to acommunication interface 2616 that provides a link (e.g., at a physical layer, data link layer, or otherwise) between thesystem bus 2601 and an external communication link. Thecommunication interface 2616 provides anetwork link 2618. Thecommunication interface 2616 may represent a Ethernet or other network interface card (NIC), a wireless interface, modem, an optical interface, or other kind of input/output interface. -
Network link 2618 provides data communication through one or more networks to other devices. Such devices include other computer systems that are part of a local area network (LAN) 2626. Furthermore, thenetwork link 2618 provides a link, via an internet service provider (ISP) 2620, to theInternet 2622. In turn, theInternet 2622 may provide a link to other computing systems such as aremote server 2630 and/or aremote client 2631.Network link 2618 and such networks may transmit data using packet-switched, circuit-switched, or other data-transmission approaches. - In operation, the
computer system 2600 may implement the functionality described herein as a result of the processor executing code. Such code may be read from or stored on a non-transitory computer-readable medium, such asmemory 2610,ROM 2608, orstorage device 2606. Other forms of non-transitory computer-readable media include disks, tapes, magnetic media, CD-ROMs, optical media, RAM, PROM, EPROM, and EEPROM. Any other non-transitory computer-readable medium may be employed. Executing code may also be read from network link 2618 (e.g., following storage in an interface buffer, local memory, or other circuitry).
Claims (25)
1. A system, comprising:
a plurality of proxy servers connected to a global computer network that operate to receive requests for content from clients and respond to the requests for content by sending the clients the content they requested;
a management server operable to receive a request to convert a file from a first version to a second version;
the management server operable to create at least first and second segments, each of the segments corresponding to a portion of the file, and send the first segment to a first one of the plurality of proxy servers and the second segment to a second one of the plurality of proxy servers, each of the first and second segments being sent with information about the requested conversion, so that the first and second segments are converted independently by the first and second proxy servers while the first and second proxy servers continue to respond to client requests for content;
wherein the plurality of proxy servers and the management server each comprise circuitry forming at least one processor and memory storing computer-readable instructions that when executed on the at least one processor will cause operation as specified above.
2. The system of claim 1 , wherein the conversion involving changing at least one of:
(a) a codec used to encode data in the file,
(b) a container format of the file,
(c) one or more codec settings used to encode data in the file
(d) one or more container format settings for the file,
(e) a frame size for data in the file,
(f) an aspect ratio for data in the file,
(g) a bit-rate of encoded data in the file,
(h) an interlacing characteristic for data in the file,
(i) a frame rate for data in the file, and
(j) a picture resolution for data in the file.
3. The system of claim 1 , wherein the conversion involves at least one of:
(a) changing one or more security characteristics of the file,
(b) applying a DRM scheme,
(c) applying encryption,
(d) applying a watermark, and
(e) applying a fingerprint.
4. The system of claim 1 , wherein the first and second proxy servers were selected to participate in performing the requested conversion at least in part because their resource utilization related to servicing client requests for content was lower than that of other proxy servers.
5. The system of claim 1 , wherein each of the plurality of proxy servers is operable to execute a first process providing a proxy function that services client requests for content, and a second process that performs conversions on files sent from the management server, the first process having priority over the second process.
6. The system of claim 1 , wherein at least one of the plurality of proxy servers is operable to send the management server a message indicating that it will not perform the requested conversion, after that proxy server determines that its resource utilization related to servicing client requests exceeds a threshold.
7. The system of claim 1 , wherein the management server is operable to identify proxy servers to use to perform the requested conversion by obtaining a list of one or more candidate proxy servers from a monitoring system associated with the plurality of proxy servers.
8. The system of claim 1 , wherein the request to convert the file is associated with a priority, and the management server decides whether to use the plurality of proxy servers for performing the requested conversion based on the priority of the request.
9. The system of claim 1 , wherein each of the first and second proxy servers operate to perform the requested conversion and return the results to the management server, which re-assembles the results into at least part of the second version of the file.
10. The system of claim 1 , wherein the plurality of proxy servers are HTTP proxy servers and the content for which they receive client requests comprises any of HTML files, web page objects, and streaming media.
11. The system of claim 1 , wherein file includes one or more of (i) audio data and (ii) video data.
12. The system of claim 1 , further comprising a machine that makes the request to the management server to convert the file, the machine comprising any of: (a) a network storage system, (b) a server providing a user interface to content provider users of the system, and (c) one of the plurality of proxy servers.
13. A method performed by one or more programmed computer machines that comprise circuitry forming one or more processors that execute computer program instructions, and that manage the conversion of content, the method comprising:
receiving a request to convert a file from a first version to a second version;
selecting first and second proxy servers from a plurality of proxy servers that are interconnected via a global computer network, and that are receiving requests for content from clients and responding to the requests for content by sending the clients the content they requested;
creating at least first and second segments, each of the segments corresponding to a portion of the file, and sending the first segment to the first proxy server and the second segment to the second proxy server, each of the first and second segments being sent with information about the requested conversion,
receiving a converted first segment from the first proxy server; and
receiving a converted second segment from the second proxy server
combining the converted first and second segments to form at least part of the second version of the file.
14. The method of claim 13 , wherein the conversion involving changing at least one of:
(a) a codec used to encode data in the file,
(b) a container format of the file,
(c) one or more codec settings used to encode data in the file
(d) one or more container format settings for the file,
(e) a frame size for data in the file,
(f) an aspect ratio for data in the file,
(g) a bit-rate of encoded data in the file,
(h) an interlacing characteristic for data in the file,
(i) a frame rate for data in the file, and
(j) a picture resolution for data in the file.
15. The method of claim 13 , wherein the conversion involves at least one of:
(a) changing one or more security characteristics of the file,
(b) applying a DRM scheme,
(c) applying encryption,
(d) applying a watermark, and
(e) applying a fingerprint.
16. The method of claim 13 , wherein the first and second proxy servers are selected at least in part because their resource utilization related to servicing client requests for content is lower than that of other proxy servers.
17. The method of claim 13 , further comprising: receiving from one of the plurality of proxy servers a message indicating that it will not convert a particular segment because its resource utilization related to servicing client requests exceeds a threshold.
18. The method of claim 13 , further comprising: identifying proxy servers to use to perform the requested conversion by obtaining a list of one or more candidate proxy servers from a monitoring system associated with the plurality of proxy servers.
19. The method of claim 13 , wherein the request to convert the file is associated with a priority, and further comprising deciding whether to use the plurality of proxy servers for performing the requested conversion based on the priority of the request.
20. The method of claim 13 , wherein the plurality of proxy servers are HTTP proxy servers and the content for which they receive client requests comprises any of HTML files, web page objects, and streaming media.
21. The method of claim 13 , wherein the file includes one or more of (i) audio data and (ii) video data.
22. The method of claim 13 , further comprising receiving the request to convert the file from any of: (a) a network storage system, (b) a server providing a user interface to content provider users of the system and (c) one of the plurality of proxy servers.
23. A method performed by programmed computer machines that comprise circuitry forming one or more processors that execute computer program instructions, comprising:
with a plurality of proxy servers that are connected to a global computer network, receiving for content from clients and responding to the requests for content by sending the clients the content they requested;
at a first proxy server selected from the plurality of proxy servers, receiving a request to convert a first segment of a file from a first version to a second version, and instructions about the conversion to be performed;
at a second proxy server selected from the plurality of proxy servers, receiving a request to convert a second segment of the file from a first version to a second version and instructions about the conversion to be performed;
the first proxy server converting the first segment from the first version to the second version while continuing to response to client requests for content, as long as the load on the first proxy server due to the client requests for content does not exceed a threshold;
the first proxy server sending the second version of the first segment to at least one server managing the conversion;
the second proxy server converting the second segment from the first version to the second version while continuing to response to client requests for content, as long as the load on the second proxy server due to the client requests for content does not exceed a threshold;
the second proxy server sending the second version of the first segment to the at least one server managing the conversion.
24. The method of claim 23 , wherein the conversion involving changing at least one of:
(k) a codec used to encode data,
(l) a container format,
(m) one or more codec settings,
(n) one or more container format settings,
(o) a frame size,
(p) an aspect ratio,
(q) a bit-rate of encoded data,
(r) an interlacing characteristic,
(s) a frame rate, and
(t) a picture resolution.
25. The method of claim 23 , wherein the conversion involves at least one of:
(f) changing one or more security characteristics,
(g) applying a DRM scheme,
(h) applying encryption,
(i) applying a watermark, and
(j) applying a fingerprint.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/667,267 US20130117418A1 (en) | 2011-11-06 | 2012-11-02 | Hybrid platform for content delivery and transcoding |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161556237P | 2011-11-06 | 2011-11-06 | |
US201161556236P | 2011-11-06 | 2011-11-06 | |
US13/667,267 US20130117418A1 (en) | 2011-11-06 | 2012-11-02 | Hybrid platform for content delivery and transcoding |
Publications (1)
Publication Number | Publication Date |
---|---|
US20130117418A1 true US20130117418A1 (en) | 2013-05-09 |
Family
ID=48223704
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/667,272 Active 2032-12-28 US9432704B2 (en) | 2011-11-06 | 2012-11-02 | Segmented parallel encoding with frame-aware, variable-size chunking |
US13/667,267 Abandoned US20130117418A1 (en) | 2011-11-06 | 2012-11-02 | Hybrid platform for content delivery and transcoding |
US15/219,064 Active US10027997B2 (en) | 2011-11-06 | 2016-07-25 | Techniques for transcoding content in parallel on a plurality of machines |
US15/969,563 Active US10595059B2 (en) | 2011-11-06 | 2018-05-02 | Segmented parallel encoding with frame-aware, variable-size chunking |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/667,272 Active 2032-12-28 US9432704B2 (en) | 2011-11-06 | 2012-11-02 | Segmented parallel encoding with frame-aware, variable-size chunking |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/219,064 Active US10027997B2 (en) | 2011-11-06 | 2016-07-25 | Techniques for transcoding content in parallel on a plurality of machines |
US15/969,563 Active US10595059B2 (en) | 2011-11-06 | 2018-05-02 | Segmented parallel encoding with frame-aware, variable-size chunking |
Country Status (1)
Country | Link |
---|---|
US (4) | US9432704B2 (en) |
Cited By (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140101118A1 (en) * | 2012-10-04 | 2014-04-10 | Telefonaktiebolaget L M Ericsson (Publ) | System and method for creating multiple versions of a descriptor file |
US8826332B2 (en) | 2012-12-21 | 2014-09-02 | Ustudio, Inc. | Media distribution and management platform |
US20140280744A1 (en) * | 2013-03-14 | 2014-09-18 | Charter Communications Operating, Llc | System and method for adapting content delivery |
US20140344398A1 (en) * | 2012-10-15 | 2014-11-20 | Limelight Networks, Inc. | Control systems and methods for cloud resource management |
US20140359166A1 (en) * | 2013-05-31 | 2014-12-04 | Broadcom Corporation | Providing multiple abr streams using a single transcoder |
WO2015077289A1 (en) * | 2013-11-21 | 2015-05-28 | Google Inc. | Transcoding media streams using subchunking |
US20150189222A1 (en) * | 2013-12-30 | 2015-07-02 | Google Inc. | Content-adaptive chunking for distributed transcoding |
US20150244757A1 (en) * | 2012-11-27 | 2015-08-27 | Tencent Technology (Shenzhen) Company Limited | Transcoding Method and System, and Distributed File Apparatus |
US20160065635A1 (en) * | 2014-08-29 | 2016-03-03 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US9288510B1 (en) * | 2014-05-22 | 2016-03-15 | Google Inc. | Adaptive video transcoding based on parallel chunked log analysis |
US20160078241A1 (en) * | 2012-12-21 | 2016-03-17 | Emc Corporation | Generation and use of a modified protected file |
US20160219328A1 (en) * | 2013-03-14 | 2016-07-28 | Comcast Cable Communications, Llc | Allocation of Clamping Functionality |
US9432704B2 (en) | 2011-11-06 | 2016-08-30 | Akamai Technologies Inc. | Segmented parallel encoding with frame-aware, variable-size chunking |
US9467461B2 (en) | 2013-12-21 | 2016-10-11 | Akamai Technologies Inc. | Countering security threats with the domain name system |
US9485456B2 (en) | 2013-12-30 | 2016-11-01 | Akamai Technologies, Inc. | Frame-rate conversion in a distributed computing system |
US20160323351A1 (en) * | 2015-04-29 | 2016-11-03 | Box, Inc. | Low latency and low defect media file transcoding using optimized storage, retrieval, partitioning, and delivery techniques |
US9537955B1 (en) * | 2014-06-10 | 2017-01-03 | EMC IP Holding Company LLC | Sending web content via asynchronous background processes |
WO2017024990A1 (en) * | 2015-08-07 | 2017-02-16 | Mediatek Inc. | Method and apparatus of bitstream random access and synchronization for multimedia applications |
WO2017214510A1 (en) * | 2016-06-10 | 2017-12-14 | Affirmed Networks, Inc. | Transcoding using time stamps |
WO2018075909A1 (en) | 2016-10-21 | 2018-04-26 | Affirmed Networks, Inc. | Adaptive content optimization |
US9986269B1 (en) | 2017-03-03 | 2018-05-29 | Akamai Technologies, Inc. | Maintaining stream continuity counter in a stateless multiplexing system |
US10164989B2 (en) | 2013-03-15 | 2018-12-25 | Nominum, Inc. | Distinguishing human-driven DNS queries from machine-to-machine DNS queries |
US20190028527A1 (en) * | 2017-07-20 | 2019-01-24 | Disney Enterprises, Inc. | Frame-accurate video seeking via web browsers |
US10511864B2 (en) * | 2016-08-31 | 2019-12-17 | Living As One, Llc | System and method for transcoding media stream |
US10531134B2 (en) | 2017-11-10 | 2020-01-07 | Akamai Technologies, Inc. | Determining a time budget for transcoding of video |
US20200034332A1 (en) * | 2014-08-05 | 2020-01-30 | Time Warner Cable Enterprises Llc | Apparatus and methods for lightweight transcoding |
US10764347B1 (en) | 2017-11-22 | 2020-09-01 | Amazon Technologies, Inc. | Framework for time-associated data stream storage, processing, and replication |
CN111918092A (en) * | 2020-08-12 | 2020-11-10 | 广州繁星互娱信息科技有限公司 | Video stream processing method, device, server and storage medium |
US10878028B1 (en) | 2017-11-22 | 2020-12-29 | Amazon Technologies, Inc. | Replicating and indexing fragments of time-associated data streams |
US10944804B1 (en) | 2017-11-22 | 2021-03-09 | Amazon Technologies, Inc. | Fragmentation of time-associated data streams |
US11025691B1 (en) * | 2017-11-22 | 2021-06-01 | Amazon Technologies, Inc. | Consuming fragments of time-associated data streams |
US11032392B1 (en) * | 2019-03-21 | 2021-06-08 | Amazon Technologies, Inc. | Including prior request performance information in requests to schedule subsequent request performance |
US11100051B1 (en) * | 2013-03-15 | 2021-08-24 | Comcast Cable Communications, Llc | Management of content |
US11316909B2 (en) * | 2019-09-26 | 2022-04-26 | Tencent Technology (Shenzhen) Company Limited | Data transmission method and apparatus, and computer storage medium |
US20220141501A1 (en) * | 2019-07-16 | 2022-05-05 | Zhejiang Dahua Technology Co., Ltd. | Systems and methods for live streaming |
CN114845141A (en) * | 2022-04-18 | 2022-08-02 | 上海哔哩哔哩科技有限公司 | Edge transcoding method and device |
US11412272B2 (en) | 2016-08-31 | 2022-08-09 | Resi Media Llc | System and method for converting adaptive stream to downloadable media |
US11470131B2 (en) | 2017-07-07 | 2022-10-11 | Box, Inc. | User device processing of information from a network-accessible collaboration system |
US11765418B1 (en) | 2021-06-29 | 2023-09-19 | Twitch Interactive, Inc. | Seamless transcode server switching |
US11882324B1 (en) * | 2021-09-02 | 2024-01-23 | Amazon Technologies, Inc. | Reconciliation for parallel transcoding |
US12047618B1 (en) | 2022-06-30 | 2024-07-23 | Amazon Technologies, Inc. | Seamless audience-aware encoding profile switching |
US12052447B1 (en) * | 2022-06-27 | 2024-07-30 | Amazon Technologies, Inc. | Dynamically moving transcoding of content between servers |
Families Citing this family (67)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130223509A1 (en) * | 2012-02-28 | 2013-08-29 | Azuki Systems, Inc. | Content network optimization utilizing source media characteristics |
CN103369355A (en) * | 2012-04-10 | 2013-10-23 | 华为技术有限公司 | Online media data conversion method, video playing method and corresponding device |
KR20130123156A (en) * | 2012-05-02 | 2013-11-12 | 삼성전자주식회사 | Apparatas and method for distributed transcoding using for mass servers |
US8806558B1 (en) | 2013-09-20 | 2014-08-12 | Limelight Networks, Inc. | Unique watermarking of content objects according to end user identity |
US9648320B2 (en) * | 2013-02-22 | 2017-05-09 | Comcast Cable Communications, Llc | Methods and systems for processing content |
JP6084080B2 (en) * | 2013-03-15 | 2017-02-22 | キヤノン株式会社 | Imaging device |
CN104365061B (en) * | 2013-05-30 | 2017-12-15 | 华为技术有限公司 | A kind of dispatching method, apparatus and system |
CN104219540B (en) * | 2013-05-30 | 2017-12-15 | 中山市云创知识产权服务有限公司 | Distributing coding/decoding system and method |
ITMI20131710A1 (en) * | 2013-10-15 | 2015-04-16 | Sky Italia S R L | "ENCODING CLOUD SYSTEM" |
CN103731678B (en) * | 2013-12-30 | 2017-02-08 | 世纪龙信息网络有限责任公司 | Video file parallel transcoding method and system |
US9229674B2 (en) | 2014-01-31 | 2016-01-05 | Ebay Inc. | 3D printing: marketplace with federated access to printers |
CN103856559A (en) * | 2014-02-13 | 2014-06-11 | 北京东方通科技股份有限公司 | Working method and system for web services with various versions coexisting |
WO2015120912A1 (en) * | 2014-02-17 | 2015-08-20 | Telefonaktiebolaget L M Ericsson (Publ) | A method and apparatus for allocating physical resources to a summarized resource |
US10423481B2 (en) * | 2014-03-14 | 2019-09-24 | Cisco Technology, Inc. | Reconciling redundant copies of media content |
US10523957B2 (en) * | 2014-10-08 | 2019-12-31 | Vid Scale, Inc. | Optimization using multi-threaded parallel processing framework |
US9595037B2 (en) | 2014-12-16 | 2017-03-14 | Ebay Inc. | Digital rights and integrity management in three-dimensional (3D) printing |
US20160167307A1 (en) * | 2014-12-16 | 2016-06-16 | Ebay Inc. | Systems and methods for 3d digital printing |
EP3241354A4 (en) * | 2014-12-31 | 2018-10-10 | Imagine Communications Corp. | Fragmented video transcoding systems and methods |
US11582202B2 (en) * | 2015-02-16 | 2023-02-14 | Arebus, LLC | System, method and application for transcoding data into media files |
US9769234B2 (en) * | 2015-02-20 | 2017-09-19 | Disney Enterprises, Inc. | Algorithmic transcoding |
EP3262523B1 (en) * | 2015-02-27 | 2019-12-04 | DivX, LLC | System and method for frame duplication and frame extension in live video encoding and streaming |
US10599609B1 (en) * | 2015-03-31 | 2020-03-24 | EMC IP Holding Company LLC | Method and system for elastic, distributed transcoding |
CN104834722B (en) * | 2015-05-12 | 2018-03-02 | 网宿科技股份有限公司 | Content Management System based on CDN |
US10951914B2 (en) * | 2015-08-27 | 2021-03-16 | Intel Corporation | Reliable large group of pictures (GOP) file streaming to wireless displays |
US10506235B2 (en) | 2015-09-11 | 2019-12-10 | Facebook, Inc. | Distributed control of video encoding speeds |
US10602157B2 (en) | 2015-09-11 | 2020-03-24 | Facebook, Inc. | Variable bitrate control for distributed video encoding |
US10499070B2 (en) * | 2015-09-11 | 2019-12-03 | Facebook, Inc. | Key frame placement for distributed video encoding |
US10602153B2 (en) | 2015-09-11 | 2020-03-24 | Facebook, Inc. | Ultra-high video compression |
US10375156B2 (en) | 2015-09-11 | 2019-08-06 | Facebook, Inc. | Using worker nodes in a distributed video encoding system |
US10063872B2 (en) | 2015-09-11 | 2018-08-28 | Facebook, Inc. | Segment based encoding of video |
US10341561B2 (en) | 2015-09-11 | 2019-07-02 | Facebook, Inc. | Distributed image stabilization |
US10447751B2 (en) | 2015-09-28 | 2019-10-15 | Sony Corporation | Parallel transcoding directly from file identifier |
EP3160145A1 (en) * | 2015-10-20 | 2017-04-26 | Harmonic Inc. | Edge server for the distribution of video content available in multiple representations with enhanced open-gop transcoding |
US9954816B2 (en) * | 2015-11-02 | 2018-04-24 | Nominum, Inc. | Delegation of content delivery to a local service |
CN106686406B (en) * | 2015-11-05 | 2019-05-17 | 中国电信股份有限公司 | For realizing the pretreated method and apparatus of video real-time transcoding |
US11449365B2 (en) * | 2016-01-04 | 2022-09-20 | Trilio Data Inc. | Ubiquitous and elastic workload orchestration architecture of hybrid applications/services on hybrid cloud |
EP3400708B1 (en) * | 2016-01-04 | 2021-06-30 | Telefonaktiebolaget LM Ericsson (publ) | Improved network recording apparatus |
US10944806B2 (en) * | 2016-06-22 | 2021-03-09 | The Directv Group, Inc. | Method to insert program boundaries in linear video for adaptive bitrate streaming |
GB2552944B (en) | 2016-08-09 | 2022-07-27 | V Nova Int Ltd | Adaptive content delivery network |
CN106101710A (en) * | 2016-08-26 | 2016-11-09 | 珠海迈科智能科技股份有限公司 | A kind of distributed video transcoding method and device |
US10209892B2 (en) | 2016-11-28 | 2019-02-19 | Hewlett Packard Enterprise Development Lp | Storage of format-aware filter format tracking states |
CN108604146B (en) * | 2017-01-05 | 2021-07-16 | 深圳市汇顶科技股份有限公司 | Touch device and method for determining capacitance induction quantity of touch device |
KR20180093441A (en) | 2017-02-13 | 2018-08-22 | 주식회사 마크애니 | Watermark embedding apparatus and method through image structure conversion |
JP6472478B2 (en) * | 2017-04-07 | 2019-02-20 | キヤノン株式会社 | Video distribution apparatus, video distribution method, and program |
US10565168B2 (en) * | 2017-05-02 | 2020-02-18 | Oxygen Cloud, Inc. | Independent synchronization with state transformation |
US10349108B2 (en) * | 2017-08-24 | 2019-07-09 | Mobitv, Inc. | System and method for storing multimedia files using an archive file format |
US10877798B2 (en) * | 2017-08-31 | 2020-12-29 | Netflix, Inc. | Scalable techniques for executing custom algorithms on media items |
US10764391B2 (en) | 2017-09-14 | 2020-09-01 | Akamai Technologies, Inc. | Origin and cache server cooperation for compute-intensive content delivery |
US10623787B1 (en) * | 2017-11-01 | 2020-04-14 | Amazon Technologies, Inc. | Optimizing adaptive bit rate streaming for content delivery |
US10848538B2 (en) | 2017-11-28 | 2020-11-24 | Cisco Technology, Inc. | Synchronized source selection for adaptive bitrate (ABR) encoders |
US10659512B1 (en) | 2017-12-05 | 2020-05-19 | Amazon Technologies, Inc. | Optimizing adaptive bit rate streaming at edge locations |
US10581948B2 (en) | 2017-12-07 | 2020-03-03 | Akamai Technologies, Inc. | Client side cache visibility with TLS session tickets |
US10439925B2 (en) | 2017-12-21 | 2019-10-08 | Akamai Technologies, Inc. | Sandbox environment for testing integration between a content provider origin and a content delivery network |
US20190197186A1 (en) * | 2017-12-21 | 2019-06-27 | Mastercard International Incorporated | Computer-implemented methods, systems comprising computer-readable media, and electronic devices for automated transcode lifecycle buffering |
US10911792B2 (en) * | 2017-12-22 | 2021-02-02 | Facebook, Inc. | Systems and methods for determining processing completeness within a distributed media item processing environment |
US11019368B2 (en) * | 2018-04-26 | 2021-05-25 | Phenix Real Time Solutions, Inc. | Adaptive bit-rate methods for live broadcasting |
US10630748B1 (en) | 2018-05-01 | 2020-04-21 | Amazon Technologies, Inc. | Video-based encoder alignment |
US10958987B1 (en) | 2018-05-01 | 2021-03-23 | Amazon Technologies, Inc. | Matching based on video data |
US10630990B1 (en) | 2018-05-01 | 2020-04-21 | Amazon Technologies, Inc. | Encoder output responsive to quality metric information |
CN110213598B (en) * | 2018-05-31 | 2021-10-15 | 腾讯科技(深圳)有限公司 | Video transcoding system and method and related product |
US10820066B2 (en) | 2018-06-20 | 2020-10-27 | Cisco Technology, Inc. | Reconciling ABR segments across redundant sites |
US10798393B2 (en) * | 2018-07-09 | 2020-10-06 | Hulu, LLC | Two pass chunk parallel transcoding process |
US10880354B2 (en) | 2018-11-28 | 2020-12-29 | Netflix, Inc. | Techniques for encoding a media title while constraining quality variations |
US10841356B2 (en) | 2018-11-28 | 2020-11-17 | Netflix, Inc. | Techniques for encoding a media title while constraining bitrate variations |
GB2576798B (en) * | 2019-01-04 | 2022-08-10 | Ava Video Security Ltd | Video stream batching |
FI131130B1 (en) * | 2021-08-25 | 2024-10-21 | Lempea Oy | Transfer system for transferring streaming data, and method for transferring streaming data |
US20230108435A1 (en) * | 2021-10-06 | 2023-04-06 | Tencent America LLC | Method and apparatus for parallel transcoding of media content on cloud |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5935207A (en) * | 1996-06-03 | 1999-08-10 | Webtv Networks, Inc. | Method and apparatus for providing remote site administrators with user hits on mirrored web sites |
US20030158913A1 (en) * | 2002-02-15 | 2003-08-21 | Agnoli Giovanni M. | System, method, and computer program product for media publishing request processing |
US20040237097A1 (en) * | 2003-05-19 | 2004-11-25 | Michele Covell | Method for adapting service location placement based on recent data received from service nodes and actions of the service location manager |
US20100131674A1 (en) * | 2008-11-17 | 2010-05-27 | Clearleap, Inc. | Network transcoding system |
US20110107185A1 (en) * | 2009-10-30 | 2011-05-05 | Cleversafe, Inc. | Media content distribution in a social network utilizing dispersed storage |
US20120331089A1 (en) * | 2011-06-21 | 2012-12-27 | Net Power And Light, Inc. | Just-in-time transcoding of application content |
Family Cites Families (95)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5842033A (en) * | 1992-06-30 | 1998-11-24 | Discovision Associates | Padding apparatus for passing an arbitrary number of bits through a buffer in a pipeline system |
US5748786A (en) * | 1994-09-21 | 1998-05-05 | Ricoh Company, Ltd. | Apparatus for compression using reversible embedded wavelets |
JP2795223B2 (en) * | 1995-07-18 | 1998-09-10 | 日本電気株式会社 | Image signal encoding method |
US6101276A (en) * | 1996-06-21 | 2000-08-08 | Compaq Computer Corporation | Method and apparatus for performing two pass quality video compression through pipelining and buffer management |
GB9703470D0 (en) * | 1997-02-19 | 1997-04-09 | Thomson Consumer Electronics | Trick play reproduction of MPEG encoded signals |
US6011868A (en) * | 1997-04-04 | 2000-01-04 | Hewlett-Packard Company | Bitstream quality analyzer |
US6108703A (en) | 1998-07-14 | 2000-08-22 | Massachusetts Institute Of Technology | Global hosting system |
US6483543B1 (en) * | 1998-07-27 | 2002-11-19 | Cisco Technology, Inc. | System and method for transcoding multiple channels of compressed video streams using a self-contained data unit |
US6466624B1 (en) * | 1998-10-28 | 2002-10-15 | Pixonics, Llc | Video decoder with bit stream based enhancements |
US6570926B1 (en) * | 1999-02-25 | 2003-05-27 | Telcordia Technologies, Inc. | Active techniques for video transmission and playback |
US7590739B2 (en) | 1999-11-22 | 2009-09-15 | Akamai Technologies, Inc. | Distributed on-demand computing system |
US6665726B1 (en) | 2000-01-06 | 2003-12-16 | Akamai Technologies, Inc. | Method and system for fault tolerant media streaming over the internet |
US20010047517A1 (en) | 2000-02-10 | 2001-11-29 | Charilaos Christopoulos | Method and apparatus for intelligent transcoding of multimedia data |
US6813387B1 (en) * | 2000-02-29 | 2004-11-02 | Ricoh Co., Ltd. | Tile boundary artifact removal for arbitrary wavelet filters |
AU780811B2 (en) | 2000-03-13 | 2005-04-21 | Sony Corporation | Method and apparatus for generating compact transcoding hints metadata |
US6662329B1 (en) * | 2000-03-23 | 2003-12-09 | International Business Machines Corporation | Processing errors in MPEG data as it is sent to a fixed storage device |
US7240100B1 (en) | 2000-04-14 | 2007-07-03 | Akamai Technologies, Inc. | Content delivery network (CDN) content server request handling mechanism with metadata framework support |
US6996616B1 (en) | 2000-04-17 | 2006-02-07 | Akamai Technologies, Inc. | HTML delivery from edge-of-network servers in a content delivery network (CDN) |
GB2362533A (en) * | 2000-05-15 | 2001-11-21 | Nokia Mobile Phones Ltd | Encoding a video signal with an indicator of the type of error concealment used |
US7111057B1 (en) | 2000-10-31 | 2006-09-19 | Akamai Technologies, Inc. | Method and system for purging content from a content delivery network |
KR100386583B1 (en) * | 2000-11-30 | 2003-06-02 | 엘지전자 주식회사 | Apparatus and method for transcoding video |
US20020143798A1 (en) | 2001-04-02 | 2002-10-03 | Akamai Technologies, Inc. | Highly available distributed storage system for internet content with storage site redirection |
US6671322B2 (en) * | 2001-05-11 | 2003-12-30 | Mitsubishi Electric Research Laboratories, Inc. | Video transcoder with spatial resolution reduction |
US20060236221A1 (en) | 2001-06-27 | 2006-10-19 | Mci, Llc. | Method and system for providing digital media management using templates and profiles |
US8990214B2 (en) | 2001-06-27 | 2015-03-24 | Verizon Patent And Licensing Inc. | Method and system for providing distributed editing and storage of digital media over a network |
US7035332B2 (en) * | 2001-07-31 | 2006-04-25 | Wis Technologies, Inc. | DCT/IDCT with minimum multiplication |
US7693220B2 (en) * | 2002-01-03 | 2010-04-06 | Nokia Corporation | Transmission of video information |
JP2006502465A (en) | 2002-01-11 | 2006-01-19 | アカマイ テクノロジーズ インコーポレイテッド | Java application framework for use in content distribution network (CDN) |
FR2834852B1 (en) * | 2002-01-16 | 2004-06-18 | Canon Kk | METHOD AND DEVICE FOR TIME SEGMENTATION OF A VIDEO SEQUENCE |
MXPA04008889A (en) * | 2002-03-15 | 2004-11-26 | Nokia Corp | Method for coding motion in a video sequence. |
US7236521B2 (en) * | 2002-03-27 | 2007-06-26 | Scientific-Atlanta, Inc. | Digital stream transcoder |
US7133905B2 (en) | 2002-04-09 | 2006-11-07 | Akamai Technologies, Inc. | Method and system for tiered distribution in a content delivery network |
US7787539B2 (en) * | 2002-07-17 | 2010-08-31 | Broadcom Corporation | Decoding and presentation time stamps for MPEG-4 advanced video coding |
US20040093419A1 (en) | 2002-10-23 | 2004-05-13 | Weihl William E. | Method and system for secure content delivery |
US7603689B2 (en) * | 2003-06-13 | 2009-10-13 | Microsoft Corporation | Fast start-up for digital video streams |
US7010044B2 (en) * | 2003-07-18 | 2006-03-07 | Lsi Logic Corporation | Intra 4×4 modes 3, 7 and 8 availability determination intra estimation and compensation |
US7233622B2 (en) * | 2003-08-12 | 2007-06-19 | Lsi Corporation | Reduced complexity efficient binarization method and/or circuit for motion vector residuals |
TWI236605B (en) * | 2003-10-02 | 2005-07-21 | Pixart Imaging Inc | Data flow conversion method and its buffer device |
US7602849B2 (en) * | 2003-11-17 | 2009-10-13 | Lsi Corporation | Adaptive reference picture selection based on inter-picture motion measurement |
KR100526189B1 (en) * | 2004-02-14 | 2005-11-03 | 삼성전자주식회사 | Transcoding system and method for keeping timing parameters constant after transcoding |
US20050232497A1 (en) * | 2004-04-15 | 2005-10-20 | Microsoft Corporation | High-fidelity transcoding |
US7818444B2 (en) | 2004-04-30 | 2010-10-19 | Move Networks, Inc. | Apparatus, system, and method for multi-bitrate content streaming |
CA2600750A1 (en) * | 2005-03-10 | 2006-09-21 | Qualcomm Incorporated | Content adaptive multimedia processing |
US20060256860A1 (en) * | 2005-05-16 | 2006-11-16 | Gordon Stephen E | Transcoding with look-ahead |
US20070058730A1 (en) * | 2005-09-09 | 2007-03-15 | Microsoft Corporation | Media stream error correction |
US8447121B2 (en) * | 2005-09-14 | 2013-05-21 | Microsoft Corporation | Efficient integrated digital video transcoding |
US8879856B2 (en) * | 2005-09-27 | 2014-11-04 | Qualcomm Incorporated | Content driven transcoder that orchestrates multimedia transcoding using content information |
JP4534971B2 (en) * | 2005-11-28 | 2010-09-01 | ソニー株式会社 | Moving picture recording apparatus, moving picture recording method, moving picture transmission method, moving picture recording method program, and recording medium recording the moving picture recording method program |
US8665943B2 (en) * | 2005-12-07 | 2014-03-04 | Sony Corporation | Encoding device, encoding method, encoding program, decoding device, decoding method, and decoding program |
US8566887B2 (en) * | 2005-12-09 | 2013-10-22 | Time Warner Cable Enterprises Llc | Caption data delivery apparatus and methods |
US20070160137A1 (en) * | 2006-01-09 | 2007-07-12 | Nokia Corporation | Error resilient mode decision in scalable video coding |
US7865898B2 (en) | 2006-01-27 | 2011-01-04 | Oracle America, Inc. | Repartitioning parallel SVM computations using dynamic timeout |
US8320450B2 (en) | 2006-03-29 | 2012-11-27 | Vidyo, Inc. | System and method for transcoding between scalable and non-scalable video codecs |
US8582663B2 (en) * | 2006-08-08 | 2013-11-12 | Core Wireless Licensing S.A.R.L. | Method, device, and system for multiplexing of video streams |
US8180920B2 (en) | 2006-10-13 | 2012-05-15 | Rgb Networks, Inc. | System and method for processing content |
US20130166580A1 (en) * | 2006-12-13 | 2013-06-27 | Quickplay Media Inc. | Media Processor |
WO2008114393A1 (en) * | 2007-03-19 | 2008-09-25 | Fujitsu Limited | Bit stream converting method, bit stream converting device, bit stream coupling device, bit stream dividing program, bit stream converting program and bit stream coupling program |
US20090083811A1 (en) | 2007-09-26 | 2009-03-26 | Verivue, Inc. | Unicast Delivery of Multimedia Content |
US20090161766A1 (en) * | 2007-12-21 | 2009-06-25 | Novafora, Inc. | System and Method for Processing Video Content Having Redundant Pixel Values |
US8542748B2 (en) | 2008-03-28 | 2013-09-24 | Sharp Laboratories Of America, Inc. | Methods and systems for parallel video encoding and decoding |
US8908763B2 (en) * | 2008-06-25 | 2014-12-09 | Qualcomm Incorporated | Fragmented reference in temporal compression for video coding |
CN101662622B (en) * | 2008-08-28 | 2012-05-23 | 鸿富锦精密工业(深圳)有限公司 | Electronic photo frame with picture-in-picture display function and method |
US9060187B2 (en) | 2008-12-22 | 2015-06-16 | Netflix, Inc. | Bit rate stream switching |
US9906757B2 (en) | 2009-02-26 | 2018-02-27 | Akamai Technologies, Inc. | Deterministically skewing synchronized events for content streams |
US9565397B2 (en) | 2009-02-26 | 2017-02-07 | Akamai Technologies, Inc. | Deterministically skewing transmission of content streams |
CN101848383A (en) * | 2009-03-24 | 2010-09-29 | 虹软(上海)科技有限公司 | Downsampling decoding method for MPEG2-format video |
US8687685B2 (en) * | 2009-04-14 | 2014-04-01 | Qualcomm Incorporated | Efficient transcoding of B-frames to P-frames |
US20120076203A1 (en) * | 2009-05-29 | 2012-03-29 | Mitsubishi Electric Corporation | Video encoding device, video decoding device, video encoding method, and video decoding method |
US20110010690A1 (en) * | 2009-07-07 | 2011-01-13 | Howard Robert S | System and Method of Automatically Transforming Serial Streaming Programs Into Parallel Streaming Programs |
CN102792291B (en) | 2009-08-17 | 2015-11-25 | 阿卡麦科技公司 | Based on the method and system of the stream distribution of HTTP |
US8879623B2 (en) * | 2009-09-02 | 2014-11-04 | Sony Computer Entertainment Inc. | Picture-level rate control for video encoding a scene-change I picture |
US20110069756A1 (en) * | 2009-09-23 | 2011-03-24 | Alcatel-Lucent Usa Inc. | Predictive encoding/decoding method and apparatus |
US20110296048A1 (en) | 2009-12-28 | 2011-12-01 | Akamai Technologies, Inc. | Method and system for stream handling using an intermediate format |
US8542737B2 (en) * | 2010-03-21 | 2013-09-24 | Human Monitoring Ltd. | Intra video image compression and decompression |
US20110280311A1 (en) * | 2010-05-13 | 2011-11-17 | Qualcomm Incorporated | One-stream coding for asymmetric stereo video |
CN102939719B (en) * | 2010-05-21 | 2016-08-03 | 黑莓有限公司 | For the method and apparatus reducing source in binary system entropy code and decoding |
US20120014433A1 (en) * | 2010-07-15 | 2012-01-19 | Qualcomm Incorporated | Entropy coding of bins across bin groups using variable length codewords |
GB2483294B (en) * | 2010-09-03 | 2013-01-02 | Canon Kk | Method and device for motion estimation of video data coded according to a scalable coding structure |
US20120075436A1 (en) * | 2010-09-24 | 2012-03-29 | Qualcomm Incorporated | Coding stereo video data |
US20120265853A1 (en) | 2010-12-17 | 2012-10-18 | Akamai Technologies, Inc. | Format-agnostic streaming architecture using an http network for streaming |
US8880633B2 (en) | 2010-12-17 | 2014-11-04 | Akamai Technologies, Inc. | Proxy server with byte-based include interpreter |
US8583818B2 (en) | 2011-01-31 | 2013-11-12 | Cbs Interactive Inc. | System and method for custom segmentation for streaming video |
US20120236940A1 (en) * | 2011-03-16 | 2012-09-20 | Texas Instruments Incorporated | Method for Efficient Parallel Processing for Real-Time Video Coding |
US20120254280A1 (en) * | 2011-04-04 | 2012-10-04 | Parker Ii Lansing Arthur | Method and system for distributed computing using mobile devices |
US8788683B2 (en) * | 2011-08-17 | 2014-07-22 | The Nasdaq Omx Group, Inc. | Scalable transcoding for streaming audio |
US9432704B2 (en) | 2011-11-06 | 2016-08-30 | Akamai Technologies Inc. | Segmented parallel encoding with frame-aware, variable-size chunking |
US8325821B1 (en) | 2012-02-08 | 2012-12-04 | Vyumix, Inc. | Video transcoder stream multiplexing systems and methods |
US9246741B2 (en) | 2012-04-11 | 2016-01-26 | Google Inc. | Scalable, live transcoding with support for adaptive streaming and failover |
US9386331B2 (en) | 2012-07-26 | 2016-07-05 | Mobitv, Inc. | Optimizing video clarity |
US9721036B2 (en) * | 2012-08-14 | 2017-08-01 | Microsoft Technology Licensing, Llc | Cooperative web browsing using multiple devices |
US9264478B2 (en) * | 2012-10-30 | 2016-02-16 | Microsoft Technology Licensing, Llc | Home cloud with virtualized input and output roaming over network |
US20140140417A1 (en) | 2012-11-16 | 2014-05-22 | Gary K. Shaffer | System and method for providing alignment of multiple transcoders for adaptive bitrate streaming in a network environment |
US9924164B2 (en) | 2013-01-03 | 2018-03-20 | Disney Enterprises, Inc. | Efficient re-transcoding of key-frame-aligned unencrypted assets |
US9294530B2 (en) | 2013-05-24 | 2016-03-22 | Cisco Technology, Inc. | Producing equivalent content across encapsulators in a network environment |
US20150063435A1 (en) * | 2013-08-30 | 2015-03-05 | Barry Benight | Techniques for reference based transcoding |
-
2012
- 2012-11-02 US US13/667,272 patent/US9432704B2/en active Active
- 2012-11-02 US US13/667,267 patent/US20130117418A1/en not_active Abandoned
-
2016
- 2016-07-25 US US15/219,064 patent/US10027997B2/en active Active
-
2018
- 2018-05-02 US US15/969,563 patent/US10595059B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5935207A (en) * | 1996-06-03 | 1999-08-10 | Webtv Networks, Inc. | Method and apparatus for providing remote site administrators with user hits on mirrored web sites |
US20030158913A1 (en) * | 2002-02-15 | 2003-08-21 | Agnoli Giovanni M. | System, method, and computer program product for media publishing request processing |
US20040237097A1 (en) * | 2003-05-19 | 2004-11-25 | Michele Covell | Method for adapting service location placement based on recent data received from service nodes and actions of the service location manager |
US20100131674A1 (en) * | 2008-11-17 | 2010-05-27 | Clearleap, Inc. | Network transcoding system |
US20110107185A1 (en) * | 2009-10-30 | 2011-05-05 | Cleversafe, Inc. | Media content distribution in a social network utilizing dispersed storage |
US20120331089A1 (en) * | 2011-06-21 | 2012-12-27 | Net Power And Light, Inc. | Just-in-time transcoding of application content |
Cited By (89)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9432704B2 (en) | 2011-11-06 | 2016-08-30 | Akamai Technologies Inc. | Segmented parallel encoding with frame-aware, variable-size chunking |
US20140101118A1 (en) * | 2012-10-04 | 2014-04-10 | Telefonaktiebolaget L M Ericsson (Publ) | System and method for creating multiple versions of a descriptor file |
US8949206B2 (en) * | 2012-10-04 | 2015-02-03 | Ericsson Television Inc. | System and method for creating multiple versions of a descriptor file |
US20140344398A1 (en) * | 2012-10-15 | 2014-11-20 | Limelight Networks, Inc. | Control systems and methods for cloud resource management |
US20150244757A1 (en) * | 2012-11-27 | 2015-08-27 | Tencent Technology (Shenzhen) Company Limited | Transcoding Method and System, and Distributed File Apparatus |
US10291673B2 (en) * | 2012-11-27 | 2019-05-14 | Tencent Technology (Shenzhen) Company Limited | Transcoding method and system, and distributed file apparatus |
US11570491B2 (en) | 2012-12-21 | 2023-01-31 | Ustudio, Inc. | Media distribution and management platform |
US8826332B2 (en) | 2012-12-21 | 2014-09-02 | Ustudio, Inc. | Media distribution and management platform |
US11303941B2 (en) | 2012-12-21 | 2022-04-12 | Ustudio, Inc. | Media distribution and management platform |
US9501212B2 (en) | 2012-12-21 | 2016-11-22 | Ustudio, Inc | Media distribution and management platform |
US10771825B2 (en) | 2012-12-21 | 2020-09-08 | Ustudio, Inc. | Media distribution and management platform |
US9811675B2 (en) * | 2012-12-21 | 2017-11-07 | EMC IP Holding Company LLC | Generation and use of a modified protected file |
US20160078241A1 (en) * | 2012-12-21 | 2016-03-17 | Emc Corporation | Generation and use of a modified protected file |
US11792469B2 (en) * | 2013-03-14 | 2023-10-17 | Comcast Cable Communications, Llc | Allocation of video recording functionality |
US20180098113A1 (en) * | 2013-03-14 | 2018-04-05 | Comcast Cable Communications, Llc | Allocation of Video Recording Functionality |
US20160219328A1 (en) * | 2013-03-14 | 2016-07-28 | Comcast Cable Communications, Llc | Allocation of Clamping Functionality |
US10051024B2 (en) * | 2013-03-14 | 2018-08-14 | Charter Communications Operating, Llc | System and method for adapting content delivery |
US9729914B2 (en) * | 2013-03-14 | 2017-08-08 | Comcast Cable Communications, Llc | Allocation of video recording functionality |
US20140280744A1 (en) * | 2013-03-14 | 2014-09-18 | Charter Communications Operating, Llc | System and method for adapting content delivery |
US11100051B1 (en) * | 2013-03-15 | 2021-08-24 | Comcast Cable Communications, Llc | Management of content |
US10164989B2 (en) | 2013-03-15 | 2018-12-25 | Nominum, Inc. | Distinguishing human-driven DNS queries from machine-to-machine DNS queries |
US20140359166A1 (en) * | 2013-05-31 | 2014-12-04 | Broadcom Corporation | Providing multiple abr streams using a single transcoder |
US9544665B2 (en) * | 2013-05-31 | 2017-01-10 | Broadcom Corporation | Providing multiple ABR streams using a single transcoder |
JP2016541178A (en) * | 2013-11-21 | 2016-12-28 | グーグル インコーポレイテッド | Transcoding media streams using subchunking |
CN106063278A (en) * | 2013-11-21 | 2016-10-26 | 谷歌公司 | Transcoding media streams using subchunking |
US9179183B2 (en) | 2013-11-21 | 2015-11-03 | Google Inc. | Transcoding media streams using subchunking |
WO2015077289A1 (en) * | 2013-11-21 | 2015-05-28 | Google Inc. | Transcoding media streams using subchunking |
US9467461B2 (en) | 2013-12-21 | 2016-10-11 | Akamai Technologies Inc. | Countering security threats with the domain name system |
JP2017507533A (en) * | 2013-12-30 | 2017-03-16 | グーグル インコーポレイテッド | Content adaptive chunking for distributed transcoding |
US9485456B2 (en) | 2013-12-30 | 2016-11-01 | Akamai Technologies, Inc. | Frame-rate conversion in a distributed computing system |
US20150189222A1 (en) * | 2013-12-30 | 2015-07-02 | Google Inc. | Content-adaptive chunking for distributed transcoding |
US9510028B2 (en) * | 2014-05-22 | 2016-11-29 | Google Inc. | Adaptive video transcoding based on parallel chunked log analysis |
US9288510B1 (en) * | 2014-05-22 | 2016-03-15 | Google Inc. | Adaptive video transcoding based on parallel chunked log analysis |
US9537955B1 (en) * | 2014-06-10 | 2017-01-03 | EMC IP Holding Company LLC | Sending web content via asynchronous background processes |
US20200034332A1 (en) * | 2014-08-05 | 2020-01-30 | Time Warner Cable Enterprises Llc | Apparatus and methods for lightweight transcoding |
US11863606B2 (en) | 2014-08-29 | 2024-01-02 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US9923942B2 (en) * | 2014-08-29 | 2018-03-20 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US10855735B2 (en) * | 2014-08-29 | 2020-12-01 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US20160065635A1 (en) * | 2014-08-29 | 2016-03-03 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US11218528B2 (en) * | 2014-08-29 | 2022-01-04 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US10341401B2 (en) * | 2014-08-29 | 2019-07-02 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US11522932B2 (en) | 2014-08-29 | 2022-12-06 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US20190327282A1 (en) * | 2014-08-29 | 2019-10-24 | The Nielsen Company (Us), Llc | Using messaging associated with adaptive bitrate streaming to perform media monitoring for mobile platforms |
US10409781B2 (en) | 2015-04-29 | 2019-09-10 | Box, Inc. | Multi-regime caching in a virtual file system for cloud-based shared content |
US10402376B2 (en) | 2015-04-29 | 2019-09-03 | Box, Inc. | Secure cloud-based shared content |
US11663168B2 (en) | 2015-04-29 | 2023-05-30 | Box, Inc. | Virtual file system for cloud-based shared content |
US20160323351A1 (en) * | 2015-04-29 | 2016-11-03 | Box, Inc. | Low latency and low defect media file transcoding using optimized storage, retrieval, partitioning, and delivery techniques |
US10929353B2 (en) | 2015-04-29 | 2021-02-23 | Box, Inc. | File tree streaming in a virtual file system for cloud-based shared content |
US10180947B2 (en) | 2015-04-29 | 2019-01-15 | Box, Inc. | File-agnostic data downloading in a virtual file system for cloud-based shared content |
US10866932B2 (en) | 2015-04-29 | 2020-12-15 | Box, Inc. | Operation mapping in a virtual file system for cloud-based shared content |
US10942899B2 (en) | 2015-04-29 | 2021-03-09 | Box, Inc. | Virtual file system for cloud-based shared content |
WO2017024990A1 (en) * | 2015-08-07 | 2017-02-16 | Mediatek Inc. | Method and apparatus of bitstream random access and synchronization for multimedia applications |
US10924820B2 (en) | 2015-08-07 | 2021-02-16 | Mediatek Inc. | Method and apparatus of bitstream random access and synchronization for multimedia applications |
US10165310B2 (en) | 2016-06-10 | 2018-12-25 | Affirmed Networks, Inc. | Transcoding using time stamps |
WO2017214510A1 (en) * | 2016-06-10 | 2017-12-14 | Affirmed Networks, Inc. | Transcoding using time stamps |
US10511864B2 (en) * | 2016-08-31 | 2019-12-17 | Living As One, Llc | System and method for transcoding media stream |
US11758200B2 (en) | 2016-08-31 | 2023-09-12 | Resi Media Llc | System and method for converting adaptive stream to downloadable media |
US12096045B2 (en) | 2016-08-31 | 2024-09-17 | Resi Media Llc | System and method for transcoding media stream |
US10951925B2 (en) | 2016-08-31 | 2021-03-16 | Resi Media Llc | System and method for transcoding media stream |
US12088859B2 (en) | 2016-08-31 | 2024-09-10 | Resi Media Llc | System and method for converting adaptive stream to downloadable media |
US11412272B2 (en) | 2016-08-31 | 2022-08-09 | Resi Media Llc | System and method for converting adaptive stream to downloadable media |
US11936923B1 (en) | 2016-08-31 | 2024-03-19 | Resi Media Llc | System and method for transcoding media stream |
US11736739B2 (en) | 2016-08-31 | 2023-08-22 | Resi Media Llc | System and method for transcoding media stream |
US11405665B1 (en) | 2016-08-31 | 2022-08-02 | Resi Media Llc | System and method for asynchronous uploading of live digital multimedia with resumable connections |
US11405661B2 (en) | 2016-08-31 | 2022-08-02 | Resi Media Llc | System and method for transcoding media stream |
EP3529990A4 (en) * | 2016-10-21 | 2020-05-06 | Affirmed Networks, Inc. | Adaptive content optimization |
US10129355B2 (en) | 2016-10-21 | 2018-11-13 | Affirmed Networks, Inc. | Adaptive content optimization |
WO2018075909A1 (en) | 2016-10-21 | 2018-04-26 | Affirmed Networks, Inc. | Adaptive content optimization |
US9986269B1 (en) | 2017-03-03 | 2018-05-29 | Akamai Technologies, Inc. | Maintaining stream continuity counter in a stateless multiplexing system |
US11962627B2 (en) | 2017-07-07 | 2024-04-16 | Box, Inc. | User device processing of information from a network-accessible collaboration system |
US11470131B2 (en) | 2017-07-07 | 2022-10-11 | Box, Inc. | User device processing of information from a network-accessible collaboration system |
US11146608B2 (en) * | 2017-07-20 | 2021-10-12 | Disney Enterprises, Inc. | Frame-accurate video seeking via web browsers |
US11722542B2 (en) | 2017-07-20 | 2023-08-08 | Disney Enterprises, Inc. | Frame-accurate video seeking via web browsers |
US20190028527A1 (en) * | 2017-07-20 | 2019-01-24 | Disney Enterprises, Inc. | Frame-accurate video seeking via web browsers |
US10531134B2 (en) | 2017-11-10 | 2020-01-07 | Akamai Technologies, Inc. | Determining a time budget for transcoding of video |
US10764347B1 (en) | 2017-11-22 | 2020-09-01 | Amazon Technologies, Inc. | Framework for time-associated data stream storage, processing, and replication |
US10878028B1 (en) | 2017-11-22 | 2020-12-29 | Amazon Technologies, Inc. | Replicating and indexing fragments of time-associated data streams |
US11025691B1 (en) * | 2017-11-22 | 2021-06-01 | Amazon Technologies, Inc. | Consuming fragments of time-associated data streams |
US10944804B1 (en) | 2017-11-22 | 2021-03-09 | Amazon Technologies, Inc. | Fragmentation of time-associated data streams |
US11032392B1 (en) * | 2019-03-21 | 2021-06-08 | Amazon Technologies, Inc. | Including prior request performance information in requests to schedule subsequent request performance |
US11849157B2 (en) * | 2019-07-16 | 2023-12-19 | Zhejiang Dahua Technology Co., Ltd. | Systems and methods for live streaming |
US20220141501A1 (en) * | 2019-07-16 | 2022-05-05 | Zhejiang Dahua Technology Co., Ltd. | Systems and methods for live streaming |
US11316909B2 (en) * | 2019-09-26 | 2022-04-26 | Tencent Technology (Shenzhen) Company Limited | Data transmission method and apparatus, and computer storage medium |
CN111918092A (en) * | 2020-08-12 | 2020-11-10 | 广州繁星互娱信息科技有限公司 | Video stream processing method, device, server and storage medium |
US11765418B1 (en) | 2021-06-29 | 2023-09-19 | Twitch Interactive, Inc. | Seamless transcode server switching |
US11882324B1 (en) * | 2021-09-02 | 2024-01-23 | Amazon Technologies, Inc. | Reconciliation for parallel transcoding |
CN114845141A (en) * | 2022-04-18 | 2022-08-02 | 上海哔哩哔哩科技有限公司 | Edge transcoding method and device |
US12052447B1 (en) * | 2022-06-27 | 2024-07-30 | Amazon Technologies, Inc. | Dynamically moving transcoding of content between servers |
US12047618B1 (en) | 2022-06-30 | 2024-07-23 | Amazon Technologies, Inc. | Seamless audience-aware encoding profile switching |
Also Published As
Publication number | Publication date |
---|---|
US10027997B2 (en) | 2018-07-17 |
US20130114744A1 (en) | 2013-05-09 |
US9432704B2 (en) | 2016-08-30 |
US20160337675A1 (en) | 2016-11-17 |
US10595059B2 (en) | 2020-03-17 |
US20180332324A1 (en) | 2018-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10595059B2 (en) | Segmented parallel encoding with frame-aware, variable-size chunking | |
US12052450B2 (en) | Fragment server directed device fragment caching | |
US9787747B2 (en) | Optimizing video clarity | |
Ma et al. | Mobile video delivery with HTTP | |
KR101983432B1 (en) | Devices, systems, and methods for converting or translating dynamic adaptive streaming over http(dash) to http live streaming(hls) | |
KR101701182B1 (en) | A method for recovering content streamed into chunk | |
US8812621B2 (en) | Reducing fetching load on cache servers in adaptive streaming | |
US9197900B2 (en) | Localized redundancy for fragment processing | |
US20150007237A1 (en) | On the fly transcoding of video on demand content for adaptive streaming | |
US20190182518A1 (en) | Adaptive content delivery network | |
US8719440B2 (en) | Intelligent device media stream caching | |
US20190045230A1 (en) | Distributed scalable encoder resources for live streams | |
US20130064286A1 (en) | Weighted encoder fragment scheduling | |
WO2012083298A2 (en) | Format-agnostic streaming architecture using an http network for streamings | |
US10531134B2 (en) | Determining a time budget for transcoding of video | |
US9338482B2 (en) | Enhanced group of pictures (GOP) alignment in media stream variants | |
US20130064287A1 (en) | Management of resources for live stream variant processing | |
US20130135525A1 (en) | Fragment boundary independent closed captioning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: AKAMAI TECHNOLOGIES, INC., MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MUTTON, JAMES A.;LYNCH, RYAN F.;SIGNING DATES FROM 20121031 TO 20121101;REEL/FRAME:029329/0273 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |