CN101668132A - Method and system for matching and processing captions - Google Patents
Method and system for matching and processing captions Download PDFInfo
- Publication number
- CN101668132A CN101668132A CN200810141753A CN200810141753A CN101668132A CN 101668132 A CN101668132 A CN 101668132A CN 200810141753 A CN200810141753 A CN 200810141753A CN 200810141753 A CN200810141753 A CN 200810141753A CN 101668132 A CN101668132 A CN 101668132A
- Authority
- CN
- China
- Prior art keywords
- captions
- transcoding
- caption area
- user terminal
- bit diagram
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000012545 processing Methods 0.000 title abstract description 6
- 238000004519 manufacturing process Methods 0.000 claims abstract description 5
- 238000010586 diagram Methods 0.000 claims description 92
- 230000002123 temporal effect Effects 0.000 claims description 20
- 238000001514 detection method Methods 0.000 claims description 7
- 230000015572 biosynthetic process Effects 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 238000004891 communication Methods 0.000 claims description 5
- 238000003672 processing method Methods 0.000 claims description 4
- 238000012790 confirmation Methods 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 2
- 239000003795 chemical substances by application Substances 0.000 description 22
- 230000005540 biological transmission Effects 0.000 description 7
- 238000003860 storage Methods 0.000 description 6
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 241001272567 Hominoidea Species 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 238000012856 packing Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/278—Subtitling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/23614—Multiplexing of additional data and video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/258—Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
- H04N21/25808—Management of client data
- H04N21/25825—Management of client data involving client display capabilities, e.g. screen resolution of a mobile phone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/258—Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
- H04N21/25808—Management of client data
- H04N21/25833—Management of client data involving client hardware characteristics, e.g. manufacturer, processing or storage capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/2662—Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4348—Demultiplexing of additional data and video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/4508—Management of client data or end-user data
- H04N21/4516—Management of client data or end-user data involving client characteristics, e.g. Set-Top-Box type, software version or amount of memory available
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/65—Transmission of management data between client and server
- H04N21/658—Transmission by the client directed to the server
- H04N21/6582—Data stored in the client, e.g. viewing habits, hardware capabilities, credit card number
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Computer Graphics (AREA)
- Computer Security & Cryptography (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The embodiment of the invention discloses a method and a system for matching and processing captions. The method comprises the following steps: receiving a play request from a user terminal; acquiringthe equipment capacity information of the user terminal according to the play request; and providing a transcoded caption bit map group for the user terminal according to the equipment capacity information. The embodiment of the invention also discloses a method for producing the caption bit map group, a caption production device, a device for matching and processing the captions and the system for matching and processing the captions. The method and the system for matching and processing the captions can transmit the caption bit map groups adapted to the resolution of screens of different terminals according to the display equipment capacity of the terminals, so that users can clearly recognize the captions when watching video.
Description
Technical field
The present invention relates to field of video communication, particularly a kind of method and system of captions matching treatment.
Background technology
Along with the development of mobile communication technology, the particularly arriving in 3G epoch begins commercialization gradually with the mobile video business headed by the mobile TV.Under the support of high-speed mobile communications technology, the user can enjoy high-quality Video service on the portable terminal that with the mobile phone is representative.But display device displaying video content by portable terminal, because the restriction of portable terminal self disposal ability and screen size, current mobile video business is many by MPEG-1 to the video content that the user provides, the video file of the conventional resolution of form such as MPEG-2 and MPEG-4 obtains through the common video transcoding and after reducing resolution, in above-mentioned video file, all embedded mostly and the pairing captions of its content, to make things convenient for the understanding of user video content.
The inventor finds in the process of invention, for the video content that provides to terminal, because it has passed through transcoding and resolution decreasing conversion process, so being embedded in captions size in the video content also can significantly reduce simultaneously and thicken unclear, the user of current mobile terminal almost can't know identification caption information correctly when watching video, this will influence, and the user is complete to be viewed and admired and understands video content, thereby has also influenced user's experience.
Summary of the invention
The embodiment of the invention provides a kind of method and system and corresponding device thereof of captions matching treatment in order to solve in the prior art in the video captions unclear defective that thickens.
A kind of captions matched processing method that the embodiment of the invention provides specifically comprises: receive the playing request from user terminal; Obtain the device capability information of described user terminal according to described playing request; According to described device capability information, for described user terminal provides transcoding captions bit diagram group.
In addition, the manufacture method of a kind of captions bit diagram of embodiment of the invention confession also group specifically comprises: obtain caption area from video image, and obtain described caption area time information corresponding and positional information; According to the described caption area that obtains and described temporal information and positional information, generate original captions bit diagram group; According to resolution described original captions bit diagram group is carried out transcoding and handle formation transcoding captions bit diagram group.
In addition, a kind of captions producing device of embodiment of the invention confession also, this device comprises: the caption area acquisition module: the positional information and the temporal information that are used for obtaining the caption area and the described caption area of video image, according to described caption area, and the positional information of described caption area and temporal information, generate original captions bit diagram group; Transcoding module: be used for described original captions bit diagram group being carried out transcoding and handle formation transcoding captions bit diagram group according to resolution.
In addition, a kind of captions producing device of embodiment of the invention confession also, this device comprises: the caption area acquisition module: the positional information and the temporal information that are used for obtaining the caption area and the described caption area of video image, according to described caption area, and the positional information of described caption area and temporal information, generate original captions bit diagram group; Transcoding module: be used for described original captions bit diagram group being carried out transcoding and handle formation transcoding captions bit diagram group according to resolution.
In addition, a kind of captions matching treatment of embodiment of the invention confession also system comprises: the captions producing device is used for making transcoding captions bit diagram group according to resolution; Captions matching treatment device is used to receive the playing request from user terminal, carries the device capability information of described user terminal in the described playing request, and provides transcoding captions bit diagram group according to the device capability information of described terminal for terminal.
In addition, a kind of user terminal of embodiment of the invention confession also comprises: the playing request module: generate playing request, comprise the address information of the device capability information or the customer agent file database of described user terminal in the described playing request; Video playback module: displaying video, and transcoding captions bit diagram group;
Communication module: be used to send playing request; Receiver, video, and transcoding captions bit diagram group.
As can be seen from the above embodiments, the present invention can send the captions bit diagram group that adapts with its screen resolution, thereby make the user know the identification caption literal when watching video according to the display device ability of different terminals.
Description of drawings
1. the captions matching treatment system construction drawing that provides for the embodiment of the invention of accompanying drawing 1;
2. the captions matching treatment structure drawing of device that provides for the embodiment of the invention of accompanying drawing 2;
3. the captions producing device structure chart that provides for the embodiment of the invention of accompanying drawing 3;
4. the structure chart of the described user terminal that provides for the embodiment of the invention of accompanying drawing 4;
5. the described captions bit diagram manufacture method flow chart that provides for the embodiment of the invention of accompanying drawing 5;
6. the described captions matched processing method flow chart that provides for the embodiment of the invention of accompanying drawing 6;
7. the schematic diagram of a kind of sending method of the described video that provides for the embodiment of the invention of accompanying drawing 7 and transcoding captions bit diagram group;
8. the described captions matched processing method flow chart that provides for the embodiment of the invention of accompanying drawing 8;
Embodiment
In order to make those skilled in the art person understand the present invention better, the present invention is described in further detail below in conjunction with accompanying drawing.
Accompanying drawing 1 is an a kind of captions matching treatment provided by the invention system, and this system comprises: captions matching treatment device 101, captions producing device 102, customer agent file database 103.Wherein:
Captions matching treatment device 101, be used to receive playing request from user terminal, carry the device capability information of described user terminal in the described playing request, and provide transcoding captions bit diagram group for terminal according to the device capability information of described terminal, this device also is used to receive the playing request of carrying customer agent file database address information from user terminal, and obtain the device capability information of described user terminal according to described customer agent file database address information, further the device capability information according to described terminal provides transcoding captions bit diagram group for terminal.
In another embodiment of the present invention, captions matching treatment device of the present invention may further include: receiver module 1011, and sending module 1012, as shown in Figure 2, wherein:
Receiver module 1011: being used to receive the playing request that user terminal is sent, can also analyzing the device capability information of user terminal self from described playing request, can be the model of equipment, classification, the size of screen, the resolution of screen etc.; In one embodiment of the invention, this module also can parse the customer agent file database address information of storage subscriber terminal apparatus information in the playing request, as above-mentioned database URL address, described URL address can be represented with forms such as IP address or domain names.And further obtain the device capability information of described user terminal from described customer agent file database.Carry the information such as the terminal models of portable terminal in this address information.
Sending module 1012: be used for device capability information, issue the transcoding video that adapts with its equipment to described user terminal according to the user terminal that obtains by receiver module, and the transcoding captions bit diagram group that adapts with its equipment.The form of the transcoding video of its transmission and captions bit diagram group can be a document form, or the Media Stream form etc.
Captions producing device 102: be used for making described captions bit diagram group according to resolution, this device obtains a series of and the transcoding captions bit diagram group corresponding different resolution of video content by the processing to original video.In order to provide suitable captions bit diagram group to dissimilar, the user terminal of different display devices and screen size, the captions producing device at first will be determined the rough position of caption area in video image, and this position can be the coordinate information in the video image, picture element number information etc.After this, from video image sequence, be partitioned into caption area, and obtain simultaneously and corresponding temporal information of this caption area and positional information, for example the broadcast start time point of this caption area correspondence and concluding time point, or the video image initial frame number of this caption area correspondence and end frame number, this caption area corresponding position in screen can be this caption area central point at the coordinate figure of screen etc.According to the above-mentioned caption area that obtains, and with corresponding temporal information of described caption area and positional information, generate captions bit diagram group.At last original captions bit diagram group is carried out the resolution transcoding and handle, obtain the transcoding captions bit diagram group of a series of different resolutions according to the physical size of the different screen of common type user terminal and resolution.This captions producing device can also further carry out transcoding according to different resolution to original video to be handled, and obtains the transcoding video of a series of different resolutions.
, captions producing device 102 may further include with lower module in one embodiment of the invention, as shown in Figure 3, comprising: caption area acquisition module 1021, transcoding module 1022, wherein:
Caption area acquisition module 1021: be used for detecting the caption area of video image, and the positional information of caption area and temporal information, according to described caption area, and the positional information of caption area and temporal information, captions bit diagram group generated.In another embodiment of the present invention, the caption area acquisition module may further include with lower unit: caption area rough detection unit 1021a, caption area confirmation unit 1021b, caption area positioning unit 1021c, caption area tracking cell 1021d, captions bit diagram group forms unit 1021e, as shown in Figure 3, and wherein:
Caption area rough detection unit 1021a: this unit is used for the caption area of video image is done rough detection, determine rough position and the scope of caption area in screen, its detection method has multiple, and preferable is to adopt visual texture information as the foundation that detects.
Caption area confirmation unit 1021b: this unit is used for confirming that through the detected possible caption area of Rough Inspection the caption area after the affirmation is retrievable caption area.The method of its affirmation has multiple, and preferable carries out the affirmation of caption area for adopting based on the method for zone-texture constraint.
Caption area positioning unit 1021c: this unit is used for the accurate position of the caption area through confirming is positioned, accomplished in many ways can be passed through in the caption area location, for example in pixel domain, obtain locating information in the projected outline of the marginal point density of level and vertical direction or grey scale pixel value, or utilize the method location caption area of piece texture strength projection in the compression domain.System will obtain dimension information and the positional information of caption area with respect to original video image behind caption area location, for example length of caption area and width, the coordinate at center etc.
Caption area tracking cell 1021d: this module is used to obtain the temporal information of caption area, as the zero-time and the concluding time of this caption area broadcast, also can be the initial frame number and the end frame number of zero-time and duration or this caption area correspondence, also can be the number of initial frame number and lasting frame.The method that realizes this functions of modules has multiple as based on the captions tracking of projected outline, or adopts the motion vector in the compression domain to carry out the method that captions are followed the tracks of.
Captions bit diagram group forms unit 1021e: it also can be that its corresponding background frames carries out caption area and cuts apart that start frame that the back drawn or abort frame are followed the tracks of in the location, and caption area is split from original video image.Caption area cut apart the method that can adopt based on the Max or the Min of multiframe, also can adopt based on waiting cutting apart of histogrammic captions other method to realize.After corresponding frame is cut apart through caption area in the original video, formed original captions bit diagram group.
Transcoding module 1022: the transcoding that the original captions bit diagram group that is formed by the captions acquisition module is carried out different resolution is handled, and forms the transcoding captions bit diagram group of different resolution.Function that can also compatible existing video code conversion generates the video of a series of different resolutions to original video, in another embodiment of the present invention, transcoding module 1022 may further include: captions bit diagram transcoding units 1022a, video code conversion unit 1022b, as shown in Figure 3, wherein:
Captions bit diagram transcoding units 1022a: according to the different display devices of common type user terminal, physical size as screen, the size of screen resolution etc., respectively original captions bit diagram group being carried out the resolution transcoding handles, obtain the transcoding captions bit diagram group of a series of different resolutions, be used to offer the user terminal use of above-mentioned different display devices.
Video code conversion unit 1022b: according to the different display devices of common type user terminal, physical size as screen, the size of screen resolution etc., respectively original video being carried out the resolution transcoding handles, obtain the transcoding video of a series of different resolutions, be used to offer the user terminal use of above-mentioned different display devices.
In another embodiment of the present invention, can also comprise customer agent file database 103 in the captions matching treatment system; , storing the facility information of various user terminals in the customer agent file database, described facility information can comprise: the title of equipment, type, the size of screen, the resolution of screen etc.In the system that has the customer agent file database, user terminal can carry the customer agent file database in the playing request that the Video service module sends address information, the URL address of database as described, can represent with forms such as IP address or domain names, the Video service module is according to the address information of described customer agent file database, from the customer agent file database, obtain the facility information of this user terminal, and then issue the transcoding video that adapts with its equipment, and the transcoding captions bit diagram group that adapts with its equipment to user terminal.
Captions producing device and the captions matching treatment device introduced among the above embodiment in captions matching treatment system and the system can be integrated in the application of reality in the existing streaming media server, and also can independently respectively exist provides service to user terminal.Described user terminal is used for sending playing request to the Video service module, and with the carried terminal device capability information, this information can be this terminal screen size in the playing request, also can be the model of this terminal, classification etc.In another one embodiment of the present invention, also can not carry the equipment of itself ability information in the playing request that user terminal sends, and the address information of carrying the customer agent file database of storage equipment of itself information, for example the URL address of customer agent file database can be represented with the form of IP address or domain name.User terminal also is used to receive and the displaying video service module sends transcoding video and transcoding captions bit diagram group.The type of user terminal can be traditional display terminal, and as television set, PC display etc. also can be the portable terminals that has display device, as mobile TV, have the mobile phone of video playback capability, mobile TV etc.User terminal may further include with lower unit, as shown in Figure 4 in another embodiment of the present invention: playing request module 201, video playback module 202, communication module 203.
Playing request module 201:,, can be the title of video or the numbering of video etc. as the required video content of watching of user according to user's demand, and the ability of equipment of itself, can be the size dimension of display screen, the model of equipment, the disposal abilities of processor etc. generate playing request.
Video playback module 202: the transcoding video that the displaying video service module issues, and transcoding captions bit diagram group, each captions bit diagram successively appears in the screen by wherein temporal information in the captions bit diagram group when displaying video, play simultaneously with corresponding video content, the position of captions bit diagram in screen determined by the positional information in the captions bit diagram, described positional information can be a coordinate information, center information etc.
Communication module 203: be used for the transmission of playing request; Receive the transcoding video that adapts with its equipment, and the transcoding captions bit diagram group that adapts with its equipment.
In one embodiment of the invention, caption area is split from original video, make according to different resolution a series of different resolutions transcoding captions bit diagram group concrete steps as shown in Figure 5, comprising:
Step 501: the at first definite video that need handle;
Step 502: the scope to caption area detects, obtain the rough zone of caption area in video image, the better detecting method is the caption area detection method based on the DCT coefficient, and in addition, other methods that can realize the caption area range detection also are fine;
Step 503: the caption area that detects is confirmed, obtained retrievable caption area.Can adopt the method for morphologic filtering to come the gap between concatenation character and the character and eliminate noise in this step, what the method for affirmation caption area was preferable is to adopt based on methods such as zone-texture constraints, and other methods that can realize that caption area is confirmed also are fine;
Step 504: to positioning, obtain the positional information of caption area, obtain size and the position of caption area with respect to original video image through the caption area of confirming.The preferred positioning method is for adopting the method for piece texture strength projection in the compression domain.Other methods that can realize the caption area location also are fine;
Step 505: to following the tracks of through the caption area of location, obtain the pairing temporal information of caption area, for example the zero-time corresponding to video play in captions, and the concluding time, captions are play corresponding to the initial frame number of video and finished frame number; The method that the preferable motion vector for adopting in the compression domain of used tracking is followed the tracks of, other can be used in caption area and follow the tracks of, and determine that the method for the pairing reproduction time information of caption area also is fine;
Step 506: the caption area that obtains in the step 503 is separated from original video image.Its preferable dividing method merges method preceding, background for adopting.Other methods that can realize that caption area is cut apart also are fine;
Step 507: all caption areas that will split from the original video image generate single captions unit at first respectively together with its relevant temporal information and positional information, and all captions unit have constituted the original captions bit diagram group of this video together.The information that wherein single captions unit is comprised can comprise: captions bit diagram data, captions bit diagram positional information, for example: the coordinate at captions bit diagram center etc., the reproduction time information of captions bit diagram, for example pairing broadcast zero-time of this captions bit diagram and concluding time also can be pairing initial frame number of this captions bit diagram and end frame number.Present embodiment provides the suggestion storage mode of a kind of captions unit, shown in following table one, need to prove that this storage mode is not unique mode, and other modes that can realize described information stores also are fine.
Table one
Field | Form |
The location byte | Be fixed as 0xFF (1Byte) |
Initial frame number | Unsigned int (4Byte) |
Finish frame number | Unsigned int (4Byte) |
Captions center X coordinate | Floating-point shape (4Byte) |
Captions center Y coordinate | Floating-point shape (4Byte) |
Captions bit diagram length | Unsigned int (4Byte) |
Captions bit diagram data | (captions bit diagram length Byte) |
Step 508: original video is carried out transcoding according to different resolution handle, form the transcoding video of one group of different resolution.
Step 509: the original captions bit diagram group that generates is carried out transcoding according to different resolution handle, form the transcoding captions bit diagram group of a series of different resolutions.509 and 508 do not have sequential relationship.
The transcoding video of the one group of different resolution that forms in the above-mentioned steps 508, and the transcoding captions bit diagram group of the one group of different resolution that generates in the above-mentioned steps 509 will offer the user terminal use of different hardware equipment ability.
The embodiment of the invention provides a kind of method of captions matching treatment, comprises the steps: as shown in Figure 7
Step 601: initiate the request of playing stream media video with portable terminal as user terminal to streaming media server, comprise video name or video numbering in the request, and the device capability information of this portable terminal, can send by the Request message of http protocol as the physical size of screen and the described requests such as resolution of support, the concrete form of its message is as follows:
GET??/pub/mobile/discovery.ts?HTTP/1.1
Host:stream.ifeng.com
Accept:*/*
Profile-dev:“physical?size=2.8,resolution=240_320”
In the above content: Profile-dev: " physical size=2.8, resolution=240_320 " is a kind of concrete device capability information, and the physical size of representing its terminal screen is 2.8, and the resolution of support is 240 * 320.
Step 602: after streaming media server is received the playing request of portable terminal, from transcoding video file group and captions bit diagram group, select the transcoding video file and the transcoding captions bit diagram group of optimum this mobile terminal screen or its resolution respectively according to the resolution of the physical size of this mobile terminal screen in the playing request and support, and (should send the support screen physical size in the present embodiment is 2.8, supports that resolution is 240 * 320 transcoding video and captions bit diagram group to send to described portable terminal.The mode of its transmission can be TS stream or rtp streaming etc., streaming media server directly flows down a transcoding video with TS and transcoding captions bit diagram group data can realize by following two kinds of preferred approach, method one is as shown in Figure 7: transcoding video data and voice data and transcoding captions bit diagram group are handled the basic stream VPES that the back generates three packings through packing respectively, APES and CPES, and be respectively these three PES flow distribution PID (bag sign) at PMT (in the Program Map Table), VPES, APES and CPES generate TS stream back after multiplexing and are handed down to portable terminal by streaming media server.Method two: because the data of transcoding captions bit diagram group are less for video data and voice data, this partial data is put in the adjustment field of TS bag of video data, and will adjust the field control position be " 11 ".
Streaming media server directly issues transcoding video and transcoding captions bit diagram group with rtp streaming, then according to preceding method it is packaged into TS stream earlier, regulation according to RFC2038 RTP Payload Format forMPEG1/MPEG2 Video standard is bundled to it in RTP bag again, and is handed down to portable terminal by the RTP/RTSP agreement.
The method of another captions matching treatment that the embodiment of the invention provides comprises the steps as shown in Figure 8, wherein:
Step 801: portable terminal is initiated the request of playing stream media video to the video stream media server, includes the customer agent file database address of the ability information of the screen equipment of storing this portable terminal in the request.The physical size that described screen equipment ability information can be a screen and the resolution of support etc. also comprise needed video name of user or video numbering etc. in the request.A kind of send mode of request is provided in the present embodiment, this mode is initiated the request of playing stream media video to the video stream media server by the Request message of http protocol, the customer agent file database address that includes the ability information of the screen equipment that stores this portable terminal in the request, the concrete form of its message is as follows:
GET??/pub/mobile/discovery.ts?HTTP/1.1
Host:stream.ifeng.com
Accept:*/*
Profile-rep:https://profilerepository.oma.org/Nokia/N61
Profile-rep:http in the foregoing: //profilerepository.oma.org/Nokia/-N61 is the address information of customer agent file database, has the device capability information of portable terminal (model is Nokia-N61) in this address information.
Step 802: the video stream media server obtains the subscriber terminal equipment ability information according to the customer agent file database address in the described request in the customer agent file database, promptly to the Request message of customer agent file database transmission based on HTTP, the screen equipment ability of this portable terminal of request inquiry, message format is as follows:
POST/Nokia/N61?HTTP/1.1
Host:profilerepository.oma.org
QueryType:“Screen_Capacity”
Step 803: after the customer agent file database is received the request of video stream media server, the device capability information of current mobile terminal is fed back to described video stream media server, promptly the Response message by HTTP sends to described video stream media server, and the form of message is as follows:
HTTP/1.1200?OK
QueryResult:“Screen_Capacity:physical?size=2.8,resolution=
240_320”
QueryResult wherein: " Screen_Capacity:physical size=2.8, resolution=240_320 " returned the device capability information of portable terminal, represents that its physical screen size is 2.8, and the resolution of support is 240 * 320.
Step 804: described video stream media server is according to the information of customer agent file server feedback, from a series of transcoding videos, and select transcoding video and the transcoding resolution captions bit diagram group that suitable described user terminal is play in the transcoding captions bit diagram group.Corresponding to the content in the above-mentioned format information that provides, should select the transcoding video of 240*320 resolution in the present embodiment and corresponding to 2.8 inches, 240*320 resolution transcoding captions bit diagram group is for transmission.Described video stream media server sends video and the captions bit diagram group of selecting to described user terminal, and its transmission means is identical with transmission means in the above-mentioned steps 703.
By the description of above embodiment, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but the former is better execution mode under a lot of situation.Based on such understanding, the part that the technical scheme of the embodiment of the invention contributes to prior art in essence in other words can embody with the form of software product, this software product is stored in the storage medium, comprise that some instructions are with so that mobile device (can be a mobile phone, personal computer, media player etc.) the described method of execution each embodiment of the present invention.Here alleged storage medium, as: ROM/RAM, disk, CD etc.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.
Claims (14)
1. a captions matched processing method is characterized in that, this method comprises:
Reception is from the playing request of user terminal;
Obtain the device capability information of described user terminal according to described playing request;
According to described device capability information, for described user terminal provides transcoding captions bit diagram group.
2. according to the described method of claim 1, it is characterized in that: carry the device capability information of described user terminal in the described playing request, the described device capability information that obtains described user terminal according to described playing request specifically comprises: directly obtain from the device capability information in the playing request of user terminal; Or
Described playing request is carried customer agent file database address information, and the described device capability information that obtains described user terminal according to described playing request specifically comprises: the device capability information that obtains described user terminal according to described customer agent file database address information from described customer agent file database.
3. method according to claim 1 and 2 is characterized in that this method also comprises: according to described device capability information, for described user terminal provides the transcoding video.
4. according to the described method of claim 3, it is characterized in that:
Describedly the transcoding video be provided and specifically comprise for described user terminal for described terminal provides transcoding captions bit diagram group:
Video data and voice data and captions bit diagram group packed respectively generates three basic streams, and is respectively described three basic flow distribution bags signs in Program Map Table,
Be handed down to described user terminal behind the TS stream with described three the multiplexing back generations of flowing through substantially.
5. the manufacture method of a captions bit diagram group is characterized in that, this method comprises:
From video image, obtain caption area,
And obtain described caption area time information corresponding and positional information;
According to the described caption area that obtains and described temporal information and positional information, generate original captions bit diagram group;
According to resolution described original captions bit diagram group is carried out transcoding and handle formation transcoding captions bit diagram group.
6. according to the described method of claim 5, it is characterized in that,
Describedly from video image, obtain caption area, and the pairing temporal information of described caption area comprises specifically:
Detect the scope of caption area, the caption area that detects is confirmed;
To positioning, obtain the pairing positional information of caption area through the caption area of confirming;
To following the tracks of, obtain the pairing temporal information of caption area through the caption area of location.
7. captions producing device, it is characterized in that: this device comprises:
The caption area acquisition module: be used for obtaining the positional information and the temporal information of the caption area and the described caption area of video image, according to described caption area, and the positional information of described caption area and temporal information, generate original captions bit diagram group;
Transcoding module: be used for described original captions bit diagram group being carried out transcoding and handle formation transcoding captions bit diagram group according to resolution.
8. according to the described captions producing device of claim 7, it is characterized in that:
Described caption area acquisition module further comprises:
Caption area rough detection unit: the scope that is used for determining caption area;
Caption area confirmation unit: be used for above-mentioned caption area is confirmed;
Caption area positioning unit: be used for the position of the caption area through confirming is positioned, obtain the positional information of caption area;
Caption area tracking cell: the temporal information that is used to obtain described caption area;
Captions bit diagram group forms the unit: according to the caption area of described acquisition, and the temporal information of described caption area and positional information, generate captions bit diagram group;
Described transcoding module further comprises:
Video code conversion unit: according to resolution original video is carried out transcoding and handle, form the transcoding video;
Captions bit diagram transcoding units: according to resolution original captions bit diagram group is carried out transcoding and handle, form transcoding captions bit diagram group.
9. captions matching treatment device is characterized in that comprising:
Receiver module: receive the playing request from user terminal, described playing request is carried the device capability information or the customer agent file database address information of described user terminal;
Sending module: be used for device capability information, for terminal provides transcoding captions bit diagram group according to described user terminal.
10. device as claimed in claim 9 is characterized in that:
Described sending module also is used for the device capability information according to described user terminal, for terminal provides the transcoding video.
11. a captions matching treatment system is characterized in that, comprising:
The captions producing device is used for making transcoding captions bit diagram group according to resolution;
Captions matching treatment device is used to receive the playing request from user terminal, carries the device capability information of described user terminal in the described playing request, and provides transcoding captions bit diagram group according to the device capability information of described terminal for terminal.
12. system as claimed in claim 11 is characterized in that also comprising:
Customer agent file database: the device capability information that is used to store user terminal;
Described captions matching treatment device also is used to receive the playing request of carrying customer agent file database address information from user terminal, and obtain the device capability information of described user terminal according to described customer agent file database address information, further the device capability information according to described terminal provides transcoding captions bit diagram group for terminal.
13. as claim 11 or 12 described systems, its feature exists,
Described captions producing device also is used for making the transcoding video according to resolution;
Described captions matching treatment device also is used for providing the transcoding video according to the device capability information of described terminal for terminal.
14. a user terminal is characterized in that, comprising:
Playing request module: generate playing request, comprise the address information of the device capability information or the customer agent file database of described user terminal in the described playing request;
Video playback module: displaying video, and transcoding captions bit diagram group;
Communication module: be used to send playing request; Receiver, video, and transcoding captions bit diagram group.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200810141753A CN101668132A (en) | 2008-09-02 | 2008-09-02 | Method and system for matching and processing captions |
PCT/CN2009/073240 WO2010025646A1 (en) | 2008-09-02 | 2009-08-13 | Method, system and device of the subtitle matching process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200810141753A CN101668132A (en) | 2008-09-02 | 2008-09-02 | Method and system for matching and processing captions |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101668132A true CN101668132A (en) | 2010-03-10 |
Family
ID=41796737
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200810141753A Pending CN101668132A (en) | 2008-09-02 | 2008-09-02 | Method and system for matching and processing captions |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN101668132A (en) |
WO (1) | WO2010025646A1 (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102594805A (en) * | 2012-01-30 | 2012-07-18 | 中兴通讯股份有限公司 | Method and system for providing multiple media services through single node |
CN102625052A (en) * | 2012-03-28 | 2012-08-01 | 广东威创视讯科技股份有限公司 | Method, device and system for processing caption data |
CN102663988A (en) * | 2012-04-28 | 2012-09-12 | 广东威创视讯科技股份有限公司 | Method, device and system for broadcasting subtitles |
CN102984546A (en) * | 2012-11-01 | 2013-03-20 | 上海文广互动电视有限公司 | Transcoding service system for distributed video transcoding |
CN103379363A (en) * | 2012-04-19 | 2013-10-30 | 腾讯科技(深圳)有限公司 | Video processing method and apparatus, mobile terminal and system |
CN105191294A (en) * | 2012-11-02 | 2015-12-23 | 欧朋伊克斯辰吉有限公司 | Methods and apparatus for video communications |
CN105791367A (en) * | 2014-12-25 | 2016-07-20 | 中国移动通信集团公司 | Method, system and related equipment for sharing auxiliary media information in screen sharing |
CN103294683B (en) * | 2012-02-24 | 2016-09-14 | 腾讯科技(深圳)有限公司 | A kind of video file captions automatic patching system and method |
CN105957014A (en) * | 2016-06-13 | 2016-09-21 | 天脉聚源(北京)传媒科技有限公司 | Picture adaptive display method and apparatus |
CN108156480A (en) * | 2017-12-27 | 2018-06-12 | 腾讯科技(深圳)有限公司 | A kind of method, relevant apparatus and the system of video caption generation |
CN110177295A (en) * | 2019-06-06 | 2019-08-27 | 北京字节跳动网络技术有限公司 | Processing method, device and the electronic equipment that subtitle crosses the border |
CN111131351A (en) * | 2018-10-31 | 2020-05-08 | 中国移动通信集团广东有限公司 | Method and device for confirming model of Internet of things equipment |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5805153A (en) * | 1995-11-28 | 1998-09-08 | Sun Microsystems, Inc. | Method and system for resizing the subtitles of a video |
WO2001095617A2 (en) * | 2000-06-02 | 2001-12-13 | Thomson Licensing S.A. | Auxiliary information processing system with a bitmapped on-screen display using limited computing resources |
KR100850999B1 (en) * | 2001-06-21 | 2008-08-12 | 엘지전자 주식회사 | Processing apparatus for closed caption in set-top box |
US20040213542A1 (en) * | 2003-04-22 | 2004-10-28 | Hiroshi Hamasaka | Apparatus and method to reproduce multimedia content for a multitude of resolution displays |
CN100414976C (en) * | 2004-01-12 | 2008-08-27 | 松下电器产业株式会社 | Caption treating device |
-
2008
- 2008-09-02 CN CN200810141753A patent/CN101668132A/en active Pending
-
2009
- 2009-08-13 WO PCT/CN2009/073240 patent/WO2010025646A1/en active Application Filing
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102594805A (en) * | 2012-01-30 | 2012-07-18 | 中兴通讯股份有限公司 | Method and system for providing multiple media services through single node |
CN103294683B (en) * | 2012-02-24 | 2016-09-14 | 腾讯科技(深圳)有限公司 | A kind of video file captions automatic patching system and method |
CN102625052A (en) * | 2012-03-28 | 2012-08-01 | 广东威创视讯科技股份有限公司 | Method, device and system for processing caption data |
CN102625052B (en) * | 2012-03-28 | 2014-08-20 | 广东威创视讯科技股份有限公司 | Method, device and system for processing caption data |
CN103379363A (en) * | 2012-04-19 | 2013-10-30 | 腾讯科技(深圳)有限公司 | Video processing method and apparatus, mobile terminal and system |
CN103379363B (en) * | 2012-04-19 | 2018-09-11 | 腾讯科技(深圳)有限公司 | Method for processing video frequency and device, mobile terminal and system |
CN102663988A (en) * | 2012-04-28 | 2012-09-12 | 广东威创视讯科技股份有限公司 | Method, device and system for broadcasting subtitles |
CN102984546A (en) * | 2012-11-01 | 2013-03-20 | 上海文广互动电视有限公司 | Transcoding service system for distributed video transcoding |
CN105191294A (en) * | 2012-11-02 | 2015-12-23 | 欧朋伊克斯辰吉有限公司 | Methods and apparatus for video communications |
CN105791367A (en) * | 2014-12-25 | 2016-07-20 | 中国移动通信集团公司 | Method, system and related equipment for sharing auxiliary media information in screen sharing |
CN105957014A (en) * | 2016-06-13 | 2016-09-21 | 天脉聚源(北京)传媒科技有限公司 | Picture adaptive display method and apparatus |
CN108156480A (en) * | 2017-12-27 | 2018-06-12 | 腾讯科技(深圳)有限公司 | A kind of method, relevant apparatus and the system of video caption generation |
CN111131351A (en) * | 2018-10-31 | 2020-05-08 | 中国移动通信集团广东有限公司 | Method and device for confirming model of Internet of things equipment |
CN111131351B (en) * | 2018-10-31 | 2022-09-27 | 中国移动通信集团广东有限公司 | Method and device for confirming model of Internet of things equipment |
CN110177295A (en) * | 2019-06-06 | 2019-08-27 | 北京字节跳动网络技术有限公司 | Processing method, device and the electronic equipment that subtitle crosses the border |
CN110177295B (en) * | 2019-06-06 | 2021-06-22 | 北京字节跳动网络技术有限公司 | Subtitle out-of-range processing method and device and electronic equipment |
US11924520B2 (en) | 2019-06-06 | 2024-03-05 | Beijing Bytedance Network Technology Co., Ltd. | Subtitle border-crossing processing method and apparatus, and electronic device |
Also Published As
Publication number | Publication date |
---|---|
WO2010025646A1 (en) | 2010-03-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101668132A (en) | Method and system for matching and processing captions | |
CN103188522B (en) | Method and system for providing and delivering a composite condensed stream | |
BE1021237B1 (en) | DEVICE ORIENTATION CAPABILITY EXCHANGE REPORTING AND SERVER ADAPTATION OF MULTIMEDIA CONTENT IN RESPONSE TO DEVICE ORIENTATION | |
JP5544426B2 (en) | Video bitstream transmission system | |
CA2623835C (en) | Content delivery system and method, and server apparatus and receiving apparatus used in this content delivery system | |
US8631143B2 (en) | Apparatus and method for providing multimedia content | |
JP2011171903A (en) | Information transmission display system | |
JP2017192142A (en) | System and method for enhanced remote transcoding using content profiling | |
CN106792154A (en) | The frame-skipping synchronization system and its control method of video player | |
CN102473288A (en) | Distributed image retargeting | |
JP5555068B2 (en) | Playback apparatus, control method thereof, and program | |
CN103119952A (en) | Methods for processing multimedia flows and corresponding devices | |
EP3070951A1 (en) | Video code stream obtaining method and apparatus | |
KR100767673B1 (en) | Digital Broadcasting Terminal with Emboding Slide Show and Method of Emboding Slide Show Using Same | |
US20060200440A1 (en) | Method for providing information about multimedia contents in multimedia service system | |
JP2006185439A (en) | Mobile communication terminal, apparatus and method for video searching in mobile communication terminal | |
US20100082740A1 (en) | Moving picture file transmitting server and method of controlling operation of same | |
KR101426579B1 (en) | Apparatus and method for providing images in wireless communication system and portable display apparatus and method for displaying images | |
CN105791964A (en) | Cross-platform media file playing method and system | |
JP2008035063A (en) | Information display device, video display system and control program | |
JP5063045B2 (en) | Communication terminal device, video display system, and control program | |
KR20140099983A (en) | System, apparatus, method and computer readable recording medium for providing an advertisement using a redirect | |
US20080163314A1 (en) | Advanced information display method | |
US8887196B2 (en) | System, mobile terminal and method for displaying object information in real time | |
JP4916232B2 (en) | Video display system, video display device, communication terminal device, and system control method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20100310 |