CN113573058B - Interframe image coding method based on space-time significance fusion - Google Patents
Interframe image coding method based on space-time significance fusion Download PDFInfo
- Publication number
- CN113573058B CN113573058B CN202111112916.2A CN202111112916A CN113573058B CN 113573058 B CN113573058 B CN 113573058B CN 202111112916 A CN202111112916 A CN 202111112916A CN 113573058 B CN113573058 B CN 113573058B
- Authority
- CN
- China
- Prior art keywords
- saliency map
- time
- node
- space
- saliency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000004927 fusion Effects 0.000 title claims abstract description 13
- 239000011159 matrix material Substances 0.000 claims abstract description 37
- 238000013139 quantization Methods 0.000 claims abstract description 26
- 230000033001 locomotion Effects 0.000 claims abstract description 24
- 239000002356 single layer Substances 0.000 claims abstract description 15
- 238000012546 transfer Methods 0.000 claims abstract description 11
- 238000012545 processing Methods 0.000 claims abstract description 9
- 230000001052 transient effect Effects 0.000 claims description 30
- 238000010521 absorption reaction Methods 0.000 claims description 18
- 230000002123 temporal effect Effects 0.000 claims description 15
- 230000007704 transition Effects 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 7
- 230000003287 optical effect Effects 0.000 claims description 5
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 230000000007 visual effect Effects 0.000 description 8
- 230000003993 interaction Effects 0.000 description 7
- 230000008859 change Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 230000016776 visual perception Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000005094 computer simulation Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 101150112439 SMAP gene Proteins 0.000 description 1
- 239000006096 absorbing agent Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses an interframe image coding method based on space-time significance fusion, which relates to the technical field of image processing and mainly comprises the following steps: acquiring a time saliency map according to the time domain motion vector of each pixel point; extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points; obtaining a transfer matrix according to the single-layer graph and the mean value characteristics of the pixel points corresponding to the nodes; acquiring a space saliency map based on a Markov chain theory according to the transfer matrix; acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map; acquiring a saliency map according to the space-time saliency map; and dynamically adjusting the quantization parameter according to the mean value characteristic of the corresponding pixel point in the saliency map, and encoding the interframe image according to the quantization parameter. The invention combines the motion characteristics of the image in time domain and space domain to obtain the time-space domain saliency map for coding, so that the coded data contains more image information, and the fidelity of the decoded data is improved.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to an interframe image coding method based on space-time significance fusion.
Background
Currently, with the increasingly widespread application of h.265/HEVC and its extended coding standard, the known perceptual computing models are mainly classified into four categories: region of Interest calculation model (ROI), Visual Attention model (Visual Attention), Visual Sensitivity model (Visual Sensitivity), Cross-perception Attention model (Cross-Modal Attention). Perceptual coding methods can be further classified into three categories: a preprocessing method, a non-scalable coding method, and a scalable coding method. The preprocessing method usually performs visual optimization processing on inter-frame images in the original video before encoding, and does not need to change an encoder. The non-scalable encoding method requires changes to the codec while performing the visual optimization. The scalable coding method only needs to change the coder when the visual optimization processing is carried out. The criterion for evaluating the performance of perceptual coding is the improvement of coding efficiency or visual quality, and for some real-time applications, the computational complexity of a perceptual model also needs to be verified.
Although the research on inter-frame image coding based on visual perception has been greatly advanced in recent years, there still remain disadvantages. The method comprises the following steps of (1) calculating the significance of an inter-frame image: there is currently a lack of efficient computational models for inter-picture coding applications. Although international research on the significance of static images has been greatly advanced, research on the significance detection of dynamic inter-frame images is still in the beginning and has not yet formed a system. The existing perception coding framework is not perfect enough, information interaction between an inter-frame image perceptron and a coder is only limited to salient object detection, and the information interaction is not beneficial to information sharing of the two (such as dividing information of a foreground and a background, motion type information and the like).
With the development of the mass media industry, the requirements on timeliness and fidelity of video transmission are higher and higher. Based on this, the research on computational models for inter-frame image coding has a great development space.
Disclosure of Invention
In order to overcome the defects of the prior art when the inter-frame image is coded, the invention provides an inter-frame image coding method based on space-time significance fusion, which comprises the following steps:
s1: acquiring an inter-frame image and extracting the mean value characteristic of each pixel point through a color difference calculation space;
s2: acquiring a time domain motion vector of each pixel point in the interframe image through an optical flow algorithm, and acquiring a time saliency map according to the time domain motion vector of each pixel point;
s3: extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points;
s4: obtaining a transfer matrix according to the weight relation between the nodes and the edges in the single-layer graph and the mean value characteristics of the corresponding pixels of the nodes;
s5: acquiring a space saliency map based on a Markov chain theory according to the transfer matrix;
s6: acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map;
s7: carrying out normalization processing in a preset color gradation range according to the space-time saliency map to obtain a saliency map;
s8: and dynamically adjusting the quantization parameter according to the mean value characteristic of the corresponding pixel point in the saliency map, and encoding the interframe image according to the quantization parameter.
Further, the time saliency map is composed of time saliency values of each pixel point, wherein the acquisition of the time saliency values can be expressed as the following formula:
wherein (x, y) is the pixel coordinate of pixel point i, MV (x, y) is the amplitude of time domain motion vector, MVx(x, y) is the horizontal component of the temporal motion vector, MVy(x, y) is the vertical component of the temporal motion vector;for the purpose of enhancing the amount of amplitude measurement,andis a constant parameter;for normalizing the enhanced amplitude level to be within a preset tone scale range,and the time significance value corresponding to the pixel point i is obtained.
Further, the nodes of the single-layer graph include a transient node and a sink node, where each node is connected to a transient node adjacent to the node or sharing an edge with an adjacent node of the node, and in the step S4, the method further includes the steps of:
and acquiring the weight of the edge between the adjacent nodes according to the average value characteristics of the pixel points corresponding to the nodes, and renumbering the nodes.
Further, the weight of the edge between the adjacent nodes can be expressed as:
wherein m and n are two adjacent nodes in the single-layer graph, and wmnIs the weight of the edge between node m and node n, xm、xnThe mean features of the corresponding pixel points of node m and node n respectively,is a constant number of times, and is,is the Euler number.
Further, the transition matrix may be represented by the following formula:
in the formula,is the numbering of the node m after renumbering,is the numbering of the node n after renumbering, A is the adjacency matrix,and N (m) represents that the node n is communicated with the node m, D is a degree matrix, P is a transition matrix, and t is the number of transient nodes.
Further, the spatial saliency map is composed of spatial saliency values of each pixel point, wherein the acquisition of the spatial saliency values is expressed as the following formula:
in the formula, Q is the transition probability between any transient state after a transition matrix P is expressed by a Markov absorption chain, I is a matrix of r multiplied by r, r is the number of absorption nodes, c is a t-dimensional column vector with all elements being 1, and y is the absorption time of the corresponding transient state node;for corresponding transient nodeThe numbering after the renumbering is carried out,for numbering corresponding to transient sectionDotThe vector is normalized with respect to the absorption time of (c),is a spatial significance value.
Further, the spatio-temporal saliency map consists of spatio-temporal saliency values of each pixel point, and the acquisition of the spatio-temporal saliency values can be expressed as a formula:
in the formula,for the weights of the spatial saliency map to be,for the weights of the temporal saliency map to be,is the spatial saliency value of the pixel point i,is the temporal saliency value of the pixel point i,the time-space significance value of the pixel point i is obtained.
Further, the adjustment of the quantization parameter in the step S8 can be expressed as the following formula:
in the formula, u (x, y) is the mean characteristic of the corresponding pixel point i (x, y) in the saliency map; q. q.s1And q is2Respectively corresponding pixel points in the saliency mapi (x, y) is in a corresponding proportional relation with the average value characteristic of the control quantization parameter threshold; QP0For the initial value of the quantization coefficient, Δ QP is the correction value of the quantization parameter QP (x, y), and Int is the rounding operation.
Compared with the prior art, the invention has at least the following effects:
(1) the invention relates to an interframe image coding method based on space-time saliency fusion, which combines an image time domain and an image space domain to obtain a space-time saliency map on the basis of considering the motion characteristics of the image on the space domain and the time domain, and codes a video interframe image according to a normalized result, so that information interaction between a sensor and a coder is not limited to saliency target detection any more, and more foreground and background division change information in the video interframe image change process and motion type information of the interframe image can be obtained;
(2) through the interaction of more information between the sensor and the encoder, the encoded compressed data can keep more related information, so that a higher-definition and fidelity video image is obtained when the compressed data is decoded, and the method can be applied to the compression encoding of high-definition video;
(3) the information interactivity is improved, and meanwhile, the coding quantization parameter is dynamically adjusted through the space-time significance value, so that the coding bit rate is reduced, and the coding speed is improved.
Drawings
FIG. 1 is a diagram of method steps for an inter-frame image coding method based on spatio-temporal saliency fusion;
FIG. 2 is a schematic diagram of spatio-temporal saliency fusion.
Detailed Description
The following are specific embodiments of the present invention and are further described with reference to the drawings, but the present invention is not limited to these embodiments.
Example one
The invention aims to solve the problem that in the prior art, the inter-frame image coding in a video causes insufficient information interaction, so that the video distortion is easily caused in the data decoding process, and as shown in figure 1, the invention provides an inter-frame image coding method based on space-time significance fusion, which comprises the following steps:
s1: acquiring an inter-frame image and extracting the mean value characteristic of each pixel point through a color difference calculation space;
s2: acquiring a time domain motion vector of each pixel point in the interframe image through an optical flow algorithm, and acquiring a time saliency map according to the time domain motion vector of each pixel point;
s3: extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points;
s4: obtaining a transfer matrix according to the weight relation between the nodes and the edges in the single-layer graph and the mean value characteristics of the corresponding pixels of the nodes;
s5: acquiring a space saliency map based on a Markov chain theory according to the transfer matrix;
s6: acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map;
s7: carrying out normalization processing in a preset color gradation range according to the space-time saliency map to obtain a saliency map;
s8: and dynamically adjusting the quantization parameter according to the mean value characteristic of the corresponding pixel point in the saliency map, and encoding the interframe image according to the quantization parameter.
Considering that the human visual perception system is more sensitive to the color difference computing space (CIELab), in order to make the decoded inter-frame image of the encoded video data more conform to the human visual perception habit, the invention needs to pre-process the inter-frame image of the video after obtaining the inter-frame image: and converting the color of the inter-frame image from the RGB space to a color difference calculation space, and calculating the mean value of each pixel point in the inter-frame image in the color difference calculation space as the characteristic of each pixel point.
Since it is a coding method based on spatio-temporal saliency fusion, it must include temporal saliency and spatial saliency. For the calculation of the time saliency, the invention acquires the time saliency map of the interframe image based on an optical flow algorithm (Lucas-Kanade). Specifically, the time domain of each pixel point is obtained through an optical flow algorithmHorizontal component MV of motion vectorx(x, y) and vertical component MVy(x, y), then the magnitude MV (x, y) of the temporal motion vector is obtained according to the two components, and the formula expression can be expressed as:
further, by the enhancement operation, the magnitude of the temporal motion vector can be further expressed as:
whereinAndare constant parameters, and in the present embodiment,the value of (a) is selected to be 10,the value of (d) is chosen to be 2. Finally, willNormalizing to a predetermined color level range (in this embodiment, the predetermined color level range is selected as [0, 255 ]]) And, is formulated as:
and then the time significance value of the pixel point i (x, y)Can be used forBy passingTo show that:
and matching the corresponding coordinates of the time significance value according to the obtained time significance value of each pixel point and the coordinates of the pixel points in the interframe image, thereby obtaining a time significance map.
For the calculation of the spatial saliency, the method is based on a Markov chain saliency detection method to acquire a spatial saliency map at a superpixel level. Firstly, a single-layer graph G (V, E) with super-pixel characteristics in an inter-frame image is extracted according to the mean value characteristics of all pixel points, wherein V and E respectively represent nodes and edges of the single-layer graph G. Also on the single-level graph G, each node V needs to be connected to a transient node that is adjacent to the node or shares an edge with an adjacent node of the node. Based on this, the weight of the edge E between two adjacent nodes m (current node) and n (transient node connected to the current node) can be defined as:
in the formula, xm、xnThe mean features of the corresponding pixel points of node m and node n respectively,is a constant number of times, and is,is the Euler number. Then, renumbering can be performed according to the weight, so that the first t numbered nodes are transient nodes, the second r numbered nodes are absorbing nodes, wherein t is the number of transient nodes, and r is the number of absorbing nodes.
At the upper partOn the basis of the above, it should be further understood that the transfer matrix P on the single-layer graph can be calculated according to the adjacency matrix a and the degree matrix D:. Therefore, to calculate the transition matrix P, the adjacency matrix a and the degree matrix D need to be confirmed.
And according to the weight of the edge between adjacent nodes, the adjacency matrix a can be represented as:
from the connection matrix, the degree matrix D can be expressed as:
in the formula,is the numbering of the node m after renumbering,is the numbering of the node n after renumbering,for the expression of the adjacency matrix, n (m) represents that the node n communicates with the node m. Wherein,which is the case when the neighboring node is on the diagonal of the inter-picture.
And then, calculating the absorption time of each transient node based on the Markov theory according to the obtained transfer matrix P. Then the sink state of the node after renumbering according to this embodimentAnd transient stateThe transition matrix may be represented as:
the state of the first node is transient state, and the state of the last node is transient absorption. Q is the transition probability between any transient after the transition matrix P is represented by a Markov absorbing chain, R is the probability of containing the movement from any transient to any absorbing state, I is a matrix of R x R size, R is the number of absorbing nodes, and c is a t-dimensional column vector with all elements being 1. For the absorption chain, its basic properties can be deduced: matrix arrayWhereinwhich is an expression of the matrix K, can be understood as the expected number of times the chain takes for transient absorption to occur in the transient node n. Assuming that the chain starts in transient m, anRepresenting the expected number of times before absorption, the absorption time of the corresponding transient node can be calculated, i.e.:
the basic idea of the markov chain is to detect saliency using temporal attributes in the absorbing markov chain. The virtual border nodes are identified as a priori border-based absorber nodes. Significance was calculated as the absorption time of the absorption node. On the basis of the markov chain significance model, the spatial significance value can be expressed as:
in the formula,for corresponding transient nodeThe numbering after the renumbering is carried out,is a numberThe absorption time normalization vector corresponding to the transient node,is a spatial significance value. And then according to the obtained space significance value of each pixel point, matching corresponding coordinates of the space significance value according to coordinates of the pixel points in the interframe image, thereby obtaining a space significance map.
The time significance map and the space significance map are obtained through the analysis and the calculation. The time saliency map reflects the dynamic characteristics of the inter-frame images in the video, the space saliency map reflects the static characteristics of the inter-frame images, and the time saliency map and the space saliency map are linearly fused to realize mutual complementation. Weight of the time saliency map is set toThe weight of the spatial saliency map isThe space-time significant value after linear fusion is:
in the formula,the time-space significance value of the pixel point i is obtained;,,;is a constant value, and generally takes a value in the range of 0.3-0.5.
And then matching corresponding coordinates of the space-time significance values according to the obtained space-time significance values of the pixel points and coordinates of the pixel points in the interframe images, thereby obtaining a space-time significance map. And then pixel level saliency is comparedProceed to preset color gradation range ([ 0, 255 ]]) The interior normalization processing is carried out, then the significant value is assigned to all the pixel points contained in the interior normalization processing, and a significant graph S containing each pixel point is obtainedmap。
In HEVC, video inter-frame image is divided into a plurality of coding block units, and the code rate of the coding block is equal to quantization parameter QP and quantization step QPstepClosely related, the value range of QP is [0, 51]. In general, the larger the quantization parameter QP, the higher the distortion of the image. Meanwhile, the foreground region in the video needs to increase the allocation of data resources (bit), and the background region needs to save the allocation of the data resources (bit). Therefore, the invention further provides a method for dynamically adjusting the quantization parameter based on the mean value characteristic u (x, y) of each pixel point in the coded block saliency map Smap, so that the foreground region adopts high QP coding and the background region adopts low QP coding, and the adjusted quantization parameter is expressed as follows:
in the formula, u (x, y) is the mean characteristic of the corresponding pixel point i (x, y) in the saliency map; q. q.s1And q is2Respectively, the threshold values of the control quantization parameters (q in this embodiment) are in corresponding proportional relation with the mean value characteristic of the corresponding pixel point i (x, y) in the saliency map1=0.5*2x,q2=0.8 × 2x, x is the mean characteristic of the corresponding pixel point); QP0For the initial value of the quantization coefficient, Δ QP is the correction value of the quantization parameter QP (x, y), and Int is the rounding operation.
As shown in fig. 2.a, it is an inter-frame image in a certain video. The spatial saliency is extracted to obtain a spatial saliency map as shown in fig. 2.b, and it can be seen that the foreground region and the background region can be clearly divided; the time saliency extraction is carried out on the inter-frame image to obtain a time saliency map as shown in fig. 2.c, and it can be seen that the inter-frame image can express a moving object and a moving type; according to the method of the present invention, the two are combined to obtain the space-time saliency map as shown in fig. 2.d, and the information of the two can be perfectly combined, so that more related information can be retained when encoding according to the saliency map.
In summary, the interframe image coding method based on the space-time saliency fusion, provided by the invention, combines the two to obtain a space-time saliency map on the basis of considering the motion characteristics of the image in the time domain and the space domain, and codes the video interframe image according to the normalized result, so that the information interaction between a sensor and an encoder is not limited to saliency target detection any more, and more foreground and background division change information and motion type information of the interframe image in the video interframe image change process can be obtained.
Through the interaction of more information between the sensor and the encoder, the encoded compressed data can keep more related information, so that a video image with higher definition and fidelity is obtained when the compressed data is decoded, and the method can be applied to the compression encoding of high-definition video. The information interactivity is improved, and meanwhile, the coding quantization parameter is dynamically adjusted through the space-time significance value, so that the coding bit rate is reduced, and the coding speed is improved.
It should be noted that all the directional indicators (such as up, down, left, right, front, and rear … …) in the embodiment of the present invention are only used to explain the relative position relationship between the components, the movement situation, etc. in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indicator is changed accordingly.
Moreover, descriptions of the present invention as relating to "first," "second," "a," etc. are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit ly indicating a number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "connected," "secured," and the like are to be construed broadly, and for example, "secured" may be a fixed connection, a removable connection, or an integral part; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should not be considered to exist, and is not within the protection scope of the present invention.
Claims (3)
1. An interframe image coding method based on space-time significance fusion is characterized by comprising the following steps:
s1: acquiring an inter-frame image and extracting the mean value characteristic of each pixel point through a color difference calculation space;
s2: acquiring a time domain motion vector of each pixel point in the interframe image through an optical flow algorithm, and acquiring a time saliency map according to the time domain motion vector of each pixel point;
s3: extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points;
s4: obtaining a transfer matrix according to the weight relation between the nodes and the edges in the single-layer graph and the mean value characteristics of the corresponding pixels of the nodes;
s5: acquiring a space saliency map based on a Markov chain theory according to the transfer matrix;
s6: acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map;
s7: carrying out normalization processing in a preset color gradation range according to the space-time saliency map to obtain a saliency map;
s8: dynamically adjusting quantization parameters according to the mean value characteristics of corresponding pixel points in the saliency map, and encoding the interframe images according to the quantization parameters;
in the step S4, the nodes of the single-layer graph include transient nodes and absorption nodes, where each node is connected to a transient node adjacent to the node or sharing an edge with an adjacent node of the node, and in the step S4, the method further includes the steps of:
acquiring the weight of edges between adjacent nodes according to the mean value characteristics of the pixels corresponding to the nodes, and renumbering the nodes;
the weight of the edge between the adjacent nodes can be expressed as:
wherein m and n are two adjacent nodes in the single-layer graph, and wmnIs the weight of the edge between node m and node n, xm、xnThe corresponding pixel points of the node m and the node n are respectivelyThe characteristics of the values are such that,is a constant number of times, and is,is the Euler number;
the transition matrix can be represented by the following formula:
in the formula,is the numbering of the node m after renumbering,is the numbering of the node n after renumbering, A is the adjacency matrix,the expression is an adjacent matrix expression, N (m) represents that a node n is communicated with a node m, D is a degree matrix, P is a transfer matrix, and t is the number of transient nodes;
in the step S5, the spatial saliency map is composed of spatial saliency values of each pixel, where the acquisition of the spatial saliency values is expressed as the following formula:
in the formula, Q is the transition probability between any transient state after a transition matrix P is expressed by a Markov absorption chain, I is a matrix of r multiplied by r, r is the number of absorption nodes, c is a t-dimensional column vector with all elements being 1, and y is the absorption time of the corresponding transient state node;for corresponding transient nodeThe numbering after the renumbering is carried out,is a numberThe absorption time normalization vector corresponding to the transient node,is a spatial significance value;
the adjustment of the quantization parameter in the step S8 can be expressed as the following formula:
in the formula, u (x, y) is the mean characteristic of the corresponding pixel point i (x, y) in the saliency map; q. q.s1And q is2Respectively, the threshold values of the control quantization parameters are in corresponding proportional relation with the mean value characteristics of the corresponding pixel points i (x, y) in the saliency map; QP0For the initial value of the quantization coefficient, Δ QP is the correction value of the quantization parameter QP (x, y), and Int is the rounding operation.
2. The method as claimed in claim 1, wherein the temporal saliency map is formed by temporal saliency values of each pixel point, and the temporal saliency value is obtained as the following formula:
wherein (x, y) is the pixel coordinate of pixel point i, MV (x, y) is the amplitude of time domain motion vector, MVx(x, y) is the horizontal component of the temporal motion vector, MVy(x, y) is the vertical component of the temporal motion vector;for the purpose of enhancing the amount of amplitude measurement,andis a constant parameter;for normalizing the enhanced amplitude level to be within a preset tone scale range,and the time significance value corresponding to the pixel point i is obtained.
3. The method as claimed in claim 1, wherein the spatio-temporal saliency map is composed of spatio-temporal saliency values of each pixel point, and the spatio-temporal saliency value is obtained by a formula:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111112916.2A CN113573058B (en) | 2021-09-23 | 2021-09-23 | Interframe image coding method based on space-time significance fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111112916.2A CN113573058B (en) | 2021-09-23 | 2021-09-23 | Interframe image coding method based on space-time significance fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113573058A CN113573058A (en) | 2021-10-29 |
CN113573058B true CN113573058B (en) | 2021-11-30 |
Family
ID=78174214
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111112916.2A Active CN113573058B (en) | 2021-09-23 | 2021-09-23 | Interframe image coding method based on space-time significance fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113573058B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113965753B (en) * | 2021-12-20 | 2022-05-17 | 康达洲际医疗器械有限公司 | Inter-frame image motion estimation method and system based on code rate control |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103458238A (en) * | 2012-11-14 | 2013-12-18 | 深圳信息职业技术学院 | Scalable video code rate controlling method and device combined with visual perception |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4705959B2 (en) * | 2005-01-10 | 2011-06-22 | トムソン ライセンシング | Apparatus and method for creating image saliency map |
CN106611427B (en) * | 2015-10-21 | 2019-11-15 | 中国人民解放军理工大学 | Saliency detection method based on candidate region fusion |
CN109076200B (en) * | 2016-01-12 | 2021-04-23 | 上海科技大学 | Method and device for calibrating panoramic stereo video system |
CN106295542A (en) * | 2016-08-03 | 2017-01-04 | 江苏大学 | A kind of road target extracting method of based on significance in night vision infrared image |
WO2018206551A1 (en) * | 2017-05-09 | 2018-11-15 | Koninklijke Kpn N.V. | Coding spherical video data |
CN107749066A (en) * | 2017-11-10 | 2018-03-02 | 深圳市唯特视科技有限公司 | A kind of multiple dimensioned space-time vision significance detection method based on region |
US11122314B2 (en) * | 2017-12-12 | 2021-09-14 | Google Llc | Bitrate optimizations for immersive multimedia streaming |
CN108734173A (en) * | 2018-04-20 | 2018-11-02 | 河海大学 | Infrared video time and space significance detection method based on Gestalt optimizations |
CN109547803B (en) * | 2018-11-21 | 2020-06-09 | 北京航空航天大学 | Time-space domain significance detection and fusion method |
CN111310768B (en) * | 2020-01-20 | 2023-04-18 | 安徽大学 | Saliency target detection method based on robustness background prior and global information |
CN113259664B (en) * | 2021-07-15 | 2021-11-16 | 康达洲际医疗器械有限公司 | Video compression method based on image binary identification |
-
2021
- 2021-09-23 CN CN202111112916.2A patent/CN113573058B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103458238A (en) * | 2012-11-14 | 2013-12-18 | 深圳信息职业技术学院 | Scalable video code rate controlling method and device combined with visual perception |
Also Published As
Publication number | Publication date |
---|---|
CN113573058A (en) | 2021-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113766228B (en) | Point cloud compression method, encoder, decoder, and storage medium | |
CN110139109B (en) | Image coding method and corresponding terminal | |
CN103002289B (en) | Video constant quality coding device for monitoring application and coding method thereof | |
TWI643494B (en) | Portrait decoding device, portrait decoding method, portrait encoding device, portrait encoding method, and data structure of encoded data | |
KR101728088B1 (en) | Contrast enhancement | |
CN101374243B (en) | Depth map encoding compression method for 3DTV and FTV system | |
CN108513131B (en) | Free viewpoint video depth map region-of-interest coding method | |
CN111726633A (en) | Compressed video stream recoding method based on deep learning and significance perception | |
CN106303521B (en) | A kind of HEVC Rate-distortion optimization method based on sensitivity of awareness | |
CN103313047B (en) | A kind of method for video coding and device | |
KR101354014B1 (en) | Coding structure | |
CN105120290B (en) | A kind of deep video fast encoding method | |
CN113068034B (en) | Video encoding method and device, encoder, equipment and storage medium | |
CN110213584A (en) | Coding unit classification method and coding unit sorting device based on Texture complication | |
CN105931189B (en) | Video super-resolution method and device based on improved super-resolution parameterized model | |
US20140369617A1 (en) | Image encoding apparatus, image encoding method, and program | |
CN113573058B (en) | Interframe image coding method based on space-time significance fusion | |
CN114900691B (en) | Encoding method, encoder, and computer-readable storage medium | |
Wang et al. | Semantic-aware video compression for automotive cameras | |
CN117750034A (en) | Method, system, equipment and storage medium for learning video coding | |
CN108491747B (en) | Method for beautifying QR (quick response) code after image fusion | |
CN111723735B (en) | Pseudo high bit rate HEVC video detection method based on convolutional neural network | |
CN112637596A (en) | Code rate control system | |
CN114463449B (en) | Hyperspectral image compression method based on edge guidance | |
JP4490351B2 (en) | Inter-layer prediction processing method, inter-layer prediction processing apparatus, inter-layer prediction processing program, and recording medium therefor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |