CN106375764A

CN106375764A - Directional intra prediction and block copy prediction combined video intra coding method

Info

Publication number: CN106375764A
Application number: CN201610795564.8A
Authority: CN
Inventors: 刘�东; 李跃; 吴枫; 李厚强
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2016-08-31
Filing date: 2016-08-31
Publication date: 2017-02-01
Anticipated expiration: 2036-08-31
Also published as: CN106375764B

Abstract

The invention discloses a directional intra prediction and block copy prediction combined video intra coding method. According to the method, a coding block is segmented into two sub-blocks in a flexible block segmentation mode, so prediction content in the same block can contain both local information and non-local information; and prediction of each block with least cost is decided through RDO (Rate Distortion Optimization). Moreover, rapid RDO and complete RDO are combined, and on the premise of not influencing the coding performance, the coding complexity is reduced. The general effect is that the complexity of a decoding side is basically not increased, and the higher compression efficiency is obtained.

Description

Video intra-frame coding method combining direction prediction and block copy prediction

Technical Field

The invention relates to the technical field of video coding, in particular to a video intra-frame coding method combining direction prediction and block copy prediction.

Background

In recent years, with the rapid development of the internet, the application demand for videos in the internet is increasing, and the data volume of the videos is very large, so that the problem that needs to be solved in order to transmit the videos in the internet with limited bandwidth is the problem of video compression coding.

The established video coding standards all belong to the hybrid video coding framework. So-called hybrid video coding, generally consists of the following parts: prediction (Prediction), transformation (Transform), Quantization (Quantization) and Entropy Coding (Entropy Coding). Among them, prediction is generally classified into intra prediction and inter prediction. Video frames that can only use intra-prediction mode are called I-frames, and video frames that can use either intra-prediction mode or inter-prediction mode are called P-frames or B-frames. The intra-frame prediction mode removes spatial redundancy by using pixels which are in the same frame as a current coding block and are reconstructed as a reference. The inter-frame prediction mode removes temporal redundancy by using the reconstructed pixels of other frames as references. Generally, the accuracy of inter-frame prediction is higher, but the first frame of video or a random access frame set for random access must be an I frame, and only intra-frame prediction can be used. Further improving the accuracy of intra-frame prediction, thereby improving the compression efficiency of I-frames, is a very urgent need in video coding.

Currently, the latest standard for general Video Coding is High Efficiency Video Coding (HEVC). Intra prediction of HEVC has 35 modes in common. Where the mode 0 indicates prediction using the Planar method, 1 indicates prediction using the DC method, and 2 to 34 indicate the use of the directional extrapolation mode. A common feature of these modes is extrapolation based on reconstructed pixels in the immediate vicinity of the current coding block, which will be referred to below collectively as Directional Intra Prediction (DIP).

In addition, there are some work that suggests other intra prediction methods, such as:

extrapolation predictions in both directions are combined (Y.Ye and M.Karczewicz, "Improved h.264 encoding based on bi-directional intra prediction, directional transform, and adaptive coherent prediction," Image Processing,2008.ICIP 2008.15th IEEEInternational reference, San Diego, CA,2008, pp.2116-2119.). However, directional prediction in HEVC, or combined two-directionally weighted prediction, only considers local correlation in video frames and not non-local correlation.

Edge (Edge) -based prediction (Liu, d., Sun, x., Wu, f., & Zhang, Y.Q. (2008). Edge-oriented uniform prediction image Processing, IEEE Transactions, 17(10),1827- & 1836.). Prediction based on template matching (template matching) (t.k.tan, c.s.book, and y.suzuki, "Intra prediction byte mapping," in Image Processing,2006ieee international reference, Oct 2006, pp.1693-1696.). However, edge-based prediction or template matching-based prediction has a high complexity at the decoding end.

Prediction based on Intra block copy (Intra block copy) (Siu-Leong Yu and Christos Chrysafis, "Intra-prediction Intra-macroblock motion compensation," Oct.2006, U.S. Pat. No. 7,120,196.). The intra block copy prediction technology finds a most similar reference block as prediction for a current coding block in a reconstructed area of a current frame, and encodes a spatial motion vector between the found most similar reference block and the current block; it is well suited for the case where there are repetitive texture patterns in a frame. This technique is used in subsequent extension versions of HEVC: the Screen Content Coding (SCC) is adopted because some repetitive patterns of regular shapes, such as characters, etc., often appear in such videos. Block copy prediction techniques work well in SCC but not well in natural video. This is mainly because natural video capture angles are complex, noise is large, even if there are repetitive textures, they are largely irregular shapes, and are rarely aligned with square coding units, so that similar reference blocks cannot be searched.

Disclosure of Invention

The invention aims to provide a video intra-frame coding method combining direction prediction and block copy prediction, which can consider local correlation and non-local correlation in a video frame and ensure that the complexity of a decoding end is lower.

The purpose of the invention is realized by the following technical scheme:

a method of video intra-coding combining directional prediction and block copy prediction, comprising:

dividing a current coding block by utilizing a predefined division template containing a plurality of division types, wherein each division type divides the current coding block into two sub-blocks;

intra block copy IBC prediction for all sub-blocks: determining a search range of IBC prediction, performing IBC prediction on all subblocks by using each reference block in the search range, determining an optimal motion vector of each subblock, and recording a prediction error of each subblock, which is obtained by predicting the optimal motion vector;

carrying out DIP prediction on all the subblocks to obtain the optimal prediction direction of each subblock, and recording the prediction error of each subblock, which is obtained by predicting the optimal prediction direction;

after prediction errors of all subblocks under IBC prediction and DIP prediction are obtained, determining optimal division and corresponding prediction combinations in all division types through rapid Rate Distortion Optimization (RDO);

and through complete RDO comparison, the coding cost of IBC prediction on the current coding block, the coding cost of DIP prediction on the current coding block and the coding cost of optimal division and corresponding prediction combination are selected, and the prediction mode with the minimum coding cost is selected as the final prediction mode of the current coding block.

Further, the plurality of partition types include:

the method comprises the following steps of (1) covering 28 division types of a horizontal direction, a vertical direction, a diagonal direction from top right to bottom left and a diagonal direction from top left to bottom right, wherein each direction has 7 division types, the division types of the horizontal direction and the vertical direction are collectively called rectangular division, and the division types of the diagonal direction from top right to bottom left and the diagonal direction from top left to bottom right are collectively called triangular division;

or, the two division modes include 7 division types with an L-shaped upper left and 7 division types with an L-shaped upper right, and both the two division modes divide the current coding block into two sub-blocks: one sub-block is a square block, and the other sub-block is an L-shaped block formed by two adjacent rectangular blocks.

Further, if the plurality of partition types are 28 partition types including a horizontal direction, a vertical direction, a diagonal direction from top right to bottom left, and a diagonal direction from top left to bottom right, determining an optimal motion vector of each sub-block when performing intra block copy IBC prediction on all sub-blocks, and recording a prediction error of each sub-block, which is predicted by the optimal motion vector, includes:

marking 7 divisions in the horizontal direction as H1-H7, and 7 divisions in the vertical direction as V1-V7; 7 types of divisions in the diagonal direction from the upper right to the lower left are marked as T1-T7, and 7 types of divisions in the diagonal direction from the upper left to the lower right are marked as D1-D7;

after the search of each reference block in the search range is completed, the minimum prediction error and the corresponding optimal motion vector of each sub-block in each partition are obtained, and the following marks are carried out:

the optimal motion vectors MV of the upper and lower sub-blocks of the 1 st partition type H1 in the horizontal direction are denoted as MV _ IBC _ H1_1 and MV _ IBC _ H1_2, respectively, and the corresponding prediction errors are denoted as SAD _ IBC _ H1_1 and SAD _ IBC _ H1_ 2; by analogy, the optimal motion vectors MV of the upper and lower sub-blocks of the 7 th partition type H7 are respectively denoted as MV _ IBC _ H7_1 and MV _ IBC _ H7_2, and the corresponding prediction errors are denoted as SAD _ IBC _ H7_1 and SAD _ IBC _ H7_ 2;

the optimal motion vectors MV of the left and right sub-blocks of the 1 st partition type V1 in the vertical direction are denoted as MV _ IBC _ V1_1 and MV _ IBC _ V1_2, respectively, and the corresponding prediction errors are denoted as SAD _ IBC _ V1_1 and SAD _ IBC _ V1_ 2; by analogy, the optimal motion vectors MV of the left and right sub-blocks of the 7 th partition type V7 are respectively denoted as MV _ IBC _ V7_1 and MV _ IBC _ V7_2, and the corresponding prediction errors are denoted as SAD _ IBC _ V7_1 and SAD _ IBC _ V7_ 2;

the optimal motion vectors MV of the upper left sub-block and the lower right sub-block of the 1 st partition type T1 in the diagonal direction from the upper right to the lower left are denoted as MV _ IBC _ T1_1 and MV _ IBC _ T1_2, respectively, and the corresponding prediction errors are denoted as SAD _ IBC _ T1_1 and SAD _ IBC _ T1_ 2; by analogy, the optimal motion vectors MV of the upper left sub-block and the lower right sub-block of the 7 th partition type T7 are denoted as MV _ IBC _ T7_1 and MV _ IBC _ T7_2, respectively, and the corresponding prediction errors are denoted as SAD _ IBC _ T7_1 and SAD _ IBC _ T7_ 2;

the optimal motion vectors MV of the upper right and lower left sub-blocks of the 1 st partition type D1 in the diagonal direction from top left to bottom right are denoted as MV _ IBC _ D1_1 and MV _ IBC _ D1_2, respectively, and the corresponding prediction errors are denoted as SAD _ IBC _ D1_1 and SAD _ IBC _ D1_ 2; by analogy, the optimal motion vectors MV of the upper right sub-block and the lower left sub-block of the 7 th partition type T7 are denoted as MV _ IBC _ D7_1 and MV _ IBC _ D7_2, respectively, and the corresponding prediction errors are denoted as SAD _ IBC _ D7_1 and SAD _ IBC _ D7_ 2.

Further, the performing DIP prediction on all sub-blocks to obtain an optimal prediction direction of each sub-block, and recording a prediction error of each sub-block, which is obtained by predicting the optimal prediction direction, includes:

the DIP prediction is that the optimal prediction direction is determined in 35 kinds of intra-frame prediction of HEVC, and after a predicted value in each direction is generated, the prediction error of each sub-block between a predicted block and a current coding block is calculated; after completing the traversal of 35 prediction directions, the minimum prediction error and corresponding intra prediction direction of each sub-block in each partition are obtained, and the following labels are performed:

the optimal prediction directions of the upper and lower sub-blocks of the 1 st partition type H1 in the horizontal direction are denoted as Mode _ DIP _ H1_1 and Mode _ DIP _ H1_2, respectively, and the corresponding prediction errors are denoted as SAD _ DIP _ H1_1 and SAD _ DIP _ H1_ 2; by analogy, the optimal prediction directions of the upper and lower sub-blocks of the 7 th partition type H7 are respectively denoted as Mode _ DIP _ H7_1 and Mode _ DIP _ H7_2, and the corresponding prediction errors are denoted as SAD _ DIP _ H7_1 and SAD _ DIP _ H7_ 2;

the optimal prediction directions of the left and right sub-blocks of the 1 st partition type V1 in the vertical direction are denoted as Mode _ DIP _ V1_1 and Mode _ DIP _ V1_2, respectively, and the corresponding prediction errors are denoted as SAD _ DIP _ V1_1 and SAD _ DIP _ V1_ 2; by analogy, the optimal prediction directions of the left and right sub-blocks of the 7 th partition type H7 are respectively denoted as Mode _ DIP _ V7_1 and Mode _ DIP _ V7_2, and the corresponding prediction errors are denoted as SAD _ DIP _ V7_1 and SAD _ DIP _ V7_ 2;

the optimal prediction directions of the two sub-blocks at the top left and the bottom right of the 1 st partition type T1 in the diagonal direction from top right to bottom left are respectively denoted as Mode _ DIP _ T1_1 and Mode _ DIP _ T1_2, and the corresponding prediction errors are denoted as SAD _ DIP _ T1_1 and SAD _ DIP _ T1_ 2; by analogy, the optimal prediction directions of the upper left sub-block and the lower right sub-block of the 7 th partition type T7 are respectively denoted as Mode _ DIP _ T7_1 and Mode _ DIP _ T7_2, and the corresponding prediction errors are denoted as SAD _ DIP _ T7_1 and SAD _ DIP _ T7_ 2;

the optimal prediction directions of the upper right and lower left sub-blocks of the 1 st partition type D1 in the diagonal direction from the top left to the bottom right are referred to as Mode _ DIP _ D1_1 and Mode _ DIP _ D1_2, respectively, and the corresponding prediction errors are referred to as SAD _ DIP _ D1_1 and SAD _ DIP _ D1_ 2; by analogy, the optimal prediction directions of the upper right sub-block and the lower left sub-block of the 7 th partition type T7 are respectively denoted as Mode _ DIP _ D7_1 and Mode _ DIP _ D7_2, and the corresponding prediction errors are denoted as SAD _ DIP _ D7_1 and SAD _ DIP _ D7_ 2.

Further, the determining the optimal partition and the corresponding prediction combination in all partition types through the fast rate distortion optimization RDO includes:

determining the partition and prediction combination with the minimum total cost in 7 partition types in the horizontal direction, the vertical direction, the diagonal direction from the top right to the bottom left and the diagonal direction from the top left to the bottom right respectively, wherein the process is as follows:

of the 7 division types in the horizontal directions H1-H7, there are 14 combinations, and the corresponding costs are:

\begin{matrix} C_H_i_1 = S A D_I B C_H i_1 + S A D_D I P_H i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_H i_1, M o d e_D I P_H i_2) \\ C_H_i_2 = S A D_I B C_H i_2 + S A D_D I P_H i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_H i_2, M o d e_D I P_H i_1) \end{matrix};

in the above formula, i is 1, 2.., 7; λ is a lagrange multiplier determined by a parameter specified by the encoder, and the Bits function is an estimate of the number of Bits consumed for entropy coding the MV and DIP directions; finding out the minimum combination of C _ H _ i _1 and C _ H _ i _2, and marking the minimum combination as C _ H _ k1_ p1, and marking the corresponding prediction modes as (k1 and p1), wherein k1 is one number from 1 to 7, and the division type is Hk 1; p1 is one of 1 to 2, which indicates the prediction combination mode, 1 means that the left part carries out IBC prediction, the right part carries out DIP prediction, and 2 is the opposite;

among the 7 division types of the vertical directions V1-V7, there are 14 combinations, and the corresponding costs are:

\begin{matrix} C_V_i_1 = S A D_I B C_V i_1 + S A D_D I P_V i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_V i_1, M o d e_D I P_V i_2) \\ C_V_i_2 = S A D_I B C_V i_2 + S A D_D I P_V i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_V i_2, M o d e_D I P_V i_1) \end{matrix};

finding out the minimum combination of C _ V _ i _1 and C _ V _ i _2 from the above formula, and marking the minimum combination as C _ V _ k2_ p2, and marking the corresponding prediction modes as (k2 and p2), wherein k2 is one number from 1 to 7, which indicates that the partition type is Vk 2; p2 is one of 1 to 2, which indicates the prediction combination mode, 1 means that the upper part carries out IBC prediction, the lower part carries out DIP prediction, and 2 is vice versa;

in 7 divisions from the upper right to the lower left diagonal direction T1 to T7, there are 14 combinations, and the corresponding costs are:

\begin{matrix} C_T_i_1 = S A D_I B C_T i_1 + S A D_D I P_T i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_T i_1, M o d e_D I P_T i_2) \\ C_T_i_2 = S A D_I B C_T i_2 + S A D_D I P_T i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_T i_2, M o d e_D I P_T i_1) \end{matrix};

finding out the minimum combination of C _ T _ i _1 and C _ T _ i _2 from the above formula, and marking the minimum combination as C _ T _ k3_ p3, and marking the corresponding prediction modes as (k3 and p3), wherein k3 is one number from 1 to 7, which indicates that the partition type is Tk 3; p3 is one of 1 to 2, which indicates the prediction combination mode, wherein 1 means that IBC prediction is carried out on the upper left part, DIP prediction is carried out on the lower right part, and the other way is not carried out on 2;

of the 7 divisions in the diagonal directions D1-D7 from top left to bottom right, there are 14 combinations, and the corresponding costs are:

\begin{matrix} C_D_i_1 = S A D_I B C_D i_1 + S A D_D I P_D i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_D i_1, M o d e_D I P_D i_2) \\ C_D_i_2 = S A D_I B C_D i_2 + S A D_D I P_D i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_D i_2, M o d e_D I P_D i_1) \end{matrix}

finding out the minimum combination of C _ D _ i _1 and C _ D _ i _2 from the above formula, and marking the minimum combination as C _ D _ k4_ p4, and marking the corresponding prediction modes as (k4 and p4), wherein k4 is one number from 1 to 7, which indicates that the partition type is Dk 4; p4 is one of numbers 1 to 2 indicating the prediction combination, 1 means IBC prediction in the upper right part and DIP prediction in the lower left part, and 2 is the opposite.

Further, the selecting, through complete RDO comparison, a prediction mode with the minimum coding cost as a final prediction mode of the current coding block includes:

carrying out DIP prediction on a current coding block, and recording the coding cost calculated by complete RDO as C _ DIP;

IBC prediction is carried out on the current coding block, and the coding cost calculated by complete RDO is recorded as C _ IBC;

for the horizontal direction, from the prediction mode (k1, p1), the intra prediction direction is determined according to the corresponding motion vector MV _ IBC _ Hk1_ p1Generating a prediction block of the current block by the partitioning type Hk1, and recording the coding cost calculated by the complete RDO as C _ H; wherein,

for the vertical direction, from the prediction mode (k2, p2), the intra prediction direction is determined according to the corresponding motion vector MV _ IBC _ Vk2_ p2Generating a prediction block of the current block by the division type Vk2, and recording the coding cost calculated by the complete RDO as C _ V; wherein,

for diagonal directions from top right to bottom left, the intra prediction direction is determined by the prediction mode (k3, p3) according to the corresponding motion vector MV _ IBC _ Tk3_ p3And the partition type Tk3 generates a prediction block of the current block, and the coding cost calculated by the complete RDO is recorded as C _ T; wherein,

for diagonal directions from top left to bottom right, the intra prediction direction is determined by the prediction mode (k4, p4) according to the corresponding motion vector MV _ IBC _ Dk4_ p4And the partition type Dk4 generates a prediction block of the current block, and the coding cost calculated by the complete RDO is recorded as C _ D;

and comparing the sizes of the C _ DIP, the C _ IBC and the C _ H, C _ V, C _ T, C _ D, and taking the prediction mode corresponding to the minimum coding cost as the final prediction mode of the current coding block.

Further, if the multiple partition types include 7 partition types with an L-shape at the top left and 7 partition types with an L-shape at the top right, determining the optimal motion vector of each sub-block when performing intra block copy IBC prediction on all sub-blocks, and recording a prediction error of each sub-block, which is predicted by the optimal motion vector, includes:

recording 7 partition types with L-shaped upper left as L1-L7, and recording 7 partition types with L-shaped upper right as R1-R7;

the optimal motion vectors MV of the upper left sub-block and the lower right sub-block of the 1 st partition type L1 in the L-type at the upper left are respectively denoted as MV _ IBC _ L1_1 and MV _ IBC _ L1_2, and the corresponding prediction errors are denoted as SAD _ IBC _ L1_1 and SAD _ IBC _ L1_ 2; by analogy, the optimal motion vectors MV of the upper left sub-block and the lower right sub-block of the 7 th partition type L7 are respectively denoted as MV _ IBC _ L7_1 and MV _ IBC _ L7_2, and the corresponding prediction errors are denoted as SAD _ IBC _ L7_1 and SAD _ IBC _ L7_ 2;

the optimal motion vectors MV of the upper right sub-block and the lower left sub-block of the 1 st partition type R1 in the L-type at the upper right are respectively denoted as MV _ IBC _ R1_1 and MV _ IBC _ R1_2, and the corresponding prediction errors are denoted as SAD _ IBC _ R1_1 and SAD _ IBC _ R1_ 2; by analogy, the optimal motion vectors MV of the upper right sub-block and the lower left sub-block of the 7 th partition type R7 are denoted as MV _ IBC _ R7_1 and MV _ IBC _ R7_2, respectively, and the corresponding prediction errors are denoted as SAD _ IBC _ R7_1 and SAD _ IBC _ R7_ 2.

the optimal prediction directions of the upper left sub-block and the lower right sub-block of the 1 st partition type L1 in the L-type at the upper left are respectively denoted as Mode _ DIP _ L1_1 and Mode _ DIP _ L1_2, and the corresponding prediction errors are denoted as SAD _ DIP _ L1_1 and SAD _ DIP _ L1_ 2; by analogy, the optimal prediction directions of the upper left sub-block and the lower right sub-block of the 7 th partition type L7 are respectively denoted as Mode _ DIP _ L7_1 and Mode _ DIP _ L7_2, and the corresponding prediction errors are denoted as SAD _ DIP _ L7_1 and SAD _ DIP _ L7_ 2;

the optimal prediction directions of the upper right sub-block and the lower left sub-block of the 1 st partition type R1 in the upper right sub-block are referred to as Mode _ DIP _ R1_1 and Mode _ DIP _ R1_2, respectively, and the corresponding prediction errors are referred to as SAD _ DIP _ R1_1 and SAD _ DIP _ R1_ 2; by analogy, the optimal prediction directions of the upper right sub-block and the lower left sub-block of the 7 th partition type R7 are referred to as Mode _ DIP _ R7_1 and Mode _ DIP _ R7_2, respectively, and the corresponding prediction errors are referred to as SAD _ DIP _ R7_1 and SAD _ DIP _ R7_ 2.

determining the partition and prediction combination with the minimum overall cost in 7 partition types of L-shaped at the upper left and L-shaped at the upper right, wherein the process is as follows:

of the 7 partition types L1-L7 at the top left, there are 14 combinations, and the corresponding costs are:

\begin{matrix} C_L_i_1 = S A D_I B C_L i_1 + S A D_D I P_L i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_L i_1, M o d e_D I P_L i_2) \\ C_L_i_2 = S A D_I B C_L i_2 + S A D_D I P_L i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_L i_2, M o d e_D I P_L i_1) \end{matrix};

in the above formula, i is 1, 2.., 7; the Bits function is an estimate of the number of Bits consumed for entropy coding of MV and DIP directions; finding out the minimum combination of C _ L _ i _1 and C _ L _ i _2, and marking the minimum combination as C _ L _ k1_ p1, and marking the corresponding prediction modes as (k1 and p1), wherein k1 is one number from 1 to 7, and the division type is Lk 1; p1 is one of 1 to 2, which indicates the prediction combination mode, 1 means that the upper left L-shaped part is subjected to IBC prediction, the lower right square part is subjected to DIP prediction, and 2 is the opposite;

in the 7 division types of L-type R1-R7 at the upper right, there are 14 combinations, and the corresponding costs are respectively:

\begin{matrix} C_R_i_1 = S A D_I B C_R i_1 + S A D_D I P_R i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_R i_1, M o d e_D I P_R i_2) \\ C_R_i_2 = S A D_I B C_R i_2 + S A D_D I P_R i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_R i_2, M o d e_D I P_R i_1) \end{matrix};

finding out the minimum combination of C _ R _ i _1 and C _ R _ i _2, which is marked as C _ R _ k2_ p2, and the corresponding prediction modes are marked as (k2 and p2), wherein k2 is a number from 1 to 7, and the division position is Rk 2; p2 is one of numbers 1 to 2 indicating the prediction combination, 1 means that the upper right L-shaped part is subjected to IBC prediction, the lower left square part is subjected to DIP prediction, and 2 is the opposite.

for the partition type with L-shaped upper left, the prediction mode (k1, p1) is based on the corresponding motion vector MV _ IBC _ Lk1_ p1 and the intra prediction directionAnd dividing the type Lk1 to generate a prediction block of the current block, and recording the coding cost calculated by the complete RDO as C _ L; wherein,

for the partition type with L-shaped upper right, the prediction mode (k2, p)2) According to the corresponding motion vector MV _ IBC _ Rk2_ p2, intra prediction directionAnd the partition type Rk2 generates a prediction block of the current block, and the coding cost calculated by the complete RDO is recorded as C _ R; wherein,

and comparing the sizes of the C _ DIP, the C _ IBC and the C _ L, C _ R, and taking the prediction mode corresponding to the minimum coding cost as the final prediction mode of the current coding block.

According to the technical scheme provided by the invention, the coding block is divided into two sub-blocks through a flexible block division mode, so that the prediction content of the same block can simultaneously contain local information and non-local information, and then the RDO determines the prediction with the minimum cost of each block. Meanwhile, the combination of the rapid RDO and the complete RDO is designed, and the coding complexity is reduced on the premise of not influencing the coding performance. The overall effect is to achieve higher compression efficiency without substantially increasing the complexity at the decoding end.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a flowchart of a video intra-coding method combining directional prediction and block copy prediction according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of the division of V1, V2, … and V7 in the vertical direction according to the embodiment of the present invention;

FIG. 3 is a schematic diagram of the division of H1, H2, …, H7 in the horizontal direction according to an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating the division of T1, T2, … and T7 in the diagonal direction from top right to bottom left according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the division of D1, D2, … and D7 in diagonal directions from top left to bottom right according to an embodiment of the present invention;

FIG. 6 is a schematic illustration of IBC prediction provided by an embodiment of the present invention;

fig. 7 is a schematic diagram of partitions L1, L2, …, and L7 in the partition types with L-shaped upper left according to the embodiment of the present invention;

fig. 8 is a schematic diagram of the divisions R1, R2, … and R7 in the division type with L-shaped upper right according to the embodiment of the present invention;

fig. 9 is a schematic diagram of coding performance after being tested according to the scheme of the first embodiment.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a flowchart of a video intra-coding method combining directional prediction and block copy prediction according to an embodiment of the present invention, as shown in fig. 1, which mainly includes the following steps:

1) the current coding block is divided by utilizing a predefined division template containing a plurality of division types, and each division type divides the current coding block into two sub-blocks.

2) Intra Block Copy (IBC) prediction for all sub-blocks: determining a search range of IBC prediction, performing IBC prediction on all subblocks by using each reference block in the search range, determining an optimal motion vector of each subblock, and recording a prediction error of each subblock, which is obtained by predicting the optimal motion vector.

3) And carrying out DIP prediction (intra-frame direction prediction) on all the sub-blocks to obtain the optimal prediction direction of each sub-block, and recording the prediction error of each sub-block, which is obtained by predicting the optimal prediction direction.

4) After prediction errors of all subblocks under IBC prediction and DIP prediction are obtained, the optimal partition and the corresponding prediction combination are determined in all partition types through fast Rate Distortion Optimization (RDO).

5) And through complete RDO comparison, the coding cost of IBC prediction on the current coding block, the coding cost of DIP prediction on the current coding block and the coding cost of optimal division and corresponding prediction combination are selected, and the prediction mode with the minimum coding cost is selected as the final prediction mode of the current coding block.

The step 2) and the step 3) are not executed in sequence, and in addition, corresponding IBC prediction and DIP prediction can be simultaneously carried out on the whole coding block when the step 2) and the step 3) are executed.

In the embodiment of the present invention, since the number of partition types is related to the types of predictions, that is, the more flexible the partition types are, the more the types of predictions are, but the higher the encoding complexity is, in this embodiment, the encoding performance and the complexity are considered comprehensively, and two forms of partition modes are proposed: the first is 28 division types which cover the horizontal direction, the vertical direction, the diagonal direction from the top right to the bottom left and the diagonal direction from the top left to the bottom right, wherein, each direction has 7 division types, the division types of the horizontal direction and the vertical direction are collectively called as rectangular division, and the division types of the diagonal direction from the top right to the bottom left and the diagonal direction from the top left to the bottom right are collectively called as triangular division; the second type is that 7 partition types with L-shaped upper left and 7 partition types with L-shaped upper right are included, and the two partition modes are that the current coding block is divided into two sub-blocks: one sub-block is a square block, and the other sub-block is an L-shaped block formed by two adjacent rectangular blocks.

For the sake of easy understanding, the following describes the process of performing the above steps 1) to 5) for the above two division modes in detail with reference to two embodiments.

Example one

The video intra-frame coding method combining the direction prediction and the block copy prediction provided by the embodiment comprises the following steps:

firstly, determining a division type.

In this embodiment, the plurality of partition types are 28 partition types in total, which covers the horizontal direction, the vertical direction, the diagonal direction from the top right to the bottom left, and the diagonal direction from the top left to the bottom right.

Marking 7 divisions in the horizontal direction as H1-H7, and 7 divisions in the vertical direction as V1-V7; the 7 kinds of divisions in the diagonal direction from the upper right to the lower left are denoted as T1 to T7, and the 7 kinds of divisions in the diagonal direction from the upper left to the lower right are denoted as D1 to D7.

As shown in fig. 2, the partitions of V1, V2, … and V7 are shown. A square in V1 represents the current coding block, the square is divided into a left rectangle and a right rectangle, the heights of the two rectangles are the same, and the width of the left rectangle accounts for 1/8 of the width of the whole square; 7/8, the width of the right rectangle is the width of the whole square; in V2, a square is divided into a left rectangle and a right rectangle, the heights of the two rectangles are the same, and the width of the left rectangle accounts for 2/8 of the width of the whole square; 6/8, the width of the right rectangle is the width of the whole square; similarly, the width of the left rectangle in V7 accounts for 7/8 of the entire square width; the width of the right rectangle is 1/8 the entire width of the square. H1, H2, …, H7 correspond to V1, V2, …, V7, except that the former is horizontal and the latter is vertical. H1, H2, …, H7 partitions are shown in fig. 3.

Next, describing the diagonal direction division from top right to bottom left in the triangular division, the T1, T2, T3 and T4 divisions divide a square into an isosceles right triangle at the top left corner and the remaining part at the bottom right corner; t5, T6 and T7 divide the left part of an isosceles right triangle that divides a square into a lower right corner and an upper left corner. As shown in fig. 4, in the T1 division, the base side length of an isosceles right triangle is 1/4 of the side length of a square; in the division of T2, the length of the bottom side of an isosceles right triangle is 2/4 of the length of a square side; in the same way, in the division of T4, the base side length of an isosceles right triangle is 4/4 of the side length of a square, that is, equal to the side length of the square. The T5, T6, T7 divisions are similar to the divisions of T1 through T4. The divisions of D1, D2, …, D7 are symmetrical with the divisions T1, T2, …, T7 about the vertical direction as shown in fig. 5.

And secondly, after the current coding block is divided according to the division template of the division type, Intra Block Copy (IBC) prediction needs to be carried out on all the sub blocks.

The search range needs to be determined prior to the search, and may be less than or equal to the reconstructed portion of the current frame. If the search range is set to be larger, the search complexity is higher, and meanwhile, a better prediction value can be found for the current block. The search is started after the search range is determined,

the present embodiment employs a full search, i.e., for each reference block within the search range, a prediction error, e.g., (SAD, sum of absolute differences) is calculated from each sub-block. Since the optimal prediction mode needs to be determined later by RDO (Rate distortion optimization), SAD of different shapes between the reference block and the current encoding block is calculated here.

Fig. 6 is a schematic diagram of a step in the IBC process, where a rectangular area (i.e. a rectangular area where a current coding Block is located) shown by a dotted line is an uncoded part, a peripheral part thereof is a reconstructed part in a current frame, a search range of the current coding Block reaches one of blocks, and an arrow indicates BV (Block Vector, which indicates a coordinate difference between a reference Block and a current Block, and is similar to a Motion Vector during inter-frame Motion estimation, i.e. Motion Vector, where BV is hereinafter indicated by MV, which is a customary expression). As can be seen from the foregoing, there are 28 division modes in the current coding block, each division mode divides the current coding block into two sub-blocks, and the SAD of each sub-block needs to be calculated, so that the SAD of 56 division blocks plus the SAD of the whole coding block needs to be calculated, namely 57 division modes. Since the sum of the two sub-block SADs obtained by each of the 28 kinds of divisions is equal to the SAD of the entire block, 29 SADs are actually calculated (the remaining 28 SADs can be obtained by subtracting the SAD of the corresponding small block from the SAD of the large block).

And thirdly, carrying out DIP prediction on all the subblocks to obtain the optimal prediction direction of each subblock, and recording the prediction error of each subblock, which is obtained by predicting the optimal prediction direction.

The DIP prediction is to determine the optimal prediction direction in 35 intra-frame predictions of HEVC, and because the optimal partition mode of the current coding unit cannot be determined before RDO is performed, we need to find the optimal prediction direction for each part in each partition, and record the SAD of each part in the optimal prediction direction at the same time. Calculating a prediction error (i.e., SAD) of each sub-block between the prediction block and the current coding block after the prediction value of each direction is generated; as can be seen from the foregoing, it is necessary to calculate the SAD of 56 divided sub-blocks plus the SAD of the entire coding block, i.e., 57 kinds.

After completing the traversal of 35 prediction directions, the minimum prediction error and corresponding intra prediction direction of each sub-block in each partition are obtained, and the following labels are performed:

And fourthly, determining the optimal partition and the corresponding prediction combination in all partition types through the fast Rate Distortion Optimization (RDO).

\begin{matrix} C_H_i_1 = S A D_I B C_H i_1 + S A D_D I P_H i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_H i_1, M o d e_D I P_H i_2) \\ C_H_i_2 = S A D_I B C_H i_2 + S A D_D I P_H i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_H i_2, M o d e_D I P_H i_1) \end{matrix};

\begin{matrix} C_V_i_1 = S A D_I B C_V i_1 + S A D_D I P_V i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_V i_1, M o d e_D I P_V i_2) \\ C_V_i_2 = S A D_I B C_V i_2 + S A D_D I P_V i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_V i_2, M o d e_D I P_V i_1) \end{matrix};

\begin{matrix} C_T_i_1 = S A D_I B C_T i_1 + S A D_D I P_T i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_T i_1, M o d e_D I P_T i_2) \\ C_T_i_2 = S A D_I B C_T i_2 + S A D_D I P_T i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_T i_2, M o d e_D I P_T i_1) \end{matrix};

\begin{matrix} C_D_i_1 = S A D_I B C_D i_1 + S A D_D I P_D i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_D i_1, M o d e_D I P_D i_2) \\ C_D_i_2 = S A D_I B C_D i_2 + S A D_D I P_D i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_D i_2, M o d e_D I P_D i_1) \end{matrix}

And fifthly, comparing the coding cost of IBC prediction of the current coding block, the coding cost of DIP prediction of the current coding block and the coding cost of optimal division and corresponding prediction combination through complete RDO, and selecting the prediction mode with the minimum coding cost as the final prediction mode of the current coding block.

Example two

firstly, determining a division type.

In this embodiment, the plurality of partition types include 7 partition types of L-shaped upper left and 7 partition types of L-shaped upper right.

as shown in fig. 7, the division diagram of L1, L2, … and L7 is shown. In L1, a square is divided into an upper left L shape and a lower right square, and the width of the upper left L shape accounts for 1/8 of the width of the whole square; 7/8, the width of the lower right square accounts for the width of the whole square; in L2, a square is divided into an upper left L shape and a lower right square, and the width of the upper left L shape accounts for 2/8 of the width of the whole square; 6/8, the width of the lower right square accounts for the width of the whole square; similarly, a square in L7 is divided into an upper left L-shape and a lower right square, and the width of the upper left L-shape accounts for 7/8 of the width of the whole square; the width of the lower right square accounts for 1/8 the width of the entire square. The divisions of R1, R2, …, R7 are symmetrical with the divisions of L1, L2, …, L7 about the vertical direction as shown in fig. 8.

The calculation principle of this step is the same as that of the second step in the first embodiment, and therefore, the description is omitted, and the difference between the two steps is mainly that the marks of the sub-blocks are not the same.

In this embodiment, after the search of each reference block in the search range is completed, the minimum prediction error and the corresponding optimal motion vector of each sub-block in each partition are obtained, and the following flags are performed:

In this embodiment, after traversing 35 prediction directions, the minimum prediction error and the corresponding intra-frame prediction direction of each sub-block in each partition are obtained, and the following flags are performed:

In this embodiment, the partition and prediction combination with the minimum overall cost is determined from the 7 partition types of the respective L-type at the top left and the L-type at the top right, and the process is as follows:

\begin{matrix} C_L_i_1 = S A D_I B C_L i_1 + S A D_D I P_L i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_L i_1, M o d e_D I P_L i_2) \\ C_L_i_2 = S A D_I B C_L i_2 + S A D_D I P_L i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_L i_2, M o d e_D I P_L i_1) \end{matrix};

\begin{matrix} C_R_i_1 = S A D_I B C_R i_1 + S A D_D I P_R i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_R i_1, M o d e_D I P_R i_2) \\ C_R_i_2 = S A D_I B C_R i_2 + S A D_D I P_R i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_R i_2, M o d e_D I P_R i_1) \end{matrix};

The process of this step is the same as the fifth step in the first embodiment, and the difference is only that the marks of some parameters are not consistent; the process of this step is specifically as follows:

for the partition type with L-type at the upper right, the intra prediction direction is determined by the prediction mode (k2, p2) according to the corresponding motion vector MV _ IBC _ Rk2_ p2And the partition type Rk2 generates a prediction block of the current block, and the coding cost calculated by the complete RDO is recorded as C _ R; wherein,

In the two embodiments of the present invention, the coding block is divided into two sub-blocks by a flexible block division manner, so that the prediction content of the same block can simultaneously include local information and non-local information, and then the RDO determines the prediction with the minimum cost for each block. Meanwhile, the combination of the rapid RDO and the complete RDO is designed, and the coding complexity is reduced on the premise of not influencing the coding performance. The overall effect is to achieve higher compression efficiency without substantially increasing the complexity at the decoding end.

To further illustrate, the effects of the above-described embodiment were also tested based on the first embodiment. The test conditions included: intra Configuration (AI), Quantization Step (QP) is set to {22, 27, 32, 37} and the search range is | mv_x|+|mv_y|<128, the number of frames tested is 5 frames, the software based is HM12.0, and the test sequence is the generic test sequence of HEVC (not including Class F, since it is a SCC sequence). There are two groups of comparative experiments: the first is the comparison of performance of HM12.0 itself with that of HM12.0 after the IBC technique has been integrated; the second set is a comparison of the performance of the HM12.0 integrated IBC technique with the HM12.0 integrated with our technique. The two sets of experimental results are shown in fig. 9, wherein the first column of fig. 9 is the name of the test sequence, the second column is the first set of experimental results, the third column is the second set of experimental results, and the last two rows are the complexity variation of the encoding side and the decoding side. It can be seen that the scheme proposed by the present invention can obtain a 2.1% code rate saving with respect to the HM, and can obtain a 0.9% code rate saving with respect to the HM integrated with IBC, and the decoding complexity is substantially unchanged.

Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions to enable a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for video intra-coding with combined directional prediction and block copy prediction, comprising:

2. The method of claim 1, wherein the plurality of partition types comprise:

3. The method of claim 2, wherein if the plurality of partition types are 28 partition types including a horizontal direction, a vertical direction, a diagonal direction from top right to bottom left, and a diagonal direction from top left to bottom right, determining an optimal motion vector for each sub-block when performing IBC prediction on all sub-blocks, and recording a prediction error of each sub-block obtained by optimal motion vector prediction comprises:

4. The method of claim 3, wherein the DIP prediction is performed on all sub-blocks to obtain an optimal prediction direction of each sub-block, and the recording of the prediction error of each sub-block obtained by the optimal prediction direction prediction comprises:

5. The method of claim 4, wherein determining the optimal partition and corresponding prediction combination among all partition types by fast Rate Distortion Optimization (RDO) comprises:

\begin{matrix} C_H_i_1 = S A D_I B C_H i_1 + S A D_D I P_H i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_H i_1, M o d e_D I P_H i_2) \end{matrix}

\begin{matrix} C_H_i_2 = S A D_I B C_H i_2 + S A D_D I P_H i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_H i_2, M o d e_D I P_H i_1) \end{matrix};

\begin{matrix} C_V_i_1 = S A D_I B C_V i_1 + S A D_D I P_V i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_V i_1, M o d e_D I P_V i_2) \end{matrix}

\begin{matrix} C_V_i_2 = S A D_I B C_V i_2 + S A D_D I P_V i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_V i_2, M o d e_D I P_V i_1) \end{matrix};

\begin{matrix} C_T_i_1 = S A D_I B C_T i_1 + S A D_D I P_T i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_T i_1, M o d e_D I P_T i_2) \end{matrix}

\begin{matrix} C_T_i_2 = S A D_I B C_T i_2 + S A D_D I P_T i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_T i_2, M o d e_D I P_T i_1) \end{matrix};

\begin{matrix} C_D_i_1 = S A D_I B C_D i_1 + S A D_D I P_D i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_D i_1, M o d e_D I P_D i_2) \end{matrix}

\begin{matrix} C_D_i_2 = S A D_I B C_D i_2 + S A D_D I P_D i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_D i_2, M o d e_D I P_D i_1) \end{matrix}

6. The method of claim 5, wherein the selecting the prediction mode with the lowest coding cost as the final prediction mode of the current coding block comprises:

7. The method of claim 2, wherein if the plurality of partition types include 7 partition types with an L-shaped upper left and 7 partition types with an L-shaped upper right, determining an optimal motion vector for each sub-block when performing IBC prediction on all sub-blocks, and recording a prediction error of each sub-block, which is predicted by the optimal motion vector, comprises:

8. The method of claim 7, wherein the DIP prediction is performed on all sub-blocks to obtain an optimal prediction direction of each sub-block, and the recording of the prediction error of each sub-block obtained by the optimal prediction direction prediction comprises:

9. The method of claim 8, wherein determining the optimal partition and corresponding prediction combination among all partition types by fast Rate Distortion Optimization (RDO) comprises:

\begin{matrix} C_L_i_1 = S A D_I B C_L i_1 + S A D_D I P_L i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_L i_1, M o d e_D I P_L i_2) \end{matrix}

\begin{matrix} C_L_i_2 = S A D_I B C_L i_2 + S A D_D I P_L i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_L i_2, M o d e_D I P_L i_1) \end{matrix};

\begin{matrix} C_R_i_1 = S A D_I B C_R i_1 + S A D_D I P_R i_2 + \\ \sqrt{λ} \times B i t s (M V_I B C_R i_1, M o d e_D I P_R i_2) \end{matrix}

\begin{matrix} C_R_i_2 = S A D_I B C_R i_2 + S A D_D I P_R i_1 + \\ \sqrt{λ} \times B i t s (M V_I B C_R i_2, M o d e_D I P_R i_1) \end{matrix};

10. The method of claim 9, wherein the selecting the prediction mode with the lowest coding cost as the final prediction mode of the current coding block comprises: