CN111223046B - Image super-resolution reconstruction method and device - Google Patents
Image super-resolution reconstruction method and device Download PDFInfo
- Publication number
- CN111223046B CN111223046B CN201911140450.XA CN201911140450A CN111223046B CN 111223046 B CN111223046 B CN 111223046B CN 201911140450 A CN201911140450 A CN 201911140450A CN 111223046 B CN111223046 B CN 111223046B
- Authority
- CN
- China
- Prior art keywords
- module
- image
- output
- convolution layer
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims abstract description 89
- 230000006872 improvement Effects 0.000 claims abstract description 33
- 238000005070 sampling Methods 0.000 claims description 44
- 238000004891 communication Methods 0.000 claims description 3
- 101000942217 Homo sapiens Protein C19orf12 Proteins 0.000 description 37
- 208000036397 Mitochondrial membrane protein-associated neurodegeneration Diseases 0.000 description 37
- 102100032608 Protein C19orf12 Human genes 0.000 description 37
- 201000007615 neurodegeneration with brain iron accumulation 4 Diseases 0.000 description 37
- LLZWPQFQEBKRLX-UHFFFAOYSA-N nitro 2-methylprop-2-eneperoxoate Chemical compound CC(=C)C(=O)OO[N+]([O-])=O LLZWPQFQEBKRLX-UHFFFAOYSA-N 0.000 description 37
- 230000006870 function Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 12
- 238000004364 calculation method Methods 0.000 description 11
- 238000012549 training Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000008447 perception Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The application provides an image super-resolution reconstruction method and device, wherein the method comprises the following steps: and inputting the low-resolution image and the resolution improvement times into a trained preset network to obtain a high-resolution reconstructed image. The trained preset network comprises a preset number of multi-perception branch modules, and any multi-perception branch module comprises a plurality of cascaded residual channel attention groups; any residual channel attention group includes a plurality of cascaded enhancement residual blocks, any enhancement residual block including: the second convolution layer, the third convolution layer, the rectifying module and the second summation module; an image for inputting the enhanced residual block is input to a second convolution layer; the output of the second convolution layer is input into the rectification module; the output of the rectifying module is input into the third convolution layer; the image for inputting the enhanced residual block, the output of the rectifying module, and the output of the third convolution layer are respectively input to the second summing module. The reconstructed image has higher spatial resolution and higher information fidelity.
Description
Technical Field
The present disclosure relates to the field of image processing, and in particular, to a method and an apparatus for reconstructing super-resolution images.
Background
Super-Resolution reconstruction (SR) refers to restoring a high Resolution image from one or more low Resolution images of the same scene.
Super-resolution reconstruction is an important digital image processing technology and has wide application in the fields of medicine, remote sensing and various social life. The current mainstream image super-resolution reconstruction method is as follows: a super-resolution reconstruction method based on deep learning. Specifically, a mapping relation between a high-resolution training sample pair and a low-resolution training sample pair is learned by constructing a neural network, and then various low-resolution images of an input network are reconstructed with high resolution by using learned priori knowledge.
However, the fidelity of the reconstructed high resolution image is low.
Disclosure of Invention
The application provides an image super-resolution reconstruction method and device, and aims to solve the problem that the fidelity of an image obtained by super-resolution reconstruction is low.
In order to achieve the above object, the present application provides the following technical solutions:
the application provides an image super-resolution reconstruction method, which comprises the following steps:
acquiring a low-resolution image to be reconstructed and a preset resolution improvement multiple;
inputting the low-resolution image and the resolution improvement multiple into a trained preset network to obtain a high-resolution reconstructed image; the trained preset network comprises a first convolution layer, a preset number of multi-perception branch modules, a first summation module, a first channel attention module and an up-sampling module; any one of the multi-perception branching modules comprises a plurality of cascaded residual channel attention groups; any one of the residual channel attention groups comprises a plurality of cascaded enhanced residual blocks;
The low-resolution image is input into the first convolution layer, and the output of the first convolution layer is respectively input into each multi-perception branch module; the output of each multi-perception branching module and the output of the first convolution layer are respectively input into the first summation module; the first summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image; the output of the first summing module is input to the first channel attention module; the output of the first channel attention module is input into the up-sampling module; the up-sampling module is used for up-sampling the output of the first channel attention module by the resolution improvement times; the up-sampling module obtains the high-resolution reconstructed image;
any one of the enhanced residual blocks includes: the second convolution layer, the third convolution layer, the rectifying module and the second summation module; an image for inputting the enhanced residual block is input to the second convolution layer; the output of the second convolution layer is input into the rectification module; the output of the rectifying module is input into the third convolution layer; the image for inputting the enhanced residual block, the output of the rectifying module, and the output of the third convolution layer are respectively input to the second summing module; the second summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image to obtain a multi-channel image;
Outputting the high-resolution reconstructed image.
Optionally, the preset number is not less than 2.
Optionally, the preset network further includes: a fourth convolution layer;
the output of the up-sampling module is input into the fourth convolution layer;
and the fourth convolution layer carries out convolution operation on the output of the up-sampling module to obtain the high-resolution reconstructed image.
Optionally, any one of the residual channel attention groups further includes: a second channel attention module, a fifth convolution layer, and a third summation module;
an image for inputting the residual channel attention group, a first enhanced residual block in the residual channel attention group; the output of the first enhanced residual block inputs a second enhanced residual block of the residual channel attention group; the output of the (B-1) th enhanced residual block in the residual channel attention group is input into the (B) th enhanced residual block in the residual channel attention group; the output of the B enhanced residual block is input into the second channel attention module; the output of the second channel attention module is input into the fifth convolution layer;
the image for inputting the residual channel attention group and the output of the fifth convolution layer are respectively input into the third summation module;
And the third summation module is used for summing pixel values of the pixel points at the same position of the same channel in the input image and outputting a multi-channel image.
Optionally, the rectifying module is a linear rectifying module.
The application also provides an image super-resolution reconstruction device, which comprises:
the acquisition module is used for acquiring the low-resolution image to be reconstructed and a preset resolution improvement multiple;
the reconstruction module is used for inputting the low-resolution image and the resolution improvement multiple into a trained preset network to obtain a high-resolution reconstructed image; the trained preset network comprises a first convolution layer, a preset number of multi-perception branch modules, a first summation module, a first channel attention module and an up-sampling module; any one of the multi-perception branching modules comprises a plurality of cascaded residual channel attention groups; any one of the residual channel attention groups comprises a plurality of cascaded enhanced residual blocks;
the low-resolution image is input into the first convolution layer, and the output of the first convolution layer is respectively input into each multi-perception branch module; the output of each multi-perception branching module and the output of the first convolution layer are respectively input into the first summation module; the first summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image; the output of the first summing module is input to the first channel attention module; the output of the first channel attention module is input into the up-sampling module; the up-sampling module is used for up-sampling the output of the first channel attention module by the resolution improvement times; the up-sampling module obtains the high-resolution reconstructed image;
Any one of the enhanced residual blocks includes: the second convolution layer, the third convolution layer, the rectifying module and the second summation module; an image for inputting the enhanced residual block is input to the second convolution layer; the output of the second convolution layer is input into the rectification module; the output of the rectifying module is input into the third convolution layer; the image for inputting the enhanced residual block, the output of the rectifying module, and the output of the third convolution layer are respectively input to the second summing module; the second summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image to obtain a multi-channel image;
and the output module is used for outputting the high-resolution reconstructed image.
Optionally, the preset number is not less than 2.
Optionally, the preset network further includes: a fourth convolution layer;
the output of the up-sampling module is input into the fourth convolution layer;
and the fourth convolution layer carries out convolution operation on the output of the up-sampling module to obtain the high-resolution reconstructed image.
Optionally, any one of the residual channel attention groups further includes: a second channel attention module, a fifth convolution layer, and a third summation module;
An image for inputting the residual channel attention group, a first enhanced residual block in the residual channel attention group; the output of the first enhanced residual block inputs a second enhanced residual block of the residual channel attention group; the output of the (B-1) th enhanced residual block in the residual channel attention group is input into the (B) th enhanced residual block in the residual channel attention group; the output of the B enhanced residual block is input into the second channel attention module; the output of the second channel attention module is input into the fifth convolution layer;
the image for inputting the residual channel attention group and the output of the fifth convolution layer are respectively input into the third summation module;
and the third summation module is used for summing pixel values of the pixel points at the same position of the same channel in the input image and outputting a multi-channel image.
Optionally, the rectifying module is a linear rectifying module.
The application also provides a storage medium comprising a stored program, wherein the program executes any one of the image super-resolution reconstruction methods.
The application also provides a device comprising at least one processor, and at least one memory and a bus connected with the processor; the processor and the memory complete communication with each other through the bus; the processor is configured to invoke the program instructions in the memory to perform any of the above-described image super-resolution reconstruction methods.
In the image super-resolution reconstruction method and device, because the preset network comprises a preset number of multi-perception branch modules, any one multi-perception branch module comprises a plurality of cascaded residual channel attention groups, and each residual channel attention group comprises a plurality of cascaded enhanced residual blocks. Wherein, any one enhancement residual block comprises a second convolution layer, a third convolution layer, a rectification module and a second summation module. The image used for inputting the enhanced residual block is input into the second convolution layer, the output of the second convolution layer is input into the rectifying module, and the output of the rectifying module is input into the third convolution layer, so that the input of the third convolution layer in any one enhanced residual block is obtained through the calculation of the second convolution layer, the third convolution layer achieves a different perception scale with the second convolution layer, and the perception scale of the third convolution layer is increased relative to that of the second convolution layer.
The image used for inputting the enhancement residual block, the output of the rectifying module and the output of the third convolution layer are respectively input into the second summation module; the second summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image to obtain images of all channels, namely the enhanced residual block can extract three levels of information, namely an image for inputting the enhanced residual block, the output of the rectification module and the output of the third convolution layer. And, the characteristic information of these three levels is all as the input of the enhancement residual block of the subsequent cascade, and so on, any multi-perception branch module can extract the information of more levels (each level includes a plurality of channels), and then, the preset number of multi-perception branch modules can extract the information of more levels.
In addition, the preset network further comprises a first summation module and a first channel attention module, wherein the first summation module sums pixel values of pixels at the same position of the same channel in the multi-level information output by the first convolution layer and each multi-perception branch module to obtain a multi-channel image, and the first channel attention module gives different weights to different channels in the multi-channel image output by the first summation module, so that different channel characteristics are utilized in a self-adaptive manner. Meanwhile, the up-sampling module improves the output of the first channel attention module by a preset multiple of resolution, so that compared with the high-resolution image reconstructed by the existing mode, the high-resolution image reconstructed by the embodiment of the application has higher fidelity.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic structural diagram of a multi-awareness network according to an embodiment of the present disclosure;
FIG. 2 is a schematic structural diagram of any multi-aware branching module according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of any one of the attention groups of residual channels disclosed in the embodiments of the present application;
fig. 4 is a schematic structural diagram of any one of the enhanced residual blocks disclosed in the embodiments of the present application;
fig. 5 is a schematic diagram of a training process of a multi-awareness network according to an embodiment of the present disclosure;
FIG. 6 is a flowchart of an image super-resolution reconstruction method disclosed in an embodiment of the present application;
fig. 7 is a schematic structural diagram of an image super-resolution reconstruction device according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The inventor of the application finds that the reasons for low fidelity of the high-resolution image reconstructed by the existing image super-resolution reconstruction method based on the deep learning in the research include: first, image information learned by the neural network is not fully utilized, i.e., the perceptibility is limited. And the information extracted from different levels in the second and the neural networks are directly used for final reconstruction, namely, the channel-level characteristic difference among the information extracted from different levels in the neural networks is ignored.
In one aspect, the network provided in the embodiment of the present application includes a preset number of multi-aware branching modules, where any one of the multi-aware branching modules includes a plurality of cascaded residual channel attention groups, and each residual channel attention group includes a plurality of cascaded enhanced residual blocks. Wherein, any one enhancement residual block comprises a second convolution layer, a third convolution layer, a rectification module and a second summation module. The image used for inputting the enhanced residual block is input into the second convolution layer, the output of the second convolution layer is input into the rectifying module, and the output of the rectifying module is input into the third convolution layer, so that the input of the third convolution layer in any one enhanced residual block is obtained through the calculation of the second convolution layer, the third convolution layer achieves a different perception scale with the second convolution layer, and the perception scale of the third convolution layer is increased relative to that of the second convolution layer.
On the other hand, the image for inputting the enhanced residual block, the output of the rectifying module, and the output of the third convolution layer are respectively input to the second summing module; the second summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image to obtain images of all channels, namely the enhanced residual block can extract three levels of information, namely an image for inputting the enhanced residual block, the output of the rectification module and the output of the third convolution layer. And, the characteristic information of these three levels is all as the input of the enhancement residual block of the subsequent cascade, and so on, any multi-perception branch module can extract the information of more levels (each level includes a plurality of channels), and then, the information of more levels that the multi-perception branch module of preset quantity can output.
In addition, the network provided by the embodiment of the application further comprises a first summation module and a first channel attention module, wherein the first summation module sums the pixel values of the pixel points at the same position of the same channel in the multi-level information output by the first convolution layer and each multi-perception branch module to obtain a multi-channel image, and the first channel attention module gives different weights to different channels in the multi-channel image output by the first summation module, so that the characteristics of the different channels are utilized to different extents in a self-adaptive manner.
In summary, compared with the high-resolution image reconstructed by the existing method, the high-resolution image reconstructed by the embodiment of the application has higher fidelity.
Fig. 1 is a structure of a multi-awareness attention network according to an embodiment of the present application, including:
the device comprises a first convolution layer, a preset number of multi-perception branch modules, a first summation module, a first channel attention module, an up-sampling module and a fourth convolution layer.
The low-resolution image to be reconstructed is input into a first convolution layer, the output of the first convolution layer is respectively input into each multi-perception branch module, the output of each multi-perception branch module and the output of the first convolution layer are respectively input into a first summation module, and the first summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image; the output of the first summing module is input into the first channel attention module; the output of the first channel attention module is input into an up-sampling module, the up-sampling module is used for carrying out r times up-sampling operation on the output of the first channel attention module, and the output of the up-sampling module is input into a fourth convolution layer; and the fourth convolution layer carries out convolution operation on the input of the up-sampling module to obtain a high-resolution reconstruction image.
Specifically, the first summing module sums the pixel values of the pixel points at the same position of the same channel in the input image, where the meaning is that: the output image of each multi-perception branching module is assumed to be an n-channel image, specifically, a 1 st channel, a 2 nd channel, a 3 rd channel, a … th channel and an n-th channel respectively. The output of the first convolution layer is an n-channel image, specifically, the 1 st channel, the 2 nd channel, the 3 rd channel, the … th channel and the n-th channel. In this embodiment, the first summing module sums the pixel values of the pixel points at the same position of the 1 st channel in the image output by each multi-perception branching module and the 1 st channel in the image output by the first convolution layer. The first summation module sums the pixel values of the pixel points at the same position of the 2 nd channel in the image output by each multi-perception branch module and the 2 nd channel in the image output by the first convolution layer, and so on, and the first summation module sums the pixel values of the pixel points at the same position of the n th channel in the image output by each multi-perception branch module and the n th channel in the image output by the first convolution layer. That is, the first summing module is configured to sum pixel values of pixels at the same position in the same channel in the image output by each multi-perception branching module and the image output by the first convolution layer.
It should be noted that, in the embodiment of the present application, the fourth convolution layer is optional, if the multi-awareness network includes the fourth convolution layer, the fourth convolution layer outputs the high-resolution image, and if the multi-awareness network does not include the fourth convolution layer, the up-sampling module outputs the high-resolution image.
Specifically, in the present embodiment, a low-resolution image input to a multi-aware attention network (hereinafter referred to as an MPAN for convenience of description) is denoted as X, where X is an M-row N-column C-channel image. The parameters of the first convolution layer are w MPAN,1 The parameters of the fourth convolution layer are represented by w MPAN,2 And (3) representing. Wherein the output of the first convolution layer is denoted as X 0 Specific X 0 =F Conv (X,w MPAN,1 ) Wherein F is Conv Representing a convolution operation. Wherein if the first convolution layer includes n convolutionsCore, then X 0 For M rows and N columns of N-channel images.
Since in this embodiment, the output of the first convolution layer is input to each multi-aware branching module (for convenience of description, any one multi-aware branching is simply referred to as MPB). In this embodiment, the convolution layer set of the kth MPB in the MPAN is represented asThe value of K is the number of multi-perception branch modules in the MPAN. In practice, in order to ensure that the high-resolution image reconstructed by the MPAN has higher fidelity, the value of K obtained by the test is not less than 2, and of course, in practice, the value of K may also be other values, and the embodiment does not specifically limit the value of K.
In this embodiment, the low resolution image X input into the MPAN network is calculated by the following formula (1):
in the method, in the process of the invention,x representing the output of the first MPB to the first convolutional layer 0 Is used for calculating the result of the calculation,x representing the output of the second MPB to the first convolutional layer 0 Is calculated as->X representing the output of the Kth MPB to the first convolutional layer 0 Is calculated by the computer. Wherein any MPB outputs X to the first convolution layer 0 Is an N-channel image of M rows and N columns.
In the method, in the process of the invention,the representation is: first oneOutput X of summing module to first convolution layer 0 And summing pixel values of pixel points at the same position of the same channel in the output of each MPB. The first summing module outputs M rows and N columns of N-channel images.
In the method, in the process of the invention,the representation is: the first channel attention module calculates M rows and N columns of N channel images output by the first summation module. Assuming X is adopted 1 Representation->ThenCan be expressed asWherein w is down A convolution layer consisting of n/r' convolution kernels of size 1×1×n, w up For a convolution layer consisting of n convolution kernels of size 1 x n/r ', r' is the vector dimension transform factor in the pass through attention module.
In the method, in the process of the invention,representing the upsampling module upsampling the output of the first summing module. Where r refers to the image size magnification, or resolution enhancement, or upsampling rate.
In the method, in the process of the invention,and the fourth convolution layer carries out convolution calculation on the output of the up-sampling module to reconstruct a high-resolution image.
In the present embodiment, the kth MPB in the MPAN includes G (k) The structure of the group of residual channel attention (hereinafter, referred to as RCAG for convenience of description) and the sixth convolutional layer, specifically, the kth MPB is shown in fig. 2. As can be seen from FIG. 2, G (k) RCAG (RCAG)A sixth convolutional layer cascade, specifically, the output of the first RCAG is input to the second RCAG, the output of the second RCAG is input to the third RCAG, … …, the G (k) -1 output of RCAG input G (k) RCAG, G (k) The output of each RCAG is input to the sixth convolutional layer. In this embodiment, the convolution layer set of the g-th RCAG in the kth MPB employsIndicating that the sixth convolution layer employs +.>Representing, therefore, the convolution layer set of the kth MPB in the MPAN
For any one MPB, assuming that the input data is X, a calculation formula of the data X by the MPB is as follows formula (2):
wherein X is an input image;represents the convolution layer set, w, of the g-th RCAG MPB A sixth convolutional layer in the MPB is shown.Representing the operation of the g-th RCAG in MPB on the input, e.g., +.>Representation of use->And calculating X.
Wherein the structure of the g-th RCAG in the kth MPB is shown in FIG. 3 and comprises a plurality of A concatenated enhancement residual block, a second channel attention module, a fifth convolution layer, and a third summation module. Wherein the image for inputting the RCAG is assumed to be X 2 First X is taken up 2 The first enhancement residual block is input, the output of the first enhancement residual block is input into the second enhancement residual block, the output of the second enhancement residual block is input into the third enhancement residual block, and so on, the output of the last enhancement residual block is input into the second channel attention group, and the output of the second channel attention group is input into the fifth convolution layer. Output of fifth convolution layer and image X for inputting RCAG 2 And respectively inputting the pixel values of the pixel points at the same position of the same channel in the input image to a third summation module, and outputting the multi-channel image. If image X for inputting RCAG 2 And (3) outputting the image of M rows, N columns and N channels by the third summation module.
Specifically, the calculation formula of any one RCAG for the input image is shown in the following formula (3):
wherein X is an input image of the RCAG;is a set of convolutional layers in an RCAG;A set of convolutional layers representing the b-th ERB in the RCAG, +.>Representation of use->The inputs are operated on. w (w) RCAG The convolution layer at the end of the RCAG, the fifth convolution layer in this embodiment, is shown. w (w) up See w in the first channel attention module in MPAN up ,w down See w in the first channel attention module in MPAN down And will not be described in detail here.
In this embodiment, the structure of any one of the g-th RCAG enhanced residual blocks in the kth MPB is shown in fig. 4. The device comprises a second convolution layer, a third convolution layer, a rectification module and a second summation module. Wherein the calculation process of the enhanced residual block on the image input into the enhanced residual block comprises the following steps: the image input to the enhanced residual block is input to the second convolution layer at first, the output of the second convolution layer is input to the rectification module, the output of the rectification module is input to the third convolution layer, and the output of the third convolution layer, the output of the rectification module and the image input to the enhanced residual block are respectively input to the second summation module.
Assuming that an image to which the enhanced residual block is input is represented as X, a calculation formula of the enhanced residual block for the image is shown as the following formula (4):
w in the formula ERB ={w ERB,1 ,w ERB,2 The set of convolutional layers in the enhanced residual block, w ERB,1 Representing the second convolutional layer, w, in the enhanced residual block ERB,2 Representing the third convolutional layer in the enhanced residual block. F (F) Conv (X,w ERB,1 ) Representing the convolution operation performed on image X by the second convolution layer in the enhanced residual block. F (F) ReLU (F Conv (X,w ERB,1 ) A rectification module performs rectification calculation on the output of the second convolution module. F (F) Conv (F ReLU (F Conv (X,w ERB,1 )),w ERB,2 ) Representing the third convolution layer performing convolution calculations on the output of the rectification module.
In this embodiment, the rectifying module may be specifically a linear rectifying module, that is, the rectifying module calculates the output of the first convolution layer according to a linear rectifying function. The linear rectification function is the prior art, and is not described herein.
Fig. 5 is a training process of a multi-awareness network according to an embodiment of the present application, including the following steps:
s501, acquiring an image set to be trained and resolution improvement multiples.
In this embodiment, the resolution improvement factor is a super resolution improvement factor that needs to be achieved by the MPAN network obtained by training in this embodiment. For example, if the MPAN network trained by the embodiment is to achieve a 3-fold resolution improvement effect, the resolution improvement factor in this step is 3.
In this step, the image set to be trained includes: a preset high resolution image set and a preset low resolution image set. The low-resolution image set is obtained by r times degradation of the high-resolution image set. Wherein, the value of r is the resolution improvement multiple in the step.
It should be noted that, the resolution improvement factor in this embodiment may be set according to actual situations, and this embodiment does not limit the specific value of the resolution improvement factor.
In particular, the high resolution image set is represented asRepresenting a low resolution image set as +.>Wherein r is a resolution improvement factor.
S502, initializing convolution kernels in each convolution layer in the MPAN network.
Specifically, the MPAN network in this embodiment is an MPAN network provided in fig. 1 in this embodiment of the present application.
In this step, the size of the convolution kernel in each convolution layer in the MPAN network is t×t×c, and the number of convolution kernels in each convolution layer may be n.
S503, inputting the low-resolution image set and the resolution improvement times into an MPAN network, and obtaining a result image set after the MPAN network rebuilds the low-resolution image set.
In this step, after the low resolution image set and the resolution improvement factor are input into the MPAN network, the MPAN network reconstructs the low resolution image, and outputs a high resolution image reconstructed from the low resolution image set, which is referred to as a result image for convenience of description.
S504, calculating a loss function value between the result image set and the high-resolution image set according to a preset loss function.
In this embodiment, the loss function may be a formula shown in the following formula (6):
wherein L is 1 (W MPAN ) The value of the loss function is indicated,representing a low resolution image setResult image calculated by MPAN network, < ->Representing high resolution image concentration and +.>A corresponding high resolution image.
In this embodiment, the loss function may also be a formula shown in the following formula (7):
wherein L is 2 (W MPAN ) The value of the loss function is indicated,representing a low resolution image setResult image calculated by MPAN network, < ->Representing high resolution image concentration and +.>A corresponding high resolution image.
In the embodiment, only two specific formulas of the loss function are provided, in practice, the calculation formulas of the loss function may also use formulas of other forms, and the specific form of the loss function is not limited in the embodiment.
S505, according to the loss function value, the convolution operation weight of all convolution layers in the MPAN network is adjusted, and the step of inputting the low-resolution image set and the resolution improvement multiple into the MPAN network is returned to obtain a result image set after the MPAN network reconstructs the low-resolution image set until the loss function value is not reduced any more, and the trained MPAN network is obtained.
The purpose that this embodiment needs to achieve for training an MPAN network is: and (3) by adjusting convolution operation weights of all convolution layers in the MPAN network, inputting a low-resolution image in the MPAN network, calculating a result image through the MPAN network, and obtaining the trained MPAN network when the loss function value between high-resolution images corresponding to the low-resolution image reaches the minimum value. Namely, the convolution operation weight set formed by the convolution operation weights of all the convolution layers in the trained MPAN network reaches the optimal value under the specified resolution improvement multiple r.
Specifically, in this step, the specific implementation process of adjusting the convolution operation weights of all the convolution layers in the MPAN network according to the loss function value is the prior art, and will not be described herein.
Fig. 6 is a schematic diagram of an image super-resolution reconstruction method according to an embodiment of the present application, including the following steps:
s601, acquiring a low-resolution image to be reconstructed and a preset resolution improvement multiple.
In this step, the manner of acquiring the low resolution image to be reconstructed is the prior art, and is not described herein. The resolution improvement factor in this step may be set by the user according to the actual situation, and the value of the resolution improvement factor is not limited in this embodiment.
S602, inputting the low-resolution image and the resolution improvement times into a trained preset network to obtain a high-resolution reconstructed image.
The trained preset network in this step is an mpa network obtained through training in the corresponding embodiment of fig. 5, and a high-resolution reconstructed image output by the trained mpa network is obtained.
S603, outputting a high-resolution reconstruction image.
In this step, a specific implementation manner of outputting the high-resolution reconstructed image is the prior art, and will not be described herein.
Fig. 7 is a schematic diagram of an image super-resolution reconstruction device according to an embodiment of the present application, including: an acquisition module 701, a reconstruction module 702 and an output module 703.
The acquiring module 701 is configured to acquire a low-resolution image to be reconstructed and a preset resolution improvement factor. The reconstruction module 702 is configured to input the low-resolution image and the resolution enhancement multiple into a trained preset network to obtain a high-resolution reconstructed image; the trained preset network comprises a first convolution layer, a preset number of multi-perception branch modules, a first summation module, a first channel attention module and an up-sampling module; any multi-perception branching module comprises a plurality of cascaded residual channel attention groups; any one of the residual channel attention groups includes a plurality of cascaded enhanced residual blocks;
The low-resolution image is input into a first convolution layer, and the output of the first convolution layer is respectively input into each multi-perception branch module; the output of each multi-perception branch module and the output of the first convolution layer are respectively input into a first summation module; the first summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image; the output of the first summing module is input into the first channel attention module; the output of the first channel attention module is input into the up-sampling module; the up-sampling module is used for up-sampling the output of the first channel attention module by a resolution improvement multiple; the up-sampling module obtains a high-resolution reconstruction image;
any one of the enhanced residual blocks includes: the second convolution layer, the third convolution layer, the rectifying module and the second summation module; an image for inputting the enhanced residual block is input to a second convolution layer; the output of the second convolution layer is input into the rectification module; the output of the rectifying module is input into the third convolution layer; the image used for inputting the enhancement residual block, the output of the rectifying module and the output of the third convolution layer are respectively input into the second summation module; the second summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image to obtain a multi-channel image;
An output module 703 for outputting a high resolution reconstructed image.
Optionally, the preset number is not less than 2.
Optionally, the preset network further includes: a fourth convolution layer; the output of the up-sampling module is input into a fourth convolution layer; and the fourth convolution layer carries out convolution operation on the output of the up-sampling module to obtain a high-resolution reconstructed image.
Optionally, any one of the residual channel attention groups further includes: a second channel attention module, a fifth convolution layer, and a third summation module; an image for inputting the residual channel attention group, a first enhanced residual block in the residual channel attention group; the output of the first enhanced residual block inputs a second enhanced residual block of the residual channel attention group; the output of the (B-1) th enhanced residual block in the residual channel attention group is input into the (B) th enhanced residual block in the residual channel attention group; the output of the B enhanced residual block is input into a second channel attention module; the output of the second channel attention module is input into the fifth convolution layer; the image of the attention group of the residual channel and the output of the fifth convolution layer are input into a third summation module respectively; and the third summation module is used for summing pixel values of the pixel points at the same position of the same channel in the input image and outputting a multi-channel image.
Optionally, the rectifying module is a linear rectifying module.
An embodiment of the present application provides an apparatus, as shown in fig. 8, including at least one processor, and at least one memory and a bus connected to the processor; the processor and the memory complete communication with each other through a bus; the processor is used for calling the program instructions in the memory to execute the image super-resolution reconstruction method. The device herein may be a server, PC, PAD, cell phone, etc.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, the device includes one or more processors (CPUs), memory, and a bus. The device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), among other forms in computer readable media, the memory including at least one memory chip. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.
Claims (10)
1. An image super-resolution reconstruction method, which is characterized by comprising the following steps:
acquiring a low-resolution image to be reconstructed and a preset resolution improvement multiple;
inputting the low-resolution image and the resolution improvement multiple into a trained preset network to obtain a high-resolution reconstructed image; the trained preset network comprises a first convolution layer, a preset number of multi-perception branch modules, a first summation module, a first channel attention module and an up-sampling module; any one of the multi-perception branching modules comprises a plurality of cascaded residual channel attention groups; any one of the residual channel attention groups comprises a plurality of cascaded enhanced residual blocks;
the low-resolution image is input into the first convolution layer, and the output of the first convolution layer is respectively input into each multi-perception branch module; the output of each multi-perception branching module and the output of the first convolution layer are respectively input into the first summation module; the first summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image; the output of the first summing module is input to the first channel attention module; the output of the first channel attention module is input into the up-sampling module; the up-sampling module is used for up-sampling the output of the first channel attention module by the resolution improvement times; the up-sampling module obtains the high-resolution reconstructed image;
Any one of the enhanced residual blocks includes: the second convolution layer, the third convolution layer, the rectifying module and the second summation module; an image for inputting the enhanced residual block is input to the second convolution layer; the output of the second convolution layer is input into the rectification module; the output of the rectifying module is input into the third convolution layer; the image for inputting the enhanced residual block, the output of the rectifying module, and the output of the third convolution layer are respectively input to the second summing module; the second summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image to obtain a multi-channel image;
outputting the high-resolution reconstructed image.
2. The method of claim 1, wherein the predetermined number is not less than 2.
3. The method of claim 1, wherein the pre-set network further comprises: a fourth convolution layer;
the output of the up-sampling module is input into the fourth convolution layer;
and the fourth convolution layer carries out convolution operation on the output of the up-sampling module to obtain the high-resolution reconstructed image.
4. A method according to claim 3, wherein any one of said residual channel attention groups further comprises: a second channel attention module, a fifth convolution layer, and a third summation module;
An image for inputting the residual channel attention group, a first enhanced residual block in the residual channel attention group; the output of the first enhanced residual block inputs a second enhanced residual block of the residual channel attention group; the output of the (B-1) th enhanced residual block in the residual channel attention group is input into the (B) th enhanced residual block in the residual channel attention group; the output of the B enhanced residual block is input into the second channel attention module; the output of the second channel attention module is input into the fifth convolution layer;
the image for inputting the residual channel attention group and the output of the fifth convolution layer are respectively input into the third summation module;
and the third summation module is used for summing pixel values of the pixel points at the same position of the same channel in the input image and outputting a multi-channel image.
5. The method of any one of claims 1-4, wherein the rectifying module is a linear rectifying module.
6. An image super-resolution reconstruction apparatus, comprising:
the acquisition module is used for acquiring the low-resolution image to be reconstructed and a preset resolution improvement multiple;
the reconstruction module is used for inputting the low-resolution image and the resolution improvement multiple into a trained preset network to obtain a high-resolution reconstructed image; the trained preset network comprises a first convolution layer, a preset number of multi-perception branch modules, a first summation module, a first channel attention module and an up-sampling module; any one of the multi-perception branching modules comprises a plurality of cascaded residual channel attention groups; any one of the residual channel attention groups comprises a plurality of cascaded enhanced residual blocks;
The low-resolution image is input into the first convolution layer, and the output of the first convolution layer is respectively input into each multi-perception branch module; the output of each multi-perception branching module and the output of the first convolution layer are respectively input into the first summation module; the first summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image; the output of the first summing module is input to the first channel attention module; the output of the first channel attention module is input into the up-sampling module; the up-sampling module is used for up-sampling the output of the first channel attention module by the resolution improvement times; the up-sampling module obtains the high-resolution reconstructed image;
any one of the enhanced residual blocks includes: the second convolution layer, the third convolution layer, the rectifying module and the second summation module; an image for inputting the enhanced residual block is input to the second convolution layer; the output of the second convolution layer is input into the rectification module; the output of the rectifying module is input into the third convolution layer; the image for inputting the enhanced residual block, the output of the rectifying module, and the output of the third convolution layer are respectively input to the second summing module; the second summation module is used for summing pixel values of pixel points at the same position of the same channel in the input image to obtain a multi-channel image;
And the output module is used for outputting the high-resolution reconstructed image.
7. The apparatus of claim 6, wherein the pre-set network further comprises: a fourth convolution layer;
the output of the up-sampling module is input into the fourth convolution layer;
and the fourth convolution layer carries out convolution operation on the output of the up-sampling module to obtain the high-resolution reconstructed image.
8. The apparatus of claim 6, wherein any one of the residual channel attention groups further comprises: a second channel attention module, a fifth convolution layer, and a third summation module;
an image for inputting the residual channel attention group, a first enhanced residual block in the residual channel attention group; the output of the first enhanced residual block inputs a second enhanced residual block of the residual channel attention group; the output of the (B-1) th enhanced residual block in the residual channel attention group is input into the (B) th enhanced residual block in the residual channel attention group; the output of the B enhanced residual block is input into the second channel attention module; the output of the second channel attention module is input into the fifth convolution layer;
the image for inputting the residual channel attention group and the output of the fifth convolution layer are respectively input into the third summation module;
And the third summation module is used for summing pixel values of the pixel points at the same position of the same channel in the input image and outputting a multi-channel image.
9. A storage medium comprising a stored program, wherein the program performs the image super-resolution reconstruction method according to any one of claims 1 to 5.
10. An apparatus comprising at least one processor, and at least one memory, bus coupled to the processor; the processor and the memory complete communication with each other through the bus; the processor is configured to invoke program instructions in the memory to perform the image super-resolution reconstruction method according to any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911140450.XA CN111223046B (en) | 2019-11-20 | 2019-11-20 | Image super-resolution reconstruction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911140450.XA CN111223046B (en) | 2019-11-20 | 2019-11-20 | Image super-resolution reconstruction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111223046A CN111223046A (en) | 2020-06-02 |
CN111223046B true CN111223046B (en) | 2023-04-25 |
Family
ID=70832773
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911140450.XA Active CN111223046B (en) | 2019-11-20 | 2019-11-20 | Image super-resolution reconstruction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111223046B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113808026A (en) * | 2020-06-12 | 2021-12-17 | 华为技术有限公司 | Image processing method and device |
CN112801881B (en) * | 2021-04-13 | 2021-06-22 | 湖南大学 | High-resolution hyperspectral calculation imaging method, system and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108734660A (en) * | 2018-05-25 | 2018-11-02 | 上海通途半导体科技有限公司 | A kind of image super-resolution rebuilding method and device based on deep learning |
CN109003229A (en) * | 2018-08-09 | 2018-12-14 | 成都大学 | Magnetic resonance super resolution ratio reconstruction method based on three-dimensional enhancing depth residual error network |
CN110111256A (en) * | 2019-04-28 | 2019-08-09 | 西安电子科技大学 | Image Super-resolution Reconstruction method based on residual error distillation network |
WO2019192588A1 (en) * | 2018-04-04 | 2019-10-10 | 华为技术有限公司 | Image super resolution method and device |
-
2019
- 2019-11-20 CN CN201911140450.XA patent/CN111223046B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019192588A1 (en) * | 2018-04-04 | 2019-10-10 | 华为技术有限公司 | Image super resolution method and device |
CN108734660A (en) * | 2018-05-25 | 2018-11-02 | 上海通途半导体科技有限公司 | A kind of image super-resolution rebuilding method and device based on deep learning |
CN109003229A (en) * | 2018-08-09 | 2018-12-14 | 成都大学 | Magnetic resonance super resolution ratio reconstruction method based on three-dimensional enhancing depth residual error network |
CN110111256A (en) * | 2019-04-28 | 2019-08-09 | 西安电子科技大学 | Image Super-resolution Reconstruction method based on residual error distillation network |
Non-Patent Citations (1)
Title |
---|
崔顺.基于深度学习的图像超分辨率重建技术研究.信息科技 中国优秀硕士学位论文全文库.2018,第9-第46页. * |
Also Published As
Publication number | Publication date |
---|---|
CN111223046A (en) | 2020-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111369440B (en) | Model training and image super-resolution processing method, device, terminal and storage medium | |
Choi et al. | A deep convolutional neural network with selection units for super-resolution | |
CN112381172B (en) | InSAR interference image phase unwrapping method based on U-net | |
CN111476719B (en) | Image processing method, device, computer equipment and storage medium | |
Sun et al. | Lightweight image super-resolution via weighted multi-scale residual network | |
Yu et al. | A unified learning framework for single image super-resolution | |
CN110782395B (en) | Image processing method and device, electronic equipment and computer readable storage medium | |
CN111105352A (en) | Super-resolution image reconstruction method, system, computer device and storage medium | |
US20190005619A1 (en) | Image upscaling system, training method thereof, and image upscaling method | |
US20180315165A1 (en) | Apparatus for upscaling an image, method for training the same, and method for upscaling an image | |
CN111754404B (en) | Remote sensing image space-time fusion method based on multi-scale mechanism and attention mechanism | |
CN113066034A (en) | Face image restoration method and device, restoration model, medium and equipment | |
CN108921801B (en) | Method and apparatus for generating image | |
CN111223046B (en) | Image super-resolution reconstruction method and device | |
CN111932480A (en) | Deblurred video recovery method and device, terminal equipment and storage medium | |
CN113421187B (en) | Super-resolution reconstruction method, system, storage medium and equipment | |
CN115953303A (en) | Multi-scale image compressed sensing reconstruction method and system combining channel attention | |
CN110782398B (en) | Image processing method, generative countermeasure network system and electronic device | |
CN111797678A (en) | Phase unwrapping method and device based on composite neural network | |
TW202409963A (en) | Method and apparatus for generating high-resolution image, and a non-transitory computer-readable medium | |
CN115713462A (en) | Super-resolution model training method, image recognition method, device and equipment | |
CN117934286B (en) | Lightweight image super-resolution method and device and electronic equipment thereof | |
CN113850721A (en) | Single image super-resolution reconstruction method, device and equipment and readable storage medium | |
CN113947521A (en) | Image resolution conversion method and device based on deep neural network and terminal equipment | |
CN117830102A (en) | Image super-resolution restoration method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |