CN109035142B - Satellite image super-resolution method combining countermeasure network with aerial image prior - Google Patents

Satellite image super-resolution method combining countermeasure network with aerial image prior Download PDF

Info

Publication number
CN109035142B
CN109035142B CN201810777731.5A CN201810777731A CN109035142B CN 109035142 B CN109035142 B CN 109035142B CN 201810777731 A CN201810777731 A CN 201810777731A CN 109035142 B CN109035142 B CN 109035142B
Authority
CN
China
Prior art keywords
image
loss
resolution
model
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810777731.5A
Other languages
Chinese (zh)
Other versions
CN109035142A (en
Inventor
黄源
侯兴松
赵世正
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GUANGDONG XI'AN JIAOTONG UNIVERSITY ACADEMY
Xian Jiaotong University
Original Assignee
GUANGDONG XI'AN JIAOTONG UNIVERSITY ACADEMY
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GUANGDONG XI'AN JIAOTONG UNIVERSITY ACADEMY, Xian Jiaotong University filed Critical GUANGDONG XI'AN JIAOTONG UNIVERSITY ACADEMY
Priority to CN201810777731.5A priority Critical patent/CN109035142B/en
Publication of CN109035142A publication Critical patent/CN109035142A/en
Application granted granted Critical
Publication of CN109035142B publication Critical patent/CN109035142B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a satellite image super-resolution method combining an antagonistic network with aerial image prior, which comprises the steps of firstly training a denoising model by using an image pair formed by a 16-level noise-containing image and a corresponding 16-level noise-free image, and then training the image super-resolution model by using clear aerial data. Because the satellite image and the aerial image do not exist in the condition of being paired, when the generated super-resolution image is subjected to image post-processing, the clear aerial image is adopted to construct the external prior dictionary of the GMM model, and the satellite image with unclear interior is guided to be reconstructed. And after reconstruction, in order to further improve the image quality, sharpening the image by using a Gaussian filtering mode. And finally, obtaining a high-resolution image of the original satellite image, and realizing the improvement of the image visual quality on the basis of the original satellite image. The effectiveness of the scheme can be seen from the experimental links. An effective idea is provided for solving the problems of satellite image super-resolution and image quality improvement under the condition of conditional limitation in reality.

Description

Satellite image super-resolution method combining countermeasure network with aerial image prior
Technical Field
The invention belongs to the technical field of image super-resolution, and particularly relates to a satellite image super-resolution method based on multi-scale sensing loss and generation countermeasure network combined aerial image prior.
Background
The image resolution is an important index of image quality, and an image with higher resolution can show more details more clearly, but is affected by hardware and external environment in the process of obtaining the image, and the obtained image resolution is lower, so that the problem of how to obtain a high-resolution image from a low-resolution image is caused. Currently, as the number of satellites increases, the range of the earth covered by the satellites is more than 90%, which makes the range which can be monitored by the satellites much larger than the range covered by images obtained by other means, but the satellite images are affected by various reasons and have lower resolution. For example, compared with an aerial image, a satellite image is relatively fuzzy and lacks of detailed information, but the coverage of the aerial image is far less than that of the satellite image, so that how to obtain a satellite image with higher resolution has important significance and value.
In the field of image super-resolution, the combination of a deep neural network and the traditional image super-resolution problem enables the image super-resolution technology to have a new breakthrough. With the development of computer hardware equipment, the cost of large-scale operation acceleration is obviously reduced, the cost of training a deep neural network is reduced, great convenience is brought to scientific researchers, and the technology is widely applied to various fields. From the network SRCNN of the originally proposed deep learning combined with super-resolution problem to the super-resolution algorithm SRGAN implemented by the present generation countermeasure network (GAN), a model for converting a low-resolution image into a high-resolution image is obtained by training network parameters using low-resolution and high-resolution images, and the high-resolution image is generated only in the case of the low-resolution image.
The image super resolution problem is described as follows:
the image super-resolution problem refers to a process of obtaining a corresponding high-resolution image from a low-resolution image, and the technology breaks through the limitation of the imaging hardware condition of the original system to obtain a clearer image. In image super-resolution technology, the super-resolution problem can be generally divided into two cases: a super-resolution method based on a single image and a super-resolution method based on a plurality of images. The super resolution of a single image is a method for improving the resolution of the image through a reconstruction algorithm by amplifying a low-resolution image. The super-resolution algorithm based on multiple images reconstructs a high-resolution image by using a method of fusing multi-frame similar image sequences.
In the super-resolution method based on a single image, an algorithm establishes a relationship between a low-resolution image and a high-resolution image. Thereby reconstructing a high resolution image from the low resolution image. The conventional algorithm simulates the cause of the low-resolution image in various ways, constructs various degradation models to fit the process of generating the low-resolution image, thereby constructing the relationship between the low-resolution image and the high-resolution image to predict and generate the high-resolution image. Such a simulation process can be described by the following equation:
IL=HIH+n
wherein ILFor low resolution images, IHIs ILAnd H is a degradation model for generating the low-resolution image, and n is a noise interference factor in the process of generating the low-resolution image. H, as a degradation model, can in turn be expressed as:
H=DSub×B×G
wherein D isSubRepresenting a down-sampling method, B is a blurring factor, and G is a geometric deformation factor.
The methods for solving the above degradation model construction mainly include an interpolation-based method, an image reconstruction-based method, and a learning-based method. In the interpolation method, the super-resolution of the image is realized by decomposing the image, interpolating and returning an interpolation value, the running speed is high, parallel calculation can be performed, and the requirement of real-time super-resolution of the image can be met. However, interpolation cannot predict the high frequency information lost from the low resolution image to the high resolution image, and the resulting high resolution image lacks texture details and sharp edges. The super-resolution algorithm based on image reconstruction is further divided into a spatial domain method and a frequency domain method, and the process from a low-resolution image to a high-resolution image is realized by establishing the corresponding relation between the low-resolution image and the high-resolution image in a spatial domain or a frequency domain and manually designing a corresponding relation model. Such as a comparative classical convex set projection method, maximum a posteriori probability estimation, etc. The method has the defects that the manually designed model cannot be suitable for various image detail recovery, the constructed model can only obtain good effect on a few data, and the image detail definition cannot be further improved under the condition of data increase.
In the learning-based methods, similar to the image reconstruction-based methods, they all implement the transition from the low-resolution image to the high-resolution image by establishing the relationship between the low-resolution image and the high-resolution image, but the learning-based methods use an external training sample to obtain a priori knowledge about the relationship between the low-resolution image and the high-resolution image. Such as manifold learning based methods, sparse representation based methods, and deep neural network based methods. The method is limited by the size of a built dictionary and reasons that data sparsity is difficult to guarantee in learning methods such as sparse representation, and a stable image super-resolution effect cannot be obtained. In the super-resolution method based on the deep neural network, the methods based on the residual error network and the generation countermeasure network, which are proposed, need to learn and train the low-resolution image and the high-resolution image pair through a large number of parameters. Meanwhile, when high-frequency information of a high-resolution image is predicted, deletion still occurs, so that a texture-rich area looks smooth.
The super-resolution method has the limitations of practical conditions in the satellite image super-resolution problem, and cannot acquire satellite images with very high resolution at present, so that data of a high-resolution satellite image and a low-resolution image pair are difficult to acquire during image super-resolution, and many super-resolution methods requiring a low-resolution image and a high-resolution image pair cannot be directly used for satellite image super-resolution tasks. When the satellite image is acquired, the noise influence is serious, so that the particle noise in the acquired image is obvious, and the direct super-resolution of the single image can amplify the noise in the image and influence the definition. As the auxiliary data, although the aerial image covers much less than the satellite image, the aerial image is very much similar to the satellite image, and has a very good definition with respect to the satellite image. The current acquired aerial image data and satellite image data do not have the paired property, namely shooting at non-same place and same time period. Under the existing limited conditions, how to denoise satellite images, how to super-resolve and how to enhance the definition of satellite data by utilizing clear aerial image data become a problem to be solved.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a satellite image super-resolution method based on multi-scale sensing loss and generation countermeasure network combined aerial image prior aiming at the defects in the prior art, so that the problem of insufficient prior of a clear image (lack of a clear satellite image) in the process of a super-resolution algorithm only using a satellite image can be solved, and a clearer satellite image can be generated. Meanwhile, under the condition of only using satellite data, a clearer super-resolution image can be generated compared with other methods due to the fact that multi-scale perception loss is added.
The invention adopts the following technical scheme:
a satellite image super-resolution method combining an antagonistic network with aerial image prior uses a 16-level noise-containing image and a corresponding 16-level noise-free image to form an image pair training denoising model, and then uses aerial data to train the image super-resolution model; the method comprises the steps of constructing a GMM model external prior dictionary by adopting aerial images, guiding an internally unclear satellite image to be reconstructed, finishing post-processing of a generated super-resolution image, sharpening the image by adopting a Gaussian filtering mode, and finally obtaining a high-resolution image of an original satellite image, so that the improvement of the image visual quality on the basis of the original satellite image is realized.
Specifically, the method comprises the following steps:
s1, defining a generator, a decision device and a multi-scale perception loss network in the generation countermeasure network;
s2, utilizing the image extracted from 18 levels in the existing satellite data to down-sample to 16 levels, and setting the obtained 16-level satellite image as the de-noised target ID_HSatellite data extracted from 16 levels as noisy image ID_LForming an image pair, and setting the generated noise-free satellite image as ID_GH
S3, performing initialization training on the generator in the denoising model by using the image pair formed in the step S2, and calculating the mean square error of the pixels between the image generated by the generator and the corresponding target image by using the mean square error as a loss function in the initialization training to obtain the MSE generator loss function lossMSECalculating gradient and returning and adjusting model parameters;
s4, after 100 epochs of initialization training, carrying out complete model training, calculating loss and corresponding gradient, transmitting back parameter models in an adjustment generator and a decision device, and sensing a loss network VGG19 without adjusting parameters;
s5, training 200 epochs to converge according to the settings, storing the model, using a training generator for denoising, and obtaining a denoised image ID_GHDefining a satellite image super-resolution model as the input of image super-resolution;
s6, repeating the steps S3-S5, completing the super-resolution network training process and the denoising model, and then generating a super-resolution image ISR_GHConstructing an external prior dictionary by adopting a Gaussian mixture model;
s7, constructing a GMM external prior dictionary, dividing clear aerial 17-level images into 15-15 small blocks, and then performing preliminary grouping according to Euclidean distance;
and S8, grouping and reconstructing the satellite images according to the reconstructed internal image blocks, and carrying out image sharpening operation on the reconstructed satellite images to obtain a final result image.
Further, in step S1, the generator in the generation countermeasure network is defined as: using a residual error network as a generator, wherein the residual error network comprises 16 residual error modules, and each residual error module comprises three convolutional layers;
the structure of the decision device is defined as: a 10-layer convolutional neural network is used as a decision device, and the convolutional layer of the convolutional neural network is convolved by a hole;
the perceptual loss at multiple scales is defined as: using VGG19 network pre-trained on IMAGENET1000 class classification database as loss-aware network by usingconv2_2conv3_4conv4_4And constructing a multi-scale perception loss through the multi-scale feature map in the multiple layers.
Further, in step S3, the MSE generator loss function lossMSEThe following were used:
lossMSE=MSE(ID_GH,ID_H)
further, in step S4, during model training, the MSE generator loss function loss in the generator loss function is setMSELoss of perception function (loss)vggAnd loss function lossGANThe generator loss function when the weighted sum forms the whole training is as follows:
lossG=lossMSE+lossvgg+lossGAN
further, loss of perceptionvggThe following were used:
lossvgg=10-6×(lossmse_conv2_2+lossmse_conv3_4+lossmse_conv4_4)
lossmse_conv2_2=MSE(fi_conv2_2,ft_conv2_2)
lossmse_conv3_4=MSE(fi_conv3_4,ft_conv3_4)
lossmse_conv4_4=MSE(fi_conv4_4,ft_conv4_4)
wherein f isi_conv2_2,fi_conv3_4,fi_conv4_4Generating image-to-perceptual model correspondences for inputconv2_2conv3_4conv4_4Layer feature map, ft_conv2_3,ft_conv3_3,ft_conv4_3Inputting correspondences obtained in a perceptual model for generating corresponding target images of an imageconv2_2conv3_4conv4_4A layer feature map;
loss of function lossGANThe following were used:
lossGAN=10-4×cross_entropy(ID_GH,True)
cross_entropy(ID_GH,True)=log(D(ID_GH))
wherein D (-) is a decision device.
Further, in step S4, the overall training time decision device loss function lossDIs defined as:
lossD=loss1+loss2
loss1=sigmoid_cross_entropy(ID_GH,False)
loss2=sigmoid_cross_entropy(ID_H,True)
further, in step S5, the super-resolution model includes a generator, a perceptual model and a determiner, the perceptual model and the determiner have the same structure as that used in the denoising model, and the generator in the image super-resolution model is defined as follows:
the method comprises the steps of constructing a residual module, then overlapping a plurality of residual modules to form a network structure main body, and realizing amplification use of an image through a sub-pixel convolution layer.
Furthermore, the data used by the generator training of the super-resolution model is aerial data, and the input is ISR_L Low resolution 16 level aerial photograph and corresponding high resolution 17 level aerial photograph ISR_HThe resultant image pair, the generator output is ISR_GHThe loss function of the generator is defined as follows:
lossMSE_SR=MSE(ISR_GH,ISR_H)
further, step S7 is specifically as follows:
s701, constructing a GMM (Gaussian mixture model) according to the grouped image blocks, carrying out SVD (singular value decomposition) on a covariance matrix in the obtained model, and constructing a dictionary as external prior to guide the reconstruction of a satellite image;
s702, I output in the previous super-resolution modelSR_GHInputting as an internal image, partitioning according to 15-15 blocks after inputting, and guiding the partitions to perform clustering by using a GMM (Gaussian mixture model) model when an external prior dictionary is constructed;
s703, guiding the internal image block to construct an internal dictionary again by utilizing a dictionary formed by external prior;
and S704, sparsely encoding the internal dictionary, and reconstructing a new internal graphic block group by combining the original internal image block group.
Compared with the prior art, the invention has at least the following beneficial effects:
the invention relates to a satellite image super-resolution method combining an anti-network with aerial image prior, which is designed aiming at the situation that the resolution and the visual effect of a satellite image are expected to be improved in the real situation but no clear aerial image pair corresponding to the satellite image exists, and comprises three parts of image denoising, image super-resolution and image post-processing, and the method flow for gradually improving the final satellite image super-resolution result in the available data range
Furthermore, a denoising model and a super-resolution model in a satellite image super-resolution process are both formed by using a generation countermeasure network, and on the basis, a multi-scale perception loss is added, so that the performance of the generation countermeasure network in image denoising and image super-resolution is further improved, and the perception loss has the effect of restricting the image generated by the generator and the corresponding target from a characteristic domain, so that the generated image is closer to the real target image visually. The multi-scale sensing loss is combined with the sensing loss of multiple scales, and stronger constraint is added, so that the generation effect is further improved.
Further, as different modules in the generation countermeasure network, the generator and the decider have different functions. The definition of the method plays an important role in realizing the denoising and super-resolution of the satellite image. The generator builds losses mainly for point-to-point pixels, and also focuses more on extracting high frequency information (through residual structure) of the image in the network body. The discriminator focuses more on the high-level semantic layer, ensures the consistency of the generated image and the real target image, and needs a larger receptive field (realized by cavity convolution). The multiscale perceptual loss is a constraint in the feature domain between the generated image and the real target image, and is realized by using a network pre-trained on IMAGENET.
Further, the generator generates an image that is noise free and similar to a true sharp image at the pixel level, so the loss function uses an MSE function based on the difference between pixels.
Further, the judger generates similarity of the image and the real clear image from the high-level semantic hierarchy constraint. The cross entropy function is a loss function based on the decision probability, and the probability that the generated image and the real target image are semantically judged to be in the same category is expected to be maximum. I.e. the generated image is as similar as possible to the real target image.
Further, in the super-resolution model, the roles of the determiner and the perception model are the same as in the noise model, so the same structure is used. The generator part, the network body is similar (still needs to generate more high frequency information, and also adopts a residual error structure), but since the super-resolution model needs to generate an image with a size larger than that of the input low-resolution image, the design of matching the sub-pixel convolution layer with the grape convolution layer is used for realizing the generation.
Further, under the practical situation, a clearer satellite image (close to the definition of an aerial image) and a lower-resolution satellite image pair cannot be obtained, and the realization effect of satellite image super-resolution is limited.
In conclusion, the invention realizes the denoising model and the image super-resolution model by the constraint combination of the pixel level, the semantic level and the multi-scale characteristic domain, and introduces the aerial images to train the image for the super-resolution model and build the GMM model dictionary in the image post-processing aiming at the unpaired satellite image training data to guide and reconstruct clearer satellite images.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is an overall flow diagram;
FIG. 2 is a diagram of a generator in a denoising model;
FIG. 3 is a structural diagram of a discriminator in a denoising model;
FIG. 4 is a block diagram of VGG19 in a denoiser;
FIG. 5 is a diagram of a generator structure in an image super-resolution model;
FIG. 6 is a flow chart for constructing a GMM model using aerial images and guiding satellite image reconstruction;
FIG. 7 is a diagram illustrating the effect of the present invention;
FIG. 8 is a graph comparing the results of the present invention.
Detailed Description
The invention provides a satellite image super-resolution method based on multi-scale perception loss and generation countermeasure network combined aerial image prior, which comprises the steps of firstly training a denoising model by using an image formed by a 16-level noise-containing image and a corresponding 16-level noise-free image, and then training the image super-resolution model by using clear aerial data. Because the satellite image and the aerial image do not exist in the condition of being paired, when the generated super-resolution image is subjected to image post-processing, the clear aerial image is adopted to construct the external prior dictionary of the GMM model, and the satellite image with unclear interior is guided to be reconstructed. And after reconstruction, in order to further improve the image quality, sharpening the image by using a Gaussian filtering mode. And finally, obtaining a high-resolution image of the original satellite image, and realizing the improvement of the image visual quality on the basis of the original satellite image. The effectiveness of the scheme can be seen from the experimental links. An effective idea is provided for solving the problems of satellite image super-resolution and image quality improvement under the condition of conditional limitation in reality.
Referring to fig. 1, the present invention provides a satellite image super-resolution method based on multi-scale sensing loss and generation countermeasure network combined aerial image prior, which includes the following specific steps:
s1, the generation countermeasure network for realizing the denoising function comprises three parts, namely a generator, a judger and a VGG19 network pre-trained by an IMAGENET database;
s101, defining a generator in the generated countermeasure network, wherein a residual error network is used as the generator, the generator comprises 16 residual error modules, and each residual error module comprises three convolutional layers. The time-domain denoising function to be realized here does not need to enlarge the image, and the specific structure is shown in fig. 2.
S102, defining a structure of a decision device, wherein the decision device uses a convolutional neural network with 10 layers, the convolutional layers use cavity convolution, the size of a receptive field is increased under the condition of not using a pooling layer by setting the range size of the cavity convolution, the accuracy of the decision device is improved, the specific structure is shown in FIG. 3, the structure of the decision device comprises 10 convolutional layers, the number of convolution kernels of each layer is respectively 64, 128, 256, 512, 1024, 512, 256, 128, 128 and 128, the mode arrangement of sequentially increasing and then decreasing, the sizes of the convolution kernels of the first 7 layers are all 4 x 4, the step length is 2, and the sliding convolution is sequentially carried out, wherein the increasing number of the convolution kernels means as many feature types as possible. The last layer of convolution kernel uses a size of 1 x 1, which has the effect of reducing the number of parameters. Since the number of the convolution kernels is increased, the number of the channels is increased, and the adjustment needs to be performed by adding such a layer.
S103, defining multi-scale perception loss, using a VGG19 network pre-trained on an IMAGENET1000 classification database as a perception loss network, and different from other perception loss, using the VGG19 networkconv2_2conv3_4conv4_4The specific structure of the multi-scale feature map in multiple layers for constructing the multi-scale perception loss and improving the image quality generated by a generator is shown in fig. 4, wherein the convolution module comprises two convolution layers and a pooling layer, the first convolution module comprises two convolution layers and a pooling layer, and the second convolution module comprises four convolution layers and a pooling layer. All convolutional layers use 3 × 3 convolutional kernels, the step length is 1, the number of convolutional kernels sequentially adopts a mode of increasing gradually layer by layer in a similar decision device as follows: 64, 64, 128, 128, 256, 256, 256, 256, 512, 512, 512, 512, 512, whereinconv2_2conv3_4conv4_4The output of the second convolution module, the output of the third convolution module and the output of the fourth convolution module, respectively.
S2, down-sampling to 16 levels by using images extracted from 18 levels in the existing satellite data (generally, common satellite images are all 16 levels, and the acquisition cost of 18-level data is high), so that the obtained 16-level data is clearer, but the clear data is very little due to the high acquisition cost of 18-level satellite data.
Setting the obtained 16-grade satellite image as a de-noised target ID_HThe common satellite data extracted directly from 16 levels is taken as a noisy image ID_LBy doing soBy forming an image pair, let the generated noise-free satellite image be ID_GH
S3, performing initialization training on the generator in the denoising model by using the image pair formed in the step S2, calculating the mean square error of the pixels between the image generated by the generator and the corresponding target image by using the Mean Square Error (MSE) as a loss function in the initialization training, calculating the gradient and returning the adjustment model parameter lossMSEThe following were used:
lossMSE=MSE(ID_GH,ID_H)
s4, after the initial training of about 100 epochs (an epoch means that all image data in the image library are trained and calculated as an epoch), training of a complete model is carried out;
at this time, all three networks need to participate in training, but the VGG19 does not adjust parameters, and only needs to output sensing loss and transmit the sensing loss to the generator and the decision device to adjust parameters; the loss function of the generator is different for the overall training compared to the training initiated individually.
In the overall training, the loss function of the generator comprises three parts: MSE generator loss, perception loss and countermeasure loss, wherein the three parts form a generator loss function in the whole training after weighted addition:
lossG=lossMSE+lossvgg+lossGAN
therein, lossMSELoss as the loss function at initial trainingvggTo perceive the loss:
lossvgg=10-6×(lossmse_conv2_2+lossmse_conv3_4+lossmse_conv4_4)
lossmse_conv2_2=MSE(fi_conv2_2,ft_conv2_2)
lossmse_conv3_4=MSE(fi_conv3_4,ft_conv3_4)
lossmse_conv4_4=MSE(fi_conv4_4,ft_conv4_4)
wherein f isi_conv2_2,fi_conv3_4,fi_conv4_4Generating image-to-perceptual model correspondences for inputconv2_2conv3_4conv4_4Layer feature map, ft_conv2_3,ft_conv3_3,ft_conv4_3Inputting correspondences obtained in a perceptual model for generating corresponding target images of an imageconv2_2conv3_4conv4_4A layer feature map;
lossGANto combat the loss function:
lossGAN=10-4×cross_entropy(ID_GH,True)
cross_entropy(ID_GH,True)=log(D(ID_GH))
wherein D (-) is a decision device.
The overall training decision-maker loss function is defined as:
lossD=loss1+loss2
loss1=sigmoid_cross_entropy(ID_GH,False)
loss2=sigmoid_cross_entropy(ID_H,True)
therein, lossDFor the judger loss, the loss and the corresponding gradient are calculated and the parameter model in the judger is adjusted back.
S5, training 200 epochs to converge according to the settings, and storing the model, wherein a generator of the training is used for later denoising processing, and the obtained denoised image is ID_GHThen defining a satellite image super-resolution model as the input of the super-resolution of the subsequent image;
the super-resolution model also mainly comprises three parts, namely a generator, a perception model and a decider. Wherein the structure used by the perception model and the decider is the same as that used in the denoising model.
Defining a generator in an image super-resolution model: the residual error network is also used in the main structure of the generator part, that is, a network structure main body is constructed by constructing a residual error module and then stacking a plurality of residual error modules, and then a sub-pixel convolution layer (sub-pixel) is used for amplifying the image, the specific structure is shown in fig. 5, the structure of the super-resolution generator is similar to that in the denoising model defined above, a mode of stacking a plurality of residual error modules is adopted, wherein the convolution layers all adopt 3 × 3 convolution kernels, the number of the convolution kernels is 64, the following sub-pixel convolution layers and the convolution layers correspondingly connected with the following sub-pixel convolution layers all adopt 256 convolution kernels, the convolution layers adopt 3 × 3 convolution kernels, the scale of the first sub-pixel convolution layer in the super-resolution model for realizing x2 is 1, and the scale of the second sub-pixel convolution layer is 2.
The super-resolution model generator trains aerial photography data with input ISR_L Low resolution 16 level aerial photograph and corresponding high resolution 17 level aerial photograph ISR_HThe pair of images formed, the output of the generator is ISR_GH
The loss function of the generator is defined as:
lossMSE_SR=MSE(ISR_GH,ISR_H)
s6, repeating the steps S3-S5, completing the super-resolution network training process and the denoising model, and then generating a super-resolution image ISR_GHIn order to further combine clear prior in the aerial image into the satellite image, a Gaussian Mixture Model (GMM) is adopted to construct an external prior dictionary to guide the method of combining internal image reconstruction and image sharpening to further improve the quality of the generated super-resolution satellite image;
and S7, constructing a GMM external prior dictionary to guide the internal image to reconstruct a clearer satellite image (originally used for image denoising). The situation that an image pair cannot be formed between an aerial image and a satellite image is utilized, a generated confrontation network model which is proposed before cannot be directly used for training, and rich details in a clear aerial image can be indirectly introduced into the satellite image generated in a super-resolution mode by using a mode of constructing a GMM external prior dictionary; constructing a GMM external prior dictionary, dividing a clear aerial 17-level image into 15 × 15 small blocks, and performing preliminary grouping (according to Euclidean distance) after the blocks, as shown in FIG. 6;
s701, constructing a GMM (Gaussian mixture model) according to the grouped image blocks, carrying out SVD (singular value decomposition) on a covariance matrix in the obtained model, and constructing a dictionary as external prior to guide the reconstruction of a satellite image;
s702, I output in the previous super-resolution modelSR_GHInputting a rear block (15 x 15) as an internal image input, and guiding the block to cluster by using a GMM model when an external prior dictionary is constructed;
s703, guiding the internal image block to construct an internal dictionary again by using a dictionary formed by external prior;
and S704, sparsely encoding the internal dictionary, and reconstructing a new internal graphic block group by combining the original internal image block group.
And S8, grouping according to the reconstructed internal graph blocks, reconstructing a satellite graph, and carrying out image sharpening operation on the reconstructed satellite graph to enable edges in the image to be clearer and obtain a final result graph.
The invention combines multi-scale perception loss and generation of a countermeasure network, and realizes super resolution of satellite images under the condition of certain condition limitation. The method comprises the steps of training a network for denoising by using a satellite image, training a network for realizing image super-resolution by using an aerial image, and further reconstructing an image after super-resolution reconstruction by combining with a feature prior in the aerial image which is extracted clearly by using a Gaussian mixture model. And the edges in the image are sharpened through one Gaussian filtering again, and finally a clearer satellite image is generated.
The invention solves the problems of image resolution and image quality improvement under the limited condition. By using multi-scale perceptual distortion loss, multi-scale constraints on generating image feature domains are achieved to generate better performing images.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A. Conditions of the experiment
1. Database for experimental use
The experimental data is satellite image data and aerial image data provided in a satellite image super-resolution project. Not public data sets, are shown only partially here. The satellite image data includes:
data type 1: satellite images (containing granular obvious noise) extracted from 16 levels are not high in definition;
data type 2: the satellite images extracted from 18 levels onwards (granular noise is not noticeable) are slightly more sharp. When the satellite images extracted from the 18 th level are down-sampled to the 16 th level, the satellite images which are clearer than the satellite images extracted from the 16 th level can be obtained. However, since the satellite images extracted from 18 levels are high in cost, the satellite images are generally difficult to obtain in large quantities, and the satellite images extracted from 16 levels are generally common. Therefore, the realization of the project has great research significance and value by training an image super-resolution model on the basis of obtaining a small part of satellite images extracted from 18 levels and then realizing similar or even superior satellite images extracted from 18 levels (assisted by using clear aerial images) according to the low-definition satellite images extracted from 16 levels by the image super-resolution technology. This type of data is less available here, but overlaps with the coverage area in data type 1, so a small number of image pairs can be constructed for model training.
Data type 3: clear aerial data are clearer relative to satellite images due to shooting height and shooting mode. In the aerial image at the same level as the satellite image, the aerial image is much clearer and contains abundant texture information. However, the coverage area of the aerial images is limited, the sources are limited, the aerial images with the time periods close to the same positions in the data type 1 and the data type 2 cannot be obtained, the image pair formed by the satellite images and the aerial images does not exist, and the aerial images cannot be directly used for training, as shown in table 1.
TABLE 1 data set and distribution thereof
Data type/rating Stage 15 16 stage Stage 17 18 stages Total up to
Data type 1 12989 51956 Is free of Is free of 64945
Data type 2 1583 6332 25328 101302 134555
Data type 3 1689 7104 27988 111952 148733
2. Experimental requirements
The experiment was divided into three sections: denoising model training, image super-resolution model training and image post-processing experiments.
And (3) denoising model training: an image pair is formed by a satellite image (containing noise) extracted from 16 levels and a satellite image (containing no noise but having low definition) down-sampled to 16 levels after being extracted from 18 levels. As training data, the generation of the countermeasure network proposed in the present scheme is trained. After the training is finished, a generator model is used for inputting a satellite image containing noise to obtain a satellite image without noise. To ensure the robustness of the model, the test uses satellite images that are all different urban areas from the training, again noisy images extracted from level 16.
Training an image super-resolution model: model training is performed by constructing an image pair using a 17-level aerial image and a 16-level aerial image obtained by 17-level down-sampling. After the generation of the countermeasure network proposed in the training scheme is completed, a model of the generator is utilized to input a 16-level satellite image without noise, and a corresponding 17-level high-resolution image can be generated. And comparing the visual effects of the resulting high resolution images
Image post-processing experiment: and carrying out image post-processing on the satellite image subjected to denoising and image super-resolution processing to further improve the image quality. Firstly, a GMM external prior dictionary is obtained by utilizing clear 17-level aerial image training as guidance, a 17-level satellite image obtained by super resolution is input, an internal dictionary is constructed under the external prior guidance, an image is reconstructed, and a satellite image combined with clear prior in the aerial image is obtained. And on the basis, the image is sharpened by using a Gaussian filtering method to obtain the final post-processed image. The resulting image is compared to the original image for clarity and visual effect.
3. Experimental parameter settings
The same setting is adopted when the denoising model and the image super-resolution model are trained. The first is the initial training of the generator, with an initial learning rate of 0.0001 and a training period of 100 epochs (one epoch for all passes of the training data). When the network is integrally trained, the initial learning rate is still set to be 0.0001, the training period is set to be 200 epcoh, and the learning rate is attenuated to be 0.00001 once when the training period reaches half.
In image post-processing, the external prior dictionary for building the GMM model comprises the following parameters: setting the step length of the blocks to be 3, setting the size of the blocks to be 15 x 15, selecting 10 image blocks with the closest Euclidean distance as a group during clustering, wherein the GMM comprises 32 Gaussian models, namely fitting 32 categories. And adopting Gaussian filtering during image sharpening, setting the filtering radius to be 1.5, and setting the sharpening intensity to be 2.
B. Evaluation criteria for experimental results
Since the actual test input is a satellite image extracted from level 16 (containing noise), there is no corresponding sharp level 17 satellite image. The measurement can not be directly carried out by using the general PSNR, SSIM and other weighing standards. The effectiveness of this solution is illustrated here by a comparison of graphs listing some of the test results.
C. Comparative test protocol
Referring to fig. 7 and 8, the above-mentioned test results show the effectiveness of the proposed scheme in practical situations. The restrictive conditions in the background of the scheme result in that a general image super-resolution algorithm cannot be trained and processed directly, and the expected effect can be achieved only by means of a series of image processing algorithms. Final test generated image effects based on the original noise-containing 16-level satellite images, not only was noise removed, but super-resolution to 17 levels (i.e., length by 2 in size) was achieved. And by means of aerial photography definition (non-co-location), the improvement and improvement of the definition of the generated 17-level satellite image are realized.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (6)

1. A satellite image super-resolution method combining an antagonistic network with an aerial image prior is characterized in that a 16-level noise-containing image and a 16-level noise-free image corresponding to the 16-level noise-containing image are used for forming an image pair training denoising model, and then the aerial data is used for training the image super-resolution model; the method comprises the following steps of constructing a GMM model external prior dictionary by adopting aerial images, guiding an internally unclear satellite image to be reconstructed, finishing post-processing of a generated super-resolution image, sharpening the image by adopting a Gaussian filtering mode, and finally obtaining a high-resolution image of an original satellite image, so that the image visual quality on the basis of the original satellite image is improved, and comprises the following steps:
s1, defining a generator, a decision device and a multi-scale perception loss network in the generation countermeasure network, wherein the generator in the generation countermeasure network is defined as: using a residual error network as a generator, wherein the residual error network comprises 16 residual error modules, and each residual error module comprises three convolutional layers;
the structure of the decision device is defined as: a 10-layer convolutional neural network is used as a decision device, and the convolutional layer of the convolutional neural network is convolved by a hole;
the perceptual loss at multiple scales is defined as: constructing multi-scale perceptual loss by using a VGG19 network pre-trained on an IMAGENET1000 class classification database as a perceptual loss network and using conv2_2, conv3_4 and conv4_4 and multi-scale feature maps in multiple layers;
s2, utilizing the image extracted from 18 levels in the existing satellite data to down-sample to 16 levels, and setting the obtained 16-level satellite image as the de-noised target ID_HSatellite data extracted from 16 levels as noisy image ID_LForming an image pair, and setting the generated noise-free satellite image as ID_GH
S3, performing initialization training to the generator in the denoising model by the image pair formed in the step S2, and calculating the image generated by the generator and the pair thereof by taking the mean square error as a loss function in the initialization trainingObtaining MSE generator loss function loss by mean square error of pixel between target imagesMSECalculating gradient and returning and adjusting model parameters;
s4, after 100 epochs of initial training, carrying out complete model training, calculating loss and corresponding gradient, returning to a parameter model in an adjustment generator and a decision device, sensing a loss network VGG19 without adjusting parameters, and carrying out MSE generator loss function loss in a generator loss function during model trainingMSELoss of perception function (loss)vggAnd loss function lossGANThe generator loss function when the weighted sum forms the whole training is as follows:
lossG=lossMSE+lossvgg+lossGAN
s5, training 200 epochs to reach convergence, storing the model, using a training generator for denoising, and obtaining a denoised image ID_GHAs an input of image super resolution, an image super resolution model is defined, the super resolution model comprises a generator, a perception model and a judger, the perception model and the judger have the same structure as that used in the denoising model, and the generator in the image super resolution model is defined as follows:
constructing a residual module, then overlapping a plurality of residual modules to form a network structure main body, and realizing amplification use of the image through a sub-pixel convolution layer;
s6, repeating the steps S3-S5, completing the super-resolution network training process and the denoising model, and then generating a super-resolution image ISR_GHConstructing an external prior dictionary by adopting a Gaussian mixture model;
s7, constructing a GMM external prior dictionary, dividing clear aerial 17-level images into 15-15 small blocks, and then performing preliminary grouping according to Euclidean distance;
and S8, grouping and reconstructing the satellite images according to the reconstructed internal image blocks, and carrying out image sharpening operation on the reconstructed satellite images to obtain a final result image.
2. The super-resolution method for satellite images against network combined with aerial image priors as claimed in claim 1, whereinCharacterized in that, in step S3, the MSE generator loss function lossMSEThe following were used:
lossMSE=MSE(ID_GH,ID_H)。
3. the super-resolution method for satellite images against network combined with aerial image priors as claimed in claim 1, wherein in step S4, loss of perception loss isvggThe following were used:
lossvgg=10-6×(lossmse_conv2_2+lossmse_conv3_4+lossmse_conv4_4)
lossmse_conv2_2=MSE(fi_conv2_2,ft_conv2_2)
lossmse_conv3_4=MSE(fi_conv3_4,ft_conv3_4)
lossmse_conv4_4=MSE(fi_conv4_4,ft_conv4_4)
wherein f isi_conv2_2,fi_conv3_4,fi_conv4_4Generating image-to-perception model corresponding conv2_2, conv3_4, conv4_4 layer feature map, f for inputt_conv2_3,ft_conv3_3,ft_conv4_3Inputting corresponding conv2_2, conv3_4 and conv4_4 layer feature maps obtained in a perception model for generating image corresponding target images;
loss of function lossGANThe following were used:
lossGAN=10-4×cross_entropy(ID_GH,True)
cross_entropy(ID_GH,True)=log(D(ID_GH))
wherein D (-) is a decision device.
4. The method for super-resolution of satellite images based on countermeasure network combined with aerial image priors as claimed in claim 1, wherein in step S4, the decision-maker loss function loss is used in the whole model trainingDIs defined as:
lossD=loss1+loss2
loss1=sigmoid_cross_entropy(ID_GH,False)
loss2=sigmoid_cross_entropy(ID_H,True)。
5. the method for super-resolution of satellite images in combination with aerial image priors through countermeasure network as claimed in claim 1, wherein the data used by the generator training of the super-resolution model is aerial data, and the input is ISR_LLow-resolution 16-level aerial photograph and corresponding high-resolution 17-level aerial photograph ISR_HThe resultant image pair, the generator output is ISR_GHThe loss function of the generator is defined as follows:
lossMSE_SR=MSE(ISR_GH,ISR_H)。
6. the super-resolution method for satellite images in combination with aerial image priors through the countermeasure network as claimed in claim 1, wherein the step S7 is as follows:
s701, constructing a GMM (Gaussian mixture model) according to the grouped image blocks, carrying out SVD (singular value decomposition) on a covariance matrix in the obtained model, and constructing a dictionary as external prior to guide the reconstruction of a satellite image;
s702, I output in the previous super-resolution modelSR_GHInputting as an internal image, partitioning according to 15-15 blocks after inputting, and guiding the partitions to perform clustering by using a GMM (Gaussian mixture model) model when an external prior dictionary is constructed;
s703, guiding the internal image block to construct an internal dictionary again by utilizing a dictionary formed by external prior;
and S704, sparsely encoding the internal dictionary, and reconstructing a new internal graphic block group by combining the original internal image block group.
CN201810777731.5A 2018-07-16 2018-07-16 Satellite image super-resolution method combining countermeasure network with aerial image prior Active CN109035142B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810777731.5A CN109035142B (en) 2018-07-16 2018-07-16 Satellite image super-resolution method combining countermeasure network with aerial image prior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810777731.5A CN109035142B (en) 2018-07-16 2018-07-16 Satellite image super-resolution method combining countermeasure network with aerial image prior

Publications (2)

Publication Number Publication Date
CN109035142A CN109035142A (en) 2018-12-18
CN109035142B true CN109035142B (en) 2020-06-19

Family

ID=64642502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810777731.5A Active CN109035142B (en) 2018-07-16 2018-07-16 Satellite image super-resolution method combining countermeasure network with aerial image prior

Country Status (1)

Country Link
CN (1) CN109035142B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11017499B2 (en) * 2018-12-21 2021-05-25 Here Global B.V. Method, apparatus, and computer program product for generating an overhead view of an environment from a perspective image
CN109801221B (en) * 2019-01-18 2024-11-05 腾讯科技(深圳)有限公司 Training method, image processing method, device and storage medium for generating countermeasure network
CN109886875B (en) * 2019-01-31 2023-03-31 深圳市商汤科技有限公司 Image super-resolution reconstruction method and device and storage medium
CN110009568A (en) * 2019-04-10 2019-07-12 大连民族大学 The generator construction method of language of the Manchus image super-resolution rebuilding
CN110070505A (en) * 2019-04-12 2019-07-30 北京迈格威科技有限公司 Enhance the method and apparatus of image classification plant noise robustness
CN110119780B (en) * 2019-05-10 2020-11-27 西北工业大学 Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN110120024B (en) 2019-05-20 2021-08-17 百度在线网络技术(北京)有限公司 Image processing method, device, equipment and storage medium
CN110706166B (en) * 2019-09-17 2022-03-18 中国科学院空天信息创新研究院 Image super-resolution reconstruction method and device for sharpening label data
EP3796252B1 (en) * 2019-09-17 2021-09-22 Vricon Systems Aktiebolag Resolution enhancement of aerial images or satellite images
CN110807762B (en) * 2019-09-19 2021-07-06 温州大学 Intelligent retinal blood vessel image segmentation method based on GAN
CN111209854A (en) * 2020-01-06 2020-05-29 苏州科达科技股份有限公司 Method and device for recognizing unbelted driver and passenger and storage medium
CN112270654A (en) * 2020-11-02 2021-01-26 浙江理工大学 Image denoising method based on multi-channel GAN
CN112686801B (en) * 2021-01-05 2023-06-20 金陵科技学院 Water quality monitoring method based on aerial image and serial echo state network
CN113535996B (en) * 2021-05-27 2023-08-04 中国人民解放军火箭军工程大学 Road image dataset preparation method and device based on aerial image
CN113361508B (en) * 2021-08-11 2021-10-22 四川省人工智能研究院(宜宾) Cross-view-angle geographic positioning method based on unmanned aerial vehicle-satellite
CN118570457A (en) * 2024-08-05 2024-08-30 山东航天电子技术研究所 Image super-resolution method based on remote sensing target recognition task driving

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105590296A (en) * 2015-12-07 2016-05-18 天津大学 Dual-dictionary learning-based single-frame image super-resolution reconstruction method
CN107154023A (en) * 2017-05-17 2017-09-12 电子科技大学 Face super-resolution reconstruction method based on generation confrontation network and sub-pix convolution
CN108171656A (en) * 2018-01-12 2018-06-15 西安电子科技大学 Adaptive Global Dictionary remote sensing images ultra-resolution method based on rarefaction representation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11024009B2 (en) * 2016-09-15 2021-06-01 Twitter, Inc. Super resolution using a generative adversarial network
CN107977932B (en) * 2017-12-28 2021-04-23 北京工业大学 Face image super-resolution reconstruction method based on discriminable attribute constraint generation countermeasure network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105590296A (en) * 2015-12-07 2016-05-18 天津大学 Dual-dictionary learning-based single-frame image super-resolution reconstruction method
CN107154023A (en) * 2017-05-17 2017-09-12 电子科技大学 Face super-resolution reconstruction method based on generation confrontation network and sub-pix convolution
CN108171656A (en) * 2018-01-12 2018-06-15 西安电子科技大学 Adaptive Global Dictionary remote sensing images ultra-resolution method based on rarefaction representation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Vehicle detection from highway satellite images via transfer learning;Liujuan Cao 等;《Information Sciences》;20161020;第366卷;全文 *
基于地物类别的高光谱图像超分辨率复原算法研究;张宗祥;《中国优秀硕士学位论文全文数据库 信息科技辑》;中国学术期刊(光盘版)电子杂志社;20180715(第07期);全文 *

Also Published As

Publication number Publication date
CN109035142A (en) 2018-12-18

Similar Documents

Publication Publication Date Title
CN109035142B (en) Satellite image super-resolution method combining countermeasure network with aerial image prior
CN110119780B (en) Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN113177882B (en) Single-frame image super-resolution processing method based on diffusion model
CN111028163A (en) Convolution neural network-based combined image denoising and weak light enhancement method
CN111784602A (en) Method for generating countermeasure network for image restoration
CN106204447A (en) The super resolution ratio reconstruction method with convolutional neural networks is divided based on total variance
CN113284051B (en) Face super-resolution method based on frequency decomposition multi-attention machine system
CN110648292A (en) High-noise image denoising method based on deep convolutional network
Yang et al. Image super-resolution based on deep neural network of multiple attention mechanism
CN114862731B (en) Multi-hyperspectral image fusion method guided by low-rank priori and spatial spectrum information
CN112489168A (en) Image data set generation and production method, device, equipment and storage medium
CN110533591B (en) Super-resolution image reconstruction method based on codec structure
CN117114984A (en) Remote sensing image super-resolution reconstruction method based on generation countermeasure network
CN113763268A (en) Blind restoration method and system for face image
CN116977651B (en) Image denoising method based on double-branch and multi-scale feature extraction
CN110992295A (en) Low-dose CT reconstruction method based on wavelet-RED convolution neural network
CN111861886A (en) Image super-resolution reconstruction method based on multi-scale feedback network
Lu et al. Underwater image enhancement method based on denoising diffusion probabilistic model
CN115063318A (en) Adaptive frequency-resolved low-illumination image enhancement method and related equipment
CN106296583B (en) Based on image block group sparse coding and the noisy high spectrum image ultra-resolution ratio reconstructing method that in pairs maps
CN114936977A (en) Image deblurring method based on channel attention and cross-scale feature fusion
Wen et al. The power of complementary regularizers: Image recovery via transform learning and low-rank modeling
CN114529482A (en) Image compressed sensing reconstruction method based on wavelet multi-channel depth network
CN112200752B (en) Multi-frame image deblurring system and method based on ER network
CN113096015A (en) Image super-resolution reconstruction method based on progressive sensing and ultra-lightweight network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant