CN118195930A

CN118195930A - Image denoising model training method, image denoising device and electronic equipment

Info

Publication number: CN118195930A
Application number: CN202211584782.9A
Authority: CN
Inventors: 王林; 汝佩哲; 李�诚; 周晓; 杨再初
Original assignee: Beijing Yingtelinda Information Technology Co ltd; Intelingda Information Technology Shenzhen Co ltd
Current assignee: Beijing Yingtelinda Information Technology Co ltd; Intelingda Information Technology Shenzhen Co ltd
Priority date: 2022-12-09
Filing date: 2022-12-09
Publication date: 2024-06-14

Abstract

The embodiment of the application provides an image denoising model training method, an image denoising device and electronic equipment, which relate to the technical field of image processing, wherein each noise image in sample data is input into an image denoising model to be trained in the image denoising model training method to obtain a predicted image; for each pixel position in the noise image, if the pixel value of the pixel position in the predicted image is in the first pixel value interval, calculating a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and the second pixel value interval; and adjusting model parameters of the image denoising model to be trained based on the loss value of the noise image calculated by using the loss value of each pixel position until reaching a convergence condition, and finishing training. Based on the method, the phenomenon of overfitting of the image denoising model is avoided to a certain extent, the image is processed based on the trained image denoising model, and the image quality of the denoised image can be improved.

Description

Image denoising model training method, image denoising device and electronic equipment

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image denoising model training method, an image denoising device, and an electronic apparatus.

Background

With the rapid development of computer technology, people use image information more and more widely, and the requirements on image quality are also gradually improved. Currently, sensors in image acquisition devices can obtain images based on photons of incident light. However, the sensor is easily affected by various factors such as ambient light intensity during the image acquisition process, so that the obtained image contains more noise. Therefore, denoising of an image is generally required to obtain a high-quality image.

In the related art, image denoising is performed by an image denoising model, and in the image denoising model training process, a loss function value is calculated based on a difference between a denoised image output by the image denoising model and noise-free label data (also referred to as true value data, i.e., noise-free image), and parameters of the image denoising model are updated based on the loss function value.

Based on the mode, the noisy image needs to be fitted to the completely noiseless image, the fitting difficulty is high, the phenomenon that the image denoising model is fitted is caused, further, the phenomenon of blurring exists in the denoised image, details of the image are lost, and the image quality of the denoised image is low.

Disclosure of Invention

The embodiment of the application aims to provide an image denoising model training method, an image denoising device and electronic equipment, which can avoid the phenomenon that an image denoising model is over-fitted to a certain extent, and further, the image is processed based on the obtained image denoising model, so that the image quality of the denoised image can be improved. The specific technical scheme is as follows:

In a first aspect of the present application, there is provided an image denoising model training method, the method comprising:

Acquiring sample data; wherein the sample data comprises: a plurality of noise images including noise, first tag images not including noise corresponding to the respective noise images, and second tag images corresponding to the respective noise images; wherein each second label image is: a weighted sum of the corresponding noise image and the first label image;

Inputting each noise image into an image denoising model to be trained, and obtaining an image output by the image denoising model as a predicted image;

For each pixel position in the noise image, if the pixel value of the pixel position in the predicted image is in a first pixel value interval, calculating a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and a second pixel value interval;

Wherein, the end point of the first pixel value interval includes: the pixel value of the pixel position in the corresponding second label image and the pixel value of the pixel position in the noise image; one end point of the second pixel value interval is a pixel value of the pixel position in the corresponding second label image, and the second pixel value interval belongs to a third pixel value interval; the end point of the third pixel value interval includes: the pixel value of the pixel position in the corresponding first label image and the pixel value of the pixel position in the corresponding second label image;

calculating a loss value of the noise image based on the loss value at each pixel position;

Based on the loss value of the noise image, the model parameters of the image denoising model to be trained are adjusted until convergence conditions are reached, and the trained image denoising model is obtained.

Optionally, if the pixel value of the pixel location in the predicted image is located in the first pixel value interval, calculating the loss value at the pixel location based on the pixel value of the pixel location in the predicted image and the second pixel value interval includes:

If the pixel value of the pixel position in the predicted image is within the first pixel value interval, calculating a loss value at the pixel position based on the difference between the pixel value of the pixel position in the predicted image and the pixel value of the pixel position in the corresponding second label image.

Optionally, the method further comprises:

If the pixel value of the pixel position in the predicted image is outside the fourth pixel value interval, calculating a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and the fifth pixel value interval; wherein, the end point of the fourth pixel value interval includes: the pixel value of the pixel position in the corresponding first label image and the pixel value of the pixel position in the noise image; one end point of the fifth pixel value interval is a pixel value of the pixel position in the corresponding first label image, and the fifth pixel value interval belongs to the third pixel value interval.

Optionally, if the pixel value of the pixel location in the predicted image is outside the fourth pixel value interval, calculating the loss value at the pixel location based on the pixel value of the pixel location in the predicted image and the fifth pixel value interval includes:

If the pixel value of the pixel position in the predicted image is outside the fourth pixel value interval, calculating a loss value at the pixel position based on the difference between the pixel value of the pixel position in the predicted image and the pixel value of the pixel position in the corresponding first label image.

Optionally, the method further comprises:

And if the pixel value of the pixel position in the predicted image is in the third pixel value interval, determining the loss value of the pixel position as 0.

Optionally, each image in the sample data is an image in a RAW format.

In a second aspect of the present application, there is provided an image denoising method, the method comprising:

Acquiring an image to be denoised;

Inputting the image to be denoised into a pre-trained image denoising model to obtain a denoised image; the image denoising model is trained based on any one of the method steps.

In a third aspect of the present application, there is provided an image denoising model training apparatus, the apparatus comprising:

The sample data acquisition module is used for acquiring sample data; wherein the sample data comprises: a plurality of noise images including noise, first tag images not including noise corresponding to the respective noise images, and second tag images corresponding to the respective noise images; wherein each second label image is: a weighted sum of the corresponding noise image and the first label image;

the prediction image acquisition module is used for inputting each noise image into an image denoising model to be trained, and obtaining an image output by the image denoising model as a prediction image;

A first calculation module, configured to calculate, for each pixel position in the noise image, a loss value at the pixel position based on a pixel value of the pixel position in the prediction image and a second pixel value interval if the pixel value of the pixel position in the prediction image is located in the first pixel value interval;

The second calculation module is used for calculating the loss value of the noise image based on the loss value of each pixel position;

and the model parameter adjustment module is used for adjusting the model parameters of the image denoising model to be trained based on the loss value of the noise image until convergence conditions are reached, so as to obtain the trained image denoising model.

Optionally, the first calculating module is specifically configured to calculate, if the pixel value of the pixel position in the predicted image is located in the first pixel value interval, a loss value at the pixel position based on a difference between the pixel value of the pixel position in the predicted image and the pixel value of the pixel position in the corresponding second label image.

Optionally, the apparatus further includes:

A third calculation module, configured to calculate a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and a fifth pixel value interval if the pixel value of the pixel position in the predicted image is outside the fourth pixel value interval; wherein, the end point of the fourth pixel value interval includes: the pixel value of the pixel position in the corresponding first label image and the pixel value of the pixel position in the noise image; one end point of the fifth pixel value interval is a pixel value of the pixel position in the corresponding first label image, and the fifth pixel value interval belongs to the third pixel value interval.

Optionally, the third calculating module is specifically configured to calculate, if the pixel value of the pixel location in the predicted image is outside the fourth pixel value interval, a loss value at the pixel location based on a difference between the pixel value of the pixel location in the predicted image and the pixel value of the pixel location in the corresponding first label image.

Optionally, the apparatus further includes:

and a fourth calculation module, configured to determine a loss value at the pixel position as 0 if the pixel value of the pixel position in the predicted image is within the third pixel value interval.

Optionally, each image in the sample data is an image in a RAW format.

In a fourth aspect of the present application, there is also provided an image denoising apparatus, the apparatus comprising:

The image acquisition module to be denoised is used for acquiring the image to be denoised;

The denoising module is used for inputting the image to be denoised into a pre-trained image denoising model to obtain a denoised image; the image denoising model is obtained by training based on the image denoising model training method.

In yet another aspect of the present application, there is also provided an electronic device including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory perform communication with each other through the communication bus;

a memory for storing a computer program;

And the processor is used for realizing any one of the image denoising model training method or the image denoising method steps when executing the program stored in the memory.

In yet another aspect of the implementation of the present application, there is also provided a computer readable storage medium, where a computer program is stored, where the computer program, when executed by a processor, implements any one of the above-mentioned image denoising model training method or image denoising method step.

The embodiment of the application also provides a computer program product containing instructions, which when run on a computer, cause the computer to execute the image denoising model training method or the image denoising method.

The image denoising model training method provided by the embodiment of the application can acquire sample data; wherein the sample data comprises: a plurality of noise images including noise, first tag images not including noise corresponding to the respective noise images, and second tag images corresponding to the respective noise images; wherein each second label image is: a weighted sum of the corresponding noise image and the second label image; inputting each noise image into an image denoising model to be trained, and obtaining an image output by the image denoising model as a predicted image; for each pixel position in the noise image, if the pixel value of the pixel position in the predicted image is positioned in a first pixel value interval, calculating a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and a second pixel value interval; the end point of the first pixel value interval comprises: the pixel value of the pixel position in the corresponding second label image and the pixel value of the pixel position in the noise image; one end point of the second pixel value interval is a pixel value of the pixel position in the corresponding second label image, and the second pixel value interval belongs to the third pixel value interval; the end points of the third pixel value interval include: the pixel value of the pixel position in the corresponding first label image and the pixel value of the pixel position in the corresponding second label image; calculating a loss value of the noise image based on the loss value at each pixel position; based on the loss value of the noise image, the model parameters of the image denoising model to be trained are adjusted until convergence conditions are reached, and the trained image denoising model is obtained.

Based on the above processing, if the pixel value of the pixel position in the predicted image is within the first pixel value interval, that is, between the pixel value of the pixel position in the corresponding second label image and the pixel value of the pixel position in the noise image, the loss value at the pixel position may be calculated based on the pixel value of the pixel position in the predicted image and the second pixel value interval. Correspondingly, the model parameters of the image denoising model are adjusted based on the loss value, so that the pixel value of the pixel position output by the image denoising model tends to the second pixel value interval. Since the second pixel value interval belongs to the third pixel value interval taking the pixel value of the pixel position in the corresponding first label image (i.e. the image without noise) and the pixel value in the corresponding second label image as endpoints, the trained image denoising model can effectively denoise the image.

In addition, the related art calculates a loss value using a pixel value of the pixel position in the corresponding first label image to force fitting of the noise image to the image containing no noise. In the scheme of the application, the loss value calculated based on the second pixel value interval can promote the output pixel value to be positioned in the third pixel value interval, so that the forced fitting of a noise image to an image which does not contain noise can be avoided, the phenomenon that an image denoising model is over-fitted can be avoided to a certain extent, and the image is processed based on the obtained image denoising model, so that the image quality of the denoised image can be improved.

Of course, it is not necessary for any one product or method of practicing the application to achieve all of the advantages set forth above at the same time.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the application, and other embodiments may be obtained according to these drawings to those skilled in the art.

FIG. 1 is a first flowchart of an image denoising model training method according to an embodiment of the present application;

FIG. 2A is a schematic diagram of a noise image according to an embodiment of the present application;

FIG. 2B is a schematic diagram of an image obtained by denoising the noisy image shown in FIG. 2A using an image denoising model obtained by a method in the related art;

FIG. 2C is a schematic diagram of an image obtained by denoising the noisy image shown in FIG. 2A using an image denoising model obtained by the method according to an embodiment of the present application;

FIG. 2D is a schematic diagram of a noiseless image corresponding to the noisy image shown in FIG. 2A according to an embodiment of the present application;

FIG. 3A is a schematic diagram of another noise image according to an embodiment of the present application;

FIG. 3B is a schematic diagram of an image obtained by denoising the noisy image shown in FIG. 3A using an image denoising model obtained by a method in the related art;

FIG. 3C is a schematic diagram of an image obtained by denoising the noisy image shown in FIG. 3A using an image denoising model obtained by the method according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of an image denoising model according to an embodiment of the present application;

FIG. 5A is a first schematic diagram of an interval in which pixel values in a predicted image according to an embodiment of the present application are located;

FIG. 5B is a second schematic diagram of an interval in which pixel values in a predicted image according to an embodiment of the present application are located;

FIG. 5C is a third schematic diagram of an interval in which pixel values in a predicted image according to an embodiment of the present application are located;

FIG. 6 is a second flowchart of an image denoising model training method according to an embodiment of the present application;

FIG. 7 is a third flowchart of an image denoising model training method according to an embodiment of the present application;

FIG. 8 is a flowchart of an image denoising method according to an embodiment of the present application;

FIG. 9 is a schematic structural diagram of an image denoising model training apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an image denoising apparatus according to an embodiment of the present application;

Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by the person skilled in the art based on the present application are included in the scope of protection of the present application.

For example, the image acquisition device may be a device such as a mobile phone or a monitoring camera, and when the intensity of ambient light is weak (for example, shooting at night), the device may denoise the acquired image based on the image denoising model obtained by the method provided by the embodiment of the present application, so as to obtain a denoised image.

In addition, the image denoising model can be trained in the equipment in advance, or the training of the image denoising model can be performed in other equipment, and correspondingly, the equipment can acquire the trained image denoising model from the other equipment and denoise the image.

In order to avoid the phenomenon of over fitting of an image denoising model to a certain extent, and further, the image is processed based on the obtained image denoising model, so that the image quality of the denoised image is improved, and the embodiment of the application provides an image denoising model training method. The method is applied to the electronic equipment, the electronic equipment can train the image denoising model, and in addition, the electronic equipment can denoise the image based on the trained image denoising model. For example, the electronic device may be a mobile phone or a monitoring camera.

Referring to fig. 1, fig. 1 is a first flowchart of an image denoising model training method according to an embodiment of the present application, where the method may include the following steps:

step S101: sample data is acquired.

Wherein the sample data comprises: a plurality of noise images including noise, first tag images not including noise corresponding to the respective noise images, and second tag images corresponding to the respective noise images; wherein each second label image is: a weighted sum of the corresponding noise image and the second label image.

Step S102: and inputting each noise image into an image denoising model to be trained, and obtaining an image output by the image denoising model as a predicted image.

Step S103: for each pixel position in the noise image, if the pixel value of the pixel position in the predicted image is within the first pixel value interval, calculating a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and the second pixel value interval.

The end point of the first pixel value interval comprises: the pixel value of the pixel position in the corresponding second label image and the pixel value of the pixel position in the noise image; one end point of the second pixel value interval is a pixel value of the pixel position in the corresponding second label image, and the second pixel value interval belongs to the third pixel value interval; the end points of the third pixel value interval include: the pixel value of the pixel location in the corresponding first label image and the pixel value of the pixel location in the corresponding second label image.

Step S104: the loss value of the noise image is calculated based on the loss value at each pixel position.

Step S105: based on the loss value of the noise image, the model parameters of the image denoising model to be trained are adjusted until convergence conditions are reached, and the trained image denoising model is obtained.

In an embodiment, as shown in fig. 2A, fig. 2A is a schematic diagram of a noise image provided in an embodiment of the present application, for example, the noise image may be a visual result of a RAW image acquired by an image acquisition device under a condition that ambient light is darker. In fig. 2A, the image contains more noise.

As shown in fig. 2B, fig. 2B is a schematic diagram of an image obtained by denoising the noise image shown in fig. 2A using an image denoising model obtained by a method in the related art. It can be seen that the image in fig. 2B contains less noise relative to the noisy image shown in fig. 2A, however, the image is overly smooth, losing more image detail.

As shown in fig. 2C, fig. 2C is a schematic diagram of an image obtained by denoising the noise image shown in fig. 2A by using the image denoising model obtained by the method according to the embodiment of the present application. It can be seen that the image in fig. 2C contains less noise than the noisy image shown in fig. 2A, and that the image in fig. 2C retains more image detail than in fig. 2B.

In addition, as shown in fig. 2D, fig. 2D is a schematic diagram of a noiseless image corresponding to the noise image shown in fig. 2A according to an embodiment of the present application, for example, the image may be a long exposure image acquired under the same scene by an image acquisition device.

In one embodiment, as shown in fig. 3A, fig. 3A is a schematic diagram of another noise image according to an embodiment of the present application. In fig. 3A, the image contains more noise.

As shown in fig. 3B, fig. 3B is a schematic diagram of an image obtained by denoising the noise image shown in fig. 3A using an image denoising model obtained by a method in the related art. It can be seen that the image in fig. 3B contains less noise relative to the de-noised image shown in fig. 3A, however, the image is overly smooth, losing more image detail.

As shown in fig. 3C, fig. 3C is a schematic diagram of an image obtained by denoising the noise image shown in fig. 3A by using the image denoising model obtained by the method according to the embodiment of the present application. It can be seen that the image in fig. 3C contains less noise than the noise image shown in fig. 3A, that is, the image denoising model obtained by the method in the embodiment of the application has an obvious denoising effect on denoising the noise image. And the image in fig. 3C retains image details such as texture edges, as compared to fig. 3B.

For step S101, sample data may be acquired before training the image denoising model.

For example, the noise image in the sample data may be an image acquired by the image acquisition device under the condition of darker ambient light, and correspondingly, the first label image corresponding to the noise image may be a long exposure image acquired by the image acquisition device under the same environment. Or the first label image in the sample data may be an image acquired by the image acquisition device under the condition that the ambient light is brighter, and correspondingly, noise may be added to the first label image to obtain a corresponding noise image.

Each second label image is a weighted sum of the corresponding noise image and the first label image, wherein the sum value of the weight of the noise image and the weight of the first label image is 1. That is, for each pixel location in each second label image, the pixel value of that pixel location in the second label image is located between the pixel value of that pixel location in the corresponding noise image and the pixel value of that pixel location in the corresponding first label image. A noise image containing noise, a first label image corresponding to the noise image that does not contain noise, and a second label image corresponding to the noise image may be referred to as a set of sample images. That is, the sample data may contain multiple sets of sample images. The first label image may also be referred to as "hard" label data and the second label image may also be referred to as "softened" label data.

In one embodiment, the pixel value for each pixel location in the second label image may be calculated based on a preset formula. For example, the preset formula may be formula (1):

gt^′＝input*alpha+gt*(1-alpha),alpha∈[0,1] (1)

Wherein input is a pixel value of each pixel position in the noise image, gt is a pixel value of the pixel position in the first label image corresponding to the noise image, gt ^′ is a pixel value of the pixel position in the second label image corresponding to the noise image, and alpha is a preset coefficient.

Alpha can play a role in controlling noise reduction intensity in the formula (1), and the more the alpha tends to be 0, the greater the noise reduction intensity. For example, when alpha=0, gt ^′ is the same as gt, that is, the first label image and the second label image are the same, and at this time, if the image denoising model is trained based on the loss value obtained by the second label image, that is, the image denoising model is trained based on the loss value obtained by the first label image, it can be understood that training is performed according to the method in the related art, and the denoising strength of the obtained image denoising model is maximum (that is, strong denoising); when alpha=1, gt ^′ is the same as input, i.e. the second label image and the noise image are the same, and at this time, if the image denoising model is trained based on the loss value obtained by the second label image, the denoising strength of the obtained image denoising model is minimum (i.e. no denoising). The preset coefficient may be set empirically, and is not particularly limited. For example, the preset coefficient may be 0.2 or 0.3. That is, in the embodiment of the present application, when the second tag image is calculated, the weight of both the first tag image and the noise image is not 0.

For step S102, when training the image denoising model, a noise image may be input to the image denoising model to be trained, to obtain a loss value of the noise image, and further, model parameters of the image denoising model may be adjusted based on the loss value of the noise image. After the adjustment is completed, the next noise image can be input into the image denoising model to be trained, and the model parameters of the image denoising model can be continuously adjusted.

Or the noise images can be grouped, then, a group of noise images can be input into an image denoising model to be trained to obtain loss values of the group of noise images, and further, model parameters of the image denoising model can be adjusted based on the loss values of the group of noise images. After the adjustment is completed, the next group of noise images can be input into the image denoising model to be trained, and the model parameters of the image denoising model are continuously adjusted.

For each noise image, after the noise image is input into an image denoising model to be trained, a prediction image corresponding to the noise image can be obtained.

The image denoising model to be trained may be DnCNN (Denoising Convolutional Neural Network ), or may also be UNet (U-shaped network structure).

For example, the structure of the image denoising model to be trained may be referred to fig. 4, and fig. 4 is a schematic structural diagram of the image denoising model according to an embodiment of the present application. The image denoising model in fig. 4 includes a plurality of conv1 (Convolution, convolutional layer 1), conv2 (Convolution, convolutional layer 2), relu (RECTIFIED LINEAR Unit, activation function layer), deconv (Transposed Convolution, transposed convolutional layer). The size of the convolution kernel of conv1 is 3×3 and stride (convolution step size) is 1. The size of the conv2 convolution kernel is2 x 2 and stride is 2.deconv has a convolution kernel size of 2 x 2 and stride of 2. The plus sign indicates the superposition process. And inputting the noise image into the image denoising model to obtain a predicted image.

For each pixel location in each noise image, the pixel location is between the pixel value in the corresponding second label image (which may be denoted by gt ^′), the pixel value in the noise image (which may be denoted by input) and the pixel value in the corresponding first label image (which may be denoted by gt) for step S103. As shown in fig. 5A, fig. 5A is a first schematic diagram of a section where a pixel value in a predicted image is located, and if the pixel position is between gt ^′ and input in the pixel value (which may be represented by pred) in the predicted image, a loss value at the pixel position may be calculated based on pred and a second pixel value section. The second pixel value interval belongs to the third pixel value interval (i.e., the interval with gt ^′ and gt as endpoints), and gt ^′ is the endpoint of the second pixel value interval, i.e., the length of the second pixel value interval is smaller than the third pixel value interval.

For example, a pixel value may be selected from the second interval of pixel values and a loss value at the pixel location may be calculated based on the selected pixel value and pred. Correspondingly, the model parameters of the image denoising model are adjusted based on the loss value, so that the pixel value of the pixel position output by the image denoising model tends to the second pixel value interval.

In one embodiment, the noise image may be a RAW image, where each pixel position in the noise image corresponds to four channels of RGGB (Red, green, blue, red, green, blue), and accordingly, a prediction image corresponding to the four channels may also be obtained. Furthermore, for each channel, if the pixel value of the pixel position in the corresponding predicted image is located in the first pixel value interval, the loss value of the pixel position is calculated based on the pixel value of the pixel position in the corresponding predicted image and the second pixel value interval, so that the loss value of the channel at the pixel position can be obtained. Further, the loss value at the pixel location can be obtained by combining the loss values of the channels at the pixel location.

In one embodiment, if the pixel value of the pixel location in the predicted image is within the first pixel value interval, a loss value at the pixel location is calculated based on the pixel value of the pixel location in the predicted image and the second pixel value interval (S103), comprising:

Step S1031: if the pixel value of the pixel position in the predicted image is within the first pixel value interval, a loss value at the pixel position is calculated based on a difference between the pixel value of the pixel position in the predicted image and the pixel value of the pixel position in the corresponding second label image.

For example, referring to FIG. 5A, if pred is located between input and gt ^′, the loss value at that pixel location can be calculated directly based on the difference between pred and gt ^′. For example, the loss value at the pixel location may be calculated based on the L1 loss function, or the loss value at the pixel location may also be calculated based on the L2 loss function.

For example, the loss value at the pixel location may be calculated based on equation (2),

loss_b＝|pred-gt^′| (2)

Where loss_b is the loss value at the pixel position, pred is the pixel value of the pixel position in the predicted image, and gt ^′ is the pixel value of the pixel position in the corresponding second label image.

Based on the processing, the loss value calculated based on the calculation of the gt ^′ can promote the output pixel value to trend to the calculation of the gt ^′, so that the phenomenon that the noise image is forcedly fitted to the image which does not contain noise can be avoided, the phenomenon that the image denoising model is over-fitted can be avoided to a certain extent, the image is processed based on the obtained image denoising model, more image details can be reserved while the noise is removed, and the image quality of the denoised image can be improved.

If the pixel value of the pixel position in the predicted image is within the first pixel value section, the loss value may be calculated based on the difference between the pixel value of the pixel position in the predicted image and the pixel value in the corresponding first label image, and the model parameters of the image denoising model may be adjusted. However, this approach fits the pixel values of the pixel locations in the predicted image to the pixel values of the pixel locations in the corresponding first label image, and requires a large-scale adjustment of the model parameters of the image denoising model, which also increases the requirements for the structural complexity of the image denoising model.

Therefore, in the embodiment of the present application, if the pixel value of the pixel position in the predicted image is located in the first pixel value interval, that is, the pixel value of the pixel position in the predicted image is close to the pixel value in the corresponding second label image, the loss value at the pixel position is calculated based on the difference between the pixel value of the pixel position in the predicted image and the pixel value in the corresponding second label image, so that the above situation can be avoided, the adjustment of the model parameters of the image denoising model with a larger amplitude is avoided, and the requirement on the structural complexity of the image denoising model is also reduced.

For step S104, a loss value for the noise image may be calculated based on a weighted sum of the loss values at each pixel location. For example, the weights of the pixel positions may all be 1, and the loss value of the noise image is the sum of the loss values at the pixel positions. The weight of each pixel position may be empirically set, and is not particularly limited.

For step S105, based on the loss value of the noise image, the model parameters of the image denoising model to be trained are adjusted until convergence conditions are reached, that is, the pixel value of each pixel position in the predicted image output by the image denoising model is located in the third pixel value interval, so that the noise image can be prevented from being forcibly fitted to the image which does not contain noise, and further, the phenomenon that the image denoising model is over-fitted can be avoided to a certain extent, further, the image is processed based on the obtained image denoising model, more image details can be reserved while noise is removed, and the image quality of the denoised image can be improved.

For example, the model parameters of the image denoising model to be trained can be adjusted by using a gradient descent method.

In one embodiment, referring to fig. 6, fig. 6 is a second flowchart of an image denoising model training method according to an embodiment of the present application, where the image denoising model training method further includes:

step S106: if the pixel value of the pixel position in the predicted image is outside the fourth pixel value interval, calculating a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and the fifth pixel value interval.

The end point of the fourth pixel value interval comprises: the pixel value of the pixel position in the corresponding first label image and the pixel value of the pixel position in the noise image; one end point of the fifth pixel value interval is a pixel value of the pixel position in the corresponding first label image, and the fifth pixel value interval belongs to the third pixel value interval.

In the embodiment of the present application, as shown in fig. 5B, fig. 5B is a second schematic diagram of a section where pixel values in a predicted image provided in the embodiment of the present application are located. For each pixel location in each noise image, the pixel location is between the pixel value in the corresponding second label image (which may be represented by gt ^′), the pixel value in the noise image (which may be represented by input), and the pixel value in the corresponding first label image (which may be represented by gt). If the pixel value (which may be represented by pred) of the pixel location in the predicted image is outside the interval ending with input and gt, then the loss value at the pixel location may be calculated based on pred and the fifth pixel value interval. The fifth pixel value interval belongs to the third pixel value interval, and gt is the end point of the fifth pixel value interval, that is, the length of the fifth pixel value interval is smaller than that of the third pixel value interval.

For example, a pixel value may be selected from the fifth pixel value interval and a loss value at that pixel location may be calculated based on the selected pixel value and pred. Accordingly, the model parameters of the image denoising model are adjusted based on the loss value, so that the pixel value of the pixel position output by the image denoising model tends to the fifth pixel value interval.

The loss value calculated based on the fifth pixel value interval can promote the output pixel value to be located in the third pixel value interval, so that the phenomenon that the noise image is forcedly fitted to the image which does not contain noise can be avoided, the phenomenon that the image denoising model is over-fitted can be avoided to a certain extent, the image is processed based on the obtained image denoising model, more image details can be reserved while noise is removed, and the image quality of the denoised image can be improved.

In one embodiment, if the pixel value of the pixel location in the predicted image is outside the fourth pixel value interval, a loss value at the pixel location is calculated based on the pixel value of the pixel location in the predicted image and the fifth pixel value interval (S106), comprising:

Step S1061: if the pixel value of the pixel position in the predicted image is outside the fourth pixel value interval, calculating a loss value at the pixel position based on a difference between the pixel value of the pixel position in the predicted image and the pixel value of the pixel position in the corresponding first label image.

In an embodiment of the present application, as shown in fig. 5B, if pred is located outside the interval with input and gt as the end points, the loss value at the pixel position may be calculated directly based on the difference between pred and gt. For example, the loss value at the pixel position may be calculated based on the L1 loss function, or the loss value at the pixel position may be calculated based on the L2 loss function.

For example, the loss value at the pixel location may be calculated based on equation (3),

loss_c＝|pred-gt| (3)

Where loss_c is the loss value at the pixel position, pred is the pixel value of the pixel position in the predicted image, and gt is the pixel value of the pixel position in the corresponding first label image.

Based on the processing, the output pixel value tends to be based on the loss value obtained by calculation of gt, and then the output pixel value tends to be in a third pixel value interval, so that the phenomenon that a noise image is forcedly fitted to an image which does not contain noise can be avoided, and further, the phenomenon that an image denoising model is fitted to a certain extent can be avoided, and further, the image is processed based on the obtained image denoising model, so that more image details can be reserved while noise is removed, and the image quality of the denoised image can be improved.

If the pixel value of the pixel position in the predicted image is outside the fourth pixel value section, the loss value may be calculated based on the difference between the pixel value of the pixel position in the predicted image and the pixel value in the corresponding second label image, and the model parameters of the image denoising model may be adjusted. However, this approach fits the pixel values of the pixel locations in the predicted image to the pixel values of the pixel locations in the corresponding second label image, and requires a large-scale adjustment of the model parameters of the image denoising model, which also increases the requirements for the structural complexity of the image denoising model.

Therefore, in the embodiment of the present application, if the pixel value of the pixel position in the predicted image is located outside the fourth pixel value interval, that is, the pixel value of the pixel position in the predicted image is close to the pixel value in the corresponding first label image, the loss value at the pixel position is calculated based on the difference between the pixel value of the pixel position in the predicted image and the pixel value in the corresponding first label image, so that the above situation can be avoided, the adjustment of the model parameter of the image denoising model with a larger amplitude is avoided, and the requirement on the structural complexity of the image denoising model is also reduced.

In one embodiment, referring to fig. 7, fig. 7 is a third flowchart of the image denoising model training method provided by the embodiment of the present application, where the image denoising model training method further includes:

step S107: if the pixel value of the pixel position in the predicted image is within the third pixel value interval, the loss value at the pixel position is determined to be 0.

In the embodiment of the present application, as shown in fig. 5C, fig. 5C is a third schematic diagram of a section where pixel values in a predicted image provided in the embodiment of the present application are located. For each pixel location in each noise image, the pixel location is between the pixel value in the corresponding second label image (which may be represented by gt ^′), the pixel value in the noise image (which may be represented by input), and the pixel value in the corresponding first label image (which may be represented by gt). If the pixel value (which may be represented as pred) of the pixel location in the predicted image lies between gt ^′ and gt, indicating that the output for the pixel location in the noisy image has met the condition, then the loss value at the pixel location is determined to be 0, i.e., no adjustment of the model parameters of the image denoising model based on the loss value at the pixel location is required for the currently input noisy image.

For example, the loss value at the pixel location may be calculated based on equation (4),

loss_a＝0 (4)

Where loss_a is the loss value at that pixel location.

Based on the processing, the loss value of the pixel position meeting the condition can be determined to be 0, the output pixel value is promoted to trend to the third pixel value interval, the phenomenon that the noise image is forcedly fitted to the image which does not contain noise can be avoided, the phenomenon that the image denoising model is over-fitted can be avoided to a certain extent, the image is processed based on the obtained image denoising model, more image details can be reserved while the noise is removed, and the image quality of the denoised image can be improved.

In one embodiment, each image in the sample data is an image in RAW format.

In an embodiment of the present application, an ISP (IMAGE SIGNAL Process, image signal processing) chip in the image capturing apparatus may perform a series of processes on a RAW (original) image (i.e., an image in RAW format) input from a sensor, including AEC (Automatic Exposure Control, auto exposure control), AGC (Automatic Gain Control, auto gain control), AWB (Automatic White Balance, auto white balance), color correction, and the like, to obtain an sRGB (STANDARD RED GREEN Blue, general color standard) image. Since the sRGB image is obtained by performing a series of processes on the RAW image, if the sRGB image is obtained and then the sRGB image is denoised, noise included in the original RAW image may not be effectively removed. And compared with the RAW image, the sRGB image has much image detail lost in the processing process, and even though the sRGB image is subjected to denoising processing, the image quality of the obtained denoised sRGB image is not high. Therefore, in the embodiment of the application, the image denoising model can be trained based on the RAW image, and then the RAW image is processed based on the trained image denoising model, so that more image details can be reserved while noise is removed, the image quality of the denoised RAW image can be improved, the denoising effect of the image denoising model is improved, and further, the image quality of the generated sRGB image can be improved.

The embodiment of the application also provides an image denoising method, referring to fig. 8, fig. 8 is a flowchart of the image denoising method provided by the embodiment of the application, and the method may include:

step 801: and acquiring an image to be denoised.

Step 802: and inputting the image to be denoised into a pre-trained image denoising model to obtain a denoised image.

The image denoising model is obtained by training based on the image denoising model training method.

In the embodiment of the application, the image to be denoised is denoised based on the image denoising model obtained by training in any one of the image denoising model training method steps.

The image denoising model training method can promote the output pixel value to be positioned in the third pixel value interval, so that the phenomenon that the image denoising model is over-fitted can be avoided to a certain extent, correspondingly, the image is processed based on the obtained image denoising model, more image details can be reserved while the noise is removed, and the image quality of the denoised image can be improved.

The embodiment of the application also provides an image denoising model training device, referring to fig. 9, fig. 9 is a schematic structural diagram of the image denoising model training device provided by the embodiment of the application, and the device comprises:

the sample data obtaining module 901 is configured to obtain sample data.

Wherein the sample data comprises: a plurality of noise images including noise, first tag images not including noise corresponding to the respective noise images, and second tag images corresponding to the respective noise images; wherein each second label image is: a weighted sum of the corresponding noise image and the first label image.

The predicted image obtaining module 902 is configured to input each noise image to an image denoising model to be trained, and obtain an image output by the image denoising model as a predicted image.

The first calculating module 903 is configured to calculate, for each pixel position in the noise image, a loss value at the pixel position based on a pixel value of the pixel position in the predicted image and a second pixel value interval if the pixel value of the pixel position in the predicted image is within the first pixel value interval.

A second calculation module 904, configured to calculate a loss value of the noise image based on the loss value at each pixel position.

The model parameter adjustment module 905 is configured to adjust model parameters of the image denoising model to be trained based on the loss value of the noise image until convergence conditions are reached, thereby obtaining a trained image denoising model.

In one embodiment, the first calculating module 903 is specifically configured to calculate, if a pixel value of the pixel location in the predicted image is within the first pixel value interval, a loss value at the pixel location based on a difference between the pixel value of the pixel location in the predicted image and a pixel value of the pixel location in the corresponding second label image.

In one embodiment, the image denoising model training apparatus further comprises:

And the third calculation module is used for calculating the loss value at the pixel position based on the pixel value of the pixel position in the predicted image and the fifth pixel value interval if the pixel value of the pixel position in the predicted image is positioned outside the fourth pixel value interval.

In one embodiment, the third calculating module is specifically configured to calculate the loss value at the pixel location based on a difference between the pixel value of the pixel location in the predicted image and the pixel value of the pixel location in the corresponding first label image if the pixel value of the pixel location in the predicted image is outside the fourth pixel value interval.

In one embodiment, each image in the sample data is an image in RAW format.

The embodiment of the application also provides an image denoising device, referring to fig. 10, fig. 10 is a schematic structural diagram of the image denoising device provided by the embodiment of the application, and the device comprises:

the image to be denoised acquisition module 1001 is configured to acquire an image to be denoised.

The denoising module 1002 is configured to input an image to be denoised to a pre-trained image denoising model, so as to obtain a denoised image.

The embodiment of the present application further provides an electronic device, as shown in fig. 11, including a processor 1101, a communication interface 1102, a memory 1103 and a communication bus 1104, where the processor 1101, the communication interface 1102 and the memory 1103 complete communication with each other through the communication bus 1104,

A memory 1103 for storing a computer program;

the processor 1101 is configured to execute a program stored in the memory 1103, and implement the following steps:

sample data is acquired. Wherein the sample data comprises: a plurality of noise images including noise, first tag images not including noise corresponding to the respective noise images, and second tag images corresponding to the respective noise images; wherein each second label image is: a weighted sum of the corresponding noise image and the second label image.

And inputting each noise image into an image denoising model to be trained, and obtaining an image output by the image denoising model as a predicted image.

For each pixel position in the noise image, if the pixel value of the pixel position in the predicted image is within the first pixel value interval, calculating a loss value at the pixel position based on the pixel value of the pixel position in the predicted image and the second pixel value interval. The end point of the first pixel value interval comprises: the pixel value of the pixel position in the corresponding second label image and the pixel value of the pixel position in the noise image; one end point of the second pixel value interval is a pixel value of the pixel position in the corresponding second label image, and the second pixel value interval belongs to the third pixel value interval; the end points of the third pixel value interval include: the pixel value of the pixel location in the corresponding first label image and the pixel value of the pixel location in the corresponding second label image.

The loss value of the noise image is calculated based on the loss value at each pixel position.

The communication bus mentioned above for the electronic device may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the electronic device and other devices.

The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components.

In yet another embodiment of the present application, a computer readable storage medium is provided, in which a computer program is stored, the computer program implementing the steps of any one of the image denoising model training methods or image denoising methods described above when executed by a processor.

In yet another embodiment of the present application, a computer program product containing instructions that, when run on a computer, cause the computer to perform any of the image denoising model training methods or image denoising methods of the above embodiments is also provided.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk Solid STATE DISK (SSD)), etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. An image denoising model training method, comprising:

2. The method of claim 1, wherein if the pixel value of the pixel location in the predicted image is within the first pixel value interval, calculating the loss value at the pixel location based on the pixel value of the pixel location in the predicted image and the second pixel value interval, comprises:

3. The method according to claim 1, wherein the method further comprises:

4. A method according to claim 3, wherein if the pixel value of the pixel location in the predicted image is outside the fourth pixel value interval, calculating the loss value at the pixel location based on the pixel value of the pixel location in the predicted image and the fifth pixel value interval comprises:

5. The method according to claim 1, wherein the method further comprises:

6. The method of any one of claims 1-5, wherein each image in the sample data is an image in RAW format.

7. A method of denoising an image, the method comprising:

Acquiring an image to be denoised;

Inputting the image to be denoised into a pre-trained image denoising model to obtain a denoised image; wherein the image denoising model is trained based on the method steps of any one of claims 1-6.

8. An image denoising model training apparatus, comprising:

9. The apparatus according to claim 8, wherein the first calculating module is specifically configured to calculate the loss value at the pixel location based on a difference between the pixel value of the pixel location in the predicted image and the pixel value of the pixel location in the corresponding second label image if the pixel value of the pixel location in the predicted image is within a first pixel value interval.

10. The apparatus of claim 8, wherein the apparatus further comprises:

11. The apparatus according to claim 10, wherein the third calculating module is specifically configured to calculate the loss value at the pixel location based on a difference between the pixel value of the pixel location in the predicted image and the pixel value of the pixel location in the corresponding first label image if the pixel value of the pixel location in the predicted image is outside the fourth pixel value interval.

12. The apparatus of claim 8, wherein the apparatus further comprises:

13. The apparatus according to any one of claims 8-12, wherein each image in the sample data is an image in RAW format.

14. An image denoising apparatus, comprising:

the denoising module is used for inputting the image to be denoised into a pre-trained image denoising model to obtain a denoised image; wherein the image denoising model is trained based on the method steps of any one of claims 1-6.

15. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

a memory for storing a computer program;

A processor for carrying out the method steps of any one of claims 1-6, or 7, when executing a program stored on a memory.

16. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-6, or 7.