WO2020199478A1

WO2020199478A1 - Method for training image generation model, image generation method, device and apparatus, and storage medium

Info

Publication number: WO2020199478A1
Application number: PCT/CN2019/103142
Authority: WO
Inventors: 王健宗; 彭俊清; 瞿晓阳
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-04-03
Filing date: 2019-08-28
Publication date: 2020-10-08
Also published as: CN110097086A; CN110097086B

Abstract

A method for training an image generation model, an image generation method, device and apparatus, and a storage medium. The method for training an image generation model comprises: acquiring a first image set and a second image set, the first image set comprising multiple captured images, and the second image set comprising multiple cartoon images; performing pre-processing of the captured images according to a preset cartoon generation algorithm to obtain corresponding target cartoon images; and iteratively training a generative network and a discriminative network in an alternating manner to obtain an image generation model.

Description

Image generation model training method, image generation method, device, equipment and storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 3, 2019, the application number is 201910267519.9, and the invention title is "image generation model training method, image generation method, device, equipment and storage medium". The entire content is incorporated into this application by reference.

Technical field

This application relates to the field of image processing technology, and in particular to an image generation model training method, image generation method, device, computer equipment and storage medium.

Background technique

Comics are an art form that is widely used in our daily lives. It has a wide range of applications. For example, it is often used in children's story education. Like other forms of artwork, many comic images are created based on real-world scenes. However, converting real-world images into comic-style images is extremely challenging in both computer vision and computer graphics, because comic-style image features and image features of captured images are often quite different. It can be, for example, the character's hairstyle, clothing, facial expressions, facial features, etc. It is precisely because of the high difference between the two that the data dimensions that need to be processed to convert the captured images into comic-style images will be huge, and the required image generation model is also very difficult to train and time-consuming.

Summary of the invention

This application provides an image generation model training method, image generation method, device, computer equipment, and storage medium to train a model that can convert captured images into comic-style images, and at the same time, improve the efficiency of training models.

In the first aspect, this application provides an image generation model training method, which includes:

Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;

Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;

Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;

Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;

When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model, and the image generation model is used to generate an image with a comic style.

In the second aspect, this application also provides an image generation method, which includes:

Acquiring an image to be processed, where the image to be processed is a captured image;

Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

Image synthesis of the hierarchical image and the edge image to obtain a target image;

The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the above-mentioned image generation model training method.

In the third aspect, this application also provides an image generation model training device, which includes:

A data acquisition unit, configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;

A preprocessing unit, configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;

A network acquisition unit, configured to acquire a preset generative confrontation network, the generative confrontation network including a generation network and a discrimination network;

The model training unit is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform operations on the generation network and the discrimination network Alternate iterative training;

The model saving unit is configured to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than a preset value, and the image generation model is used to generate an image with a comic style.

In a fourth aspect, the present application also provides an image generation device, which includes:

An image acquisition unit for acquiring an image to be processed, the image to be processed is a captured image;

A segmentation processing unit, configured to perform image segmentation processing on the to-be-processed image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

An edge processing unit, configured to process the to-be-processed image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

An image synthesis unit, configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image;

The image generation unit is configured to input the target image into an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the above-mentioned image generation model training method.

In a fifth aspect, the present application also provides a computer device, the computer device includes a memory and a processor; the memory is used to store a computer program; the processor is used to execute the computer program and execute the The computer program implements the above-mentioned image generation model training method or image generation method.

In a sixth aspect, the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor realizes the image generation model described above Training method or image generation method.

The application discloses an image generation model training method, image generation method, device, computer equipment and storage medium. The training method first preprocesses the captured images in the first image set according to a preset comic generation algorithm to obtain the target comic image corresponding to the captured image; then uses the target comic image as the input of the generating network in the generative confrontation network, and generates The image output by the network and the comic image related to the captured image in the second image set are used as the input of the discriminant network in the generative confrontation network, so that the generation network and the discriminant network are alternately and iteratively trained until the discriminant probability value of the discriminant network output is greater than the preset Value, the trained generation network obtained at this time will be used as the image generation model. This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.

FIG. 1 is a schematic flowchart of an image generation model training method provided by an embodiment of the present application;

2 is a schematic flowchart of sub-steps of the image generation model training method provided in FIG. 1;

3 is a schematic flowchart of another image generation model training method provided by an embodiment of the present application;

FIG. 4 is a schematic flowchart of an image generation method provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of an application scenario of an image generation method provided by an embodiment of the present application;

FIG. 6 is a schematic block diagram of an image generation model training device provided by an embodiment of the application;

FIG. 7 is a schematic block diagram of a preprocessing unit in an image generation model training device provided by an embodiment of the application;

FIG. 8 is a schematic block diagram of an image generation device provided by an embodiment of the application;

FIG. 9 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.

detailed description

The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The flowchart shown in the drawings is merely an illustration, and does not necessarily include all contents and operations/steps, nor does it have to be executed in the described order. For example, some operations/steps can also be decomposed, combined or partially combined, so the actual execution order may be changed according to actual conditions.

The embodiments of the present application provide an image generation model training method, image generation method, device, computer equipment, and storage medium. Among them, the image generation model training method is used to quickly train an image generation model that can generate a comic style; the image generation method can be applied to a server or a terminal, and the image generation model is used to generate a comic-style image from the captured image, thereby improving the user’s Experience.

Among them, the server can be an independent server or a server cluster. The terminal can be an electronic device such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.

For example, install the trained image generation model in a mobile phone, or compress the trained image generation model and install it in the mobile phone. The user uses a mobile phone to process the captured image by using the image generation method to obtain a comic style image corresponding to the captured image, thereby improving the user experience.

It should be noted that the comic style images can be comics or classic cartoons, etc., such as One Piece, Crayon Shin-chan or Naruto, etc. The following will use the comic style as the style of Naruto for the image generation model training method or image generation method Give a detailed introduction.

Hereinafter, some embodiments of the present application will be described in detail with reference to the accompanying drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.

Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an image generation model training method provided by an embodiment of the present application. The image generation model is obtained by model training based on a generative confrontation network. Of course, it can also be obtained by training with other similar networks.

As shown in Fig. 1, the image generation model training method includes: step S101 to step S105.

S101. Acquire a first image set and a second image set, where the first image set includes multiple photographed images, and the second image set includes multiple cartoon images.

Among them, the first image set and the second image set are obtained as sample data for model training, namely the first image set and the second image set, the first image set is a collection of captured images, and the second image set is a collection of comic images .

Specifically, the multiple captured images in the first image set are real-world pictures. A certain number of pictures can be downloaded from the Flickr website. Some of the images are used for training and the other part is used for testing, such as 6000 images, of which 5500 The images are used for model training, and the other 500 images are used for model testing.

Specifically, the multiple comic images in the second image set can be images in anime, such as Naruto. By selecting the first 700 episodes of anime Naruto, and randomly selecting 10 images in each episode, 7000 Naruto in total The image of the ninja serves as the second image set.

S102: Preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.

Specifically, the preset comic generation algorithm uses image processing algorithms to preprocess the captured images in the first image set to extract image information in the captured images, such as hierarchical structure images, edge images, facial features or hairstyle features, etc. The target comic image corresponding to the captured image is constructed according to the image information. As a result, the problem of excessive differences between the captured image and the animation image (Naruto image) can be eliminated, the data dimension that needs to be processed for the training of the image generation model is reduced, the training of the model is facilitated, and the accuracy of the model is improved.

In an embodiment, in order to improve the training speed of the model and the accuracy of the model, a step of preprocessing the captured images in the first image set is provided, as shown in FIG. 2, that is, step S102 includes: sub-step S102a To S102c.

S102a: Perform image segmentation processing on the captured image according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure.

Specifically, a Mean-shift algorithm is used to segment the captured images and perform hierarchical processing on the images, and the similar colors in the images are unified through continuous iteration to obtain hierarchical images with a hierarchical structure.

Among them, the Mean-shift (Mean-shift) algorithm belongs to the hill-climbing algorithm of kernel density estimation, which does not require any prior knowledge and completely relies on the calculation of the density function value of the sample points in the feature space. The usual histogram method is to divide the image into several equal intervals, and the ratio of the data in each interval to the total amount of data is the probability value of this interval; the principle of the Mean-shift algorithm is similar to the histogram method, with one more application Kernel function for smoothing data. Using the kernel function estimation method, when the image data is sufficient, it can gradually converge to any density function, that is, it can estimate the density of data that obey any distribution. Such a method can be used in many fields such as clustering, image segmentation, tracking, etc., and has a good effect in removing detailed information such as image color and texture. In this embodiment, the Mean-shift algorithm is mainly used for image segmentation to obtain hierarchical images.

S102b. Process the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contour lines.

Specifically, a flow-based difference of Gaussian (FDoG) algorithm performs edge extraction on the captured image to extract an edge image corresponding to the captured image.

Wherein, the processing the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with an edge contour line specifically includes: constructing a tangent flow in the captured image according to a tangent flow formula; The binary image boundary calculation formula is to calculate the Gaussian difference of the constructed tangent stream to obtain an edge image with edge contours.

In an embodiment, the tangent flow formula is:

In formula (1), Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalization at point y A tangent vector; φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w _s (x,y) is a spatial weight vector; w _m (x,y) is a quantity Level weight function; w _d (x,y) is the direction weight function; initially, t ₀ (x) is set to a vector orthogonal to the image gradient vector.

In an embodiment, the formula for calculating the boundary of the class binary image is:

In formula (2), D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1 ); The value of τ is 0.5. The similar binary image boundary calculation formula can make the edge image clear, smooth and coherent, thereby improving the accuracy of the image generation model.

S102c. Perform image synthesis on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.

Specifically, image synthesis is performed on the hierarchical image and the edge image to obtain an image of a specific hierarchical structure and edge feature corresponding to the captured image, that is, a target comic image. Using the target comic image for image generation model training can reduce the data dimension that needs to be processed for model training, and at the same time improve the training speed of the model and the accuracy of the model.

S103. Obtain a preset generative confrontation network, where the generative confrontation network includes a generation network and a discrimination network.

Specifically, a pre-selected Generative Adversarial Networks (GAN) is obtained. The generative adversarial network includes a generation network and a discrimination network. The generation network is used to generate comic images from captured images, and the discrimination network is used to determine the output of the generation network. Whether the image of is a comic image.

Among them, the generative confrontation network may be various types of confrontation networks. For example, it can be a Deep Convolutional Generative Adversarial Network (DCGAN). For another example, the generation network can be a convolutional neural network for image processing (for example, various convolutional neural network structures including convolutional layer, pooling layer, depooling layer, and deconvolutional layer, which can be performed in sequence Down-sampling and up-sampling); the discriminant network can be a convolutional neural network (for example, various convolutional neural network structures including a fully connected layer, where the fully connected layer can implement a classification function).

S104. Use the target comic image as the input of the generation network and use the image output by the generation network and the comic image as the input of the discrimination network, and perform alternate iterative training on the generation network and the discrimination network.

Specifically, performing alternating iterative training includes two training processes, namely: training a generation network and training a discriminant network.

Among them, training the generation network includes: inputting a captured image to the generation network, after a convolution, batch normalization (BN) and activation function (Relu) are activated, and then down-convolution with convolution and batch normalization (BN) and activation function (Relu) activation operations, so two trainings are carried out, and then through 8 the same Residualblock operations, two Up-convolutions are carried out with convolution, convolution, and batch normalization (BN) And activation function (Relu) activation operation, and finally through a convolution operation, output an image with the same size as the input captured image. The activation function uses the ReLU function.

Among them, training the discriminant network includes: inputting and generating images and comic images output by the discriminant network, after multiple convolutions, batch normalization (BN) and activation function (LReLU) activation, and then Sigmoid function processing The latter output is a probability value of the comic image (Naruto image) in the second image set, where the activation function uses the LReLU function. As a supplement to the generation network, the discrimination network is used to determine whether the input image (the output image of the generation network) is a Naruto image in the second image set.

By alternately training the two network structures, first optimize the judgment network model, it is easy to distinguish whether the input is the comic image in the second image set (Naruto image), that is, the image generated at the beginning of the network and the Naruto image in the second image set. The image has a large deviation. Then optimize the generation network so that the loss function of the generated network model is gradually reduced, and at the same time, it also improves the ability to distinguish the two classifications of the network model. The final iteration until the network model can not determine whether the input is the Naruto image in the second image set or the generation For the Naruto image generated by the network model, the entire generation network model has been trained at this time. At this time, the image generated by the generation network model is an image with the style of anime Naruto.

S105. When the discriminant probability value output by the discriminant network is greater than the preset value, save the trained generation network as an image generation model, and the image generation model is used to generate an image with a comic style.

Specifically, by setting a preset value, for example, by judging that the probability value of the output of the network model is greater than the preset value, the ability to discriminate the binary classification of the network model is determined, thereby ensuring that the image generated by the network model is generated, which has anime Naruto Stylized image. Wherein, the size of the preset value is not limited here, and can be set according to expert experience. When the discriminant probability value output by the discriminant network is greater than the preset value, it indicates that the generation network model can be used to generate comic-style images, so the generation network at this time is saved as a comic-style image generation model.

The training method provided by the foregoing embodiment first preprocesses the captured images in the first image set according to a preset comic generation algorithm to obtain the target comic images corresponding to the captured images; then uses the target comic images as the input of the generating network in the generative confrontation network , And use the image output of the generating network and the comic image related to the captured image in the second image set as the input of the discriminant network in the generative confrontation network, so as to perform alternate iterative training on the generation network and the discriminant network until the discriminant probability of the discriminant network output If the value is greater than the preset value, the trained generation network obtained at this time will be used as the image generation model. This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.

Please refer to FIG. 3, which is a schematic flowchart of another image generation model training method provided by an embodiment of the present application. The image generation model is obtained by model training based on a generative confrontation network. Of course, it can also be obtained by training with other similar networks.

As shown in FIG. 3, the image generation model training method includes: step S201 to step S208.

S201. Acquire multiple photographed images and multiple comic images.

S202: Perform cutting processing on the photographed image and the cartoon image respectively to obtain the photographed image and the cartoon image after cutting.

Wherein, the shot image and the comic image are respectively cut to obtain the cut shot image and the comic image, so as to determine that the cut shot image and the comic image have the same image size, for example, both cut It is a 256×256 size image, of course, it can also be cut into other sizes.

S203: Construct a first image set based on the cut shot images, and construct a second image set based on the cut cartoon images.

Specifically, the cut shot images are constructed into a first image set, and the cut cartoon images are constructed into a second image set, so that the sizes of the images in the first image set and the second image set are the same.

S204. Acquire a first image set and a second image set.

Wherein, the first image set includes multiple photographed images, and the second image set includes multiple cartoon images. It should be noted that the number of images in the first image set and the number of images in the second image set may be the same or different.

S205: Preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.

Specifically, perform image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure; perform processing on the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contours ; Image synthesis of the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.

S206. Obtain a preset generative countermeasure network, where the generative countermeasure network includes a generative network and a discriminant network.

S207. Use the target comic image as the input of the generation network and use the image output by the generation network and the comic image as the input of the discrimination network, and perform alternate iterative training on the generation network and the discrimination network.

Among them, performing alternate iterative training includes two training processes, namely: training a generation network and training a discriminant network. Specifically, the target comic image is used as the input of the generation network and the image output by the generation network and the comic image are used as the input of the discrimination network, and the generation network and the discrimination network are alternately iteratively trained , The final iteration until it is determined that the network model cannot determine whether the input is the Naruto image in the second image set or the Naruto image generated by the generation network model. At this time, the entire generation network model has been trained.

S208: When the discriminant probability value output by the discriminant network is greater than the preset value, save the trained generation network as an image generation model, and the image generation model is used to generate an image with a comic style.

Specifically, by setting a preset value, for example, by judging that the probability value of the output of the network model is greater than the preset value, the ability to discriminate the binary classification of the network model is determined, thereby ensuring that the image generated by the network model is generated, which has anime Naruto Stylized image.

When the discriminant probability value output by the discriminant network is greater than the preset value, it indicates that the generation network model can be used to generate comic-style images, so the generation network at this time is saved as a comic-style image generation model.

The training method provided by the foregoing embodiment first constructs the first image set and the second image set, and then preprocesses the captured images in the first image set according to a preset cartoon generation algorithm to obtain the target cartoon image corresponding to the captured image; The comic image is used as the input of the generating network in the generative confrontation network, and the image output by the generating network and the comic image related to the captured image in the second image set are used as the input of the discriminating network in the generative confrontation network. Carry out alternate iterative training until the discriminative probability value of the discriminant network output is greater than the preset value, and the trained generation network obtained at this time will be used as the image generation model. This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.

Please refer to FIG. 4, which is a schematic flowchart of an image generation method provided by an embodiment of the present application. The image generation method can be applied to a terminal or a server to generate a comic-style image based on the captured image using the above-trained image generation model.

In this embodiment, the application of the image generation method to a terminal (mobile phone) is taken as an example for introduction, as shown in FIG. 5, which is a schematic diagram of an application scenario of the image generation method provided by this application. The server uses any of the image generation model training methods provided in the above embodiments to train the image generation model, and sends the image generation model to the terminal. The terminal receives and saves the image generation model sent by the server. The terminal can run the image generation method according to The captured image uses the image generation model to generate a comic-style image.

For example, in one embodiment, the terminal is used to perform: acquiring an image to be processed, which is a captured image; inputting the image to be processed into an image generation model to generate a corresponding comic image, wherein the image The generative model is a model obtained by training using any of the image generative model training methods described above. Then, the image to be processed selected by the user in the terminal (for example, an image taken by steel or an image stored in a disk) is converted into a comic-style image to improve the user's experience.

The image generation method provided in this embodiment will be described in detail below in conjunction with FIG. 4 and FIG. 5. As shown in FIG. 4, the image generation method includes: step S301 to step S305.

S301. Acquire an image to be processed, where the image to be processed is a captured image.

Specifically, the image to be processed may be a picture just taken by the user, or a picture selected by the user in the gallery, such as a picture taken by the user with a mobile phone or a picture selected from the previously taken pictures, and want to convert it to a cartoon Style comic image, you can send the picture to the server that saves the comic style image generation model, the server inputs the to-be-processed image into the comic style image generation model to generate the corresponding comic image, and sends the generated comic image to user.

In an embodiment, another image generation method is also provided. The image generation method may also use the acquired image to be processed as the target image, and execute step S305.

S302. Perform image segmentation processing on the image to be processed according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure.

S303: Process the image to be processed according to the Gaussian difference filter algorithm based on the stream to generate an edge image with edge contour lines.

S304. Perform image synthesis on the hierarchical image and the edge image to obtain a target image.

Specifically, image synthesis is performed on the hierarchical image and the edge image to obtain an image of a specific hierarchical structure and edge feature corresponding to the captured image, that is, a target image. Inputting the target image to the image generation model to generate a comic-style image can increase the speed of image generation.

S305. Input the target image to an image generation model to generate a corresponding comic image.

Wherein, the image generation model is a model obtained by training using any of the image generation model training methods provided in the foregoing embodiments. The target image is input to the image generation model to generate the corresponding comic image. As shown in FIG. 5, the target image synthesized according to the hierarchical image and the edge image is input to the model, which is an image generation model, and the model is generated using the image Generate an image with a comic style, such as the image displayed by the terminal in Figure 5, thereby improving the user experience.

Please refer to FIG. 6. FIG. 6 is a schematic block diagram of an image generation model training device provided by an embodiment of the present application. The image generation model training device may be configured in a server to execute the aforementioned image generation model training method.

As shown in FIG. 6, the image generation model training device 400 includes: a photographing acquisition unit 401, a cutting processing unit 402, an atlas construction unit 403, a data acquisition unit 404, a preprocessing unit 405, a network acquisition unit 406, and model training Unit 407 and model saving unit 408.

The photographing and acquiring unit 401 is configured to acquire multiple photographed images and multiple cartoon images.

The cropping processing unit 402 is configured to perform cropping processing on the captured image and the cartoon image to obtain a cropped captured image and a cartoon image, wherein the cropped captured image and the cartoon image have the same image size.

The atlas construction unit 403 is configured to construct a first image set based on the cut shot images, and construct a second image set based on the cut cartoon images.

The data acquisition unit 404 is configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images.

The preprocessing unit 405 is configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.

In an embodiment, as shown in FIG. 7, the preprocessing unit 405 includes: a hierarchical processing subunit 4051, an edge processing subunit 4052 and an image synthesis subunit 4053.

The hierarchical processing subunit 4051 is configured to perform image segmentation processing on the captured image according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure; The captured image is processed to generate an edge image with an edge contour line; an image synthesis subunit 4053 is configured to perform image synthesis on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.

The network obtaining unit 406 is configured to obtain a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network.

The model training unit 407 is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform the evaluation on the generation network and the discrimination network Perform alternating iterative training.

The model saving unit 408 is configured to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than a preset value. The image generation model is used to generate an image with a comic style.

Please refer to FIG. 8. FIG. 8 is a schematic block diagram of an image generation device provided by an embodiment of the present application, and the image generation device is used to execute the aforementioned image generation method. Wherein, the image generating device can be configured in a server or a terminal.

As shown in FIG. 8, the image generation device 500 includes: an image acquisition unit 501, a segmentation processing unit 502, an edge processing unit 503, an image synthesis unit 504, and an image generation unit 505.

The image acquisition unit 501 is configured to acquire an image to be processed, and the image to be processed is a captured image.

In an embodiment, the acquired image to be processed may also be used as the target image, and the image generating unit 505 may be called.

The segmentation processing unit 502 is configured to perform image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure.

The edge processing unit 503 is configured to process the to-be-processed image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contour lines.

The image synthesis unit 504 is configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image.

The image generation unit 505 is configured to input the target image into the image generation model to generate a corresponding comic image. Wherein, the image generation model is a model obtained by training using the above-mentioned image generation model training method.

It should be noted that those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the device and each unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here. Repeat.

The above-mentioned apparatus can be implemented in the form of a computer program, and the computer program can be run on the computer device as shown in FIG. 9.

Please refer to FIG. 9, which is a schematic block diagram of the structure of a computer device according to an embodiment of the present application. The computer equipment can be a server or a terminal.

Referring to FIG. 9, the computer device includes a processor, a memory, and a network interface connected through a system bus, where the memory may include a non-volatile storage medium and an internal memory.

The non-volatile storage medium can store an operating system and a computer program. The computer program includes program instructions. When the program instructions are executed, the processor can execute any image generation model training method or image generation method.

The processor is used to provide computing and control capabilities and support the operation of the entire computer equipment.

The internal memory provides an environment for the operation of the computer program in the non-volatile storage medium. When the computer program is executed by the processor, the processor can execute any image generation model training method or image generation method.

The network interface is used for network communication, such as sending assigned tasks. Those skilled in the art can understand that the structure shown in FIG. 9 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. The specific computer equipment may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.

It should be understood that the processor may be a central processing unit (Central Processing Unit, CPU), the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. Among them, the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.

The embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the present application Any of the image generation model training methods or image generation methods provided in the embodiments.

The computer-readable storage medium may be the internal storage unit of the computer device described in the foregoing embodiment, such as the hard disk or memory of the computer device. The computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (SMC), or a secure digital (Secure Digital, SD) equipped on the computer device. ) Card, Flash Card, etc.

The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims

An image generation model training method, including:

Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;

Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;

Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;

Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;

When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model. The image generation model is used to generate comic-style images, and the comic-style images are all The image generation model generates a comic image based on the captured image.
The image generation model training method according to claim 1, wherein before said acquiring the first image set and the second image set, the method further comprises:

Acquire multiple photographed images and multiple comic images;

Performing cutting processing on the photographed image and the comic image respectively to obtain a photographed image and a comic image after cutting, wherein the photographed image and the comic image after the cutting have the same image size;

The first image set is constructed based on the cut shot images, and the second image set is constructed based on the cut cartoon images.
The image generation model training method according to claim 1 or 2, wherein the preprocessing the captured image according to a preset comic generation algorithm to obtain the target comic image corresponding to the captured image comprises:

Performing image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

Processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

Image synthesis is performed on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
The image generation model training method according to claim 3, wherein the processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with an edge contour line comprises:

According to the tangent flow formula, construct a tangent flow in the captured image;

Through the calculation formula of the similar binary image boundary, the Gaussian difference of the constructed tangent flow is calculated to obtain the edge image with the edge contour line.
The image generation model training method according to claim 4, wherein the tangent flow formula is:

Among them, Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalized tangent vector at point y ;Φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w s (x,y) is a spatial weight vector; w m (x,y) is a magnitude weight function; w d (x,y) is the direction weight function; initially, t 0 (x) is set to a vector orthogonal to the image gradient vector;

The formula for calculating the boundary of the binary image is:

Among them, D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1); τ is taken The value is 0.5.
An image generation method, including:

Acquiring an image to be processed, where the image to be processed is a captured image;

Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

Image synthesis of the hierarchical image and the edge image to obtain a target image;

The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the image generation model training method according to any one of claims 1 to 5.
An image generation model training device, including:

A data acquisition unit, configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;

A preprocessing unit, configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;

A network acquisition unit, configured to acquire a preset generative confrontation network, the generative confrontation network including a generation network and a discrimination network;

The model training unit is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform operations on the generation network and the discrimination network Alternate iterative training;

The model saving unit is used to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than the preset value. The image generation model is used to generate a comic-style image. The comic style image is a comic image generated by the image generation model based on the captured image.
An image generating device, including:

An image acquisition unit for acquiring an image to be processed, the image to be processed is a captured image;

A segmentation processing unit, configured to perform image segmentation processing on the to-be-processed image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

An edge processing unit, configured to process the to-be-processed image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

An image synthesis unit, configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image;

An image generation unit for inputting the target image into an image generation model to generate a corresponding comic image, wherein the image generation model is trained using the image generation model training method according to any one of claims 1 to 5 The resulting model.
A computer device including a memory and a processor;

The memory is used to store computer programs;

The processor is configured to execute the computer program, and when executing the computer program, implement the following steps:

Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;

Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;

Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;

Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;

When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model. The image generation model is used to generate comic-style images, and the comic-style images are all The image generation model generates a comic image based on the captured image.
The computer device according to claim 9, wherein the processor further implements the following steps before implementing the acquiring of the first image set and the second image set:

Acquire multiple photographed images and multiple comic images;

Performing cutting processing on the photographed image and the comic image respectively to obtain a photographed image and a comic image after cutting, wherein the photographed image and the comic image after the cutting have the same image size;

The first image set is constructed based on the cut shot images, and the second image set is constructed based on the cut cartoon images.
The computer device according to claim 9 or 10, wherein the processor implements the preprocessing of the captured image according to a preset comic generation algorithm to obtain the target comic image corresponding to the captured image, specifically achieve:

Performing image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

Processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

Image synthesis is performed on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
11. The computer device according to claim 11, wherein, when the processor implements the processing of the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contours, it specifically implements:

According to the tangent flow formula, construct a tangent flow in the captured image;

Through the calculation formula of the similar binary image boundary, the Gaussian difference of the constructed tangent flow is calculated to obtain the edge image with the edge contour line.
The computer device according to claim 12, wherein the tangent flow formula is:

Among them, Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalized tangent vector at point y ;Φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w s (x,y) is a spatial weight vector; w m (x,y) is a magnitude weight function; w d (x,y) is the direction weight function; initially, t 0 (x) is set to a vector orthogonal to the image gradient vector;

The formula for calculating the boundary of the binary image is:

Among them, D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1); τ is taken The value is 0.5.
A computer device including a memory and a processor;

The memory is used to store computer programs;

The processor is configured to execute the computer program, and when executing the computer program, implement the following steps:

Acquiring an image to be processed, where the image to be processed is a captured image;

Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

Image synthesis of the hierarchical image and the edge image to obtain a target image;

The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the image generation model training method according to any one of claims 1 to 5.
A computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the processor implements the following steps:

Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;

Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;

Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;

Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;

When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model. The image generation model is used to generate comic-style images, and the comic-style images are all The image generation model generates a comic image based on the captured image.
The computer-readable storage medium according to claim 15, wherein the processor further implements the following steps before implementing the acquiring of the first image set and the second image set:

Acquire multiple photographed images and multiple comic images;

Performing cutting processing on the photographed image and the comic image respectively to obtain a photographed image and a comic image after cutting, wherein the photographed image and the comic image after the cutting have the same image size;

The first image set is constructed based on the cut shot images, and the second image set is constructed based on the cut cartoon images.
The computer-readable storage medium according to claim 15 or 16, wherein the processor performs the preprocessing of the photographed image according to a preset comic generation algorithm to obtain a target comic image corresponding to the photographed image When, the specific realization:

Performing image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

Processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

Image synthesis is performed on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
The computer-readable storage medium according to claim 17, wherein, when the processor implements the processing of the captured image according to a stream-based difference of Gaussian filter algorithm to generate an edge image with edge contours, Implementation:

According to the tangent flow formula, construct a tangent flow in the captured image;

Through the calculation formula of the similar binary image boundary, the Gaussian difference of the constructed tangent flow is calculated to obtain the edge image with the edge contour line.
18. The computer-readable storage medium of claim 18, wherein the tangent flow formula is:

Among them, Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalized tangent vector at point y ;Φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w s (x,y) is a spatial weight vector; w m (x,y) is a magnitude weight function; w d (x,y) is the direction weight function; initially, t 0 (x) is set to a vector orthogonal to the image gradient vector;

The formula for calculating the boundary of the binary image is:

Among them, D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1); τ is taken The value is 0.5.
A computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the processor implements the following steps:

Acquiring an image to be processed, where the image to be processed is a captured image;

Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;

Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;

Image synthesis of the hierarchical image and the edge image to obtain a target image;

The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the image generation model training method according to any one of claims 1 to 5.