WO2020199478A1 - Method for training image generation model, image generation method, device and apparatus, and storage medium - Google Patents
Method for training image generation model, image generation method, device and apparatus, and storage medium Download PDFInfo
- Publication number
- WO2020199478A1 WO2020199478A1 PCT/CN2019/103142 CN2019103142W WO2020199478A1 WO 2020199478 A1 WO2020199478 A1 WO 2020199478A1 CN 2019103142 W CN2019103142 W CN 2019103142W WO 2020199478 A1 WO2020199478 A1 WO 2020199478A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- comic
- network
- images
- captured
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 100
- 238000000034 method Methods 0.000 title claims abstract description 91
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 61
- 238000007781 pre-processing Methods 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims description 52
- 230000006870 function Effects 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 23
- 230000015572 biosynthetic process Effects 0.000 claims description 22
- 238000003786 synthesis reaction Methods 0.000 claims description 22
- 238000003709 image segmentation Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 10
- 230000004913 activation Effects 0.000 description 9
- 238000010606 normalization Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 239000003086 colorant Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- This application relates to the field of image processing technology, and in particular to an image generation model training method, image generation method, device, computer equipment and storage medium.
- Comics are an art form that is widely used in our daily lives. It has a wide range of applications. For example, it is often used in children's story education. Like other forms of artwork, many comic images are created based on real-world scenes. However, converting real-world images into comic-style images is extremely challenging in both computer vision and computer graphics, because comic-style image features and image features of captured images are often quite different. It can be, for example, the character's hairstyle, clothing, facial expressions, facial features, etc. It is precisely because of the high difference between the two that the data dimensions that need to be processed to convert the captured images into comic-style images will be huge, and the required image generation model is also very difficult to train and time-consuming.
- This application provides an image generation model training method, image generation method, device, computer equipment, and storage medium to train a model that can convert captured images into comic-style images, and at the same time, improve the efficiency of training models.
- this application provides an image generation model training method, which includes:
- the first image set includes a plurality of photographed images
- the second image set includes a plurality of cartoon images
- the generative countermeasure network including a generative network and a discriminant network
- the trained generation network is saved as an image generation model, and the image generation model is used to generate an image with a comic style.
- this application also provides an image generation method, which includes:
- the target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the above-mentioned image generation model training method.
- this application also provides an image generation model training device, which includes:
- a data acquisition unit configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;
- a preprocessing unit configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image
- a network acquisition unit configured to acquire a preset generative confrontation network, the generative confrontation network including a generation network and a discrimination network;
- the model training unit is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform operations on the generation network and the discrimination network Alternate iterative training;
- the model saving unit is configured to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than a preset value, and the image generation model is used to generate an image with a comic style.
- the present application also provides an image generation device, which includes:
- An image acquisition unit for acquiring an image to be processed, the image to be processed is a captured image
- a segmentation processing unit configured to perform image segmentation processing on the to-be-processed image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure
- An edge processing unit configured to process the to-be-processed image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;
- An image synthesis unit configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image
- the image generation unit is configured to input the target image into an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the above-mentioned image generation model training method.
- the present application also provides a computer device, the computer device includes a memory and a processor; the memory is used to store a computer program; the processor is used to execute the computer program and execute the The computer program implements the above-mentioned image generation model training method or image generation method.
- the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor realizes the image generation model described above Training method or image generation method.
- the application discloses an image generation model training method, image generation method, device, computer equipment and storage medium.
- the training method first preprocesses the captured images in the first image set according to a preset comic generation algorithm to obtain the target comic image corresponding to the captured image; then uses the target comic image as the input of the generating network in the generative confrontation network, and generates The image output by the network and the comic image related to the captured image in the second image set are used as the input of the discriminant network in the generative confrontation network, so that the generation network and the discriminant network are alternately and iteratively trained until the discriminant probability value of the discriminant network output is greater than the preset Value, the trained generation network obtained at this time will be used as the image generation model.
- This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.
- FIG. 1 is a schematic flowchart of an image generation model training method provided by an embodiment of the present application
- FIG. 2 is a schematic flowchart of sub-steps of the image generation model training method provided in FIG. 1;
- FIG. 3 is a schematic flowchart of another image generation model training method provided by an embodiment of the present application.
- FIG. 4 is a schematic flowchart of an image generation method provided by an embodiment of the present application.
- FIG. 5 is a schematic diagram of an application scenario of an image generation method provided by an embodiment of the present application.
- FIG. 6 is a schematic block diagram of an image generation model training device provided by an embodiment of the application.
- FIG. 7 is a schematic block diagram of a preprocessing unit in an image generation model training device provided by an embodiment of the application.
- FIG. 8 is a schematic block diagram of an image generation device provided by an embodiment of the application.
- FIG. 9 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
- the embodiments of the present application provide an image generation model training method, image generation method, device, computer equipment, and storage medium.
- the image generation model training method is used to quickly train an image generation model that can generate a comic style;
- the image generation method can be applied to a server or a terminal, and the image generation model is used to generate a comic-style image from the captured image, thereby improving the user’s Experience.
- the server can be an independent server or a server cluster.
- the terminal can be an electronic device such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
- the trained image generation model For example, install the trained image generation model in a mobile phone, or compress the trained image generation model and install it in the mobile phone.
- the user uses a mobile phone to process the captured image by using the image generation method to obtain a comic style image corresponding to the captured image, thereby improving the user experience.
- the comic style images can be comics or classic cartoons, etc., such as One Piece, Crayon Shin-chan or Naruto, etc.
- the following will use the comic style as the style of Naruto for the image generation model training method or image generation method Give a detailed introduction.
- FIG. 1 is a schematic flowchart of an image generation model training method provided by an embodiment of the present application.
- the image generation model is obtained by model training based on a generative confrontation network. Of course, it can also be obtained by training with other similar networks.
- the image generation model training method includes: step S101 to step S105.
- the first image set and the second image set are obtained as sample data for model training, namely the first image set and the second image set, the first image set is a collection of captured images, and the second image set is a collection of comic images .
- the multiple captured images in the first image set are real-world pictures.
- a certain number of pictures can be downloaded from the Flickr website.
- Some of the images are used for training and the other part is used for testing, such as 6000 images, of which 5500 The images are used for model training, and the other 500 images are used for model testing.
- the multiple comic images in the second image set can be images in anime, such as Naruto.
- anime such as Naruto.
- the image of the ninja serves as the second image set.
- S102 Preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.
- the preset comic generation algorithm uses image processing algorithms to preprocess the captured images in the first image set to extract image information in the captured images, such as hierarchical structure images, edge images, facial features or hairstyle features, etc.
- image information in the captured images such as hierarchical structure images, edge images, facial features or hairstyle features, etc.
- the target comic image corresponding to the captured image is constructed according to the image information.
- step S102 in order to improve the training speed of the model and the accuracy of the model, a step of preprocessing the captured images in the first image set is provided, as shown in FIG. 2, that is, step S102 includes: sub-step S102a To S102c.
- S102a Perform image segmentation processing on the captured image according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure.
- a Mean-shift algorithm is used to segment the captured images and perform hierarchical processing on the images, and the similar colors in the images are unified through continuous iteration to obtain hierarchical images with a hierarchical structure.
- the Mean-shift (Mean-shift) algorithm belongs to the hill-climbing algorithm of kernel density estimation, which does not require any prior knowledge and completely relies on the calculation of the density function value of the sample points in the feature space.
- the usual histogram method is to divide the image into several equal intervals, and the ratio of the data in each interval to the total amount of data is the probability value of this interval; the principle of the Mean-shift algorithm is similar to the histogram method, with one more application Kernel function for smoothing data.
- kernel function estimation method when the image data is sufficient, it can gradually converge to any density function, that is, it can estimate the density of data that obey any distribution.
- Such a method can be used in many fields such as clustering, image segmentation, tracking, etc., and has a good effect in removing detailed information such as image color and texture.
- the Mean-shift algorithm is mainly used for image segmentation to obtain hierarchical images.
- S102b Process the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contour lines.
- a flow-based difference of Gaussian (FDoG) algorithm performs edge extraction on the captured image to extract an edge image corresponding to the captured image.
- DoG flow-based difference of Gaussian
- the processing the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with an edge contour line specifically includes: constructing a tangent flow in the captured image according to a tangent flow formula;
- the binary image boundary calculation formula is to calculate the Gaussian difference of the constructed tangent stream to obtain an edge image with edge contours.
- the tangent flow formula is:
- ⁇ (x) represents the neighborhood of X
- k is the normalized vector
- t(y) represents the current normalization at point y
- ⁇ (x,y) is a symbolic function, ⁇ (x,y) ⁇ 1,-1 ⁇
- w s (x,y) is a spatial weight vector
- w m (x,y) is a quantity Level weight function
- w d (x,y) is the direction weight function
- t 0 (x) is set to a vector orthogonal to the image gradient vector.
- the formula for calculating the boundary of the class binary image is:
- D(x) represents the boundary of the binary image
- H(x) is the filter function of the flow-based Gaussian difference filter algorithm
- ⁇ is the coefficient factor
- the value range of ⁇ is (0,1 );
- the value of ⁇ is 0.5.
- the similar binary image boundary calculation formula can make the edge image clear, smooth and coherent, thereby improving the accuracy of the image generation model.
- S102c Perform image synthesis on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
- image synthesis is performed on the hierarchical image and the edge image to obtain an image of a specific hierarchical structure and edge feature corresponding to the captured image, that is, a target comic image.
- Using the target comic image for image generation model training can reduce the data dimension that needs to be processed for model training, and at the same time improve the training speed of the model and the accuracy of the model.
- the generative adversarial network includes a generation network and a discrimination network.
- the generation network is used to generate comic images from captured images, and the discrimination network is used to determine the output of the generation network. Whether the image of is a comic image.
- the generative confrontation network may be various types of confrontation networks.
- it can be a Deep Convolutional Generative Adversarial Network (DCGAN).
- DCGAN Deep Convolutional Generative Adversarial Network
- the generation network can be a convolutional neural network for image processing (for example, various convolutional neural network structures including convolutional layer, pooling layer, depooling layer, and deconvolutional layer, which can be performed in sequence Down-sampling and up-sampling);
- the discriminant network can be a convolutional neural network (for example, various convolutional neural network structures including a fully connected layer, where the fully connected layer can implement a classification function).
- S104 Use the target comic image as the input of the generation network and use the image output by the generation network and the comic image as the input of the discrimination network, and perform alternate iterative training on the generation network and the discrimination network.
- performing alternating iterative training includes two training processes, namely: training a generation network and training a discriminant network.
- training the generation network includes: inputting a captured image to the generation network, after a convolution, batch normalization (BN) and activation function (Relu) are activated, and then down-convolution with convolution and batch normalization (BN) and activation function (Relu) activation operations, so two trainings are carried out, and then through 8 the same Residualblock operations, two Up-convolutions are carried out with convolution, convolution, and batch normalization (BN) And activation function (Relu) activation operation, and finally through a convolution operation, output an image with the same size as the input captured image.
- the activation function uses the ReLU function.
- training the discriminant network includes: inputting and generating images and comic images output by the discriminant network, after multiple convolutions, batch normalization (BN) and activation function (LReLU) activation, and then Sigmoid function processing
- the latter output is a probability value of the comic image (Naruto image) in the second image set, where the activation function uses the LReLU function.
- the discrimination network is used to determine whether the input image (the output image of the generation network) is a Naruto image in the second image set.
- the judgment network model By alternately training the two network structures, first optimize the judgment network model, it is easy to distinguish whether the input is the comic image in the second image set (Naruto image), that is, the image generated at the beginning of the network and the Naruto image in the second image set. The image has a large deviation. Then optimize the generation network so that the loss function of the generated network model is gradually reduced, and at the same time, it also improves the ability to distinguish the two classifications of the network model. The final iteration until the network model can not determine whether the input is the Naruto image in the second image set or the generation For the Naruto image generated by the network model, the entire generation network model has been trained at this time. At this time, the image generated by the generation network model is an image with the style of anime Naruto.
- the ability to discriminate the binary classification of the network model is determined, thereby ensuring that the image generated by the network model is generated, which has anime Naruto Stylized image.
- the size of the preset value is not limited here, and can be set according to expert experience.
- the training method provided by the foregoing embodiment first preprocesses the captured images in the first image set according to a preset comic generation algorithm to obtain the target comic images corresponding to the captured images; then uses the target comic images as the input of the generating network in the generative confrontation network , And use the image output of the generating network and the comic image related to the captured image in the second image set as the input of the discriminant network in the generative confrontation network, so as to perform alternate iterative training on the generation network and the discriminant network until the discriminant probability of the discriminant network output If the value is greater than the preset value, the trained generation network obtained at this time will be used as the image generation model.
- This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.
- FIG. 3 is a schematic flowchart of another image generation model training method provided by an embodiment of the present application.
- the image generation model is obtained by model training based on a generative confrontation network. Of course, it can also be obtained by training with other similar networks.
- the image generation model training method includes: step S201 to step S208.
- S202 Perform cutting processing on the photographed image and the cartoon image respectively to obtain the photographed image and the cartoon image after cutting.
- the shot image and the comic image are respectively cut to obtain the cut shot image and the comic image, so as to determine that the cut shot image and the comic image have the same image size, for example, both cut It is a 256 ⁇ 256 size image, of course, it can also be cut into other sizes.
- S203 Construct a first image set based on the cut shot images, and construct a second image set based on the cut cartoon images.
- the cut shot images are constructed into a first image set
- the cut cartoon images are constructed into a second image set, so that the sizes of the images in the first image set and the second image set are the same.
- the first image set includes multiple photographed images
- the second image set includes multiple cartoon images. It should be noted that the number of images in the first image set and the number of images in the second image set may be the same or different.
- S205 Preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.
- the generative adversarial network includes a generation network and a discrimination network.
- the generation network is used to generate comic images from captured images, and the discrimination network is used to determine the output of the generation network. Whether the image of is a comic image.
- performing alternate iterative training includes two training processes, namely: training a generation network and training a discriminant network.
- the target comic image is used as the input of the generation network and the image output by the generation network and the comic image are used as the input of the discrimination network, and the generation network and the discrimination network are alternately iteratively trained , The final iteration until it is determined that the network model cannot determine whether the input is the Naruto image in the second image set or the Naruto image generated by the generation network model.
- the entire generation network model has been trained.
- the ability to discriminate the binary classification of the network model is determined, thereby ensuring that the image generated by the network model is generated, which has anime Naruto Stylized image.
- the discriminant probability value output by the discriminant network is greater than the preset value, it indicates that the generation network model can be used to generate comic-style images, so the generation network at this time is saved as a comic-style image generation model.
- the training method provided by the foregoing embodiment first constructs the first image set and the second image set, and then preprocesses the captured images in the first image set according to a preset cartoon generation algorithm to obtain the target cartoon image corresponding to the captured image;
- the comic image is used as the input of the generating network in the generative confrontation network, and the image output by the generating network and the comic image related to the captured image in the second image set are used as the input of the discriminating network in the generative confrontation network.
- Carry out alternate iterative training until the discriminative probability value of the discriminant network output is greater than the preset value, and the trained generation network obtained at this time will be used as the image generation model.
- This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.
- FIG. 4 is a schematic flowchart of an image generation method provided by an embodiment of the present application.
- the image generation method can be applied to a terminal or a server to generate a comic-style image based on the captured image using the above-trained image generation model.
- the application of the image generation method to a terminal is taken as an example for introduction, as shown in FIG. 5, which is a schematic diagram of an application scenario of the image generation method provided by this application.
- the server uses any of the image generation model training methods provided in the above embodiments to train the image generation model, and sends the image generation model to the terminal.
- the terminal receives and saves the image generation model sent by the server.
- the terminal can run the image generation method according to The captured image uses the image generation model to generate a comic-style image.
- the terminal is used to perform: acquiring an image to be processed, which is a captured image; inputting the image to be processed into an image generation model to generate a corresponding comic image, wherein the image
- the generative model is a model obtained by training using any of the image generative model training methods described above. Then, the image to be processed selected by the user in the terminal (for example, an image taken by steel or an image stored in a disk) is converted into a comic-style image to improve the user's experience.
- the image generation method includes: step S301 to step S305.
- the image to be processed may be a picture just taken by the user, or a picture selected by the user in the gallery, such as a picture taken by the user with a mobile phone or a picture selected from the previously taken pictures, and want to convert it to a cartoon Style comic image, you can send the picture to the server that saves the comic style image generation model, the server inputs the to-be-processed image into the comic style image generation model to generate the corresponding comic image, and sends the generated comic image to user.
- another image generation method is also provided.
- the image generation method may also use the acquired image to be processed as the target image, and execute step S305.
- a Mean-shift algorithm is used to segment the captured images and perform hierarchical processing on the images, and the similar colors in the images are unified through continuous iteration to obtain hierarchical images with a hierarchical structure.
- S303 Process the image to be processed according to the Gaussian difference filter algorithm based on the stream to generate an edge image with edge contour lines.
- a flow-based difference of Gaussian (FDoG) algorithm performs edge extraction on the captured image to extract an edge image corresponding to the captured image.
- DoG flow-based difference of Gaussian
- image synthesis is performed on the hierarchical image and the edge image to obtain an image of a specific hierarchical structure and edge feature corresponding to the captured image, that is, a target image.
- image generation model Inputting the target image to the image generation model to generate a comic-style image can increase the speed of image generation.
- the image generation model is a model obtained by training using any of the image generation model training methods provided in the foregoing embodiments.
- the target image is input to the image generation model to generate the corresponding comic image.
- the target image synthesized according to the hierarchical image and the edge image is input to the model, which is an image generation model, and the model is generated using the image Generate an image with a comic style, such as the image displayed by the terminal in Figure 5, thereby improving the user experience.
- FIG. 6 is a schematic block diagram of an image generation model training device provided by an embodiment of the present application.
- the image generation model training device may be configured in a server to execute the aforementioned image generation model training method.
- the image generation model training device 400 includes: a photographing acquisition unit 401, a cutting processing unit 402, an atlas construction unit 403, a data acquisition unit 404, a preprocessing unit 405, a network acquisition unit 406, and model training Unit 407 and model saving unit 408.
- the photographing and acquiring unit 401 is configured to acquire multiple photographed images and multiple cartoon images.
- the cropping processing unit 402 is configured to perform cropping processing on the captured image and the cartoon image to obtain a cropped captured image and a cartoon image, wherein the cropped captured image and the cartoon image have the same image size.
- the atlas construction unit 403 is configured to construct a first image set based on the cut shot images, and construct a second image set based on the cut cartoon images.
- the data acquisition unit 404 is configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images.
- the preprocessing unit 405 is configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.
- the preprocessing unit 405 includes: a hierarchical processing subunit 4051, an edge processing subunit 4052 and an image synthesis subunit 4053.
- the hierarchical processing subunit 4051 is configured to perform image segmentation processing on the captured image according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure;
- the captured image is processed to generate an edge image with an edge contour line;
- an image synthesis subunit 4053 is configured to perform image synthesis on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
- the network obtaining unit 406 is configured to obtain a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network.
- the model training unit 407 is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform the evaluation on the generation network and the discrimination network Perform alternating iterative training.
- the model saving unit 408 is configured to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than a preset value.
- the image generation model is used to generate an image with a comic style.
- FIG. 8 is a schematic block diagram of an image generation device provided by an embodiment of the present application, and the image generation device is used to execute the aforementioned image generation method.
- the image generating device can be configured in a server or a terminal.
- the image generation device 500 includes: an image acquisition unit 501, a segmentation processing unit 502, an edge processing unit 503, an image synthesis unit 504, and an image generation unit 505.
- the image acquisition unit 501 is configured to acquire an image to be processed, and the image to be processed is a captured image.
- the acquired image to be processed may also be used as the target image, and the image generating unit 505 may be called.
- the segmentation processing unit 502 is configured to perform image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure.
- the edge processing unit 503 is configured to process the to-be-processed image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contour lines.
- the image synthesis unit 504 is configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image.
- the image generation unit 505 is configured to input the target image into the image generation model to generate a corresponding comic image.
- the image generation model is a model obtained by training using the above-mentioned image generation model training method.
- the above-mentioned apparatus can be implemented in the form of a computer program, and the computer program can be run on the computer device as shown in FIG. 9.
- FIG. 9 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application.
- the computer equipment can be a server or a terminal.
- the computer device includes a processor, a memory, and a network interface connected through a system bus, where the memory may include a non-volatile storage medium and an internal memory.
- the non-volatile storage medium can store an operating system and a computer program.
- the computer program includes program instructions.
- the processor can execute any image generation model training method or image generation method.
- the processor is used to provide computing and control capabilities and support the operation of the entire computer equipment.
- the internal memory provides an environment for the operation of the computer program in the non-volatile storage medium.
- the processor can execute any image generation model training method or image generation method.
- the network interface is used for network communication, such as sending assigned tasks.
- the network interface is used for network communication, such as sending assigned tasks.
- FIG. 9 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
- the specific computer equipment may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
- the processor may be a central processing unit (Central Processing Unit, CPU), the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
- the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the present application Any of the image generation model training methods or image generation methods provided in the embodiments.
- the computer-readable storage medium may be the internal storage unit of the computer device described in the foregoing embodiment, such as the hard disk or memory of the computer device.
- the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (SMC), or a secure digital (Secure Digital, SD) equipped on the computer device. ) Card, Flash Card, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
Abstract
A method for training an image generation model, an image generation method, device and apparatus, and a storage medium. The method for training an image generation model comprises: acquiring a first image set and a second image set, the first image set comprising multiple captured images, and the second image set comprising multiple cartoon images; performing pre-processing of the captured images according to a preset cartoon generation algorithm to obtain corresponding target cartoon images; and iteratively training a generative network and a discriminative network in an alternating manner to obtain an image generation model.
Description
本申请要求于2019年4月3日提交中国专利局、申请号为201910267519.9、发明名称为“图像生成模型训练方法、图像生成方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 3, 2019, the application number is 201910267519.9, and the invention title is "image generation model training method, image generation method, device, equipment and storage medium". The entire content is incorporated into this application by reference.
本申请涉及图像处理技术领域,尤其涉及一种图像生成模型训练方法、图像生成方法、装置、计算机设备及存储介质。This application relates to the field of image processing technology, and in particular to an image generation model training method, image generation method, device, computer equipment and storage medium.
漫画是我们日常生活中广泛使用的艺术形式,具有广泛的应用范围,比如常用于儿童故事教育等,像其他形式的艺术品一样,许多漫画形象都是基于现实世界的场景创作的。然而,将现实世界拍摄的图像转换为漫画风格的图像,这在计算机视觉和计算机图形学中都极具有挑战性,因为漫画风格的图像特征和拍摄图像的图像特征往往差别较大,其中图像特征可例如为人物的发型、衣着、表情、五官等。正是由于两者的差异性过高,将拍摄的图像转换为漫画风格的图像所需要处理的数据维度将会是巨大的,同时所需的图像生成模型也非常难于训练且较为耗时。Comics are an art form that is widely used in our daily lives. It has a wide range of applications. For example, it is often used in children's story education. Like other forms of artwork, many comic images are created based on real-world scenes. However, converting real-world images into comic-style images is extremely challenging in both computer vision and computer graphics, because comic-style image features and image features of captured images are often quite different. It can be, for example, the character's hairstyle, clothing, facial expressions, facial features, etc. It is precisely because of the high difference between the two that the data dimensions that need to be processed to convert the captured images into comic-style images will be huge, and the required image generation model is also very difficult to train and time-consuming.
发明内容Summary of the invention
本申请提供了一种图像生成模型训练方法、图像生成方法、装置、计算机设备及存储介质,以训练出可以将拍摄的图像转换成具有漫画风格的图像的模型,同时,提高训练模型的效率。This application provides an image generation model training method, image generation method, device, computer equipment, and storage medium to train a model that can convert captured images into comic-style images, and at the same time, improve the efficiency of training models.
第一方面,本申请提供了一种图像生成模型训练方法,其包括:In the first aspect, this application provides an image generation model training method, which includes:
获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像;Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;
根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像;Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;
获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络;Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;
将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练;Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;
当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像。When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model, and the image generation model is used to generate an image with a comic style.
第二方面,本申请还提供了一种图像生成方法,其包括:In the second aspect, this application also provides an image generation method, which includes:
获取待处理图像,所述待处理图像为拍摄图像;Acquiring an image to be processed, where the image to be processed is a captured image;
根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结 构的层级图像;Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;
根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像;Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;
将所述层级图像和所述边缘图像进行图像合成以得到目标图像;Image synthesis of the hierarchical image and the edge image to obtain a target image;
将所述目标图像输入至图像生成模型以生成对应的漫画图像,其中,所述图像生成模型为采用上述的图像生成模型训练方法训练得到的模型。The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the above-mentioned image generation model training method.
第三方面,本申请还提供了一种图像生成模型训练装置,其包括:In the third aspect, this application also provides an image generation model training device, which includes:
数据获取单元,用于获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像;A data acquisition unit, configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;
预处理单元,用于根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像;A preprocessing unit, configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;
网络获取单元,用于获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络;A network acquisition unit, configured to acquire a preset generative confrontation network, the generative confrontation network including a generation network and a discrimination network;
模型训练单元,用于将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练;The model training unit is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform operations on the generation network and the discrimination network Alternate iterative training;
模型保存单元,用于当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像。The model saving unit is configured to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than a preset value, and the image generation model is used to generate an image with a comic style.
第四方面,本申请还提供了一种图像生成装置,其包括:In a fourth aspect, the present application also provides an image generation device, which includes:
图像获取单元,用于获取待处理图像,所述待处理图像为拍摄图像;An image acquisition unit for acquiring an image to be processed, the image to be processed is a captured image;
分割处理单元,用于根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结构的层级图像;A segmentation processing unit, configured to perform image segmentation processing on the to-be-processed image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;
边缘处理单元,用于根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像;An edge processing unit, configured to process the to-be-processed image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;
图像合成单元,用于将所述层级图像和所述边缘图像进行图像合成以得到目标图像;An image synthesis unit, configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image;
图像生成单元,用于将所述目标图像输入至图像生成模型以生成对应的漫画图像,其中,所述图像生成模型为采用上述的图像生成模型训练方法训练得到的模型。The image generation unit is configured to input the target image into an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the above-mentioned image generation model training method.
第五方面,本申请还提供了一种计算机设备,所述计算机设备包括存储器和处理器;所述存储器用于存储计算机程序;所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现如上述的图像生成模型训练方法或图像生成方法。In a fifth aspect, the present application also provides a computer device, the computer device includes a memory and a processor; the memory is used to store a computer program; the processor is used to execute the computer program and execute the The computer program implements the above-mentioned image generation model training method or image generation method.
第六方面,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如上述的图像生成模型训练方法或图像生成方法。In a sixth aspect, the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor realizes the image generation model described above Training method or image generation method.
本申请公开了一种图像生成模型训练方法、图像生成方法、装置、计算机设备及存储介质。该训练方法先根据预设漫画生成算法对第一图像集中的拍摄图像进行预处理以得到拍摄图像对应的目标漫画图像;然后将目标漫画图像作 为生成式对抗网络中生成网络的输入,以及将生成网络输出的图像和第二图像集中与拍摄图像相关的漫画图像作为生成式对抗网络中判别网络的输入,从而对生成网络和判别网络进行交替迭代训练,直至判别网络输出的判别概率值大于预设值,此时得到的训练后的生成网络将作为图像生成模型。该训练方法不但可以训练出将拍摄的图像转换成具有漫画风格的图像的模型,还可以提高训练模型的效率。The application discloses an image generation model training method, image generation method, device, computer equipment and storage medium. The training method first preprocesses the captured images in the first image set according to a preset comic generation algorithm to obtain the target comic image corresponding to the captured image; then uses the target comic image as the input of the generating network in the generative confrontation network, and generates The image output by the network and the comic image related to the captured image in the second image set are used as the input of the discriminant network in the generative confrontation network, so that the generation network and the discriminant network are alternately and iteratively trained until the discriminant probability value of the discriminant network output is greater than the preset Value, the trained generation network obtained at this time will be used as the image generation model. This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.
图1是本申请的实施例提供的一种图像生成模型训练方法的示意流程图;FIG. 1 is a schematic flowchart of an image generation model training method provided by an embodiment of the present application;
图2是图1中提供的图像生成模型训练方法的子步骤示意流程图;2 is a schematic flowchart of sub-steps of the image generation model training method provided in FIG. 1;
图3是本申请的实施例提供的另一种图像生成模型训练方法的示意流程图;3 is a schematic flowchart of another image generation model training method provided by an embodiment of the present application;
图4是本申请的实施例提供的一种图像生成方法的示意流程图;FIG. 4 is a schematic flowchart of an image generation method provided by an embodiment of the present application;
图5是本申请的实施例提供的一种图像生成方法的应用场景示意图;FIG. 5 is a schematic diagram of an application scenario of an image generation method provided by an embodiment of the present application;
图6为本申请实施例提供的一种图像生成模型训练装置的示意性框图;FIG. 6 is a schematic block diagram of an image generation model training device provided by an embodiment of the application;
图7为本申请实施例提供的一种图像生成模型训练装置中预处理单元的示意性框图;FIG. 7 is a schematic block diagram of a preprocessing unit in an image generation model training device provided by an embodiment of the application;
图8为本申请实施例提供的一种图像生成装置的示意性框图;FIG. 8 is a schematic block diagram of an image generation device provided by an embodiment of the application;
图9为本申请一实施例提供的一种计算机设备的结构示意性框图。FIG. 9 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。The flowchart shown in the drawings is merely an illustration, and does not necessarily include all contents and operations/steps, nor does it have to be executed in the described order. For example, some operations/steps can also be decomposed, combined or partially combined, so the actual execution order may be changed according to actual conditions.
本申请的实施例提供了一种图像生成模型训练方法、图像生成方法、装置、计算机设备及存储介质。其中,图像生成模型训练方法用于快速训练出可以生成漫画风格的图像生成模型;图像生成方法可以应用于服务器或终端中,使用图像生成模型将拍摄的图像生成具有漫画风格图像,进而提高用户的体验。The embodiments of the present application provide an image generation model training method, image generation method, device, computer equipment, and storage medium. Among them, the image generation model training method is used to quickly train an image generation model that can generate a comic style; the image generation method can be applied to a server or a terminal, and the image generation model is used to generate a comic-style image from the captured image, thereby improving the user’s Experience.
其中,服务器可以为独立的服务器,也可以为服务器集群。该终端可以是手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等电子设备。Among them, the server can be an independent server or a server cluster. The terminal can be an electronic device such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
例如,将训练好的图像生成模型安装在手机中,或将训练好的图像生成模型压缩处理后安装在手机中。用户使用手机对拍摄的图像,利用该图像生成方法对拍摄的图像进行处理,以得到该拍摄的图像对应的漫画风格图像,由此提高了用户的体验。For example, install the trained image generation model in a mobile phone, or compress the trained image generation model and install it in the mobile phone. The user uses a mobile phone to process the captured image by using the image generation method to obtain a comic style image corresponding to the captured image, thereby improving the user experience.
需要说明的是,漫画风格的图像可以是漫画或者经典卡通等等,比如海贼王、蜡笔小新或火影忍者等等,以下将以漫画风格为火影忍者的风格对图像生成模型训练方法或图像生成方法进行详细介绍。It should be noted that the comic style images can be comics or classic cartoons, etc., such as One Piece, Crayon Shin-chan or Naruto, etc. The following will use the comic style as the style of Naruto for the image generation model training method or image generation method Give a detailed introduction.
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。Hereinafter, some embodiments of the present application will be described in detail with reference to the accompanying drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.
请参阅图1,图1是本申请的实施例提供的一种图像生成模型训练方法的示意流程图。该图像生成模型基于生成式对抗网络进行模型训练得到的,当然也可以采用其他类似网络进行训练得到。Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an image generation model training method provided by an embodiment of the present application. The image generation model is obtained by model training based on a generative confrontation network. Of course, it can also be obtained by training with other similar networks.
如图1所示,该图像生成模型训练方法,包括:步骤S101至步骤S105。As shown in Fig. 1, the image generation model training method includes: step S101 to step S105.
S101、获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像。S101. Acquire a first image set and a second image set, where the first image set includes multiple photographed images, and the second image set includes multiple cartoon images.
其中,获取第一图像集和第二图像集作为模型训练用的样本数据,即第一图像集和第二图像集,第一图像集为拍摄图像的集合,第二图像集为漫画图像的集合。Among them, the first image set and the second image set are obtained as sample data for model training, namely the first image set and the second image set, the first image set is a collection of captured images, and the second image set is a collection of comic images .
具体地,第一图像集中的多张拍摄图像为真实世界图片,可以从Flickr网站上下载了一定数量的图片,部分图像用于训练,另一部分图像用于测试,比如6000张图像,其中5500张图像用于模型训练,另外500张图像用于模型测试。Specifically, the multiple captured images in the first image set are real-world pictures. A certain number of pictures can be downloaded from the Flickr website. Some of the images are used for training and the other part is used for testing, such as 6000 images, of which 5500 The images are used for model training, and the other 500 images are used for model testing.
具体地,第二图像集中的多张漫画图像可以为动漫中的图像,比如火影忍者,通过选定动漫火影忍者前700集动漫,并在每集动漫中随机选取10张图像,总共7000张火影忍者的图像作为第二图像集。Specifically, the multiple comic images in the second image set can be images in anime, such as Naruto. By selecting the first 700 episodes of anime Naruto, and randomly selecting 10 images in each episode, 7000 Naruto in total The image of the ninja serves as the second image set.
S102、根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像。S102: Preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.
具体地,预设漫画生成算法采用图像处理算法对第一图像集中的拍摄图像进行预处理,以提取拍摄图像中的图像信息,比如层级结构图像、边缘图像、人脸特征或者发型特征等等,并根据这些图像信息构成所述拍摄图像对应的目标漫画图像。由此可以消除拍摄图像和动漫图像(火影忍者图像)之间的差异性过高的问题,降低图像生成模型训练需要处理的数据维度,便于模型的训练,同时又提高了模型的准确度。Specifically, the preset comic generation algorithm uses image processing algorithms to preprocess the captured images in the first image set to extract image information in the captured images, such as hierarchical structure images, edge images, facial features or hairstyle features, etc. The target comic image corresponding to the captured image is constructed according to the image information. As a result, the problem of excessive differences between the captured image and the animation image (Naruto image) can be eliminated, the data dimension that needs to be processed for the training of the image generation model is reduced, the training of the model is facilitated, and the accuracy of the model is improved.
在一实施例中,为了提高模型的训练速度以及模型的准确度,提供了对所述第一图像集中的拍摄图像进行预处理的步骤,如图2所示,即步骤S102包括:子步骤S102a至S102c。In an embodiment, in order to improve the training speed of the model and the accuracy of the model, a step of preprocessing the captured images in the first image set is provided, as shown in FIG. 2, that is, step S102 includes: sub-step S102a To S102c.
S102a、根据均值漂移算法对所述拍摄图像进行图像分割处理以得到具有层级结构的层级图像。S102a: Perform image segmentation processing on the captured image according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure.
具体地,使用均值漂移(Mean-shift)算法对拍摄图像进行图像分割以及对图像进行层级处理,通过不断迭代将图像中的相似颜色统一,以得到均有层级结构的层级图像。Specifically, a Mean-shift algorithm is used to segment the captured images and perform hierarchical processing on the images, and the similar colors in the images are unified through continuous iteration to obtain hierarchical images with a hierarchical structure.
其中,均值漂移(Mean-shift)算法属于核密度估计的爬山算法,不需要任何先验知识而完全依靠特征空间中样本点的计算其密度函数值。通常的直方图法是将图像分成若干个相等的区间,每个区间内的数据与总数据量的比值为这 个区间的概率值;Mean-shift算法的原理类似于直方图法,多了一个用于平滑数据的核函数。采用核函数估计法,在图像数据充分的情况下,能够渐进地收敛于任意的密度函数,即可以对服从任何分布的数据进行密度估计。这样的方法可以用于聚类、图像分割、跟踪等很多领域,并且在去除图像颜色、纹理等细节信息方面有着很好的作用。在本实施例中,主要采用Mean-shift算法用于图像分割以得到层级图像。Among them, the Mean-shift (Mean-shift) algorithm belongs to the hill-climbing algorithm of kernel density estimation, which does not require any prior knowledge and completely relies on the calculation of the density function value of the sample points in the feature space. The usual histogram method is to divide the image into several equal intervals, and the ratio of the data in each interval to the total amount of data is the probability value of this interval; the principle of the Mean-shift algorithm is similar to the histogram method, with one more application Kernel function for smoothing data. Using the kernel function estimation method, when the image data is sufficient, it can gradually converge to any density function, that is, it can estimate the density of data that obey any distribution. Such a method can be used in many fields such as clustering, image segmentation, tracking, etc., and has a good effect in removing detailed information such as image color and texture. In this embodiment, the Mean-shift algorithm is mainly used for image segmentation to obtain hierarchical images.
S102b、根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像。S102b. Process the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contour lines.
具体地,基于流的高斯差分滤波器(Flow-Based Difference of Gaussian、FDoG)算法对所述拍摄图像进行边缘提取,以提取出所述拍摄图像对应的边缘图像。Specifically, a flow-based difference of Gaussian (FDoG) algorithm performs edge extraction on the captured image to extract an edge image corresponding to the captured image.
其中,所述根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像,具体包括:根据切线流公式,在所述拍摄图像中构建切线流;通过类二值图像边界计算公式,计算构建的切线流的高斯差值以得到具有边缘轮廓线的边缘图像。Wherein, the processing the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with an edge contour line specifically includes: constructing a tangent flow in the captured image according to a tangent flow formula; The binary image boundary calculation formula is to calculate the Gaussian difference of the constructed tangent stream to obtain an edge image with edge contours.
在一个实施例中,所述切线流公式为:In an embodiment, the tangent flow formula is:
公式(1)中,Ω(x)表示X的邻域,X=(x,y)表示所述拍摄图像的像素点;k是归一化向量;t(y)表示y点处的当前归一化切线向量;φ(x,y)为符号函数,φ(x,y)∈{1,-1};w
s(x,y)为空间权重向量;w
m(x,y)为量级权重函数;w
d(x,y)为方向权重函数;初始时,t
0(x)设为与图像梯度向量正交的向量。
In formula (1), Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalization at point y A tangent vector; φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w s (x,y) is a spatial weight vector; w m (x,y) is a quantity Level weight function; w d (x,y) is the direction weight function; initially, t 0 (x) is set to a vector orthogonal to the image gradient vector.
在一个实施例中,所述类二值图像边界计算公式为:In an embodiment, the formula for calculating the boundary of the class binary image is:
公式(2)中,D(x)表示二值图像边界,H(x)为所述基于流的高斯差分滤波器算法的滤波器函数;λ为系数因子,λ取值范围为(0,1);τ取值为0.5。类二值图像边界计算公式,可以使得边缘图像变得清晰、光滑和连贯,进而提高图像生成模型的准确度。In formula (2), D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1 ); The value of τ is 0.5. The similar binary image boundary calculation formula can make the edge image clear, smooth and coherent, thereby improving the accuracy of the image generation model.
S102c、将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的目标漫画图像。S102c. Perform image synthesis on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
具体地,将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的具体层级结构和边缘特征的图像,即目标漫画图像。将该目标漫画图像用于图像生成模型训练,可以降低模型训练需要处理的数据维度,同时又提高了模型的训练速度以及模型的准确度。Specifically, image synthesis is performed on the hierarchical image and the edge image to obtain an image of a specific hierarchical structure and edge feature corresponding to the captured image, that is, a target comic image. Using the target comic image for image generation model training can reduce the data dimension that needs to be processed for model training, and at the same time improve the training speed of the model and the accuracy of the model.
S103、获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络。S103. Obtain a preset generative confrontation network, where the generative confrontation network includes a generation network and a discrimination network.
具体地,获取预选设置的生成式对抗网络(Generative Adversarial Networks、GAN),该生成式对抗网络包括生成网络和判别网络,生成网络用于利用拍摄图像生成漫画图像,判别网络用于判别生成网络输出的图像是否为漫画图像。Specifically, a pre-selected Generative Adversarial Networks (GAN) is obtained. The generative adversarial network includes a generation network and a discrimination network. The generation network is used to generate comic images from captured images, and the discrimination network is used to determine the output of the generation network. Whether the image of is a comic image.
其中,该生成式对抗网络可以是各种类型的对抗网络。比如,可以是深度卷积生成对抗网络(Deep Convolutional Generative Adversarial Network、DCGAN)。再比如,生成网络可以是用于进行图像处理的卷积神经网络(例如,包含卷积层、池化层、反池化层、反卷积层的各种卷积神经网络结构,可以依次进行降采样和上采样);判别网络可以是卷积神经网络(例如,包含全连接层的各种卷积神经网络结构,其中,全连接层可以实现分类功能)。Among them, the generative confrontation network may be various types of confrontation networks. For example, it can be a Deep Convolutional Generative Adversarial Network (DCGAN). For another example, the generation network can be a convolutional neural network for image processing (for example, various convolutional neural network structures including convolutional layer, pooling layer, depooling layer, and deconvolutional layer, which can be performed in sequence Down-sampling and up-sampling); the discriminant network can be a convolutional neural network (for example, various convolutional neural network structures including a fully connected layer, where the fully connected layer can implement a classification function).
S104、将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练。S104. Use the target comic image as the input of the generation network and use the image output by the generation network and the comic image as the input of the discrimination network, and perform alternate iterative training on the generation network and the discrimination network.
具体地,进行交替迭代训练包括两个训练过程,分别为:训练生成网络和训练判别网络。Specifically, performing alternating iterative training includes two training processes, namely: training a generation network and training a discriminant network.
其中,训练生成网络,包括:向生成网络输入拍摄图像,经过一次卷积、批归一化(BN)和激活函数(Relu)激活后,再进行了Down-convolution有卷积、批归一化(BN)和激活函数(Relu)激活操作,如此进行了两次训练,再通过8个一样的Residualblock操作,再进行了两次Up-convolution有卷积、卷积、批归一化(BN)和激活函数(Relu)激活操作,最后再经过一次卷积操作,输出一张与输入的拍摄图像具有相同大小的图像。其中激活函数用的是ReLU函数。Among them, training the generation network includes: inputting a captured image to the generation network, after a convolution, batch normalization (BN) and activation function (Relu) are activated, and then down-convolution with convolution and batch normalization (BN) and activation function (Relu) activation operations, so two trainings are carried out, and then through 8 the same Residualblock operations, two Up-convolutions are carried out with convolution, convolution, and batch normalization (BN) And activation function (Relu) activation operation, and finally through a convolution operation, output an image with the same size as the input captured image. The activation function uses the ReLU function.
其中,训练判别网络,包括:向所述判别网络输入生成网络输出的图像和漫画图像,经过多次卷积、批归一化(BN)以及激活函数(LReLU)激活后,再经过Sigmoid函数处理后的输出是第二图像集中的漫画图像(火影忍者图像)的一个概率值,其中激活函数用的是LReLU函数。判别网络作为生成网络的补充,用于判断输入图像(生成网络的输出图像)是否是第二图像集中的火影忍者图像。Among them, training the discriminant network includes: inputting and generating images and comic images output by the discriminant network, after multiple convolutions, batch normalization (BN) and activation function (LReLU) activation, and then Sigmoid function processing The latter output is a probability value of the comic image (Naruto image) in the second image set, where the activation function uses the LReLU function. As a supplement to the generation network, the discrimination network is used to determine whether the input image (the output image of the generation network) is a Naruto image in the second image set.
通过交替训练两个网络结构,先优化判别网络模型,一开始很容易区分输入的是否是第二图像集中的漫画图像(火影图像),即生成网络一开始生成的图像与第二图像集中的火影图像具有很大的偏差。接着优化生成网络使得生成网络模型的损失函数慢慢减小,同时也提高判别网络模型的二分类的能力,最后的迭代直至判别网络模型无法判别输入的是第二图像集中的火影图像,还是生成网络模型生成的火影图像,这时整个生成网络模型已经训练好,此时通过生成网络模型生成的图像就是具有了动漫火影风格的图像。By alternately training the two network structures, first optimize the judgment network model, it is easy to distinguish whether the input is the comic image in the second image set (Naruto image), that is, the image generated at the beginning of the network and the Naruto image in the second image set. The image has a large deviation. Then optimize the generation network so that the loss function of the generated network model is gradually reduced, and at the same time, it also improves the ability to distinguish the two classifications of the network model. The final iteration until the network model can not determine whether the input is the Naruto image in the second image set or the generation For the Naruto image generated by the network model, the entire generation network model has been trained at this time. At this time, the image generated by the generation network model is an image with the style of anime Naruto.
S105、当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像。S105. When the discriminant probability value output by the discriminant network is greater than the preset value, save the trained generation network as an image generation model, and the image generation model is used to generate an image with a comic style.
具体地,通过设置预设值的方式,比如通过判别网络模型输出的概率值大于该预设值时,来确定判别网络模型的二分类的能力,进而确保生成网络模型生成的图像,具有动漫火影风格的图像。其中,所述预设值的大小在此不做限 定,可根据专家经验进行设定。当所述判别网络输出的判别概率值大于预设值时,则表明该生成网络模型可以用来生成具有漫画风格的图像,因此保存此时的生成网络作为漫画风格图像生成模型。Specifically, by setting a preset value, for example, by judging that the probability value of the output of the network model is greater than the preset value, the ability to discriminate the binary classification of the network model is determined, thereby ensuring that the image generated by the network model is generated, which has anime Naruto Stylized image. Wherein, the size of the preset value is not limited here, and can be set according to expert experience. When the discriminant probability value output by the discriminant network is greater than the preset value, it indicates that the generation network model can be used to generate comic-style images, so the generation network at this time is saved as a comic-style image generation model.
上述实施例提供的训练方法先根据预设漫画生成算法对第一图像集中的拍摄图像进行预处理以得到拍摄图像对应的目标漫画图像;然后将目标漫画图像作为生成式对抗网络中生成网络的输入,以及将生成网络输出的图像和第二图像集中与拍摄图像相关的漫画图像作为生成式对抗网络中判别网络的输入,从而对生成网络和判别网络进行交替迭代训练,直至判别网络输出的判别概率值大于预设值,此时得到的训练后的生成网络将作为图像生成模型。该训练方法不但可以训练出将拍摄的图像转换成具有漫画风格的图像的模型,还可以提高训练模型的效率。The training method provided by the foregoing embodiment first preprocesses the captured images in the first image set according to a preset comic generation algorithm to obtain the target comic images corresponding to the captured images; then uses the target comic images as the input of the generating network in the generative confrontation network , And use the image output of the generating network and the comic image related to the captured image in the second image set as the input of the discriminant network in the generative confrontation network, so as to perform alternate iterative training on the generation network and the discriminant network until the discriminant probability of the discriminant network output If the value is greater than the preset value, the trained generation network obtained at this time will be used as the image generation model. This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.
请参阅图3,图3是本申请的实施例提供的另一种图像生成模型训练方法的示意流程图。该图像生成模型基于生成式对抗网络进行模型训练得到的,当然也可以采用其他类似网络进行训练得到。Please refer to FIG. 3, which is a schematic flowchart of another image generation model training method provided by an embodiment of the present application. The image generation model is obtained by model training based on a generative confrontation network. Of course, it can also be obtained by training with other similar networks.
如图3所示,该图像生成模型训练方法,包括:步骤S201至步骤S208。As shown in FIG. 3, the image generation model training method includes: step S201 to step S208.
S201、获取多张拍摄图像和多张漫画图像。S201. Acquire multiple photographed images and multiple comic images.
S202、分别对所述拍摄图像和所述漫画图像进行剪切处理以得到剪切后的拍摄图像和漫画图像。S202: Perform cutting processing on the photographed image and the cartoon image respectively to obtain the photographed image and the cartoon image after cutting.
其中,分别对所述拍摄图像和所述漫画图像进行剪切处理以得到剪切后的拍摄图像和漫画图像,以确定剪切后的拍摄图像和漫画图像的图像大小均相同,比如均剪切为256×256尺寸的图像,当然也可以剪切为其他尺寸。Wherein, the shot image and the comic image are respectively cut to obtain the cut shot image and the comic image, so as to determine that the cut shot image and the comic image have the same image size, for example, both cut It is a 256×256 size image, of course, it can also be cut into other sizes.
S203、根据剪切后的拍摄图像构建第一图像集,以及根据剪切后的漫画图像构建第二图像集。S203: Construct a first image set based on the cut shot images, and construct a second image set based on the cut cartoon images.
具体地,将剪切后的拍摄图像构建第一图像集,以及将剪切后的漫画图像构建第二图像集,以使得第一图像集和第二图像集中的图像大小均相同。Specifically, the cut shot images are constructed into a first image set, and the cut cartoon images are constructed into a second image set, so that the sizes of the images in the first image set and the second image set are the same.
S204、获取第一图像集和第二图像集。S204. Acquire a first image set and a second image set.
其中,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像。需要说明的是,所述第一图像集中的图像数量和第二图像集中的图像数量可以相同,也可以不相同。Wherein, the first image set includes multiple photographed images, and the second image set includes multiple cartoon images. It should be noted that the number of images in the first image set and the number of images in the second image set may be the same or different.
S205、根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像。S205: Preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.
具体地,根据均值漂移算法对所述拍摄图像进行图像分割处理以得到具有层级结构的层级图像;根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像;将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的目标漫画图像。Specifically, perform image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure; perform processing on the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contours ; Image synthesis of the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
S206、获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络。S206. Obtain a preset generative countermeasure network, where the generative countermeasure network includes a generative network and a discriminant network.
具体地,获取预选设置的生成式对抗网络(Generative Adversarial Networks、GAN),该生成式对抗网络包括生成网络和判别网络,生成网络用于利用拍摄图像生成漫画图像,判别网络用于判别生成网络输出的图像是否为漫 画图像。Specifically, a pre-selected Generative Adversarial Networks (GAN) is obtained. The generative adversarial network includes a generation network and a discrimination network. The generation network is used to generate comic images from captured images, and the discrimination network is used to determine the output of the generation network. Whether the image of is a comic image.
S207、将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练。S207. Use the target comic image as the input of the generation network and use the image output by the generation network and the comic image as the input of the discrimination network, and perform alternate iterative training on the generation network and the discrimination network.
其中,进行交替迭代训练包括两个训练过程,分别为:训练生成网络和训练判别网络。具体地,将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练,最后的迭代直至判别网络模型无法判别输入的是第二图像集中的火影图像,还是生成网络模型生成的火影图像,这时整个生成网络模型已经训练好。Among them, performing alternate iterative training includes two training processes, namely: training a generation network and training a discriminant network. Specifically, the target comic image is used as the input of the generation network and the image output by the generation network and the comic image are used as the input of the discrimination network, and the generation network and the discrimination network are alternately iteratively trained , The final iteration until it is determined that the network model cannot determine whether the input is the Naruto image in the second image set or the Naruto image generated by the generation network model. At this time, the entire generation network model has been trained.
S208、当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像。S208: When the discriminant probability value output by the discriminant network is greater than the preset value, save the trained generation network as an image generation model, and the image generation model is used to generate an image with a comic style.
具体地,通过设置预设值的方式,比如通过判别网络模型输出的概率值大于该预设值时,来确定判别网络模型的二分类的能力,进而确保生成网络模型生成的图像,具有动漫火影风格的图像。Specifically, by setting a preset value, for example, by judging that the probability value of the output of the network model is greater than the preset value, the ability to discriminate the binary classification of the network model is determined, thereby ensuring that the image generated by the network model is generated, which has anime Naruto Stylized image.
当所述判别网络输出的判别概率值大于预设值时,则表明该生成网络模型可以用来生成具有漫画风格的图像,因此保存此时的生成网络作为漫画风格图像生成模型。When the discriminant probability value output by the discriminant network is greater than the preset value, it indicates that the generation network model can be used to generate comic-style images, so the generation network at this time is saved as a comic-style image generation model.
上述实施例提供的训练方法先构建第一图像集和第二图像集,再根据预设漫画生成算法对第一图像集中的拍摄图像进行预处理以得到拍摄图像对应的目标漫画图像;然后将目标漫画图像作为生成式对抗网络中生成网络的输入,以及将生成网络输出的图像和第二图像集中与拍摄图像相关的漫画图像作为生成式对抗网络中判别网络的输入,从而对生成网络和判别网络进行交替迭代训练,直至判别网络输出的判别概率值大于预设值,此时得到的训练后的生成网络将作为图像生成模型。该训练方法不但可以训练出将拍摄的图像转换成具有漫画风格的图像的模型,还可以提高训练模型的效率。The training method provided by the foregoing embodiment first constructs the first image set and the second image set, and then preprocesses the captured images in the first image set according to a preset cartoon generation algorithm to obtain the target cartoon image corresponding to the captured image; The comic image is used as the input of the generating network in the generative confrontation network, and the image output by the generating network and the comic image related to the captured image in the second image set are used as the input of the discriminating network in the generative confrontation network. Carry out alternate iterative training until the discriminative probability value of the discriminant network output is greater than the preset value, and the trained generation network obtained at this time will be used as the image generation model. This training method can not only train a model that converts captured images into comic-style images, but also improve the efficiency of training the model.
请参阅图4,图4是本申请的实施例提供的一种图像生成方法的示意流程图。该图像生成方法可以应用终端或服务器中,根据拍摄图像利用上述训练的图像生成模型生成具有漫画风格的图像。Please refer to FIG. 4, which is a schematic flowchart of an image generation method provided by an embodiment of the present application. The image generation method can be applied to a terminal or a server to generate a comic-style image based on the captured image using the above-trained image generation model.
在本实施例中,以图像生成方法应用在终端(手机)为例进行介绍,具体如图5所示,图5为本申请提供的图像生成方法的应用场景示意图。服务器采用上述实施例提供的任一项图像生成模型训练方法训练出图像生成模型,并将图像生成模型发送至终端中,终端接收服务器发送的图像生成模型并保存,该终端可运行图像生成方法根据拍摄图像利用该图像生成模型生成具有漫画风格的图像。In this embodiment, the application of the image generation method to a terminal (mobile phone) is taken as an example for introduction, as shown in FIG. 5, which is a schematic diagram of an application scenario of the image generation method provided by this application. The server uses any of the image generation model training methods provided in the above embodiments to train the image generation model, and sends the image generation model to the terminal. The terminal receives and saves the image generation model sent by the server. The terminal can run the image generation method according to The captured image uses the image generation model to generate a comic-style image.
例如,在一个实施例中,终端用于执行:获取待处理图像,所述待处理图像为拍摄图像;将所述待处理图像输入至图像生成模型以生成对应的漫画图像,其中,所述图像生成模型为采用上述任一项所述的图像生成模型训练方法训练得到的模型。进而将用户在终端中选择的待处理图像(比如钢拍摄的图像或磁盘中存储的图像)转换成具有漫画风格图像,以提高用户的体验。For example, in one embodiment, the terminal is used to perform: acquiring an image to be processed, which is a captured image; inputting the image to be processed into an image generation model to generate a corresponding comic image, wherein the image The generative model is a model obtained by training using any of the image generative model training methods described above. Then, the image to be processed selected by the user in the terminal (for example, an image taken by steel or an image stored in a disk) is converted into a comic-style image to improve the user's experience.
以下将结合图4和图5,对本实施例提供的图像生成方法进行详细介绍,如图4所示,该图像生成方法,包括:步骤S301至步骤S305。The image generation method provided in this embodiment will be described in detail below in conjunction with FIG. 4 and FIG. 5. As shown in FIG. 4, the image generation method includes: step S301 to step S305.
S301、获取待处理图像,所述待处理图像为拍摄图像。S301. Acquire an image to be processed, where the image to be processed is a captured image.
具体地,该待处理图像可以为用户刚拍摄的图片,或者是用户在图库中选择的图片,比如用户用手机拍摄的图片或者从之前拍摄图片中选择一张图片,想将其转换为具有卡通风格的漫画图像,可以将该图片发送至保存有漫画风格图像生成模型的服务器,由服务器将该待处理图像输入至漫画风格图像生成模型以生成对应的漫画图像,并将生成的漫画图像发送给用户。Specifically, the image to be processed may be a picture just taken by the user, or a picture selected by the user in the gallery, such as a picture taken by the user with a mobile phone or a picture selected from the previously taken pictures, and want to convert it to a cartoon Style comic image, you can send the picture to the server that saves the comic style image generation model, the server inputs the to-be-processed image into the comic style image generation model to generate the corresponding comic image, and sends the generated comic image to user.
在一个实施例中,还提供另一种图像生成方法,该图像生成方法还可以将获取的待处理图像作为目标图像,并执行步骤S305。In an embodiment, another image generation method is also provided. The image generation method may also use the acquired image to be processed as the target image, and execute step S305.
S302、根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结构的层级图像。S302. Perform image segmentation processing on the image to be processed according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure.
具体地,使用均值漂移(Mean-shift)算法对拍摄图像进行图像分割以及对图像进行层级处理,通过不断迭代将图像中的相似颜色统一,以得到均有层级结构的层级图像。Specifically, a Mean-shift algorithm is used to segment the captured images and perform hierarchical processing on the images, and the similar colors in the images are unified through continuous iteration to obtain hierarchical images with a hierarchical structure.
S303、根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像。S303: Process the image to be processed according to the Gaussian difference filter algorithm based on the stream to generate an edge image with edge contour lines.
具体地,基于流的高斯差分滤波器(Flow-Based Difference of Gaussian、FDoG)算法对所述拍摄图像进行边缘提取,以提取出所述拍摄图像对应的边缘图像。Specifically, a flow-based difference of Gaussian (FDoG) algorithm performs edge extraction on the captured image to extract an edge image corresponding to the captured image.
S304、将所述层级图像和所述边缘图像进行图像合成以得到目标图像。S304. Perform image synthesis on the hierarchical image and the edge image to obtain a target image.
具体地,将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的具体层级结构和边缘特征的图像,即目标图像。将该目标图像输入至图像生成模型以生成具有漫画风格的图像,可以提高生成图像的速度。Specifically, image synthesis is performed on the hierarchical image and the edge image to obtain an image of a specific hierarchical structure and edge feature corresponding to the captured image, that is, a target image. Inputting the target image to the image generation model to generate a comic-style image can increase the speed of image generation.
S305、将所述目标图像输入至图像生成模型以生成对应的漫画图像。S305. Input the target image to an image generation model to generate a corresponding comic image.
其中,所述图像生成模型为采用上述实施例提供的任一项所述的图像生成模型训练方法训练得到的模型。将所述目标图像输入至图像生成模型以生成对应的漫画图像,如图5所示,将根据层级图像和边缘图像合成的目标图像输入至模型,该模型为图像生成模型,利用该图像生成模型生成具有漫画风格的图像,如图5中的终端显示的图像,由此提高了用户的体验。Wherein, the image generation model is a model obtained by training using any of the image generation model training methods provided in the foregoing embodiments. The target image is input to the image generation model to generate the corresponding comic image. As shown in FIG. 5, the target image synthesized according to the hierarchical image and the edge image is input to the model, which is an image generation model, and the model is generated using the image Generate an image with a comic style, such as the image displayed by the terminal in Figure 5, thereby improving the user experience.
请参阅图6,图6是本申请的实施例提供的一种图像生成模型训练装置的示意性框图,该图像生成模型训练装置可以配置于服务器中,用于执行前述的图像生成模型训练方法。Please refer to FIG. 6. FIG. 6 is a schematic block diagram of an image generation model training device provided by an embodiment of the present application. The image generation model training device may be configured in a server to execute the aforementioned image generation model training method.
如图6所示,该图像生成模型训练装置400,包括:拍摄获取单元401、剪切处理单元402、图集构建单元403、数据获取单元404、预处理单元405、网络获取单元406、模型训练单元407和模型保存单元408。As shown in FIG. 6, the image generation model training device 400 includes: a photographing acquisition unit 401, a cutting processing unit 402, an atlas construction unit 403, a data acquisition unit 404, a preprocessing unit 405, a network acquisition unit 406, and model training Unit 407 and model saving unit 408.
拍摄获取单元401,用于获取多张拍摄图像和多张漫画图像。The photographing and acquiring unit 401 is configured to acquire multiple photographed images and multiple cartoon images.
剪切处理单元402,用于分别对所述拍摄图像和所述漫画图像进行剪切处理以得到剪切后的拍摄图像和漫画图像,其中剪切后的拍摄图像和漫画图像的图像大小相同。The cropping processing unit 402 is configured to perform cropping processing on the captured image and the cartoon image to obtain a cropped captured image and a cartoon image, wherein the cropped captured image and the cartoon image have the same image size.
图集构建单元403,用于根据剪切后的拍摄图像构建第一图像集,以及根据剪切后的漫画图像构建第二图像集。The atlas construction unit 403 is configured to construct a first image set based on the cut shot images, and construct a second image set based on the cut cartoon images.
数据获取单元404,用于获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像。The data acquisition unit 404 is configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images.
预处理单元405,用于根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像。The preprocessing unit 405 is configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image.
在一个实施例中,如图7所示,预处理单元405包括:层级处理子单元4051、边缘处理子单元4052和图像合成子单元4053。In an embodiment, as shown in FIG. 7, the preprocessing unit 405 includes: a hierarchical processing subunit 4051, an edge processing subunit 4052 and an image synthesis subunit 4053.
层级处理子单元4051,用于根据均值漂移算法对所述拍摄图像进行图像分割处理以得到具有层级结构的层级图像;边缘处理子单元4052,用于根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像;图像合成子单元4053,用于将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的目标漫画图像。The hierarchical processing subunit 4051 is configured to perform image segmentation processing on the captured image according to the mean shift algorithm to obtain a hierarchical image with a hierarchical structure; The captured image is processed to generate an edge image with an edge contour line; an image synthesis subunit 4053 is configured to perform image synthesis on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
网络获取单元406,用于获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络。The network obtaining unit 406 is configured to obtain a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network.
模型训练单元407,用于将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练。The model training unit 407 is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform the evaluation on the generation network and the discrimination network Perform alternating iterative training.
模型保存单元408,用于当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像。The model saving unit 408 is configured to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than a preset value. The image generation model is used to generate an image with a comic style.
请参阅图8,图8是本申请的实施例提供一种图像生成装置的示意性框图,该图像生成装置用于执行前述的图像生成方法。其中,该图像生成装置可以配置于服务器或终端中。Please refer to FIG. 8. FIG. 8 is a schematic block diagram of an image generation device provided by an embodiment of the present application, and the image generation device is used to execute the aforementioned image generation method. Wherein, the image generating device can be configured in a server or a terminal.
如图8所示,该图像生成装置500,包括:图像获取单元501、分割处理单元502、边缘处理单元503、图像合成单元504和图像生成单元505。As shown in FIG. 8, the image generation device 500 includes: an image acquisition unit 501, a segmentation processing unit 502, an edge processing unit 503, an image synthesis unit 504, and an image generation unit 505.
图像获取单元501,用于获取待处理图像,所述待处理图像为拍摄图像。The image acquisition unit 501 is configured to acquire an image to be processed, and the image to be processed is a captured image.
在一个实施例中,还可以将获取的待处理图像作为目标图像,并调用图像生成单元505。In an embodiment, the acquired image to be processed may also be used as the target image, and the image generating unit 505 may be called.
分割处理单元502,用于根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结构的层级图像。The segmentation processing unit 502 is configured to perform image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure.
边缘处理单元503,用于根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像。The edge processing unit 503 is configured to process the to-be-processed image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contour lines.
图像合成单元504,用于将所述层级图像和所述边缘图像进行图像合成以得到目标图像。The image synthesis unit 504 is configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image.
图像生成单元505,用于将所述目标图像输入至图像生成模型以生成对应的漫画图像。其中,所述图像生成模型为采用上述的图像生成模型训练方法训练得到的模型。The image generation unit 505 is configured to input the target image into the image generation model to generate a corresponding comic image. Wherein, the image generation model is a model obtained by training using the above-mentioned image generation model training method.
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和各单元的具体工作过程,可以参考前述方法实施例 中的对应过程,在此不再赘述。It should be noted that those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the device and each unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here. Repeat.
上述的装置可以实现为一种计算机程序的形式,该计算机程序可以在如图9所示的计算机设备上运行。The above-mentioned apparatus can be implemented in the form of a computer program, and the computer program can be run on the computer device as shown in FIG. 9.
请参阅图9,图9是本申请实施例提供的一种计算机设备的结构示意性框图。该计算机设备可以是服务器或终端。Please refer to FIG. 9, which is a schematic block diagram of the structure of a computer device according to an embodiment of the present application. The computer equipment can be a server or a terminal.
参阅图9,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口,其中,存储器可以包括非易失性存储介质和内存储器。Referring to FIG. 9, the computer device includes a processor, a memory, and a network interface connected through a system bus, where the memory may include a non-volatile storage medium and an internal memory.
非易失性存储介质可存储操作系统和计算机程序。该计算机程序包括程序指令,该程序指令被执行时,可使得处理器执行任意一种图像生成模型训练方法或图像生成方法。The non-volatile storage medium can store an operating system and a computer program. The computer program includes program instructions. When the program instructions are executed, the processor can execute any image generation model training method or image generation method.
处理器用于提供计算和控制能力,支撑整个计算机设备的运行。The processor is used to provide computing and control capabilities and support the operation of the entire computer equipment.
内存储器为非易失性存储介质中的计算机程序的运行提供环境,该计算机程序被处理器执行时,可使得处理器执行任意一种图像生成模型训练方法或图像生成方法。The internal memory provides an environment for the operation of the computer program in the non-volatile storage medium. When the computer program is executed by the processor, the processor can execute any image generation model training method or image generation method.
该网络接口用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图9中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。The network interface is used for network communication, such as sending assigned tasks. Those skilled in the art can understand that the structure shown in FIG. 9 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. The specific computer equipment may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
应当理解的是,处理器可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that the processor may be a central processing unit (Central Processing Unit, CPU), the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. Among them, the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
本申请的实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序中包括程序指令,所述处理器执行所述程序指令,实现本申请实施例提供的任意一项图像生成模型训练方法或图像生成方法。The embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the present application Any of the image generation model training methods or image generation methods provided in the embodiments.
其中,所述计算机可读存储介质可以是前述实施例所述的计算机设备的内部存储单元,例如所述计算机设备的硬盘或内存。所述计算机可读存储介质也可以是所述计算机设备的外部存储设备,例如所述计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。The computer-readable storage medium may be the internal storage unit of the computer device described in the foregoing embodiment, such as the hard disk or memory of the computer device. The computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (SMC), or a secure digital (Secure Digital, SD) equipped on the computer device. ) Card, Flash Card, etc.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.
Claims (20)
- 一种图像生成模型训练方法,包括:An image generation model training method, including:获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像;Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像;Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络;Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练;Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像,所述具有漫画风格的图像为所述图像生成模型根据拍摄图像生成的漫画图像。When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model. The image generation model is used to generate comic-style images, and the comic-style images are all The image generation model generates a comic image based on the captured image.
- 根据权利要求1所述的图像生成模型训练方法,其中,所述获取第一图像集和第二图像集之前,还包括:The image generation model training method according to claim 1, wherein before said acquiring the first image set and the second image set, the method further comprises:获取多张拍摄图像和多张漫画图像;Acquire multiple photographed images and multiple comic images;分别对所述拍摄图像和所述漫画图像进行剪切处理以得到剪切后的拍摄图像和漫画图像,其中剪切后的拍摄图像和漫画图像的图像大小相同;Performing cutting processing on the photographed image and the comic image respectively to obtain a photographed image and a comic image after cutting, wherein the photographed image and the comic image after the cutting have the same image size;根据剪切后的拍摄图像构建第一图像集,以及根据剪切后的漫画图像构建第二图像集。The first image set is constructed based on the cut shot images, and the second image set is constructed based on the cut cartoon images.
- 根据权利要求1或2所述的图像生成模型训练方法,其中,所述根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像,包括:The image generation model training method according to claim 1 or 2, wherein the preprocessing the captured image according to a preset comic generation algorithm to obtain the target comic image corresponding to the captured image comprises:根据均值漂移算法对所述拍摄图像进行图像分割处理以得到具有层级结构的层级图像;Performing image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像;Processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的目标漫画图像。Image synthesis is performed on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
- 根据权利要求3所述的图像生成模型训练方法,其中,所述根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像,包括:The image generation model training method according to claim 3, wherein the processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with an edge contour line comprises:根据切线流公式,在所述拍摄图像中构建切线流;According to the tangent flow formula, construct a tangent flow in the captured image;通过类二值图像边界计算公式,计算构建的切线流的高斯差值以得到具有边缘轮廓线的边缘图像。Through the calculation formula of the similar binary image boundary, the Gaussian difference of the constructed tangent flow is calculated to obtain the edge image with the edge contour line.
- 根据权利要求4所述的图像生成模型训练方法,其中,所述切线流公式为:The image generation model training method according to claim 4, wherein the tangent flow formula is:其中,Ω(x)表示X的邻域,X=(x,y)表示所述拍摄图像的像素点;k是归一化向量;t(y)表示y点处的当前归一化切线向量;φ(x,y)为符号函数,φ(x,y)∈{1,-1};w s(x,y)为空间权重向量;w m(x,y)为量级权重函数;w d(x,y)为方向权重函数;初始时,t 0(x)设为与图像梯度向量正交的向量; Among them, Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalized tangent vector at point y ;Φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w s (x,y) is a spatial weight vector; w m (x,y) is a magnitude weight function; w d (x,y) is the direction weight function; initially, t 0 (x) is set to a vector orthogonal to the image gradient vector;所述类二值图像边界计算公式为:The formula for calculating the boundary of the binary image is:其中,D(x)表示二值图像边界,H(x)为所述基于流的高斯差分滤波器算法的滤波器函数;λ为系数因子,λ取值范围为(0,1);τ取值为0.5。Among them, D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1); τ is taken The value is 0.5.
- 一种图像生成方法,包括:An image generation method, including:获取待处理图像,所述待处理图像为拍摄图像;Acquiring an image to be processed, where the image to be processed is a captured image;根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结构的层级图像;Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像;Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;将所述层级图像和所述边缘图像进行图像合成以得到目标图像;Image synthesis of the hierarchical image and the edge image to obtain a target image;将所述目标图像输入至图像生成模型以生成对应的漫画图像,其中,所述图像生成模型为采用权利要求1至5中任一项所述的图像生成模型训练方法训练得到的模型。The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the image generation model training method according to any one of claims 1 to 5.
- 一种图像生成模型训练装置,包括:An image generation model training device, including:数据获取单元,用于获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像;A data acquisition unit, configured to acquire a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;预处理单元,用于根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像;A preprocessing unit, configured to preprocess the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;网络获取单元,用于获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络;A network acquisition unit, configured to acquire a preset generative confrontation network, the generative confrontation network including a generation network and a discrimination network;模型训练单元,用于将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练;The model training unit is configured to use the target comic image as the input of the generation network, and use the image output by the generation network and the comic image as the input of the discrimination network, and perform operations on the generation network and the discrimination network Alternate iterative training;模型保存单元,用于当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像,所述具有漫画风格的图像为所述图像生成模型根据拍摄图像生成的漫画图像。The model saving unit is used to save the trained generation network as an image generation model when the discrimination probability value output by the discrimination network is greater than the preset value. The image generation model is used to generate a comic-style image. The comic style image is a comic image generated by the image generation model based on the captured image.
- 一种图像生成装置,包括:An image generating device, including:图像获取单元,用于获取待处理图像,所述待处理图像为拍摄图像;An image acquisition unit for acquiring an image to be processed, the image to be processed is a captured image;分割处理单元,用于根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结构的层级图像;A segmentation processing unit, configured to perform image segmentation processing on the to-be-processed image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;边缘处理单元,用于根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像;An edge processing unit, configured to process the to-be-processed image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;图像合成单元,用于将所述层级图像和所述边缘图像进行图像合成以得到目标图像;An image synthesis unit, configured to perform image synthesis on the hierarchical image and the edge image to obtain a target image;图像生成单元,用于将所述目标图像输入至图像生成模型以生成对应的漫画图像,其中,所述图像生成模型为采用权利要求1至5中任一项所述的图像生成模型训练方法训练得到的模型。An image generation unit for inputting the target image into an image generation model to generate a corresponding comic image, wherein the image generation model is trained using the image generation model training method according to any one of claims 1 to 5 The resulting model.
- 一种计算机设备,所述计算机设备包括存储器和处理器;A computer device including a memory and a processor;所述存储器用于存储计算机程序;The memory is used to store computer programs;所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现如下步骤:The processor is configured to execute the computer program, and when executing the computer program, implement the following steps:获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像;Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像;Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络;Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练;Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像,所述具有漫画风格的图像为所述图像生成模型根据拍摄图像生成的漫画图像。When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model. The image generation model is used to generate comic-style images, and the comic-style images are all The image generation model generates a comic image based on the captured image.
- 根据权利要求9所述的计算机设备,其中,所述处理器在实现所述获取第一图像集和第二图像集之前,还实现如下步骤:The computer device according to claim 9, wherein the processor further implements the following steps before implementing the acquiring of the first image set and the second image set:获取多张拍摄图像和多张漫画图像;Acquire multiple photographed images and multiple comic images;分别对所述拍摄图像和所述漫画图像进行剪切处理以得到剪切后的拍摄图像和漫画图像,其中剪切后的拍摄图像和漫画图像的图像大小相同;Performing cutting processing on the photographed image and the comic image respectively to obtain a photographed image and a comic image after cutting, wherein the photographed image and the comic image after the cutting have the same image size;根据剪切后的拍摄图像构建第一图像集,以及根据剪切后的漫画图像构建第二图像集。The first image set is constructed based on the cut shot images, and the second image set is constructed based on the cut cartoon images.
- 根据权利要求9或10所述的计算机设备,其中,所述处理器在实现所述根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像时,具体实现:The computer device according to claim 9 or 10, wherein the processor implements the preprocessing of the captured image according to a preset comic generation algorithm to obtain the target comic image corresponding to the captured image, specifically achieve:根据均值漂移算法对所述拍摄图像进行图像分割处理以得到具有层级结构的层级图像;Performing image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像;Processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的 目标漫画图像。Image synthesis is performed on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
- 根据权利要求11所述的计算机设备,其中,所述处理器在实现所述根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像时,具体实现:11. The computer device according to claim 11, wherein, when the processor implements the processing of the captured image according to the flow-based Gaussian difference filter algorithm to generate an edge image with edge contours, it specifically implements:根据切线流公式,在所述拍摄图像中构建切线流;According to the tangent flow formula, construct a tangent flow in the captured image;通过类二值图像边界计算公式,计算构建的切线流的高斯差值以得到具有边缘轮廓线的边缘图像。Through the calculation formula of the similar binary image boundary, the Gaussian difference of the constructed tangent flow is calculated to obtain the edge image with the edge contour line.
- 根据权利要求12所述的计算机设备,其中,所述切线流公式为:The computer device according to claim 12, wherein the tangent flow formula is:其中,Ω(x)表示X的邻域,X=(x,y)表示所述拍摄图像的像素点;k是归一化向量;t(y)表示y点处的当前归一化切线向量;φ(x,y)为符号函数,φ(x,y)∈{1,-1};w s(x,y)为空间权重向量;w m(x,y)为量级权重函数;w d(x,y)为方向权重函数;初始时,t 0(x)设为与图像梯度向量正交的向量; Among them, Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalized tangent vector at point y ;Φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w s (x,y) is a spatial weight vector; w m (x,y) is a magnitude weight function; w d (x,y) is the direction weight function; initially, t 0 (x) is set to a vector orthogonal to the image gradient vector;所述类二值图像边界计算公式为:The formula for calculating the boundary of the binary image is:其中,D(x)表示二值图像边界,H(x)为所述基于流的高斯差分滤波器算法的滤波器函数;λ为系数因子,λ取值范围为(0,1);τ取值为0.5。Among them, D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1); τ is taken The value is 0.5.
- 一种计算机设备,所述计算机设备包括存储器和处理器;A computer device including a memory and a processor;所述存储器用于存储计算机程序;The memory is used to store computer programs;所述处理器,用于执行所述计算机程序并在执行所述计算机程序时,实现如下步骤:The processor is configured to execute the computer program, and when executing the computer program, implement the following steps:获取待处理图像,所述待处理图像为拍摄图像;Acquiring an image to be processed, where the image to be processed is a captured image;根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结构的层级图像;Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像;Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;将所述层级图像和所述边缘图像进行图像合成以得到目标图像;Image synthesis of the hierarchical image and the edge image to obtain a target image;将所述目标图像输入至图像生成模型以生成对应的漫画图像,其中,所述图像生成模型为采用权利要求1至5中任一项所述的图像生成模型训练方法训练得到的模型。The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the image generation model training method according to any one of claims 1 to 5.
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如下步骤:A computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the processor implements the following steps:获取第一图像集和第二图像集,所述第一图像集包括多张拍摄图像,所述第二图像集包括多张漫画图像;Acquiring a first image set and a second image set, the first image set includes a plurality of photographed images, and the second image set includes a plurality of cartoon images;根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对 应的目标漫画图像;Preprocessing the captured image according to a preset comic generation algorithm to obtain a target comic image corresponding to the captured image;获取预设的生成式对抗网络,所述生成式对抗网络包括生成网络和判别网络;Acquiring a preset generative countermeasure network, the generative countermeasure network including a generative network and a discriminant network;将所述目标漫画图像作为所述生成网络的输入以及将所述生成网络输出的图像和所述漫画图像作为所述判别网络的输入,对所述生成网络和判别网络进行交替迭代训练;Using the target comic image as the input of the generating network and using the image output by the generating network and the comic image as the input of the discriminating network, and performing alternating iterative training on the generating network and the discriminating network;当所述判别网络输出的判别概率值大于预设值时,保存训练后的生成网络作为图像生成模型,所述图像生成模型用于生成具有漫画风格的图像,所述具有漫画风格的图像为所述图像生成模型根据拍摄图像生成的漫画图像。When the discriminant probability value output by the discriminant network is greater than the preset value, the trained generation network is saved as an image generation model. The image generation model is used to generate comic-style images, and the comic-style images are all The image generation model generates a comic image based on the captured image.
- 根据权利要求15所述的计算机可读存储介质,其中,所述处理器在实现所述获取第一图像集和第二图像集之前,还实现如下步骤:The computer-readable storage medium according to claim 15, wherein the processor further implements the following steps before implementing the acquiring of the first image set and the second image set:获取多张拍摄图像和多张漫画图像;Acquire multiple photographed images and multiple comic images;分别对所述拍摄图像和所述漫画图像进行剪切处理以得到剪切后的拍摄图像和漫画图像,其中剪切后的拍摄图像和漫画图像的图像大小相同;Performing cutting processing on the photographed image and the comic image respectively to obtain a photographed image and a comic image after cutting, wherein the photographed image and the comic image after the cutting have the same image size;根据剪切后的拍摄图像构建第一图像集,以及根据剪切后的漫画图像构建第二图像集。The first image set is constructed based on the cut shot images, and the second image set is constructed based on the cut cartoon images.
- 根据权利要求15或16所述的计算机可读存储介质,其中,所述处理器在实现所述根据预设漫画生成算法对所述拍摄图像进行预处理以得到所述拍摄图像对应的目标漫画图像时,具体实现:The computer-readable storage medium according to claim 15 or 16, wherein the processor performs the preprocessing of the photographed image according to a preset comic generation algorithm to obtain a target comic image corresponding to the photographed image When, the specific realization:根据均值漂移算法对所述拍摄图像进行图像分割处理以得到具有层级结构的层级图像;Performing image segmentation processing on the captured image according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像;Processing the captured image according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;将所述层级图像和所述边缘图像进行图像合成以得到所述拍摄图像对应的目标漫画图像。Image synthesis is performed on the hierarchical image and the edge image to obtain a target comic image corresponding to the captured image.
- 根据权利要求17所述的计算机可读存储介质,其中,所述处理器在实现所述根据基于流的高斯差分滤波器算法对所述拍摄图像进行处理以生成具有边缘轮廓线的边缘图像时,具体实现:The computer-readable storage medium according to claim 17, wherein, when the processor implements the processing of the captured image according to a stream-based difference of Gaussian filter algorithm to generate an edge image with edge contours, Implementation:根据切线流公式,在所述拍摄图像中构建切线流;According to the tangent flow formula, construct a tangent flow in the captured image;通过类二值图像边界计算公式,计算构建的切线流的高斯差值以得到具有边缘轮廓线的边缘图像。Through the calculation formula of the similar binary image boundary, the Gaussian difference of the constructed tangent flow is calculated to obtain the edge image with the edge contour line.
- 根据权利要求18所述的计算机可读存储介质,其中,所述切线流公式为:18. The computer-readable storage medium of claim 18, wherein the tangent flow formula is:其中,Ω(x)表示X的邻域,X=(x,y)表示所述拍摄图像的像素点;k是归一化向量;t(y)表示y点处的当前归一化切线向量;φ(x,y)为符号函数,φ(x,y)∈{1,-1};w s(x,y)为空间权重向量;w m(x,y)为量级权重函数;w d(x,y)为方向权重函数;初始时,t 0(x)设为与图像梯度向量正交的向量; Among them, Ω(x) represents the neighborhood of X, X=(x,y) represents the pixel of the captured image; k is the normalized vector; t(y) represents the current normalized tangent vector at point y ;Φ(x,y) is a symbolic function, φ(x,y)∈{1,-1}; w s (x,y) is a spatial weight vector; w m (x,y) is a magnitude weight function; w d (x,y) is the direction weight function; initially, t 0 (x) is set to a vector orthogonal to the image gradient vector;所述类二值图像边界计算公式为:The formula for calculating the boundary of the binary image is:其中,D(x)表示二值图像边界,H(x)为所述基于流的高斯差分滤波器算法的滤波器函数;λ为系数因子,λ取值范围为(0,1);τ取值为0.5。Among them, D(x) represents the boundary of the binary image, H(x) is the filter function of the flow-based Gaussian difference filter algorithm; λ is the coefficient factor, and the value range of λ is (0,1); τ is taken The value is 0.5.
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如下步骤:A computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the processor implements the following steps:获取待处理图像,所述待处理图像为拍摄图像;Acquiring an image to be processed, where the image to be processed is a captured image;根据均值漂移算法对所述待处理图像进行图像分割处理以得到具有层级结构的层级图像;Performing image segmentation processing on the image to be processed according to a mean shift algorithm to obtain a hierarchical image with a hierarchical structure;根据基于流的高斯差分滤波器算法对所述待处理图像进行处理以生成具有边缘轮廓线的边缘图像;Processing the image to be processed according to a stream-based Gaussian difference filter algorithm to generate an edge image with edge contour lines;将所述层级图像和所述边缘图像进行图像合成以得到目标图像;Image synthesis of the hierarchical image and the edge image to obtain a target image;将所述目标图像输入至图像生成模型以生成对应的漫画图像,其中,所述图像生成模型为采用权利要求1至5中任一项所述的图像生成模型训练方法训练得到的模型。The target image is input to an image generation model to generate a corresponding comic image, wherein the image generation model is a model trained by the image generation model training method according to any one of claims 1 to 5.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910267519.9 | 2019-04-03 | ||
CN201910267519.9A CN110097086B (en) | 2019-04-03 | 2019-04-03 | Image generation model training method, image generation method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020199478A1 true WO2020199478A1 (en) | 2020-10-08 |
Family
ID=67444266
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/103142 WO2020199478A1 (en) | 2019-04-03 | 2019-08-28 | Method for training image generation model, image generation method, device and apparatus, and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110097086B (en) |
WO (1) | WO2020199478A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529058A (en) * | 2020-12-03 | 2021-03-19 | 北京百度网讯科技有限公司 | Image generation model training method and device and image generation method and device |
CN113989441A (en) * | 2021-11-16 | 2022-01-28 | 北京航空航天大学 | Three-dimensional cartoon model automatic generation method and system based on single face image |
CN114758029A (en) * | 2022-04-25 | 2022-07-15 | 杭州小影创新科技股份有限公司 | Cartoon special-effect image color changing method and system |
CN116862766A (en) * | 2023-06-28 | 2023-10-10 | 北京金阳普泰石油技术股份有限公司 | Intelligent mapping and iterative seamless splicing method and device based on edge generation model |
CN116912345A (en) * | 2023-07-12 | 2023-10-20 | 天翼爱音乐文化科技有限公司 | Portrait cartoon processing method, device, equipment and storage medium |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110097086B (en) * | 2019-04-03 | 2023-07-18 | 平安科技(深圳)有限公司 | Image generation model training method, image generation method, device, equipment and storage medium |
CN110516201B (en) * | 2019-08-20 | 2023-03-28 | Oppo广东移动通信有限公司 | Image processing method, image processing device, electronic equipment and storage medium |
CN110620884B (en) * | 2019-09-19 | 2022-04-22 | 平安科技(深圳)有限公司 | Expression-driven-based virtual video synthesis method and device and storage medium |
CN111080512B (en) * | 2019-12-13 | 2023-08-15 | 咪咕动漫有限公司 | Cartoon image generation method and device, electronic equipment and storage medium |
CN113139893B (en) * | 2020-01-20 | 2023-10-03 | 北京达佳互联信息技术有限公司 | Image translation model construction method and device and image translation method and device |
CN111589156A (en) * | 2020-05-20 | 2020-08-28 | 北京字节跳动网络技术有限公司 | Image processing method, device, equipment and computer readable storage medium |
CN111899154A (en) * | 2020-06-24 | 2020-11-06 | 广州梦映动漫网络科技有限公司 | Cartoon video generation method, cartoon generation device, cartoon generation equipment and cartoon generation medium |
CN112101204B (en) * | 2020-09-14 | 2024-01-23 | 北京百度网讯科技有限公司 | Training method, image processing method, device and equipment for generating type countermeasure network |
CN112132208B (en) * | 2020-09-18 | 2023-07-14 | 北京奇艺世纪科技有限公司 | Image conversion model generation method and device, electronic equipment and storage medium |
CN114067052A (en) * | 2021-11-16 | 2022-02-18 | 百果园技术(新加坡)有限公司 | Cartoon model construction method, device, equipment, storage medium and program product |
CN114359449A (en) * | 2022-01-13 | 2022-04-15 | 北京大橘大栗文化传媒有限公司 | Face digital asset manufacturing method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364029A (en) * | 2018-03-19 | 2018-08-03 | 百度在线网络技术(北京)有限公司 | Method and apparatus for generating model |
CN108564127A (en) * | 2018-04-19 | 2018-09-21 | 腾讯科技(深圳)有限公司 | Image conversion method, device, computer equipment and storage medium |
CN110097086A (en) * | 2019-04-03 | 2019-08-06 | 平安科技(深圳)有限公司 | Image generates model training method, image generating method, device, equipment and storage medium |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7171042B2 (en) * | 2000-12-04 | 2007-01-30 | Intel Corporation | System and method for classification of images and videos |
KR101049928B1 (en) * | 2011-02-21 | 2011-07-15 | (주)올라웍스 | Method, terminal and computer-readable recording medium for generating panoramic images |
US10891485B2 (en) * | 2017-05-16 | 2021-01-12 | Google Llc | Image archival based on image categories |
US10268928B2 (en) * | 2017-06-07 | 2019-04-23 | Adobe Inc. | Combined structure and style network |
CN107330956B (en) * | 2017-07-03 | 2020-08-07 | 广东工业大学 | Cartoon hand drawing unsupervised coloring method and device |
CN107423701B (en) * | 2017-07-17 | 2020-09-01 | 智慧眼科技股份有限公司 | Face unsupervised feature learning method and device based on generative confrontation network |
CN108596267B (en) * | 2018-05-03 | 2020-08-28 | Oppo广东移动通信有限公司 | Image reconstruction method, terminal equipment and computer readable storage medium |
CN109087380B (en) * | 2018-08-02 | 2023-10-20 | 咪咕文化科技有限公司 | Cartoon drawing generation method, device and storage medium |
CN109376582B (en) * | 2018-09-04 | 2022-07-29 | 电子科技大学 | Interactive face cartoon method based on generation of confrontation network |
-
2019
- 2019-04-03 CN CN201910267519.9A patent/CN110097086B/en active Active
- 2019-08-28 WO PCT/CN2019/103142 patent/WO2020199478A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364029A (en) * | 2018-03-19 | 2018-08-03 | 百度在线网络技术(北京)有限公司 | Method and apparatus for generating model |
CN108564127A (en) * | 2018-04-19 | 2018-09-21 | 腾讯科技(深圳)有限公司 | Image conversion method, device, computer equipment and storage medium |
CN110097086A (en) * | 2019-04-03 | 2019-08-06 | 平安科技(深圳)有限公司 | Image generates model training method, image generating method, device, equipment and storage medium |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529058A (en) * | 2020-12-03 | 2021-03-19 | 北京百度网讯科技有限公司 | Image generation model training method and device and image generation method and device |
CN113989441A (en) * | 2021-11-16 | 2022-01-28 | 北京航空航天大学 | Three-dimensional cartoon model automatic generation method and system based on single face image |
CN113989441B (en) * | 2021-11-16 | 2024-05-24 | 北京航空航天大学 | Automatic three-dimensional cartoon model generation method and system based on single face image |
CN114758029A (en) * | 2022-04-25 | 2022-07-15 | 杭州小影创新科技股份有限公司 | Cartoon special-effect image color changing method and system |
CN116862766A (en) * | 2023-06-28 | 2023-10-10 | 北京金阳普泰石油技术股份有限公司 | Intelligent mapping and iterative seamless splicing method and device based on edge generation model |
CN116912345A (en) * | 2023-07-12 | 2023-10-20 | 天翼爱音乐文化科技有限公司 | Portrait cartoon processing method, device, equipment and storage medium |
CN116912345B (en) * | 2023-07-12 | 2024-04-26 | 天翼爱音乐文化科技有限公司 | Portrait cartoon processing method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110097086A (en) | 2019-08-06 |
CN110097086B (en) | 2023-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020199478A1 (en) | Method for training image generation model, image generation method, device and apparatus, and storage medium | |
US11481869B2 (en) | Cross-domain image translation | |
CN109493350B (en) | Portrait segmentation method and device | |
WO2020119527A1 (en) | Human action recognition method and apparatus, and terminal device and storage medium | |
CN106803055B (en) | Face identification method and device | |
CN109829448B (en) | Face recognition method, face recognition device and storage medium | |
CN111144242B (en) | Three-dimensional target detection method, device and terminal | |
US20200151849A1 (en) | Visual style transfer of images | |
WO2019011249A1 (en) | Method, apparatus, and device for determining pose of object in image, and storage medium | |
EP3204888A1 (en) | Spatial pyramid pooling networks for image processing | |
CN109583509B (en) | Data generation method and device and electronic equipment | |
CN110852349A (en) | Image processing method, detection method, related equipment and storage medium | |
WO2018082308A1 (en) | Image processing method and terminal | |
CN111666905B (en) | Model training method, pedestrian attribute identification method and related device | |
CN113112518B (en) | Feature extractor generation method and device based on spliced image and computer equipment | |
CN107784288A (en) | A kind of iteration positioning formula method for detecting human face based on deep neural network | |
CN112308866A (en) | Image processing method, image processing device, electronic equipment and storage medium | |
WO2010043954A1 (en) | Method, apparatus and computer program product for providing pattern detection with unknown noise levels | |
CN112862807A (en) | Data processing method and device based on hair image | |
WO2019209751A1 (en) | Superpixel merging | |
TWI711004B (en) | Picture processing method and device | |
US9940718B2 (en) | Apparatus and method for extracting peak image from continuously photographed images | |
CN111639537A (en) | Face action unit identification method and device, electronic equipment and storage medium | |
CN112884884A (en) | Candidate region generation method and system | |
CN108038864B (en) | Method and system for extracting animal target image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19923585 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19923585 Country of ref document: EP Kind code of ref document: A1 |