CN117036711A - Weak supervision semantic segmentation method based on attention adjustment - Google Patents

Weak supervision semantic segmentation method based on attention adjustment Download PDF

Info

Publication number
CN117036711A
CN117036711A CN202311064941.7A CN202311064941A CN117036711A CN 117036711 A CN117036711 A CN 117036711A CN 202311064941 A CN202311064941 A CN 202311064941A CN 117036711 A CN117036711 A CN 117036711A
Authority
CN
China
Prior art keywords
attention
block
class
semantic segmentation
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311064941.7A
Other languages
Chinese (zh)
Inventor
苏京峰
李军侠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202311064941.7A priority Critical patent/CN117036711A/en
Publication of CN117036711A publication Critical patent/CN117036711A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a weak supervision semantic segmentation method based on attention regulation, which explores the application of a transducer in a weak supervision semantic segmentation task. The Transformer-based method optimizes the class activation graph using attention, but the class activation graph obtained after optimization has an incomplete activation problem due to the fact that the attention between partial class and block is wrong. In order to solve the problem, the invention provides a novel weak supervision semantic segmentation framework, wherein an attention adjustment strategy is designed in the framework, the attention between classes is adjusted according to the attention between blocks, and more target areas can be activated by the adjusted attention. Compared with the latest method, the method provided by the invention achieves the optimal result on the PASCAL VOC 2012 data set and the MS COCO 2014 data set.

Description

Weak supervision semantic segmentation method based on attention adjustment
Technical Field
The invention belongs to the technical field of image segmentation, and particularly relates to a weak supervision semantic segmentation method based on attention regulation.
Background
Semantic segmentation is one of the fundamental and challenging tasks in the computer vision field, whose research purpose is to classify each pixel in an image and assign it to a specific semantic class. Semantic segmentation has wide application in many fields, such as image recognition, autopilot, medical image analysis, scene understanding, and video analysis, etc., which can help computers better understand the content in images, thereby enabling automated scene understanding and decision making. In recent years, due to the vigorous development of the deep learning method, the semantic segmentation has also made remarkable progress, wherein a fully supervised semantic segmentation model is widely applied and has excellent performance. However, training the fully supervised semantic segmentation model often requires large-scale pixel-level labeling data, and obtaining the pixel-level labeling data is often difficult, time-consuming and labor-consuming. To address this problem, many efforts began to employ weakly supervised semantic segmentation techniques. The semantic segmentation network is trained by weak labels such as boundary box labels, point labels, graffiti labels or image-level labels. The image level labels are the labels which are most convenient to acquire, and are widely studied in weak supervision semantic segmentation.
Although the acquisition of image-level annotations is very convenient, the image-level annotations have a problem in that they do not provide sufficient location supervision information, because they only give information on the class of objects contained in one image, and do not indicate specific location information of the class of objects in the image. The development of class activation diagrams (CAMs) provides an efficient way to obtain location information using only image level labels. For weakly supervised semantic segmentation of image level labels, most existing approaches are usually solved using the following procedure: 1) Training a Convolutional Neural Network (CNN) using image-level labeling, from which class activation maps are generated to obtain seed regions; 2) Expanding the seed area with a certain constraint to obtain a pseudo tag; 3) The fully supervised semantic segmentation network is trained using pseudo labels as real labels. However, the class activation map generated by convolutional neural networks has a problem in that it tends to activate a localized, discernable region, while ignoring the complete object range, resulting in incomplete activation problems. At present, research proves that the characteristic is caused by the inherent characteristic of the convolutional neural network, namely, the convolutional operation in the convolutional neural network can only capture a small range of characteristic dependence and can not explore global characteristic relations, so that the activation object area is too small, the quality of the generated pseudo tag is influenced, and finally, an ideal weak supervision semantic segmentation result is difficult to obtain.
At present, transform has enjoyed tremendous success in many computer vision tasks, mainly due to its own attention mechanisms. The transducer's attention mechanism can model global feature relationships and overcome the above-described drawbacks of convolutional neural networks. Some researchers have started weak-supervision semantic segmentation studies using transformers, which typically use a Transformer structure to extract image features and generate class activation graphs, and then use attention to optimize the class activation graphs to obtain a more complete class activation graph. Although the existing weak supervision semantic segmentation method based on the Transformer generally uses attention to optimize the class activation graph, the class activation graph still cannot completely activate the object region after being subjected to attention optimization due to errors between the attention middle classification generated by the Transformer and the attention between the blocks.
Disclosure of Invention
The invention aims to solve the problem that a target area cannot be completely activated in weak supervision semantic segmentation, and provides a weak supervision semantic segmentation method based on attention fusion.
The technical scheme is as follows: in order to achieve the purpose of the invention, the technical scheme adopted by the invention is as follows: a weak supervision semantic segmentation method based on attention adjustment, comprising the following steps:
step 1, data preparation: acquiring a labeling image data set, and dividing the data set into a training set, a verification set and a test set;
step 2, data preprocessing: carrying out random horizontal overturn and color dithering treatment on the image, carrying out normalization treatment on the image, carrying out random clipping, and taking the clipped image as the input of a weak supervision semantic segmentation model;
step 3, building a model: constructing a weak supervision semantic segmentation model by taking DeiT-S pre-trained on an ImageNet as a backbone of the model;
step 4, model training: optimizing a weak supervision semantic segmentation model by using an Adam optimizer, training the model for a set period by using a training set, and generating a class activation diagram by using a loss function through multi-label cross entropy loss and the trained model;
step 5, distributing a class to each pixel position according to the value of the class activation diagram to generate a pixel-level pseudo tag, and then training a semantic segmentation network deep V2 by using the pixel-level pseudo tag; and inputting the pictures in the verification set and the test set into the trained model to obtain a final segmentation map.
Further, the model construction in the step 3 comprises:
step 3.1, constructing a weak supervision semantic segmentation framework based on attention fusion, segmenting a preprocessed image into N non-overlapping blocks, constructing N block tokens through linear mapping, and splicing C class tokens and N block tokens to obtain an input token of the framework;
step 3.2, inputting the input token into a Transfomer coding layer in the framework to obtain an output token; then extracting the last N block tokens from the output tokens to form an output block token Tp_out, and carrying out recombination and convolution operation on the output block token Tp_out to obtain an initial class activation diagram Original-CAM;
and 3.3, when the input token passes through the Transfomer coding layer, the Attention module calculates the Attention of the input token to generate Attention, and the calculation formula is as follows:
wherein Q and K respectively represent a matrix array and a Key matrix obtained by linear projection of an input token when the input token passes through a transducer coding layer, T represents matrix transposition, and d k Representing a scaling factor;
step 3.4, attention is further divided into class-to-block attention A c2p Sum block-to-block attention a p2p Then pass the attention A between the blocks p2p Attention a between class and block c2p Adjusting;
step 3.5, use class-to-block attention A c2p Sum block-to-block attention a p2p The initial class activation map is optimized.
Further, class-to-block attention A c2p Sum block-to-block attention a p2p The expression is as follows:
A c2p =Attention[1:C,C+1:C+N]
A p2p =Attention[C+1:C+N,C+1:C+N]
the attention between class c and block i is adjusted as follows:
firstly, sorting all blocks according to the order of attention values from big to small according to the attention between the blocks and the block i, and selecting the top p% of the sorted blocks;
then, the attention between class c and the selected block is taken out and calculated to obtain the attention adjustment factor between class c and block i:
wherein r (c, i) represents A c2p Attention regulator between class C and block i, C e {1,2, …, C } represents the total number of data set classes, i, j represents blocks, i e {1,2, …, N }, j e U, U represents the set of the first p% of blocks with the greatest attention between block i, S represents the number of blocks in U; a is that c2p (c, j) represents the attention between the representation class c and the block j;
attention adjustment factor r (c, i) is then added to the attention between class c and block i to adjust:
A c2p (c,i)=A c2p (c,i)+α*r(c,i)
wherein A is c2p (c, i) represents the attention between class c and block i, and α represents the attention regulator coefficient.
Further, class-to-block attention A is used in step 3.5 c2p Sum block-to-block attention a p2p Optimizing the initial class activation graph, comprising:
multiplying an initial class activation diagram initial-CAM by class-to-block attention to obtain a preliminary optimized adjustment class activation diagram;
and then further optimizing by matrix multiplication between the block-to-block attention and the adjustment class activation map to obtain a final class activation map.
Further, the model training process in step 4 is as follows:
step 4.1, setting super parameters of a weak supervision semantic segmentation model: model training times Epoch, initial learning rate and model training batch batch_size, wherein an optimizer used in training is an Adam optimizer, and a loss function is multi-label cross entropy loss;
step 4.2, carrying out multi-round training on the weak supervision semantic segmentation model, and storing parameters corresponding to a round of results with the highest training mIoU value;
and 4.3, after the weak supervision semantic segmentation model is trained, loading the stored best parameters into the model, inputting training set data into the model, and generating a complete class activation diagram by the trained model.
The beneficial effects are that: compared with the prior art, the technical scheme of the invention has the following beneficial technical effects:
the invention mainly solves the problem of incomplete activation of the class activation graph in the weak supervision semantic segmentation. A simple and effective weak supervision semantic segmentation framework is provided by taking a transducer as a basic network structure. In the framework, firstly, an attention adjustment strategy is designed, attention between the classes and the blocks is adjusted according to the attention between the blocks, the probability of error association between the classes and the blocks is effectively reduced, then the class activation diagram is optimized by using the adjusted attention, and at the moment, a target area in the obtained class activation diagram can be activated more completely and accurately, and the problem of incomplete activation of the class activation diagram can be better solved.
Drawings
FIG. 1 is a diagram of a weakly supervised semantic segmentation overall framework based on attention fusion.
Fig. 2 is an exemplary graph of segmentation results on a pass VOC 2012 validation set.
Fig. 3 is an exemplary graph of segmentation results on an MS co 2014 verification set.
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
The invention discloses a weak supervision semantic segmentation method based on attention regulation, which provides a novel framework based on a Transformer, which is used for weak supervision semantic segmentation tasks under image level annotation, and the overall structure of the framework is shown in figure 1 and mainly comprises three parts: 1) Performing feature extraction by using a transducer and generating an initial class activation diagram; 2) The attention adjusting module is used for adjusting the attention between the classes and the blocks according to the attention between the blocks, so that the accuracy of the attention between the classes and the blocks is effectively improved; 3) The class activation map is optimized by using the attention, and a more complete and accurate class activation map is obtained. The method comprises the following steps:
step 1: data preparation.
In the present invention, the paspal VOC 2012 dataset and the MS COCO 2014 dataset are used. Wherein the Pascal VOC 2012 dataset has 21 categories, including 20 object classes and one background class; the MS COCO 2014 dataset has 81 categories, including 80 object classes and one background class. The paspal VOC 2012 dataset can be divided into three parts: training set (comprising 1464 images), validation set (comprising 1449 images) and test set (comprising 1456 images), wherein training set is typically 10582 images augmented with additional data; the MS COCO 2014 dataset can be divided into two parts: a training set (comprising 82081 images) and a validation set (comprising 40137 images).
Step 2: and (5) preprocessing data.
And carrying out random horizontal overturn and color dithering treatment on the image, and setting the brightness, contrast and saturation values of the image to 0.3. The image was normalized using transform.normal to be 256×256 in size, and then randomly cropped using transform.random crop to be 224×224 in size. The cropped image is input into the model.
Step 3: and (5) building a model.
Step 3.1: and (3) constructing a weak supervision semantic segmentation framework based on attention fusion, segmenting the image preprocessed in the step (2) into N non-overlapping blocks, constructing N block tokens through linear mapping, and splicing C class tokens and N block tokens to obtain an input token of the framework.
Step 3.2: the input token is input to a Transfomer encoding layer in the framework to obtain an output token. The last N block tokens are then extracted from the output tokens to form an output block token Tp_out, which is subjected to a reorganization (Reshape) and convolution (Conv) operation to obtain an initial class activation map Original-CAM.
Original-CAM=Conv(Reshape(Tp_out))
Step 3.3: when the input token passes through the Transfomer coding layer, the Attention module calculates the Attention of the input token to generate Attention, the shape is [ C+N, C+N ], and the calculation formula is as follows:
wherein Q, K represents a matrix array and a Key matrix obtained by linear projection of an input token when the input token passes through a transducer coding layer, T represents matrix transposition, and d k Representing the scaling factor.
Step 3.4: attention can be further divided into class-to-block Attention A c2p Sum block-to-block attention a p2p Wherein A is c2p =Attention[1:C,C+1:C+N],A p2p =Attention[C+1:C+N,C+1:C+N]. Then pass the attention A from block to block p2p To pay attention between classes and blocksForce A c2p And adjusting. If the attention between the class c and the block i is to be adjusted, firstly, sorting the blocks according to the order of the attention values from big to small according to the attention between the blocks, then selecting some blocks which are ranked 30% before sorting, and then calculating the attention between the blocks to obtain the attention adjustment factor between the class c and the block i:
wherein r (c, i) represents A c2p Attention regulator between class C and block i, C e {1,2, …, C } represents the total number of data set classes, i, j represents a block, i e {1,2, …, N }, j e U, U represents a set of blocks of greater attention to block i, S represents the number of blocks in U. Attention adjustment factor r (c, i) is then added to the attention between class c and block i to adjust:
A c2p (c,i)=A c2p (c,i)+α*r(c,i)
wherein A is c2p (c, i) represents the attention between class c and block i, and α represents the attention regulator coefficient.
Step 3.5: using class-to-block attention A c2p Sum block-to-block attention a p2p To optimize the initial class activation map. The initial class activation diagram initial-CAM is multiplied by class-to-block attention to obtain a preliminary optimized adjustment class activation diagram, and then the adjustment class activation diagram is further optimized by matrix multiplication between the block-to-block attention and the adjustment class activation diagram to obtain a final class activation diagram.
Step 4: and (5) model training.
Step 4.1: setting relevant super parameters of a weak supervision semantic segmentation model, setting the model training frequency Epoch to 60, setting the model training batch batch_size to 64, setting an optimizer used during training to be an Adam optimizer, wherein the loss function is multi-label cross entropy loss, and setting the initial learning rate to 5e-4.
Step 4.2: and carrying out multi-round training on the weak supervision semantic segmentation model, and storing parameters corresponding to the best round of training result (the highest training mIoU value) by observing the training result.
Step 4.3: after the weak supervision semantic segmentation model is trained, the stored best parameters are loaded into the model, then training set data are input into the model, and the trained model can generate a complete class activation diagram.
Step 5: and (3) assigning a class to each pixel position according to the value of the class activation graph to generate a pixel-level pseudo tag, and then training the existing semantic segmentation network deep V2 by using the pixel-level pseudo tag. The pictures in the verification set and the test set are input into the trained model to obtain a final segmentation map, as shown in fig. 2 and 3, the second column is a real segmentation map, the third column is a prediction segmentation map of the invention, and the model prediction segmentation map of the invention is found to be very close to the real segmentation map.

Claims (5)

1. The weak supervision semantic segmentation method based on attention regulation is characterized by comprising the following steps of:
step 1, data preparation: acquiring a labeling image data set, and dividing the data set into a training set, a verification set and a test set;
step 2, data preprocessing: carrying out random horizontal overturn and color dithering treatment on the image, carrying out normalization treatment on the image, carrying out random clipping, and taking the clipped image as the input of a weak supervision semantic segmentation model;
step 3, building a model: constructing a weak supervision semantic segmentation model by taking DeiT-S pre-trained on an ImageNet as a backbone of the model;
step 4, model training: optimizing a weak supervision semantic segmentation model by using an Adam optimizer, training the model for a set period by using a training set, and generating a class activation diagram by using a loss function through multi-label cross entropy loss and the trained model;
step 5, distributing a class to each pixel position according to the value of the class activation diagram to generate a pixel-level pseudo tag, and then training a semantic segmentation network deep V2 by using the pixel-level pseudo tag; and inputting the pictures in the verification set and the test set into the trained model to obtain a final segmentation map.
2. The attention-adjustment-based weak supervision semantic segmentation method according to claim 1, wherein the model building in step 3 comprises:
step 3.1, constructing a weak supervision semantic segmentation framework based on attention fusion, segmenting a preprocessed image into N non-overlapping blocks, constructing N block tokens through linear mapping, and splicing C class tokens and N block tokens to obtain an input token of the framework;
step 3.2, inputting the input token into a Transfomer coding layer in the framework to obtain an output token; then extracting the last N block tokens from the output tokens to form an output block token Tp_out, and carrying out recombination and convolution operation on the output block token Tp_out to obtain an initial class activation diagram Original-CAM;
and 3.3, when the input token passes through the Transfomer coding layer, the Attention module calculates the Attention of the input token to generate Attention, and the calculation formula is as follows:
wherein Q and K respectively represent a matrix array and a Key matrix obtained by linear projection of an input token when the input token passes through a transducer coding layer, T represents matrix transposition, and d k Representing a scaling factor;
step 3.4, attention is further divided into class-to-block attention A c2p Sum block-to-block attention a p2p Then pass the attention A between the blocks p2p Attention a between class and block c2p Adjusting;
step 3.5, use class-to-block attention A c2p Sum block-to-block attention a p2p The initial class activation map is optimized.
3. The attention-based weak supervision semantic segmentation method according to claim 2, wherein the method comprises the following steps ofClass-to-block attention A c2p Sum block-to-block attention a p2p The expression is as follows:
A c2p =Attention[1:C,C+1:C+N]
A p2p =Attention[C+1:C+N,C+1:C+N]
the attention between class c and block i is adjusted as follows:
firstly, sorting all blocks according to the order of attention values from big to small according to the attention between the blocks and the block i, and selecting the top p% of the sorted blocks;
then, the attention between class c and the selected block is taken out and calculated to obtain the attention adjustment factor between class c and block i:
wherein r (c, i) represents A c2p Attention regulator between class C and block i, C e {1,2,..c } represents the total number of data set categories, i, j represents a block, i e {1,2, n., j e U, U representing the set of the first p% of blocks with the greatest attention between block i, S representing the number of blocks in U; a is that c2p (c, j) represents the attention between the representation class c and the block j;
attention adjustment factor r (c, i) is then added to the attention between class c and block i to adjust:
A c2p (c,i)=A c2p (c,i)+α*r(c,i)
wherein A is c2p (c, i) represents the attention between class c and block i, and α represents the attention regulator coefficient.
4. The attention-based weak supervision semantic segmentation method according to claim 2, wherein class-to-block attention a is used in step 3.5 c2p Sum block-to-block attention a p2p Optimizing the initial class activation graph, comprising:
multiplying an initial class activation diagram initial-CAM by class-to-block attention to obtain a preliminary optimized adjustment class activation diagram;
and then further optimizing by matrix multiplication between the block-to-block attention and the adjustment class activation map to obtain a final class activation map.
5. The attention-based weak supervision semantic segmentation method according to any one of claims 1 to 4, wherein the model training process in step 4 is as follows:
step 4.1, setting super parameters of a weak supervision semantic segmentation model: model training times Epoch, initial learning rate and model training batch batch_size, wherein an optimizer used in training is an Adam optimizer, and a loss function is multi-label cross entropy loss;
step 4.2, carrying out multi-round training on the weak supervision semantic segmentation model, and storing parameters corresponding to a round of results with the highest training mIoU value;
and 4.3, after the weak supervision semantic segmentation model is trained, loading the stored best parameters into the model, inputting training set data into the model, and generating a complete class activation diagram by the trained model.
CN202311064941.7A 2023-08-23 2023-08-23 Weak supervision semantic segmentation method based on attention adjustment Pending CN117036711A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311064941.7A CN117036711A (en) 2023-08-23 2023-08-23 Weak supervision semantic segmentation method based on attention adjustment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311064941.7A CN117036711A (en) 2023-08-23 2023-08-23 Weak supervision semantic segmentation method based on attention adjustment

Publications (1)

Publication Number Publication Date
CN117036711A true CN117036711A (en) 2023-11-10

Family

ID=88641034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311064941.7A Pending CN117036711A (en) 2023-08-23 2023-08-23 Weak supervision semantic segmentation method based on attention adjustment

Country Status (1)

Country Link
CN (1) CN117036711A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117593517A (en) * 2024-01-19 2024-02-23 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117593517A (en) * 2024-01-19 2024-02-23 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network
CN117593517B (en) * 2024-01-19 2024-04-16 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network

Similar Documents

Publication Publication Date Title
CN110263912B (en) Image question-answering method based on multi-target association depth reasoning
CN110032926B (en) Video classification method and device based on deep learning
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
CN111259940B (en) Target detection method based on space attention map
CN114758383A (en) Expression recognition method based on attention modulation context spatial information
CN110929610B (en) Plant disease identification method and system based on CNN model and transfer learning
CN111861945B (en) Text-guided image restoration method and system
CN112052906B (en) Image description optimization method based on pointer network
CN113111716B (en) Remote sensing image semiautomatic labeling method and device based on deep learning
CN113343705A (en) Text semantic based detail preservation image generation method and system
CN110245620B (en) Non-maximization inhibition method based on attention
CN110674777A (en) Optical character recognition method in patent text scene
CN113159067A (en) Fine-grained image identification method and device based on multi-grained local feature soft association aggregation
CN112488209A (en) Incremental image classification method based on semi-supervised learning
CN116452410A (en) Text-guided maskless image editing method based on deep learning
CN115019173A (en) Garbage identification and classification method based on ResNet50
CN115563327A (en) Zero sample cross-modal retrieval method based on Transformer network selective distillation
CN115565056A (en) Underwater image enhancement method and system based on condition generation countermeasure network
CN116258874A (en) SAR recognition database sample gesture expansion method based on depth condition diffusion network
CN114283285A (en) Cross consistency self-training remote sensing image semantic segmentation network training method and device
CN116071553A (en) Weak supervision semantic segmentation method and device based on naive VisionTransformer
CN117437317A (en) Image generation method, apparatus, electronic device, storage medium, and program product
CN111126155B (en) Pedestrian re-identification method for generating countermeasure network based on semantic constraint
CN117036711A (en) Weak supervision semantic segmentation method based on attention adjustment
Campana et al. Variable-hyperparameter visual transformer for efficient image inpainting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination