CN118506407B - Light pedestrian re-recognition method and system based on random color discarding and attention - Google Patents
Light pedestrian re-recognition method and system based on random color discarding and attention Download PDFInfo
- Publication number
- CN118506407B CN118506407B CN202410950010.5A CN202410950010A CN118506407B CN 118506407 B CN118506407 B CN 118506407B CN 202410950010 A CN202410950010 A CN 202410950010A CN 118506407 B CN118506407 B CN 118506407B
- Authority
- CN
- China
- Prior art keywords
- image
- attention
- pedestrian
- recognition
- lagt
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 83
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000012360 testing method Methods 0.000 claims abstract description 13
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 238000013507 mapping Methods 0.000 claims abstract description 7
- 238000005457 optimization Methods 0.000 claims abstract description 7
- 230000009466 transformation Effects 0.000 claims description 20
- 239000011159 matrix material Substances 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 17
- 238000010586 diagram Methods 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 6
- 238000007634 remodeling Methods 0.000 claims description 5
- 230000002776 aggregation Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 4
- 238000009499 grossing Methods 0.000 claims description 4
- 238000013075 data extraction Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 10
- 238000006243 chemical reaction Methods 0.000 description 9
- 238000003860 storage Methods 0.000 description 9
- 230000006872 improvement Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 239000003086 colorant Substances 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0985—Hyperparameter optimisation; Meta-learning; Learning-to-learn
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a light weight pedestrian re-recognition method and a light weight pedestrian re-recognition system based on random color discarding and attention, which relate to the technical field of pedestrian re-recognition, and are used for receiving image data and preprocessing the image data to obtain preprocessed image data; inputting the preprocessed image data into a OSNet which is built in advance and embedded with a cascade self-attention module, and extracting features to obtain image features; classifying the image features through the full connection layer, and mapping the image features onto corresponding class labels to obtain classified image features; calculating the identity loss with label smoothness by using the classified image features, and performing optimization training on a pre-established lightweight pedestrian re-recognition network model through a counter-propagation update gradient to obtain an optimized lightweight pedestrian re-recognition network model; and acquiring a test set of the pedestrian re-recognition data set, and inputting the test set of the pedestrian re-recognition data set into the optimized light-weight pedestrian re-recognition network model to obtain a light-weight pedestrian re-recognition result.
Description
Technical Field
The invention relates to the technical field of pedestrian re-identification, in particular to a light weight pedestrian re-identification method and system based on random color discarding and attention.
Background
With the rapid development of deep learning technology, the pedestrian re-recognition field has made remarkable progress. The residual error network (ResNet) is applied to the field of pedestrian re-recognition and achieves remarkable results, and the ResNet network solves the gradient disappearance problem in deep network training by introducing residual error connection, so that the network is easier to optimize.
The robustness and generalization of the recognition method in the prior art are not high, and in order to improve the robustness and generalization of the recognition system, some data enhancement strategies such as color enhancement are widely applied. The color information is used as important distinguishing characteristics of pedestrians, can enrich the expression of the pedestrian characteristics, particularly plays an important role when facing complex scenes such as illumination change, and has more obvious advantages compared with the traditional method in complex environments when being used as a pedestrian re-identification system integrating the color information. In some cases, however, the color bias generated by the color features limits the model to a certain extent to make the correct predictions, mainly in two ways: firstly, the color difference between the images of the same pedestrian increases the possibility of false recognition; and secondly, the color deviation weakens the characteristic difference between different pedestrian images, and reduces the recognition degree of the system. Currently, few studies are performed to reduce color bias and improve model robustness.
Disclosure of Invention
To solve the above-mentioned shortcomings in the background art, an object of the present invention is to provide a lightweight pedestrian re-recognition method and system based on random color discarding and attention.
In a first aspect, the object of the present invention can be achieved by the following technical solutions: the light pedestrian re-identification method based on random color discarding and attention comprises the following steps:
Receiving image data, and preprocessing the image data to obtain preprocessed image data;
inputting the preprocessed image data into a OSNet which is built in advance and embedded with a cascade self-attention module, extracting characteristics, and outputting to obtain image characteristics;
classifying the image features through the full connection layer, and mapping the image features onto corresponding class labels to obtain classified image features;
Calculating the identity loss with label smoothness by using the classified image features, and performing optimization training on a pre-established lightweight pedestrian re-recognition network model through counter propagation update gradient to obtain an optimized lightweight pedestrian re-recognition network model;
and acquiring a test set of the pedestrian re-recognition data set, inputting the test set of the pedestrian re-recognition data set into the optimized light-weight pedestrian re-recognition network model, and outputting to obtain a light-weight pedestrian re-recognition result.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the preprocessing of the image data comprises the following steps:
data enhancement: and executing random overturn, random erasing and a random color discarding strategy based on LAGT to finally obtain the preprocessed image data.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the random color discarding strategy based on LAGT adopts aggregation gray level transformation to gray the image, and the calculation process is as follows:
Wherein, ,AndRepresenting the red, green and blue color channels,,,,AndRepresenting pixel values at specific positions of red, green and blue channels, respectively, the weighting coefficients are expressed as,For the height of the image to be high,Is the width of the image.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the implementation process of LAGT is as follows:
during the data loading process, a random identity sampler is adopted to randomly select Identity of each selectedThe size of the training batch of pictures for pedestrians is as followsThe set is expressed asWhereinRepresenting the first of the training lotsThe image of the object is a single image,Represent the firstSample labels of individual images LAGT with probabilityConverting the original image into a gray image, randomly selecting a rectangular area from the original image, replacing gray values of the areas corresponding to the gray image into the original image, and giving an original pedestrian pictureWith probabilityPerforming an aggregate gray scale transformation, the corresponding AGT image is defined as:
Original image Area size:
Wherein, For the height of the image to be high,Is the width of the image;
Area of AGT rectangle :
Wherein, 、Minimum and maximum values for the AGT image area to original image area ratio;
Aspect ratio of AGT rectangle High and highSum width of:
Wherein the method comprises the steps of,Maximum and minimum values of the aspect ratio of the gray scale transformation rectangle;
in the original image Random initialization of a pointThe following conditions are satisfied:
For the height of the image to be high, Is the width of the image;
The selected LAGT region is :
For each ofThe LAGT region rect is selected as:
The LAGT algorithm can ultimately be expressed as:
Wherein, Will bePixels in an image corresponding rectangle are given toThe image is displayed in a form of a picture,Is LAGT transformed samples.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the embedded cascade self-attention module includes SSAM and CSAM.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the SSAM intermediate feature map of the preprocessed image data extraction isWhereinAs a result of the number of characteristic channels,For the size of the intermediate feature diagramPerforming a1×1 convolution operation to obtain,,WhereinFor a pair of,,After the remodelling operation, obtaining a spatial self-attention affinity matrix,WhereinThe procedure is represented as follows:
Wherein the method comprises the steps of Representation of the first in spaceNumber of position pairsAttention weight of individual locations, willAnd (3) withMultiplying the obtained images by embedding attention weights, and overlapping the obtained images with original characteristic pixels to obtain a spatial self-attention weighted characteristic diagram:
Wherein the method comprises the steps ofIs a feature map that weights spatial self-attention by adjusting SSAM the affected hyper-parametersThe processing is performed by CSAM.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the CSAM has a spatial self-attention weighted feature map for the inputAfter the remodeling operation, obtaining a channel self-attention affinity matrix,The procedure is represented as follows:
Wherein the method comprises the steps of Representation channelPaired channelsIs weighted for (a) attention toInitialize one andThe sizes are the same and the values are allMatrix of maximaThe new channel self-attention weight affinity matrix is,Will beAnd (3) withMultiplying and embedding attention weight and thenCorresponding position pixels are overlapped to obtain a characteristic diagram of channel self-attention weighting;
Wherein the method comprises the steps ofIs to adjust the super parameter of CSAM influence;
feature map weighting channel self-attention And obtaining the image characteristics after passing through OSNet backbone networks.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the process of calculating the identity loss L ID with label smoothing by using the classified image features comprises the following steps:
Wherein the method comprises the steps of Representing the category of the pedestrian,Representing the number of images of the pedestrians in the training set,As a real tag it is possible to provide a real tag,Is a predicted logical value for the network.
In a second aspect, to achieve the above object, the present invention discloses a lightweight pedestrian re-recognition system based on random color discarding and attention, comprising:
The image processing module is used for receiving the image data, preprocessing the image data and obtaining preprocessed image data;
the feature extraction module is used for inputting the preprocessed image data into the OSNet which is built in advance and embedded with the cascade self-attention module, extracting features and outputting the features to obtain image features;
The image classification module is used for classifying the image features through the full connection layer, mapping the image features onto corresponding category labels and obtaining classified image features;
the model training module is used for calculating the identity loss with label smoothness by using the classified image characteristics, and carrying out optimization training on a pre-established lightweight pedestrian re-recognition network model through counter propagation update gradient to obtain an optimized lightweight pedestrian re-recognition network model;
The pedestrian re-recognition module is used for acquiring a test set of the pedestrian re-recognition data set, inputting the test set of the pedestrian re-recognition data set into the optimized light-weight pedestrian re-recognition network model, and outputting to obtain a light-weight pedestrian re-recognition result.
A terminal device comprising a memory, a processor and a computer program stored in the memory and capable of running on the processor, the memory storing the computer program capable of running on the processor, when loading and executing the computer program, employing a lightweight pedestrian re-recognition method based on random color discarding and attention as described above.
The invention has the beneficial effects that:
the invention adopts OSNet as a backbone network of the model, and the parameter quantity is greatly reduced;
For pedestrians with clothes of the same color, the negative influence of color deviation on the recognition effect can be effectively restrained through LAGT algorithm, the model is encouraged to find and pay attention to characteristic information irrelevant to the color, the weights of the neural network on the color characteristic and the non-color characteristic are balanced, and the Baseline recognition effect is improved;
for pedestrian pictures with complex backgrounds and shielding, the invention can effectively aggregate pedestrians and inhibit irrelevant backgrounds through the cascade self-attention module, so that the extracted features are finer and have discrimination, and the Baseline identification effect is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort;
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is an overall frame diagram of a lightweight pedestrian re-recognition network based on random color discard and self-attention in accordance with the present invention;
FIG. 3 is an original sample and LAGT sample example of the present invention;
FIG. 4 is SSAM employed in the present invention;
FIG. 5 is a CSAM used in the present invention;
FIG. 6 is a search result of a training model using RGB images and original grayscale transformed images on a Market-1501 dataset;
FIG. 7 is an example of a pedestrian image in a green background on a dataset;
FIG. 8 is a search result of a green background pedestrian image on a Market-1501 dataset using an RGB image and an original gray scale transformed image training model;
FIG. 9 is a search result of a gray image training model using RGB images and different gray transforms on a Market-1501 dataset;
FIG. 10 is a visual comparison of pedestrian re-identification under color deviation in accordance with the present invention;
FIG. 11 is a visual comparison of pedestrian re-recognition under a complex background and occlusion in accordance with the present invention;
FIG. 12 is a schematic diagram of the system of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Embodiment one:
the following description is made of the relevant terms related to the embodiments of the present application:
The gray scale transformation refers to a method of changing the gray scale value of each pixel in the source image point by point according to a certain transformation relationship according to a certain target condition. The purpose is to improve the image quality and make the display effect of the image clearer. The gray level conversion processing of the image is a very basic and direct spatial domain image processing method in the image enhancement processing technology, and is also an important component of image digitizing software and image display software.
The Self-Attention mechanism is a special Attention mechanism that allows the model to take into account the relationship of each element in a sequence to all other elements when processing a sequence. Such a mechanism may help the model better understand the context information in the sequence, thereby processing the sequence data more accurately. ( Sequence data is a type of data in which elements exist in a particular order. Each element has a specific position, and the sequence relation among the positions has important influence on the meaning and processing mode of the data )
As shown in fig. 1, the lightweight pedestrian re-recognition method based on random color discarding and attention is characterized in that the method comprises the following steps:
Receiving image data, and preprocessing the image data to obtain preprocessed image data;
The process of preprocessing the image data includes:
data enhancement: performing random overturn, random erasing and random color discarding strategies based on LAGT to finally obtain preprocessed image data;
The random color discarding strategy based on LAGT adopts aggregation gray level transformation to gray the image, and the calculation process is as follows:
Wherein, ,AndRepresenting the red, green and blue color channels,,,,AndRepresenting pixel values at specific positions of red, green and blue channels, respectively, the weighting coefficients are expressed as。
The LAGT algorithm is implemented as follows: during the data loading process, a random identity sampler is adopted to randomly selectIdentity of each selectedThe size of the training batch of pictures for pedestrians is as follows. The set is expressed asWhereinRepresenting the first of the training lotsThe image of the object is a single image,Represent the firstSample labels of the individual images. LAGT with probabilityAnd converting the original image into a gray image, randomly selecting a rectangular area from the original image, and replacing gray values of the corresponding position area of the gray image into the original image. Given an original pedestrian pictureWith probabilityAn aggregate gray scale transformation is performed, the corresponding AGT image of which is defined as:
Original image Area size:
Wherein, For the height of the image to be high,Is the width of the image.
Area of AGT rectangle:
Wherein, 、Is the minimum and maximum value of the AGT image area to original image area ratio.
Aspect ratio of AGT rectangleHigh and highSum width of:
Wherein the method comprises the steps of,Maximum and minimum values of the aspect ratio of the rectangle are transformed for the gray scale.
In the original imageRandom initialization of a pointThe following conditions are satisfied:
For the height of the image to be high, Is the width of the image;
The selected LAGT region is :
For each ofThe LAGT region selected is:
The LAGT algorithm can ultimately be expressed as:
Wherein, Will bePixels in an image corresponding rectangle are given toThe image is displayed in a form of a picture,Is LAGT transformed samples. The original samples and LAGT samples are shown in fig. 3.
Inputting the preprocessed image data into a OSNet which is built in advance and embedded with a cascade self-attention module, extracting characteristics, and outputting to obtain image characteristics;
The embedded cascade self-attention module comprises SSAM and CSAM, as shown in figures 4 and 5 respectively;
SSAM can help the network aggregate semantically related features in space and highlight individual features, making the network more focused on details of pedestrians.
The neutral network extracted intermediate characteristic diagram isWhereinAs a result of the number of characteristic channels,Is the size of the intermediate feature map. For a pair ofPerforming a1×1 convolution operation to obtain,,In the process of deducing the attention matrix, more refined and abstract characteristic representation can be extracted for channel dimension reduction, and the network performance is not affected while the computational complexity is reduced, wherein. For a pair of,,After the remodeling (reshape) operation, a spatial self-attention affinity matrix is obtained,WhereinThe process is represented as follows:
Wherein the method comprises the steps of Representation of the first in spaceNumber of position pairsAttention weight for each location. Will beAnd (3) withMultiplying the obtained images by embedding attention weights, and overlapping the obtained images with original characteristic pixels to obtain a spatial self-attention weighted characteristic diagram:
Wherein the method comprises the steps ofIs the hyper-parameter that adjusts SSAM effects.
The CSAM can help the network learn the association information among the channels, and further extract the effective characteristic representation among different channels in the image.
Feature map with spatial self-attention weighting for inputAfter the remodeling (reshape) operation, a channel self-attention affinity matrix is obtained,The procedure is represented as follows:
Wherein the method comprises the steps of Representation channelPaired channelsIs a weight of attention of (2). For the followingWe initialize one and using the normalization (Normalize) methodThe sizes are the same and the values are allMatrix of maximaThe new channel self-attention weight affinity matrix is,The method effectively avoids the influence of noise or abnormal value on the maximum activation value, encourages other channels to provide complementary information by subtracting the maximum activation value from each channel, increases the perception diversity of the model to different characteristic channels, and improves the robustness of the model. Will beAnd (3) withMultiplying and embedding attention weight and thenCorresponding position pixels are overlapped to obtain a characteristic diagram of channel self-attention weighting:
Wherein the method comprises the steps ofIs a super parameter for adjusting the CSAM effect.
After passing through OSNet backbone network with cascade self-attention module, feature map with size b×512×16×8 is output, where b is batch size, 512 is channel number, 16 is feature map height, and 8 is feature map width.
Classifying the image features through the full connection layer, and mapping the image features onto corresponding class labels to obtain classified image features;
characteristic score with b×N, N is the number of categories;
Calculating the identity loss with label smoothness by using the classified image features, and performing optimization training on a pre-established lightweight pedestrian re-recognition network model through counter propagation update gradient to obtain an optimized lightweight pedestrian re-recognition network model;
A calculation process for calculating identity loss L ID with label smoothing by using the classified image features:
Wherein the method comprises the steps of Representing the category of the pedestrian,Representing the number of images of the pedestrians in the training set,As a real tag it is possible to provide a real tag,Is a predicted logical value for the network.
The label changes smoothlyIs characterized by comprising the following structure:
Wherein the method comprises the steps of The model is encouraged to not trust the training set too much, increasing generalization ability.
And acquiring a pedestrian re-recognition data set, inputting the pedestrian re-recognition data set into the optimized light-weight pedestrian re-recognition network model, and outputting to obtain a light-weight pedestrian re-recognition result.
Tests were performed on both marker-1501 and DukeMTMC-reID pedestrian re-identification datasets. And adopting AMSGrad optimizers, setting the initial learning rate to be 0.0015, and adopting a cosine annealing strategy to update the learning rate. Batch size and weight decay were 64 and 5e-4, respectively. Training epoch to be 250, adjusting LAGT algorithm probability to be 0.6, and controlling super parameters influenced by SSAM and CSAM modulesAndFor 1, the tag smoothed ID Loss is used for supervision and the pedestrian matching uses a cosine distance.
Specifically, the following examples are provided to further illustrate the present invention:
The experimental environment parameters are shown in table 1.
TABLE 1
In the experiments, the present invention selects OSNet pre-trained on ImageNet as Baseline, and in order to verify the effect of LAGT and SAM alone and in combination on the network model, ablation experiments were performed on both of the marker-1501 and DukeMTMC-reID datasets, the results of which are shown in table 2. As can be seen from Table 2, LAGT and SAM both have some improvement over Baseline. The simultaneous addition of both modules to Baseline LAGT weakens the negative impact of color bias on model recognition, SAM makes the model extract finer features in space and channels, thus the network achieves the best recognition performance, rank-1 and mAP in the marker-1501 dataset reach 95.5% and 87.7% respectively, 0.7% and 1.0% improvement over Baseline, and Rank-1 and mAP in the DukeMTMC-reID dataset reach 89.2% and 77.2% respectively, 0.8% and 0.5% improvement over Baseline. In addition, our network is lightweight, the parameter is only 2.8M, and the network provides convenience for application and deployment.
TABLE 2
To verify the effectiveness of the present invention, the two datasets of Market-1501, dukeMTMC-reID are compared with the advanced pedestrian re-recognition method in recent years, wherein the method is based on PCB+RPP of local feature extraction, HA-CNN, mancs, AANet and IANet of attention mechanism. In addition, we have chosen some other better performing methods for comparison, such as BDB, boT, etc., and the statistics are shown in Table 3.
As can be seen from Table 3, the Rank-1 index of the invention reached 95.5% and 89.2% on the two data sets, and the mAP index reached 87.7% and 77.2% on the two data sets, respectively, the invention focused on global feature extraction only, and Rank-1 and mAP increased by 2.4% and 6.7% on the mark-1501 and 6.3% and 9.7% on DukeMTMC-reID, respectively, compared with the PCB+RPP network with local feature extraction. Compared to the complex attribute attention network AANet, rank-1 and mAP are promoted by 1.6% and 4.3% on Market-1501, respectively, and 1.5% and 2.9% on DukeMTMC-reID, respectively. Compared with the strong baseline BoT method which only focuses on global features, the method has the advantage of small parameter, and Rank-1 and mAP are respectively improved by 1.0% and 1.8% on the mark-1501 and 2.8% and 0.8% on the DukeMTMC-reID.
TABLE 3 Table 3
Table 4 shows the comparison result of the parameters of the network and the main stream network, and the comparison table shows that the network adds LAGT and SAM modules on the basis of OSNet to effectively improve the recognition performance of the model only on the premise of increasing a small amount of parameters and calculation time. Compared with ResNet networks, the model parameter is simplified, the consumed computing resources are less, the training time of the model is shortened, the model can adapt to tasks more quickly, and the efficiency of the model in practical application is improved. Compared with OSNet, the invention improves the recognition precision and generalization capability of pedestrians, and simultaneously increases the parameter by only 0.6M, which proves that the designed network has better light weight.
TABLE 4 Table 4
The dataset contains various complex and varying color deviations, resulting in an insufficient robustness of the model in coping with color changes. To address this issue, it is important to balance the weights between the color features and other key discriminant features. Fig. 6 shows the search results of training models on the mark-1501 dataset by using RGB images and gray images, respectively, and it can be seen from the figure that the search results are affected when color deviation exists between the query image and the gallery image, and the search effect of the sample is improved to a certain extent after color information is ignored.
Based on the problem, the training image with the local gray level image is generated through random color discarding, so that the robustness of the model to color deviation is improved. The key step in the RCD strategy is to replace a local color image with its corresponding local gray image in the image, and the gray conversion used is:
Wherein, 、、Respectively representing red, green and blue channels. It can be seen from the formula that, in order to reflect the sensitivity of human eyes to different colors, the red, green and blue three channels are provided with different conversion weights, and the conversion strategy is more in line with the visual image of human eyes, but the following problems exist in the task of re-identifying pedestrians based on a deep learning network:
(1) Different from the sensitivity of human eye vision to colors, the network model has the same identification capability to the information of three color channels, so that the adoption of a color weight distribution strategy based on human eye vision perception characteristics has a certain limitation.
(2) In the pedestrian data set, according to the statistics of the wearing habits of people, fewer pedestrians with green wearing colors are worn, and green features often appear on background information of pictures, as shown in fig. 7, trees and lawns serve as green background information of pedestrian images, when the maximum transition weight is assigned to a green channel, the generalization capability of a model on the background information is increased, the generalization capability of the model on the pedestrians is ignored, and recognition deviation is generated.
For this problem, a corresponding experimental verification was performed as shown in fig. 8. It can be seen from the graph that when the deviation between the background color characteristics of the picture and the pedestrian color characteristics is large, particularly the green background picture, and the original gray level transformation strategy is adopted, the large transformation weight of the green channel can improve the generalization capability of the model to the background, neglect the generalization capability to the pedestrian, and introduce deviation to reduce the recognition effect.
According to the analysis, the invention designs a color discarding strategy more suitable for a learning model on the basis of RCD, changes the original gray level transformation, and the new transformation method should meet the following three requirements:
(1) The variability of the network model on three color channels is reduced, the negative influence of color deviation on recognition is weakened, and meanwhile, extra deviation is avoided;
(2) After the color conversion, more stable structure and texture characteristics in the original image are reserved, so that distortion is reduced.
(3) The improved gray level conversion method does not change the learning strategy, can avoid the phenomenon of overfitting during training, saves the computing resource and achieves the purposes of light weight and effectiveness.
Based on the analysis of the gray scale transformation principle set forth in the above description, a weighted average method is employed to optimize the gray scale transformation strategy by carefully balancing the transformation weights assigned to the red, green and blue channels. For the followingPedestrian image size, aggregate gray scale transform (AGT) formula is as follows:
Wherein, ,AndRepresenting the red, green and blue color channels,,。,AndRepresenting pixel values at specific locations of the red, green and blue channels, respectively. The weight coefficient is expressed asIs a constant balance factor, selects. In the above formula, uniform weights are assigned to three channels in the conversion from an RGB image to a gray image.
To preliminarily verify the effectiveness of the improvement, the results of a search of a gray image training model generated on a mark-1501 using RGB images and different gray transforms are shown as shown in FIG. 9, in which、、The search results obtained by the original gray level conversion and the average gray level conversion are respectively obtained when the input is an RGB image. The result shows that AGT does not introduce extra deviation while eliminating color deviation, and maintains stable structural texture characteristics of RGB image, and the identification effect is superior to that of original gray scale transformation.
Embodiment two: in a second aspect, as shown in fig. 12, in order to achieve the above object, the present invention discloses a lightweight pedestrian re-recognition system based on random color discarding and attention, comprising:
The image processing module 11 is configured to receive image data, and perform preprocessing on the image data to obtain preprocessed image data;
the feature extraction module 12 is configured to input the preprocessed image data into the pre-established OSNet embedded with the cascade self-attention module, extract features, and output image features;
The image classification module 13 is used for classifying the image features through the full connection layer, mapping the image features onto corresponding category labels, and obtaining classified image features;
The model training module 14 is configured to calculate an identity loss with label smoothing using the classified image features, and perform optimization training on a pre-established lightweight pedestrian re-recognition network model by back-propagating an update gradient to obtain an optimized lightweight pedestrian re-recognition network model;
The pedestrian re-recognition module 15 is configured to obtain a test set of the pedestrian re-recognition data set, input the test set of the pedestrian re-recognition data set into the optimized lightweight pedestrian re-recognition network model, and output and obtain a lightweight pedestrian re-recognition result.
Based on the same inventive concept, the present invention also provides a computer apparatus comprising: one or more processors, and memory for storing one or more computer programs; the program includes program instructions and the processor is configured to execute the program instructions stored in the memory. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processor, digital signal processor (DIGITAL SIGNAL Processor, DSP), application specific integrated circuit (Application SpecificIntegrated Circuit, ASIC), field-Programmable gate array (Field-Programmable GATEARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc., that are the computational core and control core of the terminal for implementing one or more instructions, particularly for loading and executing one or more instructions within a computer storage medium to implement the methods described above.
It should be further noted that, based on the same inventive concept, the present invention also provides a computer storage medium having a computer program stored thereon, which when executed by a processor performs the above method. The storage media may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electrical, magnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing has shown and described the basic principles, principal features, and advantages of the present disclosure. It will be understood by those skilled in the art that the present disclosure is not limited to the embodiments described above, which have been described in the foregoing and description merely illustrates the principles of the disclosure, and that various changes and modifications may be made therein without departing from the spirit and scope of the disclosure, which is defined in the appended claims.
Claims (4)
1. A lightweight pedestrian re-identification method based on random color discarding and attention, the method comprising the steps of:
Receiving image data, and preprocessing the image data to obtain preprocessed image data;
The preprocessing of the image data comprises the following steps:
data enhancement: performing random overturn, random erasing and random color discarding strategies based on LAGT to finally obtain preprocessed image data;
the random color discarding strategy based on LAGT adopts aggregation gray level transformation to gray the image, and the calculation process is as follows:
Wherein, ,AndRepresenting the red, green and blue color channels,,,,AndRepresenting pixel values at specific positions of red, green and blue channels, respectively, the weighting coefficients are expressed as,For the height of the image to be high,Is the width of the image;
The implementation process of LAGT is as follows:
during the data loading process, a random identity sampler is adopted to randomly select Identity of each selectedThe size of the training batch of pictures for pedestrians is as followsThe set is expressed asWhereinRepresenting the first of the training lotsThe image of the object is a single image,Represent the firstSample labels of individual images LAGT with probabilityConverting the original image into a gray image, randomly selecting a rectangular area from the original image, replacing gray values of the areas corresponding to the gray image into the original image, and giving an original pedestrian pictureWith probabilityPerforming an aggregate gray scale transformation, the corresponding AGT image is defined as:
Original image Area size:
Wherein, For the height of the image to be high,Is the width of the image;
Area of AGT rectangle :
Wherein, 、Minimum and maximum values for the AGT image area to original image area ratio;
Aspect ratio of AGT rectangle High and highSum width of:
Wherein the method comprises the steps of,Maximum and minimum values of the aspect ratio of the gray scale transformation rectangle;
in the original image Random initialization of a pointThe following conditions are satisfied:
For the height of the image to be high, Is the width of the image;
The selected LAGT region is :
For each ofThe LAGT region rect is selected as:
The LAGT algorithm can ultimately be expressed as:
Wherein, Will bePixels in an image corresponding rectangle are given toThe image is displayed in a form of a picture,Is LAGT transformed samples;
inputting the preprocessed image data into a OSNet which is built in advance and embedded with a cascade self-attention module, extracting characteristics, and outputting to obtain image characteristics;
the embedded cascade self-attention module comprises SSAM and a CSAM;
The SSAM intermediate feature map of the preprocessed image data extraction is WhereinAs a result of the number of characteristic channels,For the size of the intermediate feature diagramPerforming a1×1 convolution operation to obtain,,WhereinFor a pair of,,After the remodelling operation, obtaining a spatial self-attention affinity matrix,WhereinThe procedure is represented as follows:
Wherein the method comprises the steps of Representation of the first in spaceNumber of position pairsAttention weight of individual locations, willAnd (3) withMultiplying the obtained images by embedding attention weights, and overlapping the obtained images with original characteristic pixels to obtain a spatial self-attention weighted characteristic diagram:
Wherein the method comprises the steps ofIs a feature map that weights spatial self-attention by adjusting SSAM the affected hyper-parametersProcessing by CSAM;
the CSAM has a spatial self-attention weighted feature map for the input After the remodeling operation, obtaining a channel self-attention affinity matrix,The procedure is represented as follows:
Wherein the method comprises the steps of Representation channelPaired channelsIs weighted for (a) attention toInitialize one andThe sizes are the same and the values are allMatrix of maximaThe new channel self-attention weight affinity matrix is,Will beAnd (3) withMultiplying and embedding attention weight and thenCorresponding position pixels are overlapped to obtain a characteristic diagram of channel self-attention weighting;
Wherein the method comprises the steps ofIs to adjust the super parameter of CSAM influence;
feature map weighting channel self-attention Obtaining image characteristics after passing through OSNet backbone networks;
classifying the image features through the full connection layer, and mapping the image features onto corresponding class labels to obtain classified image features;
Calculating the identity loss with label smoothness by using the classified image features, and performing optimization training on a pre-established lightweight pedestrian re-recognition network model through counter propagation update gradient to obtain an optimized lightweight pedestrian re-recognition network model;
and acquiring a test set of the pedestrian re-recognition data set, inputting the test set of the pedestrian re-recognition data set into the optimized light-weight pedestrian re-recognition network model, and outputting to obtain a light-weight pedestrian re-recognition result.
2. The method for lightweight pedestrian re-identification based on random color discard and attention as in claim 1, wherein the calculation process of calculating identity loss with label smoothing L ID using the classified image features:
Wherein the method comprises the steps of Representing the category of the pedestrian,Representing the number of images of the pedestrians in the training set,As a real tag it is possible to provide a real tag,Is a predicted logical value for the network.
3. A lightweight pedestrian re-identification system based on random color discard and attention comprising:
The image processing module is used for receiving the image data, preprocessing the image data and obtaining preprocessed image data;
The preprocessing of the image data comprises the following steps:
data enhancement: performing random overturn, random erasing and random color discarding strategies based on LAGT to finally obtain preprocessed image data;
the random color discarding strategy based on LAGT adopts aggregation gray level transformation to gray the image, and the calculation process is as follows:
Wherein, ,AndRepresenting the red, green and blue color channels,,,,AndRepresenting pixel values at specific positions of red, green and blue channels, respectively, the weighting coefficients are expressed as,For the height of the image to be high,Is the width of the image;
The implementation process of LAGT is as follows:
during the data loading process, a random identity sampler is adopted to randomly select Identity of each selectedThe size of the training batch of pictures for pedestrians is as followsThe set is expressed asWhereinRepresenting the first of the training lotsThe image of the object is a single image,Represent the firstSample labels of individual images LAGT with probabilityConverting the original image into a gray image, randomly selecting a rectangular area from the original image, replacing gray values of the areas corresponding to the gray image into the original image, and giving an original pedestrian pictureWith probabilityPerforming an aggregate gray scale transformation, the corresponding AGT image is defined as:
Original image Area size:
Wherein, For the height of the image to be high,Is the width of the image;
Area of AGT rectangle :
Wherein, 、Minimum and maximum values for the AGT image area to original image area ratio;
Aspect ratio of AGT rectangle High and highSum width of:
Wherein the method comprises the steps of,Maximum and minimum values of the aspect ratio of the gray scale transformation rectangle;
in the original image Random initialization of a pointThe following conditions are satisfied:
For the height of the image to be high, Is the width of the image;
The selected LAGT region is :
For each ofThe LAGT region rect is selected as:
The LAGT algorithm can ultimately be expressed as:
Wherein, Will bePixels in an image corresponding rectangle are given toThe image is displayed in a form of a picture,Is LAGT transformed samples;
the feature extraction module is used for inputting the preprocessed image data into the OSNet which is built in advance and embedded with the cascade self-attention module, extracting features and outputting the features to obtain image features;
the embedded cascade self-attention module comprises SSAM and a CSAM;
The SSAM intermediate feature map of the preprocessed image data extraction is WhereinAs a result of the number of characteristic channels,For the size of the intermediate feature diagramPerforming a1×1 convolution operation to obtain,,WhereinFor a pair of,,After the remodelling operation, obtaining a spatial self-attention affinity matrix,WhereinThe procedure is represented as follows:
Wherein the method comprises the steps of Representation of the first in spaceNumber of position pairsAttention weight of individual locations, willAnd (3) withMultiplying the obtained images by embedding attention weights, and overlapping the obtained images with original characteristic pixels to obtain a spatial self-attention weighted characteristic diagram:
Wherein the method comprises the steps ofIs a feature map that weights spatial self-attention by adjusting SSAM the affected hyper-parametersProcessing by CSAM;
the CSAM has a spatial self-attention weighted feature map for the input After the remodeling operation, obtaining a channel self-attention affinity matrix,The procedure is represented as follows:
Wherein the method comprises the steps of Representation channelPaired channelsIs weighted for (a) attention toInitialize one andThe sizes are the same and the values are allMatrix of maximaThe new channel self-attention weight affinity matrix is,Will beAnd (3) withMultiplying and embedding attention weight and thenCorresponding position pixels are overlapped to obtain a characteristic diagram of channel self-attention weighting;
Wherein the method comprises the steps ofIs to adjust the super parameter of CSAM influence;
feature map weighting channel self-attention Obtaining image characteristics after passing through OSNet backbone networks;
The image classification module is used for classifying the image features through the full connection layer, mapping the image features onto corresponding category labels and obtaining classified image features;
the model training module is used for calculating the identity loss with label smoothness by using the classified image characteristics, and carrying out optimization training on a pre-established lightweight pedestrian re-recognition network model through counter propagation update gradient to obtain an optimized lightweight pedestrian re-recognition network model;
The pedestrian re-recognition module is used for acquiring a test set of the pedestrian re-recognition data set, inputting the test set of the pedestrian re-recognition data set into the optimized light-weight pedestrian re-recognition network model, and outputting to obtain a light-weight pedestrian re-recognition result.
4. A terminal device comprising a memory, a processor and a computer program stored in the memory and capable of running on the processor, characterized in that the memory has stored therein a computer program capable of running on the processor, which processor, when loaded and executing, employs the light weight pedestrian re-recognition method based on random color discarding and attention as claimed in any one of claims 1 to 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410950010.5A CN118506407B (en) | 2024-07-16 | 2024-07-16 | Light pedestrian re-recognition method and system based on random color discarding and attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410950010.5A CN118506407B (en) | 2024-07-16 | 2024-07-16 | Light pedestrian re-recognition method and system based on random color discarding and attention |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118506407A CN118506407A (en) | 2024-08-16 |
CN118506407B true CN118506407B (en) | 2024-09-13 |
Family
ID=92229531
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410950010.5A Active CN118506407B (en) | 2024-07-16 | 2024-07-16 | Light pedestrian re-recognition method and system based on random color discarding and attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118506407B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117078967A (en) * | 2023-09-04 | 2023-11-17 | 石家庄铁道大学 | Efficient and lightweight multi-scale pedestrian re-identification method |
CN118115947A (en) * | 2024-03-07 | 2024-05-31 | 四川大学 | Cross-mode pedestrian re-identification method based on random color conversion and multi-scale feature fusion |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107273872B (en) * | 2017-07-13 | 2020-05-05 | 北京大学深圳研究生院 | Depth discrimination network model method for re-identification of pedestrians in image or video |
-
2024
- 2024-07-16 CN CN202410950010.5A patent/CN118506407B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117078967A (en) * | 2023-09-04 | 2023-11-17 | 石家庄铁道大学 | Efficient and lightweight multi-scale pedestrian re-identification method |
CN118115947A (en) * | 2024-03-07 | 2024-05-31 | 四川大学 | Cross-mode pedestrian re-identification method based on random color conversion and multi-scale feature fusion |
Also Published As
Publication number | Publication date |
---|---|
CN118506407A (en) | 2024-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wan et al. | Automated colorization of a grayscale image with seed points propagation | |
Hosu et al. | KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment | |
Yang et al. | Underwater image enhancement based on conditional generative adversarial network | |
CN113313657B (en) | Unsupervised learning method and system for low-illumination image enhancement | |
CN110909690B (en) | Method for detecting occluded face image based on region generation | |
CN111738243B (en) | Method, device and equipment for selecting face image and storage medium | |
WO2020107717A1 (en) | Visual saliency region detection method and apparatus | |
CN101630363B (en) | Rapid detection method of face in color image under complex background | |
US8103058B2 (en) | Detecting and tracking objects in digital images | |
CN116681636B (en) | Light infrared and visible light image fusion method based on convolutional neural network | |
CN113112416A (en) | Semantic-guided face image restoration method | |
CN111079807B (en) | Ground object classification method and device | |
CN110705634B (en) | Heel model identification method and device and storage medium | |
CN116994287B (en) | Animal counting method and device and animal counting equipment | |
CN117292117A (en) | Small target detection method based on attention mechanism | |
CN112102175B (en) | Image contrast enhancement method and device, storage medium and electronic equipment | |
Chin et al. | Facial skin image classification system using Convolutional Neural Networks deep learning algorithm | |
CN118506407B (en) | Light pedestrian re-recognition method and system based on random color discarding and attention | |
CN115797709B (en) | Image classification method, device, equipment and computer readable storage medium | |
CN116798041A (en) | Image recognition method and device and electronic equipment | |
Han et al. | Unsupervised learning based dual-branch fusion low-light image enhancement | |
CN116129417A (en) | Digital instrument reading detection method based on low-quality image | |
CN114743148A (en) | Multi-scale feature fusion tampering video detection method, system, medium, and device | |
CN116958615A (en) | Picture identification method, device, equipment and medium | |
Hu et al. | Multi-scale information fusion generative adversarial network for real-world noisy image denoising |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |