CN114842051B - Unmanned aerial vehicle tracking model migration learning method based on depth attribution map - Google Patents
Unmanned aerial vehicle tracking model migration learning method based on depth attribution map Download PDFInfo
- Publication number
- CN114842051B CN114842051B CN202210473138.8A CN202210473138A CN114842051B CN 114842051 B CN114842051 B CN 114842051B CN 202210473138 A CN202210473138 A CN 202210473138A CN 114842051 B CN114842051 B CN 114842051B
- Authority
- CN
- China
- Prior art keywords
- layer
- model
- aerial vehicle
- unmanned aerial
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000013508 migration Methods 0.000 title claims abstract description 23
- 230000005012 migration Effects 0.000 title claims abstract description 23
- 238000001514 detection method Methods 0.000 claims abstract description 35
- 238000012549 training Methods 0.000 claims abstract description 31
- 238000013528 artificial neural network Methods 0.000 claims abstract description 17
- 230000006870 function Effects 0.000 claims abstract description 11
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000008859 change Effects 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 4
- 238000005286 illumination Methods 0.000 claims description 4
- 230000000007 visual effect Effects 0.000 claims description 2
- 238000012216 screening Methods 0.000 abstract 1
- 238000013526 transfer learning Methods 0.000 description 14
- 238000004422 calculation algorithm Methods 0.000 description 5
- 208000024827 Alzheimer disease Diseases 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 230000008921 facial expression Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008909 emotion recognition Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an unmanned aerial vehicle tracking model migration learning method based on a depth attribution map, which belongs to the technical field of migration learning and comprises the steps of collecting detection data; selecting a deep neural network pre-training model; constructing a forward propagation path, and collecting output characteristics of each convolution layer of the pre-training model; calculating the similarity of different data points among the same characteristics, and constructing an edge similarity sequence; constructing a node attribution value sequence; each time a layer is added to the convolution layer, calculating cosine similarity of attribution values of the layer and a last layer node, and calculating a spearman correlation coefficient of similarity of the layer and a last layer side; constructing a depth attribution map similarity function; and (3) obtaining the correlation coefficient corresponding to each convolution layer, setting a threshold value, screening the correlation coefficient larger than the threshold value, wherein the value corresponds to a critical point of fine tuning of the convolution layer as a model parameter, and training parameters behind the layer. The method is simple to operate, short in training period and free of a large amount of image data.
Description
Technical Field
The invention relates to an unmanned aerial vehicle tracking model migration learning method based on a depth attribution map, and belongs to the technical field of migration learning.
Background
Currently, the mainstream target tracking algorithm is mainly used for tracking any target in a video sequence, and tracking specific targets mainly depends on the generalization capability of the tracking algorithm, so that the application of the tracking algorithm in a specific actual scene is often difficult to obtain a satisfactory tracking effect. In addition, deep learning often requires a large number of labeled training samples, and the data distribution is the same as that of test samples, but the data set sample acquisition difficulty aiming at a specific tracking scene is large, the time consumption is long, and the problem that fitting is easy to happen when the deep neural network is trained from the beginning by utilizing the data set, so that the trained deep neural network model has no practical application value. The occurrence of the migration learning theory provides an important method and a path for solving the problem, but most of the fields currently adopt a traditional model-based pre-training-fine tuning migration learning method, and the method can obtain a migration model with better performance, but needs a large amount of experiments in the process of selecting a model fine tuning critical point, consumes time and occupies a large amount of calculation resources, and is not fully applicable in solving the practical problem.
Literature (Maqsood M,Nazir F,Khan U,et al.Transfer learning assisted classification and detection of Alzheimer's disease stages using 3D MRI scans[J].Sensors,2019,19(11):2645.) proposes a method for detecting and classifying the diseased stages of alzheimer's disease with the aid of transfer learning. The method adopts AlexNet networks trained on a large image dataset ImageNet, replaces the last three full-connection layers of the networks with a softmax layer, a full-connection layer and an output layer, then trains the networks on the Alzheimer's disease medical image dataset by utilizing a pretraining-fine tuning method in transfer learning, and finally realizes the detection of the Alzheimer's disease. The method adopts a pre-training-fine-tuning transfer learning method, however, the medical image is required to be strictly marked, so that the method has a large limitation.
Literature (Rathi D.Optimization of Transfer Learning for Sign Language Recognition Targeting Mobile Platform[J].arXiv preprint arXiv:1805.06618,2018.) proposes a mobile platform-based american sign language recognition algorithm, the pre-training model of which is a MobileNet model and a Inception V model trained on ImageNet, and the pre-training model is trained on Sign Language MNIST by means of transfer learning and deployed on the mobile platform. The method still only uses the traditional transfer learning method, and the transfer learning efficiency is not improved.
Document (Nguyen D,Nguyen K,Sridharan S,et al.Meta transfer learning for facial emotion recognition[C]//2018 24th International Conference on Pattern Recognition(ICPR).IEEE,2018:3543-3548.) proposes an algorithm for automatic facial expression recognition, which trains a system capable of automatically recognizing facial expressions on SAVEE and ENTERFACE data sets by using a meta-shift learning method PathNet, and overcomes the problem that a priori knowledge is lost in multiple cross-domain shifts due to lack of the facial expression data sets and a pretraining-fine tuning shift method. The method uses a meta transfer learning method, and is not an efficient transfer learning method.
Disclosure of Invention
Aiming at the problems, the invention provides an unmanned aerial vehicle tracking model migration learning method based on a depth attribution map, which can obtain critical points of model migration fine adjustment in a simple mode, thereby improving migration learning efficiency.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
an unmanned aerial vehicle tracking model migration learning method based on a depth attribution map comprises the following steps:
S1: collecting image data containing a target unmanned aerial vehicle model as detection data D p, detection data D p={x1,x2,…xa,xn and n unmanned aerial vehicle image data;
S2: using a tracking model SiamRPN ++ which is trained by using the universal tracking data set as a deep neural network pre-training model m 1;
S3: constructing a forward propagation path, inputting detection data D p acquired in the step S1 into a deep neural network pre-training model m 1 in the step S2, calculating an output characteristic F k 1 of a convolution layer every time an image in the detection data D p passes through the convolution layer, and storing a result; through n convolution layers of the deep neural network pre-training model m 1, constructing a knowledge pool omega containing n output features, wherein omega= { F 1 1,F2 1,…,Fk 1,Fn 1 };
S4: calculating the similarity of the same output characteristic F k 1 between every two image data points in the detection data D p by using cosine similarity to obtain the similarity of edges
S5: constructing a reverse propagation path, inputting the detection data D p into a deep neural network pre-training model m 1, and calculating the attribution value of the input data x a for the characteristic output F k 1(xa) by using a gradient input modeThe node attribution value for obtaining the output characteristic of the layer is
S6: constructing a node attribution value sequence, and calculating the similarity of each layer of convolution and the last layer of convolution characteristic embedded space node attribution value in the deep neural network pretraining model m 1 by using cosine similarity
S7: constructing a similarity sequence of the feature embedding space according to the arrangement sequence of the convolution layers, and calculating the edge similarity of the feature embedding space of each layer of convolution and the final layer of convolutionAnd (3) withIs obtained as the correlation coefficient
S8: constructing a depth attribution map, wherein the similarity function is as followsThen according to the similarity function, the correlation coefficient r k of the depth attribution map of each layer convolution and the last layer convolution is obtained;
S9: setting a correlation coefficient threshold r set, comparing the correlation coefficient r k obtained by each calculation with r set, and reserving if the correlation coefficient is larger than or equal to the threshold, otherwise discarding;
s10: and taking the k layer where r k is positioned as a model parameter fine tuning critical point.
The technical scheme of the invention is further improved as follows: the detection data D p in the step S1 is data randomly sampled in a self-made multi-rotor unmanned aerial vehicle data set; the self-made multi-rotor unmanned aerial vehicle comprises a Sinkiang unmanned aerial vehicle model Mavic; the attribute of homemade many rotor unmanned aerial vehicle dataset includes rapid movement, background confusion, similar thing interference, deformation, shelter from, motion blur, illumination change, scale change and surpass the field of vision, the scene of homemade many rotor unmanned aerial vehicle dataset includes city, crowd, school and beach, homemade many rotor unmanned aerial vehicle dataset still includes different visual angles, with unmanned aerial vehicle's relative distance and different flight gesture.
The technical scheme of the invention is further improved as follows: the SiamRPN ++ model m 1 in step S2 is a pre-training model that is trained on a tracking dataset comprising ILSVRC2015-DET, ILSVRC2015-VID, COCO2017, YOUTUBE-BoundingBoxes for tracking tasks, the SiamRPN ++ model comprising a feature extraction network ResNet50 and a region candidate network RPN.
The technical scheme of the invention is further improved as follows: in the step S3, the SiamRPN ++ model m 1 is composed of a plurality of nonlinear primitive functions, when a forward propagation path is constructed, for the convolution layer structure of the model, the output features F k 1 of different convolution layers are selected, each layer of the SiamRPN ++ pre-training model has a plurality of convolutions, and the output feature of the last convolution of each layer is selected to be obtained and reserved as the output of the layer.
The technical scheme of the invention is further improved as follows: in the step S4, the cosine similarity is used to calculate the similarity of all image data points in the detected data with respect to the output feature F k 1 Can be expressed as:
In the formula, Representing the edge of the p-th node and the q-th node, and expressing the similarity between the features of the two nodes in the feature space F k 1 by using cosine similarity, wherein the specific calculation mode is as follows:
The technical scheme of the invention is further improved as follows: the specific way of calculating the attribute value of the input data node to the output feature by using the gradient input method in the step S5 is as follows:
For the pre-training model m 1, one input data is given Calculate the value of the ith element in x a for the factor of F k 1(xa)The calculation method is as follows:
The technical scheme of the invention is further improved as follows: in the step S7, the correlation coefficient of the edge similarity between each layer of convolution and the feature embedding space of the last layer of convolution of the detection data is calculated by using the spearman correlation coefficient, and the calculation mode is as follows:
wherein d i represents AndIs the difference in the ith element order.
By adopting the technical scheme, the invention has the following technical effects:
1) The migration fine tuning critical point of the unmanned aerial vehicle tracking model can be quickly found;
2) The required image data volume is small, the calculation time is short, and the calculation cost is reduced;
3) The training time is short, and the efficiency of the whole transfer learning process is improved.
Drawings
FIG.1 is a flow chart of the present invention;
FIG. 2 is a diagram of the SiamRPN ++ model structure to which the present invention applies.
Detailed Description
The invention is further described in detail below with reference to the attached drawings and specific examples:
An unmanned aerial vehicle tracking model migration learning method based on depth attribution patterns, as shown in fig. 1, comprises the following steps:
S1: by means of random sampling, 200 pieces of unmanned aerial vehicle data containing different attributes and different backgrounds are collected in Mavic multi-rotor unmanned aerial vehicle to serve as detection data D p, the detection data is represented as D p={x1,x2,…xa,xn, and n pieces of unmanned aerial vehicle image data are contained.
The detection data D p is data randomly sampled in the self-made multi-rotor unmanned aerial vehicle data set, and the Xinjiang unmanned aerial vehicle model contained in the self-made multi-rotor unmanned aerial vehicle data set is Mavic. The data set comprises multiple attributes such as rapid movement, background confusion, similar interference, deformation, shielding, motion blurring, illumination change, scale change, beyond view field and the like, and comprises multiple scenes such as cities, crowds, schools, beaches and the like, different view angles, relative distances with the unmanned aerial vehicle, different flight attitudes and the like, and the actual scenes which the unmanned aerial vehicle may need to deal with are fully considered. When the acquisition of the detection data is carried out in a random sampling mode, a plurality of factors such as the attribute, the background and the like are fully considered, and the richness of the detection data is ensured.
S2: the completed tracking model SiamRPN ++ trained on the generic tracking dataset ILSVRC-DET, ILSVRC2015-VID, COCO2017, YOUTUBE-BoundingBoxes will be utilized as the deep neural network pre-training model m 1.
The SiamRPN ++ model m 1 is a pre-training model that is trained on a tracking dataset comprising ILSVRC2015-DET, ILSVRC2015-VID, COCO2017, youtbe-BoundingBoxes for tracking tasks, the SiamRPN ++ model comprising a feature extraction network ResNet and region candidate network RPN.
S3: constructing a forward propagation path, inputting detection data D p into a selected deep neural network pre-training model m 1, calculating an output characteristic F k 1 of a convolution layer every time an image in the detection data D p passes through the convolution layer, and storing a result; through n convolution layers of the pre-training model m 1, a knowledge pool Ω, Ω= { F 1 1,F2 1,…,Fk 1,Fn 1 } containing n output features F k 1 is constructed.
The SiamRPN ++ model m 1 consists of a plurality of nonlinear primitive functions, when a forward propagation path is constructed, aiming at the convolution layer structure of the model, the output characteristics F k 1 of different convolution layers are selected, each layer of the SiamRPN ++ pre-training model has a plurality of convolutions, and the output characteristics of the last convolution of each layer are selected to be obtained and reserved as the output of the layer.
S4: calculating the similarity of the same output characteristic F k 1 between every two image data points in the detection data D p by using cosine similarity to obtain the similarity of edgesCan be expressed as:
In the formula, Representing the edges of the p-th node and the q-th node, and expressing the similarity of the edges between the two nodes by using cosine similarity.
S5: constructing a reverse propagation path, inputting the detection data D p into a selected deep neural network pre-training model m 1, and calculating the attribution value of the input data x a aiming at the characteristic output F k 1(xa) by using a gradient input modeThe node attribution value for obtaining the output characteristic of the layer is
Specifically, the specific way of calculating the attribute value of the input data node to the output characteristic by using the gradient input mode is as follows:
For the pre-training model m 1, one input data is given Calculate the value of the ith element in x a for the factor of F k 1(xa)The calculation method is as follows:
S6: constructing a node attribution value sequence, and calculating the similarity of each layer of convolution and the last layer of convolution characteristic embedded space node attribution value in the pre-training model m 1 by using cosine similarity
S7: constructing a similarity sequence of the feature embedding space according to the arrangement sequence of the convolution layers, and calculating the edge similarity of the feature embedding space of each layer of convolution and the final layer of convolutionAnd (3) withIs obtained as the correlation coefficientThe calculation method is as follows:
wherein d i represents AndIs the difference in the ith element order.
S8: constructing a depth attribution map, and constructing a similarity function expressed as: And solving a correlation coefficient r k of the depth attribution map of each layer convolution and the last layer convolution according to the similarity function.
S9: setting a correlation coefficient threshold r set, comparing the correlation coefficient r k obtained by each calculation with r set, reserving if the correlation coefficient is larger than or equal to the threshold, otherwise discarding, and sorting the reserved correlation coefficient according to the increment of the value, wherein r k can be selected as a final result.
S10: and taking the k layer where r k is positioned as a model parameter fine tuning critical point.
Example 1
An unmanned aerial vehicle tracking model migration learning method based on depth attribution patterns, as shown in fig. 1, comprises the following steps:
S1: by means of random sampling, 200 pieces of unmanned aerial vehicle data containing different attributes and different backgrounds are collected in Mavic multi-rotor unmanned aerial vehicle to serve as detection data D p, the detection data is represented as D p={x1,x2,…xa,xn, and n pieces of unmanned aerial vehicle image data are contained.
The detection data are randomly sampled data in a self-made multi-rotor unmanned aerial vehicle data set, and the self-made multi-rotor unmanned aerial vehicle data set comprises a Xinjiang unmanned aerial vehicle model Mavic. The data set comprises multiple attributes such as rapid movement, background confusion, similar interference, deformation, shielding, motion blurring, illumination change, scale change, beyond view field and the like, and comprises multiple scenes such as cities, crowds, schools, beaches and the like, different view angles, relative distances with the unmanned aerial vehicle, different flight attitudes and the like, and the actual scenes which the unmanned aerial vehicle may need to deal with are fully considered. When the acquisition of the detection data is carried out in a random sampling mode, a plurality of factors such as the attribute, the background and the like are fully considered, and the richness of the detection data is ensured.
S2, referring to FIG. 2, the invention takes SiamRPN ++ model as a pre-training model m 1 aiming at unmanned aerial vehicle tracking tasks, wherein the SiamRPN ++ model consists of a feature extraction network ResNet and a region candidate network RPN. The ResNet network is a network model designed by using a residual error module, adopts a Bottleneck structure and can be divided into 5 convolution layers, wherein layer 1 contains 1 convolution, layer 2 contains 3 convolutions, layer 3 contains 4 convolutions, layer 4 contains 6 convolutions and layer 5 contains 3 convolutions.
S3, the invention adopts the output characteristic of the last convolution of each convolution layer as F k 1, constructs an output characteristic knowledge pool as Ω={F1 1,F2 1,F3 1,F4 1,F5 1}; and takes the final output characteristic of the RPN part as F e 1.
S4, calculating the similarity of each layer of characteristics of the detection data D p in the 1 st layer to the 5 th layer by using cosine similarity, and constructing a similarity matrix of the edgesAlso calculate the edge similarity of the RPN part
S5, calculating node attribution values of output characteristics of the detection data in each layer from layer 1 to layer 5 by using a gradient input mode, and constructing a node attribution value sequenceAlso calculate the node-attribute value of the RPN part
S6, calculating similarity between the node attribution value of each layer of convolution output characteristics and the node attribution value of the RPN part output characteristics in the node attribution value sequence by utilizing cosine similarityAnd retain the results.
S7, calculating the correlation coefficient of the similarity of each layer of output characteristic edge and the similarity of the RPN output characteristic edge in the edge similarity sequence by utilizing the Szechwan correlation coefficientAnd retain the results.
S8, constructing a depth attribution mapAndCalculating a similarity function
S9, under the influence of training data, the lambda is set differently, lambda=1 is set, 10% of unmanned aerial vehicle data are used for transfer learning, lambda=0.01 are 1% of unmanned aerial vehicle data are used for transfer learning, and a corresponding correlation coefficient r k is obtained.
TABLE 1 depth attribution map correlation coefficient
The depth attribution map correlation coefficient of each layer convolution and RPN output characteristic is listed in table 1, the correlation coefficient is found to be basically gradually increased layer by layer, and the correlation coefficient phase difference of other layers except the 1 st layer is not obvious, because ResNet networks are deep neural networks, the characteristic with strong correlation can be learned in shallower layers, and therefore, a threshold value can be set according to the correlation coefficient, and the selection of the model migration fine tuning critical point can be carried out.
S10, when λ=1, selecting a threshold value r set =1, and the corresponding correlation coefficients of the layers 2, 3, 4 and 5 are all larger than the threshold value, so that the corresponding k layer can be selected as a critical point for fine adjustment of model parameters; similarly, when λ=10, the threshold value r set =5 is selected, and the critical point for fine tuning of the model parameters can be selected from layers 2, 3, 4, and 5.
Claims (6)
1. The unmanned aerial vehicle tracking model migration learning method based on the depth attribution map is characterized by comprising the following steps of:
S1: collecting image data containing a target unmanned aerial vehicle model as detection data D p, detection data D p={x1,x2,…xa,xn and n unmanned aerial vehicle image data;
S2: using a tracking model SiamRPN ++ which is trained by using the universal tracking data set as a deep neural network pre-training model m 1;
s3: constructing a forward propagation path, inputting detection data D p acquired in the step S1 into a deep neural network pre-training model m 1 in the step S2, and calculating the output characteristics of a convolution layer once when an image in the detection data D p passes through the convolution layer And saving the result; through n convolution layers of the deep neural network pre-training model m 1, a knowledge pool omega containing n output features is constructed,
S4: calculating the similarity of the same output characteristic F k 1 between every two image data points in the detection data D p by using cosine similarity to obtain the similarity of edges
S5: constructing a reverse propagation path, inputting the detection data D p into a deep neural network pre-training model m 1, and calculating the characteristic output of input data x a by using a gradient input modeIs the value of the cause of (2)The node attribution value for obtaining the output characteristic of the layer isThe specific way of calculating the attribution value of the input data node to the output characteristic by using the gradient input mode is as follows:
For a deep neural network pre-training model m 1, one input data is given Then calculate the ith element pair in x a Is the value of the cause of (2)The calculation method is as follows:
S6: constructing a node attribution value sequence, and calculating the similarity of each layer of convolution and the last layer of convolution characteristic embedded space node attribution value in the deep neural network pretraining model m 1 by using cosine similarity
S7: constructing a similarity sequence of the feature embedding space according to the arrangement sequence of the convolution layers, and calculating the edge similarity of the feature embedding space of each layer of convolution and the final layer of convolutionAnd (3) withIs obtained as the correlation coefficient
S8: constructing a depth attribution map, wherein the similarity function is as followsThen according to the similarity function, the correlation coefficient r k of the depth attribution map of each layer convolution and the last layer convolution is obtained;
S9: setting a correlation coefficient threshold r set, comparing the correlation coefficient r k obtained by each calculation with r set, and reserving if the correlation coefficient is larger than or equal to the threshold, otherwise discarding;
s10: and taking the k layer where r k is positioned as a model parameter fine tuning critical point.
2. The unmanned aerial vehicle tracking model migration learning method based on the depth attribution map of claim 1, wherein the method comprises the following steps of: the detection data D p in the step S1 is data randomly sampled in a self-made multi-rotor unmanned aerial vehicle data set; the self-made multi-rotor unmanned aerial vehicle comprises a Sinkiang unmanned aerial vehicle model Mavic; the attribute of homemade many rotor unmanned aerial vehicle dataset includes rapid movement, background confusion, similar thing interference, deformation, shelter from, motion blur, illumination change, scale change and surpass the field of vision, the scene of homemade many rotor unmanned aerial vehicle dataset includes city, crowd, school and beach, homemade many rotor unmanned aerial vehicle dataset still includes different visual angles, with unmanned aerial vehicle's relative distance and different flight gesture.
3. The unmanned aerial vehicle tracking model migration learning method based on the depth attribution map of claim 1, wherein the method comprises the following steps of: the SiamRPN ++ model m 1 in step S2 is a pre-training model that is trained on a tracking dataset comprising ILSVRC2015-DET, ILSVRC2015-VID, COCO2017, YOUTUBE-BoundingBoxes for tracking tasks, the SiamRPN ++ model comprising a feature extraction network ResNet50 and a region candidate network RPN.
4. The unmanned aerial vehicle tracking model migration learning method based on the depth attribution map of claim 1, wherein the method comprises the following steps of: in the step S3, the SiamRPN ++ model m 1 is composed of a plurality of nonlinear primitive functions, and when constructing the forward propagation path, the output characteristics of different convolution layers are selected according to the convolution layer structure of the modelThe SiamRPN ++ model has multiple convolutions at each layer, and the output characteristic of the last convolution of each layer is selected to be obtained and reserved as the output of the layer.
5. The unmanned aerial vehicle tracking model migration learning method based on the depth attribution map of claim 1, wherein the method comprises the following steps of: in the step S4, all image data points in the detected data are calculated by using cosine similarity, and the output characteristics are aimed atSimilarity of (3)Can be expressed as:
In the formula, Representing the edges of the p-th node and the q-th node, and expressing the two nodes in the feature space by using cosine similarityThe similarity among the features is calculated by the following specific calculation modes:
6. The unmanned aerial vehicle tracking model migration learning method based on the depth attribution map of claim 1, wherein the method comprises the following steps of: in the step S7, the correlation coefficient of the edge similarity between each layer of convolution and the feature embedding space of the last layer of convolution of the detection data is calculated by using the spearman correlation coefficient, and the calculation mode is as follows:
wherein d i represents AndIs the difference in the ith element order.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210473138.8A CN114842051B (en) | 2022-04-29 | 2022-04-29 | Unmanned aerial vehicle tracking model migration learning method based on depth attribution map |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210473138.8A CN114842051B (en) | 2022-04-29 | 2022-04-29 | Unmanned aerial vehicle tracking model migration learning method based on depth attribution map |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114842051A CN114842051A (en) | 2022-08-02 |
CN114842051B true CN114842051B (en) | 2024-11-08 |
Family
ID=82566971
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210473138.8A Active CN114842051B (en) | 2022-04-29 | 2022-04-29 | Unmanned aerial vehicle tracking model migration learning method based on depth attribution map |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114842051B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796232A (en) * | 2019-10-12 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Attribute prediction model training method, attribute prediction method and electronic equipment |
CN111091179A (en) * | 2019-12-03 | 2020-05-01 | 浙江大学 | Heterogeneous depth model mobility measurement method based on attribution graph |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107273872B (en) * | 2017-07-13 | 2020-05-05 | 北京大学深圳研究生院 | Depth discrimination network model method for re-identification of pedestrians in image or video |
CN114329029B (en) * | 2021-10-28 | 2024-05-14 | 腾讯科技(深圳)有限公司 | Object retrieval method, device, equipment and computer storage medium |
-
2022
- 2022-04-29 CN CN202210473138.8A patent/CN114842051B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796232A (en) * | 2019-10-12 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Attribute prediction model training method, attribute prediction method and electronic equipment |
CN111091179A (en) * | 2019-12-03 | 2020-05-01 | 浙江大学 | Heterogeneous depth model mobility measurement method based on attribution graph |
Also Published As
Publication number | Publication date |
---|---|
CN114842051A (en) | 2022-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | RSNet: The search for remote sensing deep neural networks in recognition tasks | |
CN110443818B (en) | Graffiti-based weak supervision semantic segmentation method and system | |
CN110956185B (en) | Method for detecting image salient object | |
CN106909924B (en) | Remote sensing image rapid retrieval method based on depth significance | |
CN111126360B (en) | Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN108764006B (en) | SAR image target detection method based on deep reinforcement learning | |
CN108052966A (en) | Remote sensing images scene based on convolutional neural networks automatically extracts and sorting technique | |
CN106815323B (en) | Cross-domain visual retrieval method based on significance detection | |
CN107025440A (en) | A kind of remote sensing images method for extracting roads based on new convolutional neural networks | |
CN113807188B (en) | Unmanned aerial vehicle target tracking method based on anchor frame matching and Siamese network | |
CN111582091B (en) | Pedestrian recognition method based on multi-branch convolutional neural network | |
CN113240697B (en) | Lettuce multispectral image foreground segmentation method | |
CN111027627A (en) | Vibration information terrain classification and identification method based on multilayer perceptron | |
CN115049841A (en) | Depth unsupervised multistep anti-domain self-adaptive high-resolution SAR image surface feature extraction method | |
Han et al. | Research on remote sensing image target recognition based on deep convolution neural network | |
CN114495170A (en) | Pedestrian re-identification method and system based on local self-attention inhibition | |
Jiang et al. | Arbitrary-shaped building boundary-aware detection with pixel aggregation network | |
Kampffmeyer et al. | Dense dilated convolutions merging network for semantic mapping of remote sensing images | |
Wang et al. | Detecting occluded and dense trees in urban terrestrial views with a high-quality tree detection dataset | |
Sjahputera et al. | Clustering of detected changes in high-resolution satellite imagery using a stabilized competitive agglomeration algorithm | |
CN108830172A (en) | Aircraft remote sensing images detection method based on depth residual error network and SV coding | |
Wang et al. | Big Map R-CNN for object detection in large-scale remote sensing images. | |
CN114842051B (en) | Unmanned aerial vehicle tracking model migration learning method based on depth attribution map | |
Yin et al. | M2F2-RCNN: Multi-functional faster RCNN based on multi-scale feature fusion for region search in remote sensing images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |