CN117437557A - Hyperspectral image classification method based on double-channel feature enhancement - Google Patents
Hyperspectral image classification method based on double-channel feature enhancement Download PDFInfo
- Publication number
- CN117437557A CN117437557A CN202210812607.4A CN202210812607A CN117437557A CN 117437557 A CN117437557 A CN 117437557A CN 202210812607 A CN202210812607 A CN 202210812607A CN 117437557 A CN117437557 A CN 117437557A
- Authority
- CN
- China
- Prior art keywords
- feature
- channel
- hyperspectral image
- feature map
- spatial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000003595 spectral effect Effects 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 8
- 230000009467 reduction Effects 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 22
- 230000004913 activation Effects 0.000 claims description 20
- 238000001228 spectrum Methods 0.000 claims description 20
- 238000011176 pooling Methods 0.000 claims description 7
- 230000004927 fusion Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 2
- 230000008447 perception Effects 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 claims description 2
- 238000012795 verification Methods 0.000 claims 2
- 230000003287 optical effect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 24
- 238000013527 convolutional neural network Methods 0.000 description 13
- 238000010606 normalization Methods 0.000 description 12
- 238000013135 deep learning Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008033 biological extinction Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003121 nonmonotonic effect Effects 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/58—Extraction of image or video features relating to hyperspectral data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a hyperspectral image classification method based on dual-channel feature enhancement. Aiming at the problem of how to more fully extract and utilize the spatial information and the spectral information of the hyperspectral image under the condition of limited training samples, the invention provides a hyperspectral image classification method based on double-channel feature enhancement (DCFE). First, two channels are designed to capture spectral and spatial features, respectively, using a three-dimensional convolution as a feature extractor in each channel. Then, the feature map in the optical channel is fused with the feature map of the spatial channel after the dimension reduction operation. And finally, inputting the feature map which integrates the spectral features and the spatial features into an attention module, and realizing feature enhancement by improving the attention of important information and reducing the interference of useless information. The experimental results on four hyperspectral data sets show that the method has good classification performance.
Description
Technical Field
The invention relates to the field of remote sensing images, in particular to a hyperspectral image classification method based on double-channel feature enhancement.
Background
The hyperspectral image (Hyperspectral Image, HSI) is also called hyperspectral remote sensing image, is a three-dimensional image captured by an aerospace vehicle carrying a hyperspectral imager, and consists of a two-dimensional space dimension and a spectrum dimension, wherein the spectrum dimension comprises tens or hundreds of spectrum bands, so that the hyperspectral image has wide application prospects in fields such as land coverage analysis, water monitoring, anomaly detection, change region detection and the like. Thus, hyperspectral classification techniques that evolve rapidly and with high classification accuracy will bring great progress to the development of society.
The goal of hyperspectral image classification is to assign a class label to each pixel in the image based on the sample characteristics. In early research on hyperspectral image classification, methods such as a Support Vector Machine (SVM), sparse Representation Classification (SRC), polynomial logistic regression (MLR) and the like are proposed, however, the methods only use information of spectrum dimension, and the hyperspectral image has higher spatial correlation while containing rich spectrum characteristics, so that feature extraction is not complete enough, and a classifier with higher accuracy is difficult to learn under the condition of fewer samples.
The deep learning has outstanding expression in the aspect of extracting nonlinear and hierarchical features, and has great breakthrough in the fields of image classification, natural language processing, target detection and the like. Hyperspectral image classification is a typical classification task and is deeply affected by deep learning. Chen et al propose a method for extracting high-order features of hyperspectral image data based on a deep learning method of a stacked self-coding network (SAE), and obtain classification results using logistic regression. Makantasis et al propose a method for integrally extracting spatial and spectral features using random principal component analysis (R-PCA). Chen et al propose a classification method based on a Deep Belief Network (DBN) and a Restricted Boltzmann Machine (RBM). Zhao et al applied CNN as a feature extractor to hyperspectral image classification. Zhang et al propose a method based on a differential area convolutional neural network (DRCNN) that uses different image blocks within the target pixel neighborhood as input to the CNN, effectively enhancing the input data. Lee et al propose a Contextual Deep Convolutional Neural Network (CDCNN) with deeper and wider networks. He et al propose a lightweight fused CNN algorithm 3D-2D-1D CNN, which effectively improves the analysis speed of data on the premise of ensuring high classification accuracy. Although these methods are effective, the extraction and utilization of spatial information and spectral information for hyperspectral images is insufficient, resulting in a failure to obtain better classification in the case of limited training samples.
In order to solve the problem of insufficient extraction and utilization of spatial information and spectral information in hyperspectral image classification, a hyperspectral image classification method based on dual-channel feature enhancement (DCFE) is proposed. The method comprises two channels of spectrum and space, wherein a multi-branch 3D convolutional neural network is used for capturing spectrum characteristics and space characteristics in each channel, output characteristics in the two channels are subjected to characteristic fusion, and characteristics in the fused characteristic diagram are subjected to characteristic enhancement through Coordinate Attention (CA). Finally, the final classification result is obtained by the fully connected layer (FC).
Disclosure of Invention
The hyperspectral image classification method based on the double-channel feature enhancement is provided, features of two dimensions of a hyperspectral image are extracted through two channels respectively, weight distribution is carried out by using attention, useless features are restrained by the enhanced useful features, and space information and spectrum information are extracted and utilized better, so that a better classification effect is achieved.
To achieve the above object, the present application provides the following solutions:
a hyperspectral image classification method based on dual-channel feature enhancement comprises the following steps:
extracting the spatial features and the spectral features of the hyperspectral image through 3D convolution in two channels;
s1: and carrying out preprocessing such as pixel point segmentation and filling on the original image, and dividing the data.
S2: and extracting the spectral features and the spatial features of the hyperspectral image.
S3: and (3) performing dimension reduction on the spectral feature map extracted in the step (S2), and fusing the spectral feature map with the extracted spatial feature map.
S4: and (3) inputting the feature map which is fused with the spectral features and the spatial features in the S2 into an attention module to realize feature enhancement.
S5: the hyperspectral image is classified using a method based on dual channel feature enhancement.
Preferably, the raw hyperspectral image data is preprocessed before inputting the data into the DCFE model: firstly, cutting an image into a square with 11×11×band (spectrum dimension of a hyperspectral image) by taking a category pixel point as a center, and performing filling operation on an edge pixel point.
Preferably, the spectral feature and spatial feature extraction method includes: in the spectral channel, first, 11×11×band image samples are convolved into a feature map of 11×11×97 by a convolution layer, and then input into a spectral block consisting of 3 multi-branch blocks, each consisting of a convolution layer, a batch normalization layer, and a mich activation function. In the spatial channel, first, 11×11×200 image samples are convolved into a feature map of 11×11×1 by a convolution layer, and then input into a spatial block composed of 3 multi-branch blocks each composed of a convolution layer, a batch normalization layer, and a mich activation function.
The network for extracting the spectral features and the spatial features in the channel is composed of a plurality of multi-branch blocks, in each multi-branch block, a normalization layer and an activation layer enter a convolution layer using m a x D convolution kernels to perform 3D convolution operation, a feature map after passing through the convolution layer is added with feature maps of two residual branches element by element, and a result is used as an input of the next multi-branch block.
Preferably, the method for reducing the dimension of the optical characteristic diagram comprises the following steps: in the spectrum channel, the size of the output spectrum characteristic diagram is 11 multiplied by 97, the dimension reduction operation is carried out by a convolution layer with the convolution kernel size of 11 multiplied by 97, the characteristic diagram with the same size as the dimension of the space channel is output, and finally the space characteristic diagram and the spectrum characteristic diagram are fused along the third dimension.
Preferably, the method for enhancing the characteristics of the characteristic map comprises the following steps: compared with the attention module which is used in other methods and can only establish the unidirectional relation of the feature map, the feature map which is fused with the spectral feature and the spatial feature is input into the CA module, and the CA embeds the information on the space of the feature map into the attention of the channel by using a space information coding mode, so that the relationship between the channels of the feature map can be established by using the information on the space.
In CA, in order to preserve spatial information, global pooling is first decomposed into two one-dimensional feature encoding operations, and each channel is encoded along horizontal and vertical coordinates using two pooling kernels of (H, 1) and (1, W), respectively, to generate a pair of direction-sensing feature maps. And then carrying out feature fusion on the generated feature map, inputting the feature map into a convolution layer of C/r convolution kernels with the size of 1 multiplied by 1 to carry out the operation of reducing and activating to obtain output with the size of C/r multiplied by 1 multiplied by (H+W), splitting the output along the space dimension, carrying out convolution operation respectively, and converting the output into tensors with the same number as the original input channels. Finally, the two tensors are used as attention weights to multiply the original input to realize characteristic enhancement.
Preferably, the training method of the hyperspectral image classification method based on the dual-channel feature enhancement is as follows:
all experiments of the invention were run on a system of Intel (R) Xeon (R) 4208CPU @ 2.10GHz processor, nvidia GeForce RTX 2060Ti graphics card, all classifiers were implemented using pyrerch, batch size was set to 16, optimizer using RMSprop, learning rate initial value of 0.00008, adjusting learning rate using cosine annealing, loss function using cross entropy loss function. In order to verify the feasibility of the method, the invention performs a comparison test on four public data sets with other five hyperspectral image classification methods.
The beneficial effects of this application are:
1) The invention provides a multi-branch structure +3D convolution method for extracting spectrum and space characteristics respectively. The method not only can more fully extract and utilize the spectrum information and the space information of the hyperspectral image, but also solves the problem of gradient disappearance in the deep network, accelerates the speed of network training and convergence, prevents overfitting and improves the quality of results.
2) The present invention proposes a method of embedding spatial information into channel attention. The method not only can capture cross-channel information, but also can capture spatial information. In this way, the feature map fused by the two channels is enhanced with useful information and suppressed with redundant information.
3) The dual-channel characteristic enhancement network provided by the invention obtains the most advanced classification precision in the limited data set of four training samples.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the following brief description is given to the accompanying drawings:
FIG. 1 is a diagram of a 3D-CNN structure incorporating batch normalization
FIG. 2 is a Coordinate Attention architecture diagram
FIG. 3 is a diagram of a multi-branch structure
FIG. 4 is a diagram of a multi-branch block structure
Fig. 5 is a diagram of DCFE network architecture
FIG. 6 is a diagram of classification results for IP datasets
FIG. 7 is a diagram of classification results for UP data sets
FIG. 8 is a diagram of classification results of SV data sets
FIG. 9 is a diagram of classification results of BS data sets
Fig. 10 is a flow chart of DCFE network
Detailed Description
The following description of the invention is illustrative in nature and is not to be construed as limiting the scope of the invention.
FIG. 1 is a diagram of a 3D-CNN structure incorporating batch normalization. In the convolution process, the 3D-CNN carries out convolution operation in three directions of a width direction, a height direction and a channel direction, so that spectrum information and space information in a hyperspectral image sample can be directly extracted. Thus, 3D-CNN is used herein as the basic structure of the DCFE process. In addition, a batch normalization layer (BN) is added to each 3D-CNN layer to improve numerical stability.
As shown in fig. 1, the input is n k P in size k ×p k ×b k Through n k+1 The convolution kernel has a size of a k+1 ×a k+1 ×d k+1 To generate n k+1 P in size k+1 ×p k+1 ×b k+1 The ith output of the add batch normalized kth+1th 3D-CNN layer is expressed as:
wherein:is the j-th input feature map of the k+1-th layer,>representing the final output of the k-th layer, E (-) and Var (-) representing the expectation and variance functions,/and->And->Representing the weight and bias of the k+1 layer, R (·) is a nonlinear activation function in the network.
Fig. 2 is a Coordinate Attention architecture diagram. As shown in fig. 2, to be able to preserve spatial information, global pooling is decomposed into two one-dimensional feature encoding operations, specifically, given an input X, X of size c×h×w, each channel is encoded along horizontal and vertical coordinates using two pooling cores of sizes (H, 1) and (1, W), respectively. Thus, the output of the c-th channel of height h and the output of the c-th channel of width w can be expressed as:
wherein: x is x c Representing the input. The two transforms aggregate features along two spatial directions, respectively, to generate a pair of direction perception feature maps. Feature fusion is carried out on the feature graphs generated by the formula (3) and the formula (4), and then the feature graphs are input into a convolution layer of C/r convolution kernels with the size of 1 multiplied by 1, and the output of the convolution layer can be expressed as:
f=δ(F 1 ([z h ,z w ])) (5)
wherein:[·,·]representing the stitching operation along the spatial dimension, delta is a nonlinear activation function, r is the reduction rate used to control the number of channels. Then splitting f along the spatial dimension into +.>Andrespectively performing convolution operation, and converting to the same channel number as the input X to obtain an output g h And g w
g h =σ(F h (f h )) (6)
g w =σ(F w (f w )) (7)
Wherein: f (F) h And F w Represents convolution operation, sigma represents Sigmoid activation function, and then outputs g h And g w The final output, which is expanded and used as a attention weight, can be expressed as:
y c (i,j)=x c (i,j)×g h (i)×g w (j) (8)
wherein: x is x c Representing input g h And g w Representing the attention weight.
Fig. 3 is a multi-branch structure diagram. As shown in fig. 3, F represents a hidden layer, which contains a convolution layer, a normalization layer and an activation layer, G represents a convolution layer with a convolution kernel size of 1 x 1, directly through the network to the next layer is an Identity branch. It is the presence of these branches that makes the extracted features finer and solves the problem of gradient extinction in deep networks. The output of the i-1 th multi-branch block in the figure can be expressed as:
X i =H i-1 (X i-1 )+G i-1 (X i-1 )+X i-1 (9)
where F (·) represents convolution, normalization, and activation operations and G (·) represents convolution operations. The network in the channel for extracting the spectral and spatial features is composed of a plurality of multi-branched blocks.
Fig. 4 is a diagram of a multi-branch block structure. As shown in fig. 4, assuming that n feature maps of p×p×b are input, the normalization layer and the activation layer are input to the convolution layer using m convolution kernels of a×a×d to perform a 3D convolution operation, the feature maps after passing through the convolution layer and the feature maps of two residual branches are added element by element, and the result is taken as the input of the next multi-branch block.
Fig. 5 is a DCFE network architecture diagram. As shown in fig. 5, the DCFE model architecture is described by taking the Indian Pins (IP) dataset as an example. Indian pins contain 145 x 145 pixels, each with 200 spectral bands, i.e., the Indian pins dataset is 145 x 200 in size. The number of pixels with corresponding labels is 20249, the other pixels are background, the sample size is allocated to 11×11×200, and the number of convolution kernels of the convolution layers is fixed to 24.
In the spectral channel, first, 11×11×200 image samples are convolved into a feature map of 11×11×97 by a convolution layer, and then input into a spectral block consisting of 3 multi-branch blocks, each consisting of a convolution layer, a batch normalization layer, and a mich activation function.
In the spatial channel, first, 11×11×200 image samples are convolved into a feature map of 11×11×1 by a convolution layer, and then input into a spatial block composed of 3 multi-branch blocks each composed of a convolution layer, a batch normalization layer, and a mich activation function.
And carrying out feature fusion on the output feature map of the spectrum channel and the feature map of the space channel, inputting the feature fusion into the attention module, adding the attention weight obtained by calculation to the original feature map, obtaining a 1 multiplied by 48 feature map through pooling operation, and finally obtaining a classification result through a full connection layer.
In the network structure, a proper activation function can accelerate the back propagation and convergence speed of the network, in the DCFE model, normalization and activation operations are carried out after each convolution operation, the selected activation function is a Mish function, a self-regularized non-monotonic activation function is adopted instead of a traditional ReLU activation function, and the Mish function is calculated as follows:
mish(x)=x×tanh(ln1+e x ) (10)
wherein: x represents the input of the activation function. ReLU is a piecewise linear function that unifies all negative inputs to zero even though useful information is contained in the negative inputs. In contrast, the Mish function retains the negative input as a negative output, and therefore, some useful information is retained.
Fig. 6 is a classification result diagram of an IP dataset.
Fig. 7 is a classification result diagram of the UP dataset.
Fig. 8 is a classification result diagram of SV data sets.
Fig. 9 is a classification result diagram of a BS data set.
Claims (5)
1. The hyperspectral image classification method based on the double-channel feature enhancement is characterized by comprising the following steps of:
s1: and preprocessing such as pixel point cutting and filling is carried out on the original image, and data are divided.
S2: and extracting the spectral features and the spatial features of the hyperspectral image.
S3: and (3) performing dimension reduction on the spectral feature map extracted in the step (S2), and fusing the spectral feature map with the extracted spatial feature map.
S4: and (3) inputting the feature map which is fused with the spectral features and the spatial features in the step (S2) into an attention module, realizing feature enhancement, and finally obtaining a classification result through a full-connection layer.
2. The hyperspectral image classification method based on dual-channel feature enhancement as claimed in claim 1, wherein the process of S1 is:
and loading original hyperspectral image data, dividing category pixel points in the image, and filling edge pixel points, wherein the size of specified clipping is 11×11. Dividing the segmented sample into a training set, a verification set and a test set, wherein the training set and a corresponding label thereof are used for updating network parameters, the verification set and the label thereof are used for monitoring a temporary model generated in a training stage, and the test set is used for evaluating an optimal model.
3. The hyperspectral image classification method based on the dual-channel feature enhancement as claimed in claim 1, wherein the process of S2 is:
firstly, carrying out convolution operation on hyperspectral image samples by using two different 3D convolutions, then respectively inputting the obtained two different feature graphs into a spectrum channel and a space channel, wherein a feature extractor consists of three layers of convolutions, and in the feature extraction process, if the input is n k P in size k ×p k ×b k Through n k+1 The convolution kernel has a size of a k+1 ×a k+1 ×d k+1 To generate n k+1 P in size k+1 ×p k+1 ×b k+1 The ith output of the added batch normalized kth+1th 3D-CNN layer is shown in the formulas (1) and (2):
wherein:is the j-th input feature map of the k+1-th layer,>representing the final output of the k-th layer, E (-) and Var (-) representing the expectation and variance functions,/and->And->Representing the weight and bias of the k+1 layer, R (·) is a nonlinear activation function in the network. Finally, a spectral feature map and a spatial feature map are obtained by convolution operation in the two channels.
4. The hyperspectral image classification method based on dual-channel feature enhancement as claimed in claim 1, wherein the process of S3 is:
and carrying out convolution operation on the extracted spectrum feature map to reduce the dimension, and then fusing the spectrum feature map with the space feature map to obtain a feature map fused with the space spectrum feature and the space feature.
5. The hyperspectral image classification method based on the dual-channel feature enhancement as claimed in claim 1, wherein the process of S4 is:
first, to be able to preserve spatial information, global pooling is decomposed into two one-dimensional feature encoding operations, specifically, given an input X, X of size c×h×w, each channel is encoded along horizontal and vertical coordinates, respectively, using two pooling kernels of sizes (H, 1) and (1, W). Thus, the output of the c-th channel of height h and the output of the c-th channel of width w can be expressed as:
wherein: x is x c Representing the input. The two transforms aggregate features along two spatial directions, respectively, to generate a pair of direction perception feature maps. Then, feature fusion is carried out on the feature graphs generated by the formula (3) and the formula (4), and then the feature graphs are input into a convolution layer of C/r convolution kernels with the size of 1 multiplied by 1, and the output of the convolution layer can be expressed as:
f=δ(F 1 ([z h ,z w ])) (5)
wherein:[·,·]representing the stitching operation along the spatial dimension, delta is a nonlinear activation function, r is the reduction rate used to control the number of channels. Then splitting f along the spatial dimension into +.>Andrespectively performing convolution operation, and converting to the same channel number as the input X to obtain an output g h And g w :
g h =σ(F h (f h )) (6)
g w =σ(F w (f w )) (7)
Wherein: f (F) h And F w Represents convolution operation, sigma represents Sigmoid activation function, and then outputs g h And g w The final output, which is expanded and used as a attention weight, can be expressed as:
y c (i,j)=x c (i,j)×g h (i)×g w (j) (8)
wherein: x is x c Representing input g h And g w Representing the attention weight. And finally, inputting the feature map enhanced by the attention module into a full-connection layer to obtain a classification result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210812607.4A CN117437557A (en) | 2022-07-12 | 2022-07-12 | Hyperspectral image classification method based on double-channel feature enhancement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210812607.4A CN117437557A (en) | 2022-07-12 | 2022-07-12 | Hyperspectral image classification method based on double-channel feature enhancement |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117437557A true CN117437557A (en) | 2024-01-23 |
Family
ID=89555825
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210812607.4A Pending CN117437557A (en) | 2022-07-12 | 2022-07-12 | Hyperspectral image classification method based on double-channel feature enhancement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117437557A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113743429A (en) * | 2020-05-28 | 2021-12-03 | 中国人民解放军战略支援部队信息工程大学 | Hyperspectral image classification method and device |
-
2022
- 2022-07-12 CN CN202210812607.4A patent/CN117437557A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113743429A (en) * | 2020-05-28 | 2021-12-03 | 中国人民解放军战略支援部队信息工程大学 | Hyperspectral image classification method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111462126B (en) | Semantic image segmentation method and system based on edge enhancement | |
CN113221639B (en) | Micro-expression recognition method for representative AU (AU) region extraction based on multi-task learning | |
CN110929736B (en) | Multi-feature cascading RGB-D significance target detection method | |
CN108154133B (en) | Face portrait-photo recognition method based on asymmetric joint learning | |
CN113192076B (en) | MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction | |
CN111860683A (en) | Target detection method based on feature fusion | |
CN115311502A (en) | Remote sensing image small sample scene classification method based on multi-scale double-flow architecture | |
CN113065426B (en) | Gesture image feature fusion method based on channel perception | |
CN116091946A (en) | Yolov 5-based unmanned aerial vehicle aerial image target detection method | |
CN110826534B (en) | Face key point detection method and system based on local principal component analysis | |
CN114463340B (en) | Agile remote sensing image semantic segmentation method guided by edge information | |
CN117437557A (en) | Hyperspectral image classification method based on double-channel feature enhancement | |
CN114898157A (en) | Global learning device and method for hyperspectral image classification | |
CN116977747B (en) | Small sample hyperspectral classification method based on multipath multi-scale feature twin network | |
CN118736226A (en) | Method and system for dividing small sample dam cracks based on general dividing large model | |
CN117830835A (en) | Satellite remote sensing image segmentation method based on deep learning | |
CN113095185B (en) | Facial expression recognition method, device, equipment and storage medium | |
CN116844039A (en) | Multi-attention-combined trans-scale remote sensing image cultivated land extraction method | |
CN112991257B (en) | Heterogeneous remote sensing image change rapid detection method based on semi-supervised twin network | |
Wu et al. | Tire defect detection based on low and high-level feature fusion | |
CN113705731A (en) | End-to-end image template matching method based on twin network | |
Wang et al. | Deep convolutional neural network and its application in image recognition of road safety projects | |
CN115471677B (en) | Hyperspectral image classification method based on double-channel sparse network | |
CN117392392B (en) | Rubber cutting line identification and generation method | |
CN115909045B (en) | Two-stage landslide map feature intelligent recognition method based on contrast learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |