CN112819096A - Method for constructing fossil image classification model based on composite convolutional neural network - Google Patents
Method for constructing fossil image classification model based on composite convolutional neural network Download PDFInfo
- Publication number
- CN112819096A CN112819096A CN202110219351.1A CN202110219351A CN112819096A CN 112819096 A CN112819096 A CN 112819096A CN 202110219351 A CN202110219351 A CN 202110219351A CN 112819096 A CN112819096 A CN 112819096A
- Authority
- CN
- China
- Prior art keywords
- image
- fossil
- fossil image
- constructing
- classification model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000013145 classification model Methods 0.000 title claims abstract description 53
- 239000002131 composite material Substances 0.000 title claims abstract description 41
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 28
- 238000000605 extraction Methods 0.000 claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 37
- 230000000007 visual effect Effects 0.000 claims abstract description 18
- 238000011176 pooling Methods 0.000 claims abstract description 12
- 238000012545 processing Methods 0.000 claims abstract description 9
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 21
- 230000001537 neural effect Effects 0.000 claims description 12
- 238000012795 verification Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 239000013598 vector Substances 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 abstract description 2
- 230000004927 fusion Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 34
- 238000013135 deep learning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000003708 edge detection Methods 0.000 description 2
- 230000004438 eyesight Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 229910021487 silica fume Inorganic materials 0.000 description 2
- 229930186147 Cephalosporin Natural products 0.000 description 1
- 235000004035 Cryptotaenia japonica Nutrition 0.000 description 1
- 102000007641 Trefoil Factors Human genes 0.000 description 1
- 235000015724 Trifolium pratense Nutrition 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 229940124587 cephalosporin Drugs 0.000 description 1
- 150000001780 cephalosporins Chemical class 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 210000001589 microsome Anatomy 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for constructing a fossil image classification model based on a composite convolutional neural network, which comprises the following steps of: s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model; s2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then processed sequentially through a global average pooling layer and a full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value; s3: training a primary fossil image classification model. The method respectively extracts the depth features of the original fossil images and the primary visual features of the corresponding gradient images, and further improves the accuracy of the fossil image classification task through feature fusion.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a method for constructing a fossil image classification model based on a composite convolutional neural network, and relates to an image classification method.
Background
In the conventional classification work of microsilica fossils, the classification is performed manually, but since microsilica fossils are too small and are difficult to be found or classified by visual observation, classification is performed manually through observation with a microscope. With the progress of work such as archaeological excavation, constantly there is new type of microsome fossil kind to discover out, this just makes the total sample kind increase, and the efficiency of work such as manual sorting selection is more low. The whole process is tedious, the error probability can be increased along with time, and meanwhile, the long-time high-intensity observation can cause serious damage to the eyesight and other physical health conditions of observers.
With the development of artificial intelligence, the research of deep learning has made great progress, and the deep learning gradually plays an increasingly important role in daily work and has achieved more and more obvious effects in image classification tasks. The method can learn through a small amount of sample data, automatically acquire the most appropriate features for classification through a network model, does not need to manually select the appropriate features, and can obtain higher classification accuracy. The micro fossil images are classified by utilizing the convolutional neural network, so that a large amount of artificial resources are saved, high classification accuracy and high working efficiency can be guaranteed, and a certain reference value is provided for archaeological work.
The existing research techniques mainly include: an image feature based method and a deep learning based method. The traditional fossil image classification method based on image features needs complex steps of feature extraction and feature optimization, and the algorithm complexity is high; the existing fossil image model based on deep learning only extracts features from an original image, primary visual features are not enhanced from the angle of image gradient change, the number of network layers is deep, and a fossil image classification task is easy to be overfitted.
Disclosure of Invention
The invention aims to provide a method for constructing a fossil image classification model based on a composite convolutional neural network, aiming at the problems of high algorithm complexity, low training speed, easiness in fitting and low detection precision of a fossil image classification task.
In order to realize the task, the invention adopts the following technical scheme:
a method for constructing a fossil image classification model based on a composite convolutional neural network comprises the following steps:
s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model;
s2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then processed sequentially through a global average pooling layer and a full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value;
s3: training a primary fossil image classification model.
Optionally, the S2 further includes:
s21: fusing the depth feature map and the primary visual feature map from the channel dimension to obtain a fused feature map;
s22: inputting the fused feature map into a global average pooling layer and outputting to obtain a feature vector;
s23: the feature vectors are subjected to convolution and classification of the full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value.
Optionally, processing the original fossil image by using a Canny operator to obtain a gradient image;
the classification of the full connectivity layer is done by a Softmax classifier.
Optionally, the S3 further includes:
s31: constructing a target loss function;
s32: initializing network parameters of a fossil image feature extraction model by using a ResNet50 deep residual error network, and initializing network parameters of a full connection layer according to normal distribution;
s33: for original fossil images in a data set to be processed, randomly selecting 80% of the original fossil images as training set images, and selecting 10% of the original fossil images as verification set images;
s34: inputting an original training set image and a preprocessed training set image into a fossil image classification model, and minimizing the target loss function value by adopting a batch gradient descending training method to obtain the weight of a trained fossil image classification network;
s35: and feeding back a target loss function value of the preprocessed verification set image, if the target loss function value is smaller than the weight of the trained fossil image classification network, updating the weight of the fossil image classification network, and otherwise, saving the weight of the fossil image classification network.
Optionally, the S31 further includes:
constructing an objective loss function L, expressed as:
L=-θα(1-b)γlog(b)-(1-θ)log(b)
wherein, alpha belongs to [0,1] is a weight factor used for adjusting and balancing the importance degree of different types of samples; gamma >0 is a modulation factor used for adjusting the weight of the easily classified samples; θ is a hyperparameter used to adjust the weights of the two loss functions; b is equal to [0,1] to represent the estimated probability of the model to the true label of the sample.
Optionally, the method for constructing the fossil image feature extraction model includes:
s1: establishing a stackable composite convolution residual block, and setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block;
s2: stacking the composite convolution residual blocks to form neural frameworks with different depths, wherein the depths of the neural frameworks are determined by image data sets where different tasks are located;
the complex convolution residual block set described in S1:
a first layer: a convolution kernel with the size of 1 multiplied by 1 reduces the calculation parameters of the model;
a second layer: 3 × 3 convolution operation, combining the traditional convolution and the cavity convolution, wherein the traditional convolution captures continuous structural dependency characteristics, and the cavity convolution captures structural dependency relationships with longer spacing;
and a third layer: and (5) performing 1 × 1 convolution kernel to restore the number of the feature maps.
Optionally, the number of the basic channels increases linearly with the depth of the neural structure.
Optionally, the fossil image feature extraction model has a structure shown in table 1:
TABLE 1 fossil image feature extraction model network architecture
A method of fossil image classification, comprising: preprocessing a fossil image to be detected to obtain a gradient image;
inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model constructed by the construction method of the fossil image classification model based on the composite convolutional neural network, and performing prediction classification on the classification of the fossil image classification model to obtain the classification of the fossil image.
A computer-readable storage medium storing computer instructions for causing a computer to execute the method for constructing a composite convolutional neural network-based fossil image classification model according to the present invention.
Compared with the prior art, the invention has the following technical characteristics:
(1) in the aspect of constructing a feature extraction network, the invention provides a stackable composite convolution residual block for stacking neural architectures with different depths, wherein the specific depths are determined by image data sets of different tasks. The module combines the traditional convolution and the cavity convolution, the former captures continuous structural dependence characteristics, and the latter captures structural dependence relations with longer spacing distance, so that the receptive field is increased under the condition of not increasing the parameter number.
(2) In the aspect of constructing a fossil classification model, the depth features of the original fossil image and the primary visual features of the fossil gradient image, such as information of image structural components, edge textures and the like, are respectively extracted, and the accuracy of a fossil image classification task is further improved through fusion of the features.
(3) Aiming at the problem that the classification loss function used by the existing fossil image classification model cannot well solve the problem of unbalanced sample class number and relieve overfitting, the invention constructs a new target loss function, so that the model can pay more attention to samples which are difficult to classify while reducing the loss proportion of samples which are easy to classify, and the accuracy of the fossil image classification model is effectively improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow chart of a method for constructing a fossil image classification model based on a composite convolutional neural network according to the present invention;
FIG. 2 is a flowchart of a method for constructing a fossil image feature extraction model according to the present invention;
FIG. 3 is a diagram illustrating the structure of a composite convolution residual block according to the present invention;
fig. 4 is a schematic diagram of the fossil image feature extraction model extracting depth features from an original fossil image according to the present invention.
Detailed Description
In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.
With reference to fig. 1, 2 and 3, the method for constructing a fossil image classification model based on a composite convolutional neural network of the present invention includes the following steps:
s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model; the original fossil image used by the invention is in JPEG format; the gradient image referred to in the invention refers to an image calculated from a gray scale image of an original fossil image by using Canny operator.
S2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain depth features, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain primary visual features, the depth features and the primary visual features are fused and then processed sequentially through a global average pooling layer and a full-link layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value; the primary visual features referred to in the invention refer to features such as color, edge texture, structural relationship and the like extracted from a gradient image corresponding to an original fossil image; the depth features refer to abstract semantic features extracted from an original fossil image, and the depth features extracted at each stage of the fossil image feature extraction model are visualized, as shown in fig. 4.
S3: the method comprises the steps of training a primary fossil image classification model, wherein the training process comprises training of a fossil image feature extraction model, and the fossil image feature extraction model is embedded into the fossil image classification model to jointly form a final fossil image classification model.
In an embodiment of the present disclosure, S2 further includes:
s21: fusing the depth feature map and the primary visual feature map from the channel dimension to obtain a fused feature map;
s22: inputting the fused feature map into a global average pooling layer and outputting to obtain a feature vector;
s23: the feature vectors are subjected to convolution and classification of the full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value.
Specifically, processing an original fossil image by using a Canny operator to obtain a gradient image; the classification of the full connectivity layer is done by a Softmax classifier.
In an embodiment of the present disclosure, S3 further includes:
s31: constructing a target loss function;
s32: initializing network parameters of a fossil image feature extraction model by using a ResNet50 deep residual error network, and initializing network parameters of a full connection layer according to normal distribution;
s33: for original fossil images in a data set to be processed, randomly selecting 80% of the original fossil images as training set images, and selecting 10% of the original fossil images as verification set images;
s34: inputting an original training set image and a preprocessed training set image into a fossil image classification model, and minimizing the target loss function value by adopting a batch gradient descending training method to obtain the weight of a trained fossil image classification network;
s35: and feeding back a target loss function value of the preprocessed verification set image, if the target loss function value is smaller than the weight of the trained fossil image classification network, updating the weight of the fossil image classification network, and otherwise, saving the weight of the fossil image classification network.
Specifically, the target loss function L is represented as:
L=-θα(1-b)γlog(b)-(1-θ)log(b)
wherein, alpha belongs to [0,1] is a weight factor used for adjusting and balancing the importance degree of different types of samples; gamma >0 is a modulation factor used for adjusting the weight of the easily classified samples; θ is a hyperparameter used to adjust the weights of the two loss functions; b is equal to [0,1] to represent the estimated probability of the model to the true label of the sample.
With reference to fig. 2, in an embodiment of the present disclosure, a method for constructing a fossil image feature extraction model includes:
s11: establishing a stackable composite convolution residual block, and setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block; for example, the convolution expansion rate of the hole set in the present invention is 2.
S12: stacking the composite convolution residual blocks to form neural frameworks with different depths, wherein the depths of the neural frameworks are determined by image data sets where different tasks are located; because the resolution of the input images in different image data sets is different, the depth of the network needs to be reset according to different data sets; for example, the neural architecture of the fossil image feature extraction model formed in the present invention is: input image → convolutional layer → max pooling layer → 8 complex convolutional residual blocks are stacked → output feature map.
The complex convolution residual block set described in S11:
a first layer: a convolution kernel with the size of 1 multiplied by 1 reduces the calculation parameters of the model;
a second layer: 3 × 3 convolution operation, combining the traditional convolution and the cavity convolution, wherein the traditional convolution captures continuous structural dependency characteristics, and the cavity convolution captures structural dependency relationships with longer spacing;
and a third layer: and (5) performing 1 × 1 convolution kernel to restore the number of the feature maps.
In embodiments of the present disclosure, the number of base channels increases linearly as the depth of the neural architecture gets deeper.
In an embodiment of the present disclosure, the fossil image feature extraction model has a structure shown in table 1.
Example 1:
the embodiment discloses a method for constructing a fossil image classification model based on a composite convolutional neural network, which comprises the following steps:
step 1, preprocessing an original fossil image, calculating the gradient of the original fossil image by using a Canny operator, calculating edge texture information of the image to obtain a larger gradient value, and calculating a smoother part to obtain a smaller gradient value to obtain a final gradient image;
the method specifically comprises the following steps:
step 1.1, converting the color fossil image in the RGB format into a gray level image, such as a formula:
f(x,y)==0.299R+0.587G+0.114B
wherein f (x, y) represents a gray image generated by an original image, x and y respectively represent the coordinate positions of pixel points of the image, and R, G, B represents the colors of red, green and blue channels.
Step 1.2, in order to reduce the extraction result of the gradient image by noise as much as possible, a gaussian filter is used to denoise a gray level image f (x, y), the step is called as a smooth image, a selected gaussian function is set as G (x, y), and the smoothed image is set as H (x, y), then:
H(x,y)=f(x,y)*G(x,y)
where σ represents the standard deviation of a two-dimensional gaussian function, affecting the quality of the gaussian filtering. "x" is an operator that represents a convolution.
Step 1.3, calculating the amplitude and direction of the gradient by using the finite difference of the first-order partial derivatives, wherein the gradient of the image is defined as the change degree of the gray value of the pixel in the computer vision field, and the calculation of the change degree can be described as calculating the partial derivatives of the corresponding pixel along the x-axis direction and the y-axis direction in the micro-integration, then:
since the image can be regarded as a discrete matrix, the above differential function is rewritten into a discrete differential operator, and then a one-dimensional gaussian smoothing is combined on the basis to obtain a cable operator, which is also called a first-order differential operator, and the formula is as follows:
on the basis of the derivation, the calculation process for solving the image gradient is mathematically abstracted to pass the image to be processed through Sx,SyThe Sobel operators in the two directions carry out filtering calculation to obtain gradient graphs G in the two corresponding directionsx,GyThereby it is convenientThe gradient G and direction θ of the pixel point can be determined:
wherein G is gradient strength, theta represents gradient direction, and arctan is arctangent direction;
step 1.4, based on the above operation steps, a gradient edge composed of many pixels can be obtained, and the edge information is inaccurate and the edge is thick. Therefore, non-maximum suppression is required to obtain accurate edge information with a single pixel composition.
The method specifically comprises the following steps:
step 1.4.1, comparing the gradient intensity of the current pixel with two pixels along the positive and negative gradient directions;
step 1.4.2, if the gradient intensity of the current pixel is maximum compared with the other two pixels, the pixel point is reserved as an edge point, otherwise, the pixel point is inhibited;
and step 1.5, performing edge detection and connection on the basis of the step 1.4. Ideally, edge detection only processes a pixel set located on an edge, but in an actual processing process, noise exists all the time, so that the edge is disconnected, and the pixel set on the edge cannot completely and effectively describe edge characteristics, so that threshold judgment is introduced. And judging the threshold of the pixel points on the edge by setting a proper threshold range.
The method specifically comprises the following steps:
step 1.5.1, if the gradient intensity value of the pixel point on the edge is greater than the maximum value of the threshold value, recording the pixel point as the edge point;
step 1.5.2, if the gradient intensity value of the pixel point on the edge is smaller than the minimum value of the threshold value, recording the pixel point as a non-edge point;
step 1.5.3, if the pixel point is positioned between the maximum value and the minimum value of the threshold value, whether the pixel point is in eight-way connection with the edge point marked in front or not is calculated, if the pixel point is in eight-way connection, the pixel point is marked as the edge point, and if not, the pixel point is marked as a non-edge point;
step 1.5.4, traversing all pixel points on the edge, thereby connecting the edge points which are not closed to form a contour, and obtaining a gradient image;
and 2, performing feature extraction by using a fossil image feature extraction model based on the composite convolutional neural network, and constructing a fossil image classification model on the basis.
The method specifically comprises the following steps:
step 2.1, constructing a fossil image feature extraction model based on a composite convolutional neural network;
the method specifically comprises the following steps:
step 2.1.1, establishing a stackable composite convolution residual block for stacking to form a neural framework with different depths, wherein the specific depths are determined by image data sets of different tasks; because the resolution of the input images in different image data sets is different, the depth of the network needs to be reset according to different data sets;
further, as shown in fig. 3, the first layer of the composite convolution residual block main path is a convolution kernel of 1 × 1 size for reducing the calculation parameters of the model. The 3 x 3 convolution operation of the middle layer combines the traditional convolution and the cavity convolution, the former captures continuous structural dependence characteristics, and the latter captures structural dependence relationships with longer spacing, so that the receptive field is increased under the condition of not increasing the number of parameters. The third layer uses a 1 x 1 convolution kernel to reduce the number of feature maps to ensure that inputs can be added to the output, ensuring model accuracy while reducing computational parameters.
The calculation process can be defined as:
Fl,i=ReLU(Wl,p),
where p represents the input signature, W is the weight of the convolution kernel, ReLU is the activation function, Fl,iAn output feature map representing the model layer I, which is the output of the first stage of the complex convolutional residual block, comprisingi feature maps.
In the second stage, will be composed ofIs represented by Fl,iHalf of the feature map of (1) is input into the conventional convolution, and the output is Fl+1_conv,jContains j feature maps; the other half of the characteristic diagramThen input into the hole convolution and output is Fl+1_dconv,kContains k feature maps.
q=ReLU(Wl+3iFl+2,j+k)+Wsp,
In the third stage, the output characteristic graphs of the traditional convolution and the cavity convolution are overlapped from the channel dimension, and the result is Fl+2,j+kThe method comprises j + k feature maps in total. And obtaining an output characteristic diagram q after the final layer of 1 × 1 convolution operation and jump connection.
Step 2.1.2, setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block; then stacking the composite convolution residual blocks to determine a final feature extraction model, as shown in table 1;
the number of base channels per complex convolutional residual block in step 2.1.2 increases linearly as the network gets deeper;
step 2.2, constructing a fossil image classification model;
the method specifically comprises the following steps:
step 2.2.1, respectively sending the original fossil image and the gradient image obtained by preprocessing according to the step 1 into the fossil image feature extraction model based on the composite convolutional neural network constructed in the step 2.1, and extracting to obtain 1024 feature maps containing depth features and 7 × 7 pixels and 1024 feature maps containing primary visual features and 7 × 7 pixels;
step 2.2.2, fusing the depth feature map and the primary visual feature map from the channel dimension to obtain 2048 feature maps with the size of 7 × 7 pixels;
step 2.2.3, sending the feature maps fused in the step 2.2.2 into a global average pooling layer, and performing global average pooling operation on each input feature map, namely calculating the average value of all pixel points of each feature map and outputting a data value, wherein 2048 feature maps output 2048 data points which form a 2048-dimensional vector called a feature vector;
step 2.2.4, finally, mapping the distributed characteristic representation obtained by the global average pooling to a sample mark space through a full connection layer, namely, performing convolution operation on an output result obtained by the global average pooling by adopting C convolution kernels with the size of 1 multiplied by 2048, namely the length of 1, the width of 1 and the number of channels of 2048; and finally, obtaining probability values divided into various categories through a Softmax classifier, and obtaining the categories to which the final fossil images belong. Where C is the number of fossil image categories.
Step 3, training the constructed fossil image classification model;
step 3.1, constructing a target loss function;
the method specifically comprises the following steps:
step 3.1.1, in the existing research work, a Cross Entropy (CE) loss function is widely applied to a training process of a fossil image classification model, such as a formula:
wherein a is the true label (coded in one-hot form) of the sample.Representing the prediction result of the sample, C is the number of fossil image classes, and for ease of representation, the formula is rewritten as:
CE(b)==-log(b)
wherein b ∈ [0,1] represents the estimated probability of the model for the sample true label. The CE loss function can reflect the difference between the predicted probability distribution and the true probability distribution, and in the training process of the model, we expect to see that the predicted probability and the true probability are as close as possible, so the training goal of the model is to minimize the loss function. However, the CE loss function cannot solve the problem of unbalanced number of sample classes and alleviate the over-fitting problem, so the CE loss function is selected to be improved.
Step 3.1.2, constructing an objective loss function L, proposing to use a loss function improved based on focal loss as the objective function, and improving the CE loss function, which can be expressed as:
L=-6α(1-b)γlog(b)-(1-θ)log(b)
wherein, alpha is ∈ [0,1]]Is a weighting factor for adjusting and balancing the importance, Count, of the samples of different classesmIndicates the number of samples of the m-th class, γ>0 is a modulation factor for adjusting the weight of the easily classified samples, and θ is a hyperparameter for adjusting the weights of the two loss functions.
Step 3.2, initializing parameters in the fossil image feature extraction model as initial parameter values by using a ResNet50 deep residual error network trained in advance on an Imagenet large-scale data set, and initializing network parameters of a full connection layer according to normal distribution;
3.3, randomly selecting 80% of images in the data set as a training set, 10% of images as a verification set and 10% of images as a test set; preprocessing the images in the training set and the verification set according to the step 1 to obtain preprocessed images in the training set and preprocessed images in the verification set;
step 3.4, sending the preprocessed training set images including the original fossil images and the corresponding gradient images into a constructed fossil image classification model, minimizing the target loss function value by adopting a batch gradient descending training method, and further adjusting parameters of all layers in the network to obtain the weight of the trained composite convolutional neural network;
in this embodiment, the batch processing size of the training is set to 64, the parameter update momentum is set to 0.9, the learning rate is set to 0.001, the iteration number (Epoch) is set to 500, the retraining is performed on the weight of the trained shallow convolutional neural network model by using a back propagation algorithm to obtain a weight training result, and the batch random gradient is decreased, so that the loss value is minimized.
And 3.6, feeding back the loss value of the weight training result and the preprocessed verification set image after each generation of training is finished: inputting the verification set image into a currently trained fossil image classification model, calculating a loss value of the verification set image, updating the weight of the model if the current error value is smaller than the weight training result, and otherwise, continuously storing the previous weight training result; after the iterative training is carried out for 500 times, the training of the model is terminated, and the optimal model result is stored.
And 4, classifying the fossil images by using a fossil image classification model based on the composite convolutional neural network.
Step 4.1, preprocessing a fossil image to be detected according to the step 1 to obtain a gradient image;
and 4.2, inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model obtained by training in the step 3 to predict the classification of the fossil image to obtain the final classification to which the fossil image belongs.
Example 2:
the data set in the embodiment is provided by geological institute of northwest university and comprises 2354 fossil images of 1392 Yunnan cephalosporins, 852 frigoria, 85 trefoil osmidges and 25 wuding worms. And (3) taking the classification accuracy as an evaluation index of the model performance, wherein the value is [0,1], and the higher the value is, the better the performance of the model is.
TABLE 2 comparison of the results between the different methods
As can be seen from the results of Table 2, the performance of the present invention on this data set is higher than the compared fossil image classification models. To further prove that the innovations proposed in the invention of this year can have a beneficial effect on the final result, this example compares the effects of four different methods, as follows:
n1: only one sub-network is included, namely the input is the original fossil image, and the whole network is trained end to end by adopting a cross entropy loss function.
N2: only one sub-network is included, namely the input is the original fossil image, and the whole network is trained end to end by adopting the loss function L provided by the invention.
N3: only one sub-network is included, namely a gradient image corresponding to an original fossil image is input, and the whole network is trained end to end by adopting the loss function L provided by the invention.
N4: the method comprises two sub-networks, wherein the input of the two sub-networks are respectively an original fossil image and a gradient image corresponding to the original fossil image, the features extracted by the two sub-networks are spliced and fused, and the loss function L provided by the invention is adopted to train the whole network end to end.
TABLE 3 contrast effect of ablation experiment
As can be seen from the results in table 3, the innovation provided by the present invention can have a beneficial effect on the final result, thereby further improving the performance of the fossil image classification model.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.
Claims (10)
1. A method for constructing a fossil image classification model based on a composite convolutional neural network is characterized by comprising the following steps:
s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model;
s2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then processed sequentially through a global average pooling layer and a full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value;
s3: training a primary fossil image classification model.
2. The method for constructing a composite convolutional neural network-based fossil image classification model as claimed in claim 1, wherein the S2 further comprises:
s21: fusing the depth feature map and the primary visual feature map from the channel dimension to obtain a fused feature map;
s22: inputting the fused feature map into a global average pooling layer and outputting to obtain a feature vector;
s23: the feature vectors are subjected to convolution and classification of the full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value.
3. The method for constructing the fossil image classification model based on the composite convolutional neural network as claimed in claim 1 or 2, wherein a Canny operator is utilized to process an original fossil image to obtain a gradient image;
the classification of the full connectivity layer is done by a Softmax classifier.
4. The method for constructing a fossil image classification model based on a composite convolutional neural network as claimed in claim 1 or 2, wherein the S3 further comprises:
s31: constructing a target loss function;
s32: initializing network parameters of a fossil image feature extraction model by using a ResNet50 deep residual error network, and initializing network parameters of a full connection layer according to normal distribution;
s33: for original fossil images in a data set to be processed, randomly selecting 80% of the original fossil images as training set images, and selecting 10% of the original fossil images as verification set images;
s34: inputting an original training set image and a preprocessed training set image into a fossil image classification model, and minimizing the target loss function value by adopting a batch gradient descending training method to obtain the weight of a trained fossil image classification network;
s35: and feeding back a target loss function value of the preprocessed verification set image, if the target loss function value is smaller than the weight of the trained fossil image classification network, updating the weight of the fossil image classification network, and otherwise, saving the weight of the fossil image classification network.
5. The method for constructing a composite convolutional neural network-based fossil image classification model as claimed in claim 4, wherein the S31 further comprises:
constructing an objective loss function L, expressed as:
L=-θα(1-b)γlog(b)-(1-θ)log(b)
wherein alpha belongs to [0,1] as a weight factor; gamma >0 is a modulation factor; θ is a hyperparameter; b ∈ [0,1] represents the estimated probability.
6. The method for constructing the fossil image classification model based on the composite convolutional neural network as claimed in claim 1 or 2, wherein the method for constructing the fossil image feature extraction model comprises the following steps:
s1: establishing a stackable composite convolution residual block, and setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block;
s2: stacking the composite convolution residual blocks to form neural frameworks with different depths, wherein the depths of the neural frameworks are determined by image data sets where different tasks are located;
the complex convolution residual block set described in S1:
a first layer: a convolution kernel with the size of 1 multiplied by 1 reduces the calculation parameters of the model;
a second layer: 3 × 3 convolution operation, combining the traditional convolution and the cavity convolution, wherein the traditional convolution captures continuous structural dependency characteristics, and the cavity convolution captures structural dependency relationships with longer spacing;
and a third layer: and (5) performing 1 × 1 convolution kernel to restore the number of the feature maps.
7. The method for constructing a composite convolutional neural network-based fossil image classification model as claimed in claim 6, wherein the number of basic channels increases linearly as the depth of the neural architecture becomes deeper.
9. A method of classifying fossil images, comprising: preprocessing a fossil image to be detected to obtain a gradient image;
inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model constructed by the composite convolutional neural network-based fossil image classification model construction method according to any one of claims 1 to 8, and performing prediction classification on the classification of the fossil image to obtain the classification of the fossil image.
10. A computer-readable storage medium storing computer instructions for causing a computer to perform the method of constructing a composite convolutional neural network-based fossil image classification model according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110219351.1A CN112819096B (en) | 2021-02-26 | 2021-02-26 | Construction method of fossil image classification model based on composite convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110219351.1A CN112819096B (en) | 2021-02-26 | 2021-02-26 | Construction method of fossil image classification model based on composite convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112819096A true CN112819096A (en) | 2021-05-18 |
CN112819096B CN112819096B (en) | 2024-01-19 |
Family
ID=75864159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110219351.1A Active CN112819096B (en) | 2021-02-26 | 2021-02-26 | Construction method of fossil image classification model based on composite convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112819096B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113537026A (en) * | 2021-07-09 | 2021-10-22 | 上海智臻智能网络科技股份有限公司 | Primitive detection method, device, equipment and medium in building plan |
CN113610061A (en) * | 2021-09-30 | 2021-11-05 | 国网浙江省电力有限公司电力科学研究院 | Method and system for identifying unstressed conducting wire based on target detection and residual error network |
CN115818166A (en) * | 2022-11-15 | 2023-03-21 | 华能伊敏煤电有限责任公司 | Unattended automatic control method and system for wheel hopper continuous system |
CN116258658A (en) * | 2023-05-11 | 2023-06-13 | 齐鲁工业大学(山东省科学院) | Swin transducer-based image fusion method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111160389A (en) * | 2019-12-02 | 2020-05-15 | 东北石油大学 | Lithology identification method based on fusion of VGG |
CN111612066A (en) * | 2020-05-21 | 2020-09-01 | 成都理工大学 | Remote sensing image classification method based on depth fusion convolutional neural network |
WO2020215236A1 (en) * | 2019-04-24 | 2020-10-29 | 哈尔滨工业大学(深圳) | Image semantic segmentation method and system |
AU2020103901A4 (en) * | 2020-12-04 | 2021-02-11 | Chongqing Normal University | Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field |
-
2021
- 2021-02-26 CN CN202110219351.1A patent/CN112819096B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020215236A1 (en) * | 2019-04-24 | 2020-10-29 | 哈尔滨工业大学(深圳) | Image semantic segmentation method and system |
CN111160389A (en) * | 2019-12-02 | 2020-05-15 | 东北石油大学 | Lithology identification method based on fusion of VGG |
CN111612066A (en) * | 2020-05-21 | 2020-09-01 | 成都理工大学 | Remote sensing image classification method based on depth fusion convolutional neural network |
AU2020103901A4 (en) * | 2020-12-04 | 2021-02-11 | Chongqing Normal University | Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field |
Non-Patent Citations (2)
Title |
---|
程国建;郭文惠;范鹏召;: "基于卷积神经网络的岩石图像分类", 西安石油大学学报(自然科学版), no. 04 * |
芦国军;陈丽芳;: "基于深度卷积神经网络的遥感图像场景分类", 太原师范学院学报(自然科学版), no. 01 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113537026A (en) * | 2021-07-09 | 2021-10-22 | 上海智臻智能网络科技股份有限公司 | Primitive detection method, device, equipment and medium in building plan |
CN113610061A (en) * | 2021-09-30 | 2021-11-05 | 国网浙江省电力有限公司电力科学研究院 | Method and system for identifying unstressed conducting wire based on target detection and residual error network |
CN115818166A (en) * | 2022-11-15 | 2023-03-21 | 华能伊敏煤电有限责任公司 | Unattended automatic control method and system for wheel hopper continuous system |
CN115818166B (en) * | 2022-11-15 | 2023-09-26 | 华能伊敏煤电有限责任公司 | Unmanned automatic control method and system for continuous system of wheel bucket |
CN116258658A (en) * | 2023-05-11 | 2023-06-13 | 齐鲁工业大学(山东省科学院) | Swin transducer-based image fusion method |
CN116258658B (en) * | 2023-05-11 | 2023-07-28 | 齐鲁工业大学(山东省科学院) | Swin transducer-based image fusion method |
Also Published As
Publication number | Publication date |
---|---|
CN112819096B (en) | 2024-01-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112819096B (en) | Construction method of fossil image classification model based on composite convolutional neural network | |
CN108875935B (en) | Natural image target material visual characteristic mapping method based on generation countermeasure network | |
CN107862668A (en) | A kind of cultural relic images restored method based on GNN | |
CN109146944B (en) | Visual depth estimation method based on depth separable convolutional neural network | |
CN112489164B (en) | Image coloring method based on improved depth separable convolutional neural network | |
CN108764298B (en) | Electric power image environment influence identification method based on single classifier | |
CN107516103B (en) | Image classification method and system | |
CN114092697B (en) | Building facade semantic segmentation method with attention fused with global and local depth features | |
CN109118504B (en) | Image edge detection method, device and equipment based on neural network | |
CN111046917B (en) | Object-based enhanced target detection method based on deep neural network | |
CN112837344A (en) | Target tracking method for generating twin network based on conditional confrontation | |
CN110059728A (en) | RGB-D image vision conspicuousness detection method based on attention model | |
CN110009700B (en) | Convolutional neural network visual depth estimation method based on RGB (red, green and blue) graph and gradient graph | |
CN111339862B (en) | Remote sensing scene classification method and device based on channel attention mechanism | |
CN109460815A (en) | A kind of monocular depth estimation method | |
CN109872326B (en) | Contour detection method based on deep reinforced network jump connection | |
CN112991371B (en) | Automatic image coloring method and system based on coloring overflow constraint | |
CN109448039B (en) | Monocular vision depth estimation method based on deep convolutional neural network | |
CN115937552A (en) | Image matching method based on fusion of manual features and depth features | |
CN111882555B (en) | Deep learning-based netting detection method, device, equipment and storage medium | |
Borbon et al. | Coral health identification using image classification and convolutional neural networks | |
CN109508639A (en) | Road scene semantic segmentation method based on multiple dimensioned convolutional neural networks with holes | |
CN111259923A (en) | Multi-target detection method based on improved three-dimensional R-CNN algorithm | |
CN115222754A (en) | Mirror image segmentation method based on knowledge distillation and antagonistic learning | |
CN110796716B (en) | Image coloring method based on multiple residual error network and regularized transfer learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |