CN112819096A - Method for constructing fossil image classification model based on composite convolutional neural network - Google Patents

Method for constructing fossil image classification model based on composite convolutional neural network Download PDF

Info

Publication number
CN112819096A
CN112819096A CN202110219351.1A CN202110219351A CN112819096A CN 112819096 A CN112819096 A CN 112819096A CN 202110219351 A CN202110219351 A CN 202110219351A CN 112819096 A CN112819096 A CN 112819096A
Authority
CN
China
Prior art keywords
image
fossil
fossil image
constructing
classification model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110219351.1A
Other languages
Chinese (zh)
Other versions
CN112819096B (en
Inventor
张蕾
王晓宇
罗杰
卜起荣
冯筠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern University
Original Assignee
Northwestern University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern University filed Critical Northwestern University
Priority to CN202110219351.1A priority Critical patent/CN112819096B/en
Publication of CN112819096A publication Critical patent/CN112819096A/en
Application granted granted Critical
Publication of CN112819096B publication Critical patent/CN112819096B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for constructing a fossil image classification model based on a composite convolutional neural network, which comprises the following steps of: s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model; s2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then processed sequentially through a global average pooling layer and a full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value; s3: training a primary fossil image classification model. The method respectively extracts the depth features of the original fossil images and the primary visual features of the corresponding gradient images, and further improves the accuracy of the fossil image classification task through feature fusion.

Description

Method for constructing fossil image classification model based on composite convolutional neural network
Technical Field
The invention relates to the technical field of image processing, in particular to a method for constructing a fossil image classification model based on a composite convolutional neural network, and relates to an image classification method.
Background
In the conventional classification work of microsilica fossils, the classification is performed manually, but since microsilica fossils are too small and are difficult to be found or classified by visual observation, classification is performed manually through observation with a microscope. With the progress of work such as archaeological excavation, constantly there is new type of microsome fossil kind to discover out, this just makes the total sample kind increase, and the efficiency of work such as manual sorting selection is more low. The whole process is tedious, the error probability can be increased along with time, and meanwhile, the long-time high-intensity observation can cause serious damage to the eyesight and other physical health conditions of observers.
With the development of artificial intelligence, the research of deep learning has made great progress, and the deep learning gradually plays an increasingly important role in daily work and has achieved more and more obvious effects in image classification tasks. The method can learn through a small amount of sample data, automatically acquire the most appropriate features for classification through a network model, does not need to manually select the appropriate features, and can obtain higher classification accuracy. The micro fossil images are classified by utilizing the convolutional neural network, so that a large amount of artificial resources are saved, high classification accuracy and high working efficiency can be guaranteed, and a certain reference value is provided for archaeological work.
The existing research techniques mainly include: an image feature based method and a deep learning based method. The traditional fossil image classification method based on image features needs complex steps of feature extraction and feature optimization, and the algorithm complexity is high; the existing fossil image model based on deep learning only extracts features from an original image, primary visual features are not enhanced from the angle of image gradient change, the number of network layers is deep, and a fossil image classification task is easy to be overfitted.
Disclosure of Invention
The invention aims to provide a method for constructing a fossil image classification model based on a composite convolutional neural network, aiming at the problems of high algorithm complexity, low training speed, easiness in fitting and low detection precision of a fossil image classification task.
In order to realize the task, the invention adopts the following technical scheme:
a method for constructing a fossil image classification model based on a composite convolutional neural network comprises the following steps:
s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model;
s2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then processed sequentially through a global average pooling layer and a full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value;
s3: training a primary fossil image classification model.
Optionally, the S2 further includes:
s21: fusing the depth feature map and the primary visual feature map from the channel dimension to obtain a fused feature map;
s22: inputting the fused feature map into a global average pooling layer and outputting to obtain a feature vector;
s23: the feature vectors are subjected to convolution and classification of the full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value.
Optionally, processing the original fossil image by using a Canny operator to obtain a gradient image;
the classification of the full connectivity layer is done by a Softmax classifier.
Optionally, the S3 further includes:
s31: constructing a target loss function;
s32: initializing network parameters of a fossil image feature extraction model by using a ResNet50 deep residual error network, and initializing network parameters of a full connection layer according to normal distribution;
s33: for original fossil images in a data set to be processed, randomly selecting 80% of the original fossil images as training set images, and selecting 10% of the original fossil images as verification set images;
s34: inputting an original training set image and a preprocessed training set image into a fossil image classification model, and minimizing the target loss function value by adopting a batch gradient descending training method to obtain the weight of a trained fossil image classification network;
s35: and feeding back a target loss function value of the preprocessed verification set image, if the target loss function value is smaller than the weight of the trained fossil image classification network, updating the weight of the fossil image classification network, and otherwise, saving the weight of the fossil image classification network.
Optionally, the S31 further includes:
constructing an objective loss function L, expressed as:
L=-θα(1-b)γlog(b)-(1-θ)log(b)
wherein, alpha belongs to [0,1] is a weight factor used for adjusting and balancing the importance degree of different types of samples; gamma >0 is a modulation factor used for adjusting the weight of the easily classified samples; θ is a hyperparameter used to adjust the weights of the two loss functions; b is equal to [0,1] to represent the estimated probability of the model to the true label of the sample.
Optionally, the method for constructing the fossil image feature extraction model includes:
s1: establishing a stackable composite convolution residual block, and setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block;
s2: stacking the composite convolution residual blocks to form neural frameworks with different depths, wherein the depths of the neural frameworks are determined by image data sets where different tasks are located;
the complex convolution residual block set described in S1:
a first layer: a convolution kernel with the size of 1 multiplied by 1 reduces the calculation parameters of the model;
a second layer: 3 × 3 convolution operation, combining the traditional convolution and the cavity convolution, wherein the traditional convolution captures continuous structural dependency characteristics, and the cavity convolution captures structural dependency relationships with longer spacing;
and a third layer: and (5) performing 1 × 1 convolution kernel to restore the number of the feature maps.
Optionally, the number of the basic channels increases linearly with the depth of the neural structure.
Optionally, the fossil image feature extraction model has a structure shown in table 1:
TABLE 1 fossil image feature extraction model network architecture
Figure BDA0002953975290000031
A method of fossil image classification, comprising: preprocessing a fossil image to be detected to obtain a gradient image;
inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model constructed by the construction method of the fossil image classification model based on the composite convolutional neural network, and performing prediction classification on the classification of the fossil image classification model to obtain the classification of the fossil image.
A computer-readable storage medium storing computer instructions for causing a computer to execute the method for constructing a composite convolutional neural network-based fossil image classification model according to the present invention.
Compared with the prior art, the invention has the following technical characteristics:
(1) in the aspect of constructing a feature extraction network, the invention provides a stackable composite convolution residual block for stacking neural architectures with different depths, wherein the specific depths are determined by image data sets of different tasks. The module combines the traditional convolution and the cavity convolution, the former captures continuous structural dependence characteristics, and the latter captures structural dependence relations with longer spacing distance, so that the receptive field is increased under the condition of not increasing the parameter number.
(2) In the aspect of constructing a fossil classification model, the depth features of the original fossil image and the primary visual features of the fossil gradient image, such as information of image structural components, edge textures and the like, are respectively extracted, and the accuracy of a fossil image classification task is further improved through fusion of the features.
(3) Aiming at the problem that the classification loss function used by the existing fossil image classification model cannot well solve the problem of unbalanced sample class number and relieve overfitting, the invention constructs a new target loss function, so that the model can pay more attention to samples which are difficult to classify while reducing the loss proportion of samples which are easy to classify, and the accuracy of the fossil image classification model is effectively improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow chart of a method for constructing a fossil image classification model based on a composite convolutional neural network according to the present invention;
FIG. 2 is a flowchart of a method for constructing a fossil image feature extraction model according to the present invention;
FIG. 3 is a diagram illustrating the structure of a composite convolution residual block according to the present invention;
fig. 4 is a schematic diagram of the fossil image feature extraction model extracting depth features from an original fossil image according to the present invention.
Detailed Description
In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.
With reference to fig. 1, 2 and 3, the method for constructing a fossil image classification model based on a composite convolutional neural network of the present invention includes the following steps:
s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model; the original fossil image used by the invention is in JPEG format; the gradient image referred to in the invention refers to an image calculated from a gray scale image of an original fossil image by using Canny operator.
S2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain depth features, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain primary visual features, the depth features and the primary visual features are fused and then processed sequentially through a global average pooling layer and a full-link layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value; the primary visual features referred to in the invention refer to features such as color, edge texture, structural relationship and the like extracted from a gradient image corresponding to an original fossil image; the depth features refer to abstract semantic features extracted from an original fossil image, and the depth features extracted at each stage of the fossil image feature extraction model are visualized, as shown in fig. 4.
S3: the method comprises the steps of training a primary fossil image classification model, wherein the training process comprises training of a fossil image feature extraction model, and the fossil image feature extraction model is embedded into the fossil image classification model to jointly form a final fossil image classification model.
In an embodiment of the present disclosure, S2 further includes:
s21: fusing the depth feature map and the primary visual feature map from the channel dimension to obtain a fused feature map;
s22: inputting the fused feature map into a global average pooling layer and outputting to obtain a feature vector;
s23: the feature vectors are subjected to convolution and classification of the full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value.
Specifically, processing an original fossil image by using a Canny operator to obtain a gradient image; the classification of the full connectivity layer is done by a Softmax classifier.
In an embodiment of the present disclosure, S3 further includes:
s31: constructing a target loss function;
s32: initializing network parameters of a fossil image feature extraction model by using a ResNet50 deep residual error network, and initializing network parameters of a full connection layer according to normal distribution;
s33: for original fossil images in a data set to be processed, randomly selecting 80% of the original fossil images as training set images, and selecting 10% of the original fossil images as verification set images;
s34: inputting an original training set image and a preprocessed training set image into a fossil image classification model, and minimizing the target loss function value by adopting a batch gradient descending training method to obtain the weight of a trained fossil image classification network;
s35: and feeding back a target loss function value of the preprocessed verification set image, if the target loss function value is smaller than the weight of the trained fossil image classification network, updating the weight of the fossil image classification network, and otherwise, saving the weight of the fossil image classification network.
Specifically, the target loss function L is represented as:
L=-θα(1-b)γlog(b)-(1-θ)log(b)
wherein, alpha belongs to [0,1] is a weight factor used for adjusting and balancing the importance degree of different types of samples; gamma >0 is a modulation factor used for adjusting the weight of the easily classified samples; θ is a hyperparameter used to adjust the weights of the two loss functions; b is equal to [0,1] to represent the estimated probability of the model to the true label of the sample.
With reference to fig. 2, in an embodiment of the present disclosure, a method for constructing a fossil image feature extraction model includes:
s11: establishing a stackable composite convolution residual block, and setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block; for example, the convolution expansion rate of the hole set in the present invention is 2.
S12: stacking the composite convolution residual blocks to form neural frameworks with different depths, wherein the depths of the neural frameworks are determined by image data sets where different tasks are located; because the resolution of the input images in different image data sets is different, the depth of the network needs to be reset according to different data sets; for example, the neural architecture of the fossil image feature extraction model formed in the present invention is: input image → convolutional layer → max pooling layer → 8 complex convolutional residual blocks are stacked → output feature map.
The complex convolution residual block set described in S11:
a first layer: a convolution kernel with the size of 1 multiplied by 1 reduces the calculation parameters of the model;
a second layer: 3 × 3 convolution operation, combining the traditional convolution and the cavity convolution, wherein the traditional convolution captures continuous structural dependency characteristics, and the cavity convolution captures structural dependency relationships with longer spacing;
and a third layer: and (5) performing 1 × 1 convolution kernel to restore the number of the feature maps.
In embodiments of the present disclosure, the number of base channels increases linearly as the depth of the neural architecture gets deeper.
In an embodiment of the present disclosure, the fossil image feature extraction model has a structure shown in table 1.
Example 1:
the embodiment discloses a method for constructing a fossil image classification model based on a composite convolutional neural network, which comprises the following steps:
step 1, preprocessing an original fossil image, calculating the gradient of the original fossil image by using a Canny operator, calculating edge texture information of the image to obtain a larger gradient value, and calculating a smoother part to obtain a smaller gradient value to obtain a final gradient image;
the method specifically comprises the following steps:
step 1.1, converting the color fossil image in the RGB format into a gray level image, such as a formula:
f(x,y)==0.299R+0.587G+0.114B
wherein f (x, y) represents a gray image generated by an original image, x and y respectively represent the coordinate positions of pixel points of the image, and R, G, B represents the colors of red, green and blue channels.
Step 1.2, in order to reduce the extraction result of the gradient image by noise as much as possible, a gaussian filter is used to denoise a gray level image f (x, y), the step is called as a smooth image, a selected gaussian function is set as G (x, y), and the smoothed image is set as H (x, y), then:
Figure BDA0002953975290000071
H(x,y)=f(x,y)*G(x,y)
where σ represents the standard deviation of a two-dimensional gaussian function, affecting the quality of the gaussian filtering. "x" is an operator that represents a convolution.
Step 1.3, calculating the amplitude and direction of the gradient by using the finite difference of the first-order partial derivatives, wherein the gradient of the image is defined as the change degree of the gray value of the pixel in the computer vision field, and the calculation of the change degree can be described as calculating the partial derivatives of the corresponding pixel along the x-axis direction and the y-axis direction in the micro-integration, then:
Figure BDA0002953975290000072
Figure BDA0002953975290000073
since the image can be regarded as a discrete matrix, the above differential function is rewritten into a discrete differential operator, and then a one-dimensional gaussian smoothing is combined on the basis to obtain a cable operator, which is also called a first-order differential operator, and the formula is as follows:
Figure BDA0002953975290000074
on the basis of the derivation, the calculation process for solving the image gradient is mathematically abstracted to pass the image to be processed through Sx,SyThe Sobel operators in the two directions carry out filtering calculation to obtain gradient graphs G in the two corresponding directionsx,GyThereby it is convenientThe gradient G and direction θ of the pixel point can be determined:
Figure BDA0002953975290000075
Figure BDA0002953975290000076
wherein G is gradient strength, theta represents gradient direction, and arctan is arctangent direction;
step 1.4, based on the above operation steps, a gradient edge composed of many pixels can be obtained, and the edge information is inaccurate and the edge is thick. Therefore, non-maximum suppression is required to obtain accurate edge information with a single pixel composition.
The method specifically comprises the following steps:
step 1.4.1, comparing the gradient intensity of the current pixel with two pixels along the positive and negative gradient directions;
step 1.4.2, if the gradient intensity of the current pixel is maximum compared with the other two pixels, the pixel point is reserved as an edge point, otherwise, the pixel point is inhibited;
and step 1.5, performing edge detection and connection on the basis of the step 1.4. Ideally, edge detection only processes a pixel set located on an edge, but in an actual processing process, noise exists all the time, so that the edge is disconnected, and the pixel set on the edge cannot completely and effectively describe edge characteristics, so that threshold judgment is introduced. And judging the threshold of the pixel points on the edge by setting a proper threshold range.
The method specifically comprises the following steps:
step 1.5.1, if the gradient intensity value of the pixel point on the edge is greater than the maximum value of the threshold value, recording the pixel point as the edge point;
step 1.5.2, if the gradient intensity value of the pixel point on the edge is smaller than the minimum value of the threshold value, recording the pixel point as a non-edge point;
step 1.5.3, if the pixel point is positioned between the maximum value and the minimum value of the threshold value, whether the pixel point is in eight-way connection with the edge point marked in front or not is calculated, if the pixel point is in eight-way connection, the pixel point is marked as the edge point, and if not, the pixel point is marked as a non-edge point;
step 1.5.4, traversing all pixel points on the edge, thereby connecting the edge points which are not closed to form a contour, and obtaining a gradient image;
and 2, performing feature extraction by using a fossil image feature extraction model based on the composite convolutional neural network, and constructing a fossil image classification model on the basis.
The method specifically comprises the following steps:
step 2.1, constructing a fossil image feature extraction model based on a composite convolutional neural network;
the method specifically comprises the following steps:
step 2.1.1, establishing a stackable composite convolution residual block for stacking to form a neural framework with different depths, wherein the specific depths are determined by image data sets of different tasks; because the resolution of the input images in different image data sets is different, the depth of the network needs to be reset according to different data sets;
further, as shown in fig. 3, the first layer of the composite convolution residual block main path is a convolution kernel of 1 × 1 size for reducing the calculation parameters of the model. The 3 x 3 convolution operation of the middle layer combines the traditional convolution and the cavity convolution, the former captures continuous structural dependence characteristics, and the latter captures structural dependence relationships with longer spacing, so that the receptive field is increased under the condition of not increasing the number of parameters. The third layer uses a 1 x 1 convolution kernel to reduce the number of feature maps to ensure that inputs can be added to the output, ensuring model accuracy while reducing computational parameters.
The calculation process can be defined as:
Fl,i=ReLU(Wl,p),
where p represents the input signature, W is the weight of the convolution kernel, ReLU is the activation function, Fl,iAn output feature map representing the model layer I, which is the output of the first stage of the complex convolutional residual block, comprisingi feature maps.
Figure BDA0002953975290000091
Figure BDA0002953975290000092
In the second stage, will be composed of
Figure BDA0002953975290000093
Is represented by Fl,iHalf of the feature map of (1) is input into the conventional convolution, and the output is Fl+1_conv,jContains j feature maps; the other half of the characteristic diagram
Figure BDA0002953975290000094
Then input into the hole convolution and output is Fl+1_dconv,kContains k feature maps.
Figure BDA0002953975290000095
q=ReLU(Wl+3iFl+2,j+k)+Wsp,
In the third stage, the output characteristic graphs of the traditional convolution and the cavity convolution are overlapped from the channel dimension, and the result is Fl+2,j+kThe method comprises j + k feature maps in total. And obtaining an output characteristic diagram q after the final layer of 1 × 1 convolution operation and jump connection.
Step 2.1.2, setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block; then stacking the composite convolution residual blocks to determine a final feature extraction model, as shown in table 1;
the number of base channels per complex convolutional residual block in step 2.1.2 increases linearly as the network gets deeper;
step 2.2, constructing a fossil image classification model;
the method specifically comprises the following steps:
step 2.2.1, respectively sending the original fossil image and the gradient image obtained by preprocessing according to the step 1 into the fossil image feature extraction model based on the composite convolutional neural network constructed in the step 2.1, and extracting to obtain 1024 feature maps containing depth features and 7 × 7 pixels and 1024 feature maps containing primary visual features and 7 × 7 pixels;
step 2.2.2, fusing the depth feature map and the primary visual feature map from the channel dimension to obtain 2048 feature maps with the size of 7 × 7 pixels;
step 2.2.3, sending the feature maps fused in the step 2.2.2 into a global average pooling layer, and performing global average pooling operation on each input feature map, namely calculating the average value of all pixel points of each feature map and outputting a data value, wherein 2048 feature maps output 2048 data points which form a 2048-dimensional vector called a feature vector;
step 2.2.4, finally, mapping the distributed characteristic representation obtained by the global average pooling to a sample mark space through a full connection layer, namely, performing convolution operation on an output result obtained by the global average pooling by adopting C convolution kernels with the size of 1 multiplied by 2048, namely the length of 1, the width of 1 and the number of channels of 2048; and finally, obtaining probability values divided into various categories through a Softmax classifier, and obtaining the categories to which the final fossil images belong. Where C is the number of fossil image categories.
Step 3, training the constructed fossil image classification model;
step 3.1, constructing a target loss function;
the method specifically comprises the following steps:
step 3.1.1, in the existing research work, a Cross Entropy (CE) loss function is widely applied to a training process of a fossil image classification model, such as a formula:
Figure BDA0002953975290000101
wherein a is the true label (coded in one-hot form) of the sample.
Figure BDA0002953975290000102
Representing the prediction result of the sample, C is the number of fossil image classes, and for ease of representation, the formula is rewritten as:
Figure BDA0002953975290000103
CE(b)==-log(b)
wherein b ∈ [0,1] represents the estimated probability of the model for the sample true label. The CE loss function can reflect the difference between the predicted probability distribution and the true probability distribution, and in the training process of the model, we expect to see that the predicted probability and the true probability are as close as possible, so the training goal of the model is to minimize the loss function. However, the CE loss function cannot solve the problem of unbalanced number of sample classes and alleviate the over-fitting problem, so the CE loss function is selected to be improved.
Step 3.1.2, constructing an objective loss function L, proposing to use a loss function improved based on focal loss as the objective function, and improving the CE loss function, which can be expressed as:
L=-6α(1-b)γlog(b)-(1-θ)log(b)
Figure BDA0002953975290000104
wherein, alpha is ∈ [0,1]]Is a weighting factor for adjusting and balancing the importance, Count, of the samples of different classesmIndicates the number of samples of the m-th class, γ>0 is a modulation factor for adjusting the weight of the easily classified samples, and θ is a hyperparameter for adjusting the weights of the two loss functions.
Step 3.2, initializing parameters in the fossil image feature extraction model as initial parameter values by using a ResNet50 deep residual error network trained in advance on an Imagenet large-scale data set, and initializing network parameters of a full connection layer according to normal distribution;
3.3, randomly selecting 80% of images in the data set as a training set, 10% of images as a verification set and 10% of images as a test set; preprocessing the images in the training set and the verification set according to the step 1 to obtain preprocessed images in the training set and preprocessed images in the verification set;
step 3.4, sending the preprocessed training set images including the original fossil images and the corresponding gradient images into a constructed fossil image classification model, minimizing the target loss function value by adopting a batch gradient descending training method, and further adjusting parameters of all layers in the network to obtain the weight of the trained composite convolutional neural network;
in this embodiment, the batch processing size of the training is set to 64, the parameter update momentum is set to 0.9, the learning rate is set to 0.001, the iteration number (Epoch) is set to 500, the retraining is performed on the weight of the trained shallow convolutional neural network model by using a back propagation algorithm to obtain a weight training result, and the batch random gradient is decreased, so that the loss value is minimized.
And 3.6, feeding back the loss value of the weight training result and the preprocessed verification set image after each generation of training is finished: inputting the verification set image into a currently trained fossil image classification model, calculating a loss value of the verification set image, updating the weight of the model if the current error value is smaller than the weight training result, and otherwise, continuously storing the previous weight training result; after the iterative training is carried out for 500 times, the training of the model is terminated, and the optimal model result is stored.
And 4, classifying the fossil images by using a fossil image classification model based on the composite convolutional neural network.
Step 4.1, preprocessing a fossil image to be detected according to the step 1 to obtain a gradient image;
and 4.2, inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model obtained by training in the step 3 to predict the classification of the fossil image to obtain the final classification to which the fossil image belongs.
Example 2:
the data set in the embodiment is provided by geological institute of northwest university and comprises 2354 fossil images of 1392 Yunnan cephalosporins, 852 frigoria, 85 trefoil osmidges and 25 wuding worms. And (3) taking the classification accuracy as an evaluation index of the model performance, wherein the value is [0,1], and the higher the value is, the better the performance of the model is.
TABLE 2 comparison of the results between the different methods
Figure BDA0002953975290000111
As can be seen from the results of Table 2, the performance of the present invention on this data set is higher than the compared fossil image classification models. To further prove that the innovations proposed in the invention of this year can have a beneficial effect on the final result, this example compares the effects of four different methods, as follows:
n1: only one sub-network is included, namely the input is the original fossil image, and the whole network is trained end to end by adopting a cross entropy loss function.
N2: only one sub-network is included, namely the input is the original fossil image, and the whole network is trained end to end by adopting the loss function L provided by the invention.
N3: only one sub-network is included, namely a gradient image corresponding to an original fossil image is input, and the whole network is trained end to end by adopting the loss function L provided by the invention.
N4: the method comprises two sub-networks, wherein the input of the two sub-networks are respectively an original fossil image and a gradient image corresponding to the original fossil image, the features extracted by the two sub-networks are spliced and fused, and the loss function L provided by the invention is adopted to train the whole network end to end.
TABLE 3 contrast effect of ablation experiment
Figure BDA0002953975290000121
As can be seen from the results in table 3, the innovation provided by the present invention can have a beneficial effect on the final result, thereby further improving the performance of the fossil image classification model.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.

Claims (10)

1. A method for constructing a fossil image classification model based on a composite convolutional neural network is characterized by comprising the following steps:
s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model;
s2: the method comprises the steps that a fossil image feature extraction model carries out feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model carries out feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then processed sequentially through a global average pooling layer and a full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value;
s3: training a primary fossil image classification model.
2. The method for constructing a composite convolutional neural network-based fossil image classification model as claimed in claim 1, wherein the S2 further comprises:
s21: fusing the depth feature map and the primary visual feature map from the channel dimension to obtain a fused feature map;
s22: inputting the fused feature map into a global average pooling layer and outputting to obtain a feature vector;
s23: the feature vectors are subjected to convolution and classification of the full connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value.
3. The method for constructing the fossil image classification model based on the composite convolutional neural network as claimed in claim 1 or 2, wherein a Canny operator is utilized to process an original fossil image to obtain a gradient image;
the classification of the full connectivity layer is done by a Softmax classifier.
4. The method for constructing a fossil image classification model based on a composite convolutional neural network as claimed in claim 1 or 2, wherein the S3 further comprises:
s31: constructing a target loss function;
s32: initializing network parameters of a fossil image feature extraction model by using a ResNet50 deep residual error network, and initializing network parameters of a full connection layer according to normal distribution;
s33: for original fossil images in a data set to be processed, randomly selecting 80% of the original fossil images as training set images, and selecting 10% of the original fossil images as verification set images;
s34: inputting an original training set image and a preprocessed training set image into a fossil image classification model, and minimizing the target loss function value by adopting a batch gradient descending training method to obtain the weight of a trained fossil image classification network;
s35: and feeding back a target loss function value of the preprocessed verification set image, if the target loss function value is smaller than the weight of the trained fossil image classification network, updating the weight of the fossil image classification network, and otherwise, saving the weight of the fossil image classification network.
5. The method for constructing a composite convolutional neural network-based fossil image classification model as claimed in claim 4, wherein the S31 further comprises:
constructing an objective loss function L, expressed as:
L=-θα(1-b)γlog(b)-(1-θ)log(b)
wherein alpha belongs to [0,1] as a weight factor; gamma >0 is a modulation factor; θ is a hyperparameter; b ∈ [0,1] represents the estimated probability.
6. The method for constructing the fossil image classification model based on the composite convolutional neural network as claimed in claim 1 or 2, wherein the method for constructing the fossil image feature extraction model comprises the following steps:
s1: establishing a stackable composite convolution residual block, and setting the number of basic channels and the hole convolution expansion rate of the composite convolution residual block;
s2: stacking the composite convolution residual blocks to form neural frameworks with different depths, wherein the depths of the neural frameworks are determined by image data sets where different tasks are located;
the complex convolution residual block set described in S1:
a first layer: a convolution kernel with the size of 1 multiplied by 1 reduces the calculation parameters of the model;
a second layer: 3 × 3 convolution operation, combining the traditional convolution and the cavity convolution, wherein the traditional convolution captures continuous structural dependency characteristics, and the cavity convolution captures structural dependency relationships with longer spacing;
and a third layer: and (5) performing 1 × 1 convolution kernel to restore the number of the feature maps.
7. The method for constructing a composite convolutional neural network-based fossil image classification model as claimed in claim 6, wherein the number of basic channels increases linearly as the depth of the neural architecture becomes deeper.
8. The method for constructing a fossil image classification model based on a composite convolutional neural network as claimed in claim 1 or 2, wherein the fossil image feature extraction model has a structure shown in table 1:
TABLE 1 fossil image feature extraction model network architecture
Figure FDA0002953975280000021
Figure FDA0002953975280000031
9. A method of classifying fossil images, comprising: preprocessing a fossil image to be detected to obtain a gradient image;
inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model constructed by the composite convolutional neural network-based fossil image classification model construction method according to any one of claims 1 to 8, and performing prediction classification on the classification of the fossil image to obtain the classification of the fossil image.
10. A computer-readable storage medium storing computer instructions for causing a computer to perform the method of constructing a composite convolutional neural network-based fossil image classification model according to any one of claims 1 to 8.
CN202110219351.1A 2021-02-26 2021-02-26 Construction method of fossil image classification model based on composite convolutional neural network Active CN112819096B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110219351.1A CN112819096B (en) 2021-02-26 2021-02-26 Construction method of fossil image classification model based on composite convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110219351.1A CN112819096B (en) 2021-02-26 2021-02-26 Construction method of fossil image classification model based on composite convolutional neural network

Publications (2)

Publication Number Publication Date
CN112819096A true CN112819096A (en) 2021-05-18
CN112819096B CN112819096B (en) 2024-01-19

Family

ID=75864159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110219351.1A Active CN112819096B (en) 2021-02-26 2021-02-26 Construction method of fossil image classification model based on composite convolutional neural network

Country Status (1)

Country Link
CN (1) CN112819096B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537026A (en) * 2021-07-09 2021-10-22 上海智臻智能网络科技股份有限公司 Primitive detection method, device, equipment and medium in building plan
CN113610061A (en) * 2021-09-30 2021-11-05 国网浙江省电力有限公司电力科学研究院 Method and system for identifying unstressed conducting wire based on target detection and residual error network
CN115818166A (en) * 2022-11-15 2023-03-21 华能伊敏煤电有限责任公司 Unattended automatic control method and system for wheel hopper continuous system
CN116258658A (en) * 2023-05-11 2023-06-13 齐鲁工业大学(山东省科学院) Swin transducer-based image fusion method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160389A (en) * 2019-12-02 2020-05-15 东北石油大学 Lithology identification method based on fusion of VGG
CN111612066A (en) * 2020-05-21 2020-09-01 成都理工大学 Remote sensing image classification method based on depth fusion convolutional neural network
WO2020215236A1 (en) * 2019-04-24 2020-10-29 哈尔滨工业大学(深圳) Image semantic segmentation method and system
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020215236A1 (en) * 2019-04-24 2020-10-29 哈尔滨工业大学(深圳) Image semantic segmentation method and system
CN111160389A (en) * 2019-12-02 2020-05-15 东北石油大学 Lithology identification method based on fusion of VGG
CN111612066A (en) * 2020-05-21 2020-09-01 成都理工大学 Remote sensing image classification method based on depth fusion convolutional neural network
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
程国建;郭文惠;范鹏召;: "基于卷积神经网络的岩石图像分类", 西安石油大学学报(自然科学版), no. 04 *
芦国军;陈丽芳;: "基于深度卷积神经网络的遥感图像场景分类", 太原师范学院学报(自然科学版), no. 01 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537026A (en) * 2021-07-09 2021-10-22 上海智臻智能网络科技股份有限公司 Primitive detection method, device, equipment and medium in building plan
CN113610061A (en) * 2021-09-30 2021-11-05 国网浙江省电力有限公司电力科学研究院 Method and system for identifying unstressed conducting wire based on target detection and residual error network
CN115818166A (en) * 2022-11-15 2023-03-21 华能伊敏煤电有限责任公司 Unattended automatic control method and system for wheel hopper continuous system
CN115818166B (en) * 2022-11-15 2023-09-26 华能伊敏煤电有限责任公司 Unmanned automatic control method and system for continuous system of wheel bucket
CN116258658A (en) * 2023-05-11 2023-06-13 齐鲁工业大学(山东省科学院) Swin transducer-based image fusion method
CN116258658B (en) * 2023-05-11 2023-07-28 齐鲁工业大学(山东省科学院) Swin transducer-based image fusion method

Also Published As

Publication number Publication date
CN112819096B (en) 2024-01-19

Similar Documents

Publication Publication Date Title
CN112819096B (en) Construction method of fossil image classification model based on composite convolutional neural network
CN108875935B (en) Natural image target material visual characteristic mapping method based on generation countermeasure network
CN107862668A (en) A kind of cultural relic images restored method based on GNN
CN109146944B (en) Visual depth estimation method based on depth separable convolutional neural network
CN112489164B (en) Image coloring method based on improved depth separable convolutional neural network
CN108764298B (en) Electric power image environment influence identification method based on single classifier
CN107516103B (en) Image classification method and system
CN114092697B (en) Building facade semantic segmentation method with attention fused with global and local depth features
CN109118504B (en) Image edge detection method, device and equipment based on neural network
CN111046917B (en) Object-based enhanced target detection method based on deep neural network
CN112837344A (en) Target tracking method for generating twin network based on conditional confrontation
CN110059728A (en) RGB-D image vision conspicuousness detection method based on attention model
CN110009700B (en) Convolutional neural network visual depth estimation method based on RGB (red, green and blue) graph and gradient graph
CN111339862B (en) Remote sensing scene classification method and device based on channel attention mechanism
CN109460815A (en) A kind of monocular depth estimation method
CN109872326B (en) Contour detection method based on deep reinforced network jump connection
CN112991371B (en) Automatic image coloring method and system based on coloring overflow constraint
CN109448039B (en) Monocular vision depth estimation method based on deep convolutional neural network
CN115937552A (en) Image matching method based on fusion of manual features and depth features
CN111882555B (en) Deep learning-based netting detection method, device, equipment and storage medium
Borbon et al. Coral health identification using image classification and convolutional neural networks
CN109508639A (en) Road scene semantic segmentation method based on multiple dimensioned convolutional neural networks with holes
CN111259923A (en) Multi-target detection method based on improved three-dimensional R-CNN algorithm
CN115222754A (en) Mirror image segmentation method based on knowledge distillation and antagonistic learning
CN110796716B (en) Image coloring method based on multiple residual error network and regularized transfer learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant