CN114595725B - Electroencephalogram signal classification method based on addition network and supervised contrast learning - Google Patents

Electroencephalogram signal classification method based on addition network and supervised contrast learning Download PDF

Info

Publication number
CN114595725B
CN114595725B CN202210253209.3A CN202210253209A CN114595725B CN 114595725 B CN114595725 B CN 114595725B CN 202210253209 A CN202210253209 A CN 202210253209A CN 114595725 B CN114595725 B CN 114595725B
Authority
CN
China
Prior art keywords
addition
layer
convolution
mth
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210253209.3A
Other languages
Chinese (zh)
Other versions
CN114595725A (en
Inventor
李畅
赵禹阊
宋仁成
刘羽
成娟
陈勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202210253209.3A priority Critical patent/CN114595725B/en
Publication of CN114595725A publication Critical patent/CN114595725A/en
Application granted granted Critical
Publication of CN114595725B publication Critical patent/CN114595725B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/369Electroencephalography [EEG]
    • A61B5/372Analysis of electroencephalograms
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Psychiatry (AREA)
  • Veterinary Medicine (AREA)
  • Pathology (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Fuzzy Systems (AREA)
  • Physiology (AREA)
  • Psychology (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The invention discloses an electroencephalogram signal classification method based on addition network and supervised contrast learning, which comprises the following steps: 1, carrying out data selection and pretreatment of slicing on original electroencephalogram data; 2, establishing an addition network classification model; 3, designing a mixing loss function, and establishing a classification model optimization target; and 4, training the network by inputting data, and completing electroencephalogram signal classification by using the trained optimal model. According to the invention, the multiplication operation is replaced by addition, so that the calculation complexity and energy consumption are greatly reduced, the loss function of mixing the supervision contrast loss and the cross entropy loss is used, the signal classification can be automatically completed without manually extracting the characteristics or processing the signals of the original electroencephalogram, and the accuracy of the electroencephalogram classification can be remarkably improved, thereby increasing the application value of the electroencephalogram in the fields of medical treatment and the like.

Description

Electroencephalogram signal classification method based on addition network and supervised contrast learning
Technical Field
The invention relates to the field of electroencephalogram signal classification, in particular to a method for automatically classifying and predicting original electroencephalogram data of a subject through a deep learning method.
Background
The brain controls the behavior, emotion and other physiological activities of the human, and the electrical activity in the cerebral cortex contains rich information, which may contain information of different emotions, motor imagery and diseases of the human. Along with the development of brain-computer interface field and intelligent medical treatment, brain-computer signals are widely applied to various fields such as emotion calculation, motor imagery, medical health and the like. If the information of the electroencephalogram signals can be fully mined, different electroencephalogram signals can be accurately classified, and the use value of the electroencephalogram signals in the fields of medical treatment and the like can be increased.
Electroencephalogram (EEG) is a portable device that records electrical activity of the cerebral cortex and can detect various information related to brain electrical function. Intracranial EEG signals are acquired through electrodes placed under the scalp, while scalp EEG signals are acquired through electrodes placed on the scalp surface. Intracranial electroencephalogram is suitable for a long-term implantable monitoring system, generally has a high signal-to-noise ratio, and scalp electroencephalogram does not need to be implanted and is noninvasive for a patient, so that the intracranial electroencephalogram is common in practical use. Studies of the subject's EEG data show that some activities associated with the brain electrical signals begin to show signs a few minutes to a few hours before onset, so we can predictively classify the associated activities by capturing information in the EEG signal. However, analysis of EEG signals often requires a great deal of expertise and expert experience, which is a time-consuming and labor-consuming project; furthermore, EEG signals are continuous in time, and the subject outputs brain electrical signals at any time, so that a system capable of automatically predicting and classifying brain electrical signals is needed.
In the traditional algorithm of prediction classification based on EEG signals, researchers usually denoise the EEG signals, extract relevant features, and classify the obtained features by using a classifier to obtain a prediction effect. Common features are e.g. Hjorth parameters, statistical moments, accumulated energy, autoregressive coefficients, lyapunov indices, etc. Commonly used classifiers are support vector machines, bayesian classifiers, etc. However, extraction of these features also requires a rich expert experience, and the effect of classification is also largely dependent on the extracted features, which can lead to poor generalization effects; moreover, the traditional classifier has the defect of improving the classification performance of the electroencephalogram signals.
In recent years, the deep learning method is widely applied to the field of brain-computer interfaces, can automatically learn more suitable features from input, can simultaneously learn tasks of feature extraction and classification, and can obtain more accurate prediction effects in electroencephalogram signal classification tasks. However, deep learning methods often involve significant computational and hardware costs, which are disadvantageous in terms of clinical deployment, mobile applications, and implantable device applications. The past approaches have been directed primarily to designing a feature pre-processing approach or to designing a particular network architecture. The characteristic preprocessing process generally converts the original electroencephalogram data into various forms of characteristics, and also comprises operations such as filtering and denoising, and although more clean data can be obtained, some important information can be lost at the same time; the special network structure has better effect aiming at specific situations, but the performance is obviously reduced in complex and diverse environments; these methods ignore the inherent association between data.
Disclosure of Invention
The invention provides an electroencephalogram signal classification method based on addition network and supervised contrast learning to overcome the defects of the prior art, so that the electroencephalogram signal classification can be automatically realized under the environment with low energy consumption, low time delay and friendly hardware, and the electroencephalogram signal classification accuracy can be remarkably improved, thereby increasing the application value of the electroencephalogram signal in the fields of medical treatment and the like.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the invention discloses an electroencephalogram signal classification method based on addition network and supervised contrast learning, which is characterized by comprising the following steps of:
step 1, acquiring an electroencephalogram data set with labeling information, and performing channel data selection and sample segmentation pretreatment on an original electroencephalogram in the electroencephalogram data set to obtain N sections of electroencephalogram samples with the duration of T and form a training sample set, wherein the training sample set is marked as X= { X 1 ,X 2 ,...,X n ,...,X N (wherein X is n ∈R W×H Representing an nth section of electroencephalogram signal sample, wherein H represents the channel number of the electroencephalogram signal, W=T×s represents the sampling point number, and s represents the sampling rate of the electroencephalogram signal; let the nth section of brain electric signal sample X n The corresponding label is marked as Y n The label set corresponding to the training sample set X is denoted as y= { Y 1 ,Y 2 ,...,Y n ,...,Y N };
Step 2, establishing an addition network model, and comprising the following steps: the system comprises a one-dimensional convolution layer, M addition convolution modules, a self-adaptive pooling layer, a projection layer and a classification layer;
the addition convolution module consists of an addition convolution layer and an addition convolution residual layer; let the addition convolution kernel of the mth addition convolution block have the size h m Step length w m The ReLU activation function and batch normalization operation are adopted between the addition convolution layers and between the addition convolution residual layers; wherein, only the first addition convolution block is provided with the maximum pooling operation; m=1, 2, …, M;
step 2.1, initializing model parameters:
initializing weights of all convolution layers by using a device_unique_initialization;
step 2.2, the nth section of the electroencephalogram signal sample X is processed n ∈R W×H Inputting the data into the addition network model, extracting the time characteristics of the one-dimensional convolution layer and performing data dimension reduction operation to obtainNth one-dimensional convolution feature sequenceWherein (1)>An nth one-dimensional convolution characteristic sequence representing the output of the one-dimensional convolution layer>The x-th feature map, C 0 Represents the nth characteristic sequence->The number of the feature graphs in the model (a);
step 2.3, processing of the addition convolution module:
step 2.3.1, when m=1, the nth one-dimensional convolution feature sequence is obtainedAs input to the mth addition convolution module and denoted as characteristic sequence of the mth addition convolution module +.>Wherein (1)>Characteristic sequence +.>The x-th feature map, C m Characteristic sequence +.about.representing the mth addition convolution module>The number of the feature graphs in the model (a);
step 2.3.2, feature sequence of the mth addition convolution moduleThe characteristic sequence of the mth addition convolution layer is obtained through the processing of the mth addition convolution layer per se>Wherein (1)>An mth additive convolution layer characteristic sequence representing an output of said mth additive convolution layer +.>The x-th feature map of (a);
the feature sequence of the mth addition convolution moduleThe feature sequence of the mth addition convolution residual layer is obtained through the processing of the mth addition convolution residual layer per se>Wherein (1)>An mth addition convolution residual layer characteristic sequence representing an output of the mth addition convolution residual layer +.>The x-th feature map of (a);
step 2.3.3, outputting the feature sequence of the mth addition convolution layerAnd the feature sequence of the mth addition convolution residual layer output +.>After addition, the mth fusion characteristic sequence +.>Wherein (1)>Represents the mth fusion characteristic sequence +.>The x-th feature map of (a);
step 2.3.4, judging whether m=1 is satisfied, if so, fusing the feature sequence with the mPerforming maximum pooling operation to obtain characteristic sequence +.1 of the m+1th addition convolution module>Otherwise, the mth fusion feature sequence +.>Characteristic sequence +.1 as m+1 addition convolution module>Wherein (1)>Characteristic sequence +.1 representing the m+1th addition convolution module>The x-th feature map of (a); c (C) m+1 Characteristic sequence +.1 representing the m+1th addition convolution module>The number of the feature graphs in the model (a);
step 2.3.5, assigning m+1 to m; judging that M is more than M, if so, entering a step 2.3.6; otherwise, returning to the step 2.3.2;
step 2.3.6, feature sequence of the Mth addition convolution moduleBy the processing of the adaptive pooling layer, the feature vector +.>Wherein (1)>Characteristic sequence +.about.Mth addition convolution module>The x-th feature map, C M Characteristic sequence +.about.Mth addition convolution module>The number of feature patterns in->Feature vector representing the output of the adaptive pooling layer +.>R represents the number of eigenvalues;
step 2.4, processing the projection layer and the classification layer;
step 2.4.1, the fully connected projection layer outputs the characteristic vector of the self-adaptive pooling layerProjecting to the feature space to obtain projection layer feature vector +.>Wherein (1)>Representing the projection layer feature vector +.>Is the r-th eigenvalue of (a);
the adaptive pooling layer feature vectorMeanwhile, the nth probability p belonging to different categories is obtained through the processing of the full-connection classification layer n ={p n,1 ,p n,2 ,...,p n,a ...,p n,k -a }; wherein p is n,a An nth section of electroencephalogram signal sample X which represents the output of the classification layer n Probability of belonging to class a; k represents the number of categories;
step 3, randomly extracting a plurality of samples from a plurality of segments of electroencephalogram signal samples of the training sample set X to form a batch of data which is recorded as x= { X 1 ,x 2 ,...,x i ,...,x m -a }; the corresponding tag is noted as y= { y 1 ,y 2 ,...,y i ,...,y m -a }; wherein x is i Representing the ith electroencephalogram signal sample in a batch of data, y i Represents x i M represents the number of batch samples;
after the batch of data is processed according to the process from step 2.2 to step 2.4, the projection layer outputs the characteristic vectorWherein (1)>A feature vector representing an ith electroencephalogram signal sample in a batch of data output by the projection layer; outputting a probability p= { P by the classification layer 1 ,p 2 ,...,p i ,...,p m -a }; wherein p is i Representing the probability of an ith electroencephalogram signal sample in a batch of data output by the classification layer;
establishing a mixing loss function L by using the formula (1) -formula (3) n
L=αL sup +(1-α)L cross-entropy (1)
In the formulas (1) - (3), alpha is a parameter for adjusting two types of error weights, L sup Indicating a loss of supervision contrast, L cross -entropy Representing the cross-entropy loss,indicating the number of samples with the same label as the ith electroencephalogram signal sample in one batch of samples, ++>Indicating that the condition y is satisfied j =y i When j is not equal to i, the value is 1, otherwise, the value is 0; />A feature vector representing a j-th electroencephalogram signal sample in a batch of data output by the projection layer, τ represents a hyper-parameter controlling the training smoothness, +.>The feature vector of the t-th electroencephalogram signal sample in a batch of data output by the projection layer;
and 4, training the addition network model by using an Adam optimizer based on the training sample set X, calculating a mixed loss function L, and adjusting the learning rate in the training process by adopting a self-adaptive learning rate method until the verification loss is not reduced or the maximum training times are reached, so that the trained addition network model is obtained and is used for classifying the electroencephalogram signals.
1. The invention provides an addition network model based on deep learning, which uses an addition network in electroencephalogram signal classification for the first time, uses cheap addition to replace complex multiplication operation, realizes lower calculation cost and hardware cost, and simultaneously maintains the same accuracy.
2. The invention uses supervision contrast loss for electroencephalogram signal classification for the first time, the contrast loss can fully explore the inherent connection of data, the data of the same class are concentrated together, the data of different classes are simultaneously far away, and the cross entropy loss is combined, so that the classification performance of the electroencephalogram signal is improved.
3. The invention is an end-to-end structure model, does not need to carry out manual denoising and characteristic preprocessing processes on the original EEG signals in advance, directly carries out training learning from the original EEG data, and is more in line with a deep learning data driving mode, so that a great deal of expert experience and expertise are not needed, and better generalization is obtained.
Drawings
FIG. 1 is a schematic diagram of a network architecture according to the present invention;
FIG. 2 is a schematic diagram of a conventional convolution calculation;
FIG. 3 is a schematic diagram of an additive convolution calculation of the present invention;
FIG. 4 is a schematic diagram of the comparative learning of the present invention;
FIG. 5 is a graph showing comparison of the AUC effect of electroencephalogram classification in a CHB-MIT database;
FIG. 6 is a graph showing the comparison of the sensitivity effects of electroencephalogram classification in a CHB-MIT database;
FIG. 7 is a graph showing the comparison of the effect of classifying the EEG signals in the CHB-MIT database with FPR.
Detailed Description
In this embodiment, an electroencephalogram signal classification method based on addition network and supervised contrast learning mainly uses the addition network to classify electroencephalogram signals. The addition network uses the addition to replace multiplication in convolution by changing the similarity measurement in the convolution process, so that the calculation cost is reduced, the similar samples are mutually close, the different samples are mutually far away, and the accurate classification effect is finally achieved. As shown in fig. 1, the method specifically comprises the following steps:
step 1, obtainThe electroencephalogram signal data set marked with information is subjected to channel data selection and sample segmentation pretreatment on the original electroencephalogram signals in the electroencephalogram signal data set, so that N segments of electroencephalogram signal samples with the duration of T are obtained, a training sample set is formed, and the training sample set is marked as X= { X 1 ,X 2 ,...,X n ,...,X N (wherein X is n ∈R W×H Representing an nth section of electroencephalogram signal sample, wherein H represents the channel number of the electroencephalogram signal, W=T×s represents the sampling point number, and s represents the sampling rate of the electroencephalogram signal; let the nth section of brain electric signal sample X n The corresponding label is marked as Y n The label set corresponding to the training sample set X is denoted as y= { Y 1 ,Y 2 ,...,Y n ,...,Y N -a }; the method uses a public electroencephalogram epileptic dataset CHB-MIT and kagle;
step 2, establishing an addition network model, and comprising the following steps: the system comprises a one-dimensional convolution layer, M addition convolution modules, a self-adaptive pooling layer, a projection layer and a classification layer;
the addition convolution module consists of an addition convolution layer and an addition convolution residual error layer; let the addition convolution kernel of the mth addition convolution block have the size h m Step length w m The ReLU activation function and batch normalization operation are adopted between the addition convolution layers and between the addition convolution residual layers; wherein, only the first addition convolution block is provided with the maximum pooling operation; m=1, 2, …, M; the method sets m=3; h is a 1 =11×1,s 1 =1×1;h 2 =5×5,s 1 =2×2;h 3 =5×5,s 1 =2×2;
Step 2.1, initializing model parameters:
initializing weights of all convolution layers by using a device_unique_initialization;
step 2.2, the nth section of the electroencephalogram signal sample X n ∈R W×H In the addition network model, the nth one-dimensional convolution characteristic sequence is obtained after the time characteristic extraction and the data dimension reduction operation of the one-dimensional convolution layer are carried out firstlyWherein (1)>N-th one-dimensional convolution characteristic sequence representing one-dimensional convolution layer output +.>The x-th feature map, C 0 Represents the nth characteristic sequence->The number of the feature graphs in the model (a); because the original electroencephalogram signal is used, noise information is contained in the signal, the function of denoising can be achieved by using one-dimensional convolution, meanwhile, the size of the data dimension can be reduced, the size of a convolution kernel used in the experiment is 21 multiplied by 1, the step size is 1, and the maximum pooling operation size is 8 multiplied by 1;
step 2.3, processing of an addition convolution module:
step 2.3.1, when m=1, the nth one-dimensional convolution feature sequence is obtainedAs input to the mth addition convolution module and denoted as characteristic sequence of the mth addition convolution module +.>Wherein (1)>Characteristic sequence +.about.representing the mth addition convolution module>The x-th feature map, C m Characteristic sequence +.about.representing the mth addition convolution module>The number of the feature graphs in the model (a);
step 2.3.2, feature sequence of mth addition convolution moduleThe characteristic sequence of the mth addition convolution layer is obtained through the processing of the mth addition convolution layer per se>Wherein,an mth addition convolution layer characteristic sequence representing an mth addition convolution layer output +.>The x-th feature map of (a);
feature sequence of mth addition convolution moduleThe feature sequence of the mth addition convolution residual layer is obtained through the processing of the mth addition convolution residual layer per se>Wherein (1)>An mth addition convolution residual layer characteristic sequence representing an mth addition convolution residual layer output +.>The x-th feature map of (a); the conventional convolution calculates the similarity by calculating an inner product between the feature map and the filter, and the additive convolution calculates the similarity by calculating an L1 distance between the feature map and the filter; assume that a filter of a certain layer of the network isWherein the filter size is h×w, c in And c out Representing the number of input channels and the number of output channels; the input characteristic diagram is->Wherein H and W represent input feature map size; the output characteristic O is calculated as in equation (1):
in the formula (1), a is more than or equal to 1 and less than or equal to H, b is more than or equal to 1 and less than or equal to W, c is more than or equal to 1 and less than or equal to c out The larger the output feature O, the higher the similarity of the two, as shown in fig. 2; the addition convolution calculates the L1 distance by changing the multiplication to subtraction (addition in a computer is convenient by subtraction complement conversion), as in equation (2):
taking the opposite number of the L1 distance as a similarity measure, the larger the output characteristic O is, the smaller the L1 distance is, and the higher the similarity of the two is, as shown in FIG. 3; since the values calculated by the conventional convolution may be positive or negative, and the values calculated by the additive convolution may only be negative, the batch normalization in the conventional convolution is used for processing, which makes better use of the conventional activation function.
Step 2.3.3, outputting the feature sequence of the mth addition convolution layerAnd the feature sequence of the mth addition convolution residual layer output +.>After addition, the mth fusion characteristic sequence +.>Wherein (1)>Represents the mth fusion characteristic sequence +.>The x-th feature map of (a);
step 2.3.4, judging whether m=1 is satisfied, if so, fusing the feature sequence with the mPerforming maximum pooling operation to obtain characteristic sequence +.1 of the m+1th addition convolution module>Otherwise, the mth fusion feature sequence +.>Characteristic sequence +.1 as m+1 addition convolution module>Wherein (1)>Characteristic sequence +.1 representing the m+1th addition convolution module>The x-th feature map of (a); c (C) m+1 Characteristic sequence +.1 representing the m+1th addition convolution module>The number of the feature graphs in the model (a);
step 2.3.5, assigning m+1 to m; judging that M is more than M, if so, entering a step 2.3.6; otherwise, returning to the step 2.3.2;
step 2.3.6, feature sequence of the Mth addition convolution moduleBy the processing of the adaptive pooling layer, the feature vector +.>Wherein (1)>Characteristic sequence +.about.Mth addition convolution module>The x-th feature map, C M Characteristic sequence +.about.Mth addition convolution module>The number of feature patterns in->Feature vector representing the output of the adaptive pooling layer +.>R represents the number of eigenvalues;
step 2.4, processing a projection layer and a classification layer;
step 2.4.1, the fully connected projection layer outputs the characteristic vector from the self-adaptive pooling layerProjecting to the feature space to obtain projection layer feature vector +.>Wherein (1)>Representing projection layer feature vector +.>Is the r-th eigenvalue of (a);
adaptive pooling layer feature vectorsSimultaneous communicationProcessing the full connection classification layer to obtain the nth probability p belonging to different categories n ={p n,1 ,p n,2 ,...,p n,a ...,p n,k -a }; wherein p is n,a N-th section electroencephalogram signal sample X representing output of classification layer n Probability of belonging to class a; k represents the number of categories; projecting feature vectors into a feature space through a projection layer, scaling their lengths to 1 using normalization so that all feature vectors fall onto one hypersphere, then aggregating projection features of the same kind using contrast loss, feature vectors of different kinds being far apart, as shown in fig. 4;
step 3, randomly extracting a plurality of samples from a plurality of segments of electroencephalogram signal samples of the training sample set X to form a batch of data which is recorded as x= { X 1 ,x 2 ,...,x i ,...,x m -a }; the corresponding tag is noted as y= { y 1 ,y 2 ,...,y i ,...,y m -a }; wherein x is i Representing the ith electroencephalogram signal sample in a batch of data, y i Represents x i M represents the number of batch samples;
after a batch of data is processed according to the process from step 2.2 to step 2.4, the projection layer outputs the characteristic vectorWherein (1)>Representing the characteristic vector of the ith electroencephalogram signal sample in a batch of data output by the projection layer; the probability p= { P is output by the classification layer 1 ,p 2 ,...,p i ,...,p m -a }; wherein p is i Representing the probability of the ith electroencephalogram signal sample in a batch of data output by the classification layer;
establishing a mixing loss function L by using the formula (3) -formula (5) n
L=αL sup +(1-α)L cross-entropy (3)
In the formulas (3) - (5), alpha is a parameter for adjusting two types of error weights, L sup Indicating a loss of supervision contrast, L cross -entropy Representing the cross-entropy loss,indicating the number of samples with the same label as the ith electroencephalogram signal sample in one batch of samples, ++>Indicating that the condition y is satisfied j =y i When j is not equal to i, the value is 1, otherwise, the value is 0; />Characteristic vector of jth electroencephalogram signal sample in a batch of data output by a projection layer is represented, τ represents super-parameter for controlling training smoothness, and +.>The feature vector of the t-th electroencephalogram signal sample in a batch of data output by the projection layer;
and 4, training the addition network model by using an Adam optimizer based on the training sample set X, calculating a mixed loss function L, and adjusting the learning rate in the training process by adopting a self-adaptive learning rate method until the verification loss is not reduced or the maximum training times are reached, so that the trained addition network model is obtained and is used for classifying the electroencephalogram signals. Because the method of similarity calculation is changed, the function of gradient back propagation is also changed, which leads to the large gradient magnitude difference between different layers, and in order to enable the network to converge and learn a better model, the invention uses a self-adaptive learning rate method, adjusts the learning rate according to the majority of parameters of each layer, and calculates the following formula (6) -formula (7):
ΔF l =η×θ l ×ΔL(F l ) (6)
in the formulas (6) - (7), η represents the overall learning rate, θ l Is the local learning rate of the first layer of the network, ΔL (F l ) Is the gradient of the filter of the first layer of the network; lambda denotes the super parameter controlling the magnitude of the local learning rate, z denotes the first layer filter F l Is a parameter number of (2);
in particular, additive networks and supervised contrast learning networks (SCL-AddNTs) are compared with some advanced electroencephalographic classification deep learning methods such as one-dimensional convolutional neural networks (1D+CNN), deep convolutional neural networks+multi-layer perceptrons (DCNN+MLP), deep neural networks+two-way long and short term memory networks (DCNN+Bi-LSTM), and residual networks (ResCNN). Performance indicators on the CHB-MIT and Kaggle databases are as follows:
TABLE 1 average Performance of different methods on the CHB-MIT database for classifying electroencephalograms
Sensitivity (%) AUC FPR(\h)
1D+CNN 88.7 0.881 0.172
DCNN+MLP 87.8 0.861 0.208
ResCNN 89.9 0.911 0.140
SCL-AddNets 94.9 0.942 0.077
TABLE 2 average Performance of different methods on the Kaggle database to classify electroencephalograms signals
Sensitivity (%) AUC FPR(\h)
1D+CNN 80.9 0.808 0.134
DCNN+MLP 82.9 0.811 0.156
ResCNN 81.2 0.829 0.161
SCL-AddNets 89.1 0.831 0.120
TABLE 3 comparison of calculation complexity and parameter quantity for different methods
Quantity of parameters (. Times.10) 6 ) Multiplication times Number of additions Energy consumption (mJ) Delay of
1D+CNN 1.07 0.80×10 9 0.80×10 9 3.68 4.80
DCNN+MLP 0.43 0.45×10 9 0.45×10 9 2.07 2.70
ResCNN 0.12 0.31×10 9 0.31×10 9 1.43 1.86
SCL-AddNets 0.12 7.57×10 6 0.54×10 9 0.51 1.11
The remaining cross-validation results for 19 subjects are shown in figures 5, 6 and 7. Analysis of results:
the experimental results in tables 1 and 2 show that, compared with other deep learning methods 1D+CNN, DCNN+MLP and ResCNN in the electroencephalogram signal classification field, SCL-AddNTs are improved in various indexes, and the number of false alarms in the inter-seizure period can be reduced while the pre-seizure period can be predicted more accurately on two databases. It can be seen from table 3 that SCL-AddNets change a large number of multiplications into additions, which greatly improves both power consumption and delay. In addition, as can be seen from fig. 4, 5 and 6, the model is obviously improved in most subjects, and the type area and the signal distribution of different types of electroencephalogram signals are different for different subjects, so that the method has good identification capability and strong generalization effect on different subjects.
In summary, the invention fully utilizes the rich electroencephalogram information contained in the original EEG signal, reduces the calculation cost by using an addition network, simultaneously maintains the classification precision, combines the supervision and contrast learning, makes similar samples close to each other, and makes different samples far away from each other, thereby achieving the more accurate electroencephalogram signal classification effect. In the two classification tests of the public data set CHB-MIT and Kagle, the electroencephalogram data of the pre-seizure class can be classified more quickly and accurately, and meanwhile, the number of false alarms in the inter-seizure class is reduced, which is superior to most of traditional deep learning methods.

Claims (1)

1. An electroencephalogram signal classification method based on addition network and supervised contrast learning is characterized by comprising the following steps:
step 1, acquiring an electroencephalogram data set with labeling information, and performing channel data selection and sample segmentation pretreatment on an original electroencephalogram in the electroencephalogram data set to obtain N sections of electroencephalogram samples with the duration of T and form a training sample set, wherein the training sample set is marked as X= { X 1 ,X 2 ,...,X n ,...,X N (wherein X is n ∈R W×H Representing an nth section of electroencephalogram signal sample, wherein H represents the channel number of the electroencephalogram signal, W=T×s represents the sampling point number, and s represents the sampling rate of the electroencephalogram signal; let the nth section of brain electric signal sample X n The corresponding label is marked as Y n The label set corresponding to the training sample set X is denoted as y= { Y 1 ,Y 2 ,...,Y n ,...,Y N };
Step 2, establishing an addition network model, and comprising the following steps: the system comprises a one-dimensional convolution layer, M addition convolution modules, a self-adaptive pooling layer, a projection layer and a classification layer;
the addition convolution module consists of an addition convolution layer and an addition convolution residual layer; let the addition convolution kernel of the mth addition convolution block have the size h m Step length w m The ReLU activation function and batch normalization operation are adopted between the addition convolution layers and between the addition convolution residual layers; wherein, only the first addition convolution block is provided with the maximum pooling operation; m=1, 2, …, M;
step 2.1, initializing model parameters:
initializing weights of all convolution layers by using a device_unique_initialization;
step 2.2, the nth section of the electroencephalogram signal sample X is processed n ∈R W×H Inputting the data into the addition network model, and obtaining an nth one-dimensional convolution characteristic sequence after the time characteristic extraction and the data dimension reduction operation of the one-dimensional convolution layerWherein (1)>An nth one-dimensional convolution characteristic sequence representing the output of the one-dimensional convolution layer>The x-th feature map, C 0 Represents the nth characteristic sequence->The number of the feature graphs in the model (a);
step 2.3, processing of the addition convolution module:
step 2.3.1, when m=1, the nth one-dimensional convolution feature sequence is obtainedAs input to the mth addition convolution module and denoted as characteristic sequence of the mth addition convolution module +.>Wherein (1)>Characteristic sequence +.>The x-th feature map, C m Characteristic sequence +.about.representing the mth addition convolution module>The number of the feature graphs in the model (a);
step 2.3.2, feature sequence of the mth addition convolution moduleThe characteristic sequence of the mth addition convolution layer is obtained through the processing of the mth addition convolution layer per se>Wherein,an mth additive convolution layer characteristic sequence representing an output of said mth additive convolution layer +.>The x-th feature map of (a);
the feature sequence of the mth addition convolution moduleThe feature sequence of the mth addition convolution residual layer is obtained through the processing of the mth addition convolution residual layer per se>Wherein (1)>An mth addition convolution residual layer characteristic sequence representing an output of the mth addition convolution residual layer +.>The x-th feature map of (a);
step 2.3.3, outputting the feature sequence of the mth addition convolution layerAnd the feature sequence of the mth addition convolution residual layer output +.>After addition, the mth fusion characteristic sequence +.>Wherein (1)>Represents the mth fusion characteristic sequence +.>The x-th feature map of (a);
step 2.3.4, judging whether m=1 is satisfied, if so, fusing the feature sequence with the mPerforming maximum pooling operation to obtain characteristic sequence +.1 of the m+1th addition convolution module>Otherwise, the mth fusion feature sequence +.>Characteristic sequence +.1 as m+1 addition convolution module>Wherein (1)>Characteristic sequence +.1 representing the m+1th addition convolution module>The x-th feature map of (a); c (C) m+1 Characteristic sequence +.1 representing the m+1th addition convolution module>The number of the feature graphs in the model (a);
step 2.3.5, assigning m+1 to m; judging that M is more than M, if so, entering a step 2.3.6; otherwise, returning to the step 2.3.2;
step 2.3.6, feature sequence of the Mth addition convolution moduleBy the processing of the adaptive pooling layer, the feature vector +.>Wherein (1)>Characteristic sequence +.about.Mth addition convolution module>The x-th feature map, C M Feature sequence representing the Mth addition convolution moduleThe number of feature patterns in->Feature vector representing the output of the adaptive pooling layer +.>R represents the number of eigenvalues;
step 2.4, processing the projection layer and the classification layer;
step 2.4.1, the fully connected projection layer outputs the characteristic vector of the self-adaptive pooling layerProjecting to the feature space to obtain projection layer feature vector +.>Wherein (1)>Representing the projection layer feature vector +.>Is the r-th eigenvalue of (a);
the adaptive pooling layer feature vectorMeanwhile, the nth probability p belonging to different categories is obtained through the processing of the full-connection classification layer n ={p n,1 ,p n,2 ,...,p n,a ...,p n,k -a }; wherein p is n,a An nth section of electroencephalogram signal sample X which represents the output of the classification layer n Probability of belonging to class a; k represents the number of categories;
step 3, several sections of brain telecom from the training sample set XA plurality of samples are randomly extracted from the number samples and form a batch of data which is marked as x= { x 1 ,x 2 ,...,x i ,...,x m -a }; the corresponding tag is noted as y= { y 1 ,y 2 ,...,y i ,...,y m -a }; wherein x is i Representing the ith electroencephalogram signal sample in a batch of data, y i Represents x i M represents the number of batch samples;
after the batch of data is processed according to the process from step 2.2 to step 2.4, the projection layer outputs the characteristic vectorWherein (1)>A feature vector representing an ith electroencephalogram signal sample in a batch of data output by the projection layer; outputting a probability p= { P by the classification layer 1 ,p 2 ,...,p i ,...,p m -a }; wherein p is i Representing the probability of an ith electroencephalogram signal sample in a batch of data output by the classification layer;
establishing a mixing loss function L by using the formula (1) -formula (3) n
L=αL sup +(1-α)L cross-entropy (1)
In the formulas (1) - (3), alpha is a parameter for adjusting two types of error weights, L sup Indicating a loss of supervision contrast, L cross-entropy Representing the cross-entropy loss,the number of samples with the same label as the ith electroencephalogram signal sample in one batch of samples is represented,indicating that the condition y is satisfied j =y i When j is not equal to i, the value is 1, otherwise, the value is 0; />A feature vector representing a j-th electroencephalogram signal sample in a batch of data output by the projection layer, τ represents a hyper-parameter controlling the training smoothness, +.>The feature vector of the t-th electroencephalogram signal sample in a batch of data output by the projection layer;
and 4, training the addition network model by using an Adam optimizer based on the training sample set X, calculating a mixed loss function L, and adjusting the learning rate in the training process by adopting a self-adaptive learning rate method until the verification loss is not reduced or the maximum training times are reached, so that the trained addition network model is obtained and is used for classifying the electroencephalogram signals.
CN202210253209.3A 2022-03-15 2022-03-15 Electroencephalogram signal classification method based on addition network and supervised contrast learning Active CN114595725B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210253209.3A CN114595725B (en) 2022-03-15 2022-03-15 Electroencephalogram signal classification method based on addition network and supervised contrast learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210253209.3A CN114595725B (en) 2022-03-15 2022-03-15 Electroencephalogram signal classification method based on addition network and supervised contrast learning

Publications (2)

Publication Number Publication Date
CN114595725A CN114595725A (en) 2022-06-07
CN114595725B true CN114595725B (en) 2024-02-20

Family

ID=81817097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210253209.3A Active CN114595725B (en) 2022-03-15 2022-03-15 Electroencephalogram signal classification method based on addition network and supervised contrast learning

Country Status (1)

Country Link
CN (1) CN114595725B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115607170B (en) * 2022-11-18 2023-04-25 中国科学技术大学 Lightweight sleep staging method based on single-channel electroencephalogram signals and application
CN115700104B (en) * 2022-12-30 2023-04-25 中国科学技术大学 Self-interpretable electroencephalogram signal classification method based on multi-scale prototype learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
CN112766229A (en) * 2021-02-08 2021-05-07 南京林业大学 Human face point cloud image intelligent identification system and method based on attention mechanism
CN113011330A (en) * 2021-03-19 2021-06-22 中国科学技术大学 Electroencephalogram signal classification method based on multi-scale neural network and cavity convolution
WO2021143353A1 (en) * 2020-01-13 2021-07-22 腾讯科技(深圳)有限公司 Gesture information processing method and apparatus, electronic device, and storage medium
CN113673434A (en) * 2021-08-23 2021-11-19 合肥工业大学 Electroencephalogram emotion recognition method based on efficient convolutional neural network and contrast learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932480B (en) * 2018-06-08 2022-03-15 电子科技大学 Distributed optical fiber sensing signal feature learning and classifying method based on 1D-CNN

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021143353A1 (en) * 2020-01-13 2021-07-22 腾讯科技(深圳)有限公司 Gesture information processing method and apparatus, electronic device, and storage medium
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
CN112766229A (en) * 2021-02-08 2021-05-07 南京林业大学 Human face point cloud image intelligent identification system and method based on attention mechanism
CN113011330A (en) * 2021-03-19 2021-06-22 中国科学技术大学 Electroencephalogram signal classification method based on multi-scale neural network and cavity convolution
CN113673434A (en) * 2021-08-23 2021-11-19 合肥工业大学 Electroencephalogram emotion recognition method based on efficient convolutional neural network and contrast learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于半监督学习的脑电信号特征提取及识别;张娜;唐贤伦;刘庆;;工程科学与技术(第S2期);全文 *
结合注意力与无监督深度学习的单目深度估计;岑仕杰;何元烈;陈小聪;;广东工业大学学报(第04期);全文 *

Also Published As

Publication number Publication date
CN114595725A (en) 2022-06-07

Similar Documents

Publication Publication Date Title
Abdelhameed et al. A deep learning approach for automatic seizure detection in children with epilepsy
Thoduparambil et al. EEG-based deep learning model for the automatic detection of clinical depression
Liu et al. Deep C-LSTM neural network for epileptic seizure and tumor detection using high-dimension EEG signals
Tuncer et al. Classification of epileptic seizures from electroencephalogram (EEG) data using bidirectional short-term memory (Bi-LSTM) network architecture
CN108256629B (en) EEG signal unsupervised feature learning method based on convolutional network and self-coding
CN112244873A (en) Electroencephalogram time-space feature learning and emotion classification method based on hybrid neural network
CN114595725B (en) Electroencephalogram signal classification method based on addition network and supervised contrast learning
CN114564990B (en) Electroencephalogram signal classification method based on multichannel feedback capsule network
CN112766355B (en) Electroencephalogram signal emotion recognition method under label noise
Kumar et al. OPTICAL+: a frequency-based deep learning scheme for recognizing brain wave signals
Malviya et al. A novel technique for stress detection from EEG signal using hybrid deep learning model
CN111387975B (en) Electroencephalogram signal identification method based on machine learning
CN112932501B (en) Method for automatically identifying insomnia based on one-dimensional convolutional neural network
CN115804602A (en) Electroencephalogram emotion signal detection method, equipment and medium based on attention mechanism and with multi-channel feature fusion
CN113974655A (en) Epileptic seizure prediction method based on electroencephalogram signals
Graña et al. A review of Graph Neural Networks for Electroencephalography data analysis
Taori et al. Cognitive workload classification: Towards generalization through innovative pipeline interface using HMM
Srinivasan et al. A novel approach to schizophrenia Detection: Optimized preprocessing and deep learning analysis of multichannel EEG data
Liu et al. Automated Machine Learning for Epileptic Seizure Detection Based on EEG Signals.
Rani et al. Effective Epileptic Seizure Detection Using Enhanced Salp Swarm Algorithm-based Long Short-Term Memory Network
Pandian et al. Effect of data preprocessing in the detection of epilepsy using machine learning techniques
Aldahr et al. Evolving deep learning models for epilepsy diagnosis in data scarcity context: A survey
CN113177482A (en) Cross-individual electroencephalogram signal classification method based on minimum category confusion
Sharifrazi et al. Functional Classification of Spiking Signal Data Using Artificial Intelligence Techniques: A Review
Wu et al. Automatic classification of EEG signals via deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant