CN109841229A - A kind of Neonate Cry recognition methods based on dynamic time warping - Google Patents
A kind of Neonate Cry recognition methods based on dynamic time warping Download PDFInfo
- Publication number
- CN109841229A CN109841229A CN201910134910.1A CN201910134910A CN109841229A CN 109841229 A CN109841229 A CN 109841229A CN 201910134910 A CN201910134910 A CN 201910134910A CN 109841229 A CN109841229 A CN 109841229A
- Authority
- CN
- China
- Prior art keywords
- crying
- neonate
- recognition methods
- dynamic time
- time warping
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention belongs to biometrics identification technology field, specially a kind of Neonate Cry recognition methods based on dynamic time warping.The method of the present invention includes: the newborn's crying sound acquired in the case of three kinds, and labeled as hungry, pain and unknown cause crying;Crying is subjected to the pretreatment such as down-sampled, preemphasis, framing adding window, end-point detection, feature extraction is carried out to the crying signal pre-processed, extracts the Mel frequency cepstrum coefficient of short-time energy, fundamental frequency and 0 to 12 ranks;One-way analysis of variance is carried out to characteristic parameter, filters out the feature in three classes crying with significant difference;Select the reference template of three classifications respectively using dynamic time warping algorithm;Three two classification tasks are executed, match cognization is carried out using DTW algorithm, calculates discrimination.The step of computation complexity of the present invention is lower, accelerates arithmetic speed, and dynamic time warping algorithm does not require two sample lengths equal, eliminates pre-processing.
Description
Technical field
The invention belongs to biometrics identification technology fields, and in particular to Neonate Cry recognition methods.
Background technique
Newborn does not have ability of language expression, and the mode of unique exposition need and emotion is to cry and scream.Newborn excessively cries
Noisy is that mother complains most problems, allow they experience anxiety with it is helpless, this be also cause one of postpartum depression it is important
Reason.It excessively cries noisy while also bringing many workload and puzzlement to clinical department, children's health care section.
Center of gravity has been placed on the difference of crying under analysis different conditions by most of research in relation to Neonate Cry at present, main
Will difference such as wave spectrogram etc. visually set about, the purpose of Classification and Identification is to compare the discrimination of different characteristic.It adopts mostly
It is identified with classifier, this method needs biggish data set and occupies biggish computer memory.
Summary of the invention
The Neonate Cry identification side that the purpose of the present invention is to provide a kind of data volumes is small, efficient, computation complexity is low
Method.
Neonate Cry recognition methods provided by the invention is based on dynamic time warping technology, to the signal of acquisition
Directly classified to abnormal gait by original signal after being pre-processed, the specific steps are as follows:
Step 1, about 10-20 minutes before being breast-feeded using the recording of ZOOM-H4N portable recorder, when injection, 5-10 points after breast-feeding
Crying sound in the case of three kinds of clock, and labeled as hungry, pain and unknown cause crying;
Step 2 pre-processes crying data, including down-sampled, preemphasis, framing adding window, end-point detection etc., then right
Pretreated crying signal carries out feature extraction, extracts the Mel frequency cepstrum system of short-time energy, fundamental frequency and 0 to 12 ranks
Number;
Step 3, carrying out one-way analysis of variance to obtained characteristic parameter, (Guo Ping one-way analysis of variance is in mathematical statistics
In application [J] Changchun University journal, 2014 (10): 1370-1373.), filter out in three classes crying have conspicuousness
The feature of difference;
Step 4, using dynamic time warping algorithm (DTW) (Anil K.Jain, FriederikeD.Griess,
Scottd.Connell. On-line signature verification.Pattern Recognition, 2002,
(35): 2963-2972. the reference template of three classifications) is selected respectively;
Step 5 executes three two classification tasks, carries out match cognization using dynamic time warping algorithm, calculates discrimination.
In step 1 of the present invention, recorder is placed on from newborn oral cavity 5-10 centimeters when acquiring crying, in order to avoid generate
Loud noise.
In step 2 of the present invention, crying data are pre-processed, original frequency are dropped into 8000Hz-16000Hz, in advance
Exacerbation coefficient is 0.95-0.98, and frame length 10-30ms, it is 5-10ms that frame, which moves, carries out end-point detection using double threshold method, extracts
The method of gene frequency uses Cepstrum Method.
In step 3 of the present invention, the one-way analysis of variance is suitable for the analysis two-by-two of three classes or three classes or more, if
Setting p value boundary is a certain value in 0.01-0.05, when p value is less than the boundary of setting, indicates significant difference.The tool of p value
Depending on body is arranged with specific situation, in embodiment, p value is set as 0.5.
In step 4 of the present invention, the purpose of the dynamic time warping algorithm is similar between two templates of measurement
Degree, similarity are indicated with the distance measure of two templates under optimal path.
The invention has the advantages that effective characteristic of division is filtered out by one-way analysis of variance, so that the later period
Classify more efficient.Dynamic time warping algorithm is of less demanding to data volume, that is, possessing lesser data volume can classify,
And computation complexity is lower, accelerates arithmetic speed.And dynamic time warping algorithm does not require two sample lengths equal, saves
The step of having gone pre-processing.
Detailed description of the invention
Fig. 1 is that the present invention is based on the Neonate Cry recognition methods flow charts of dynamic time warping.
Fig. 2-Fig. 4 is crying original signal example (interception 10s) in the case of choose three kinds in present pre-ferred embodiments.
Wherein, Fig. 2 is hungry crying, and Fig. 3 is pain crying, and Fig. 4 is unknown crying.
The case figure for six with the significant difference feature that Fig. 5-Figure 10 is filtered out for the present invention.Wherein, Fig. 5 is short
Shi Nengliang, Fig. 6 are fundamental frequency, and it is 10 rank MFCC that Fig. 7, which is 6 rank MFCC, Figure 10 for 1 rank MFCC, Fig. 9 for 0 rank MFCC, Fig. 8,.
Specific embodiment
Develop simultaneously preferred embodiment with reference to the accompanying drawing, and the present invention will be described in detail.
The Neonate Cry recognition methods based on dynamic time warping that the present invention provides a kind of, method flow schematic diagram is such as
Shown in Fig. 1.This method is realized using following steps:
ZOOM-H4N portable recorder is placed on from five to ten centimeters of newborn oral cavity by step 1, in order to avoid generate larger
Noise.Recorder sample frequency is set as 44.1kHz, bit rate 16bit/s, when recording before breast-feeding 15 minutes, injection, feeds
The ten minutes crying sound in the case of three kinds after milk, and labeled as hungry, pain and unknown cause crying.At three kinds in the present embodiment
In the case of the original signal example (interception 10s) that acquires as shown in figs 2-4.
Step 2, collected crying first passes through down-sampled so that data volume reduces, it is down-sampled after frequency be
8000Hz, preemphasis coefficient be 0.98, a length of 25ms of framing time frame, frame move be 10ms, use double threshold method carry out endpoint inspection
It surveys, feature extraction is carried out to the crying signal pre-processed, extracts the Mei Er frequency of short-time energy, fundamental frequency and 0 to 12 ranks
The step of cepstrum coefficient, the method for extracting gene frequency is Cepstrum Method, extracts MFCC cepstrum are as follows:
(1) pretreated signal is subjected to discrete Fourier transform (DFT), to obtain the short-term spectrum of speech frame;
(2) range value of short-term spectrum is weighted filtering processing by Mei Er filter group;
(3) one is carried out to whole output valves of Mei Er filter group and seeks logarithm operation;
(4) value obtained after seeking Logarithmic calculation is subjected to discrete cosine transform (DCT), to obtain MFCC cepstrum
(MFCC).
Step 3 carries out one-way analysis of variance to obtained characteristic parameter, and filtering out has significantly in three classes crying
The feature of sex differernce.One-way analysis of variance is suitable for the analysis two-by-two of three classes or three classes or more, and the p value boundary of setting is
0.05, significant difference is indicated when p value is less than 0.05, filters out six features (short-time energy, fundamental frequency, 0 rank
MFCC, 1 rank MFCC, 6 rank MFCC, 10 rank MFCC) there is significant difference, the case figure of distribution is as shown in fig. 5-10.
Step 4 selects the reference template of three classifications with dynamic time warping algorithm respectively.Dynamic time warping algorithm
Purpose be the similarity measured between two templates, similarity refers to the distance measure of two templates under optimal path, formula
Are as follows:
Wherein, T (n) is Time alignment function, and N is the frame number of test template,For the Europe between two frame vectors
Formula distance.
The step of choosing reference template are as follows:
(1) distance measure D is calculated two-by-two in several (such as 30) samples of each classification, calculateIt is secondary, there is minimum range
A pair of sample of D is selected as candidate reference template, and is labeled as A and B;
(2) A and B calculates distance respectively and is added up with other samples in a classification, formula are as follows:
;
(3) compareWith, reference template is chosen as with smaller value.
Step 5 executes three two classification tasks, match cognization is carried out using dynamic time warping algorithm, at each two points
In generic task, test sample is calculated at a distance from reference template, calculates discrimination.
In the present embodiment, 90 samples are acquired altogether, and the newborn mature from 72 there are 30 samples under every kind of reason
This.
Verified, in the present embodiment, the discrimination of pain crying is higher, has reached 93.1%.Screening is demonstrated simultaneously
Six features out are effective to classification Neonate Cry.The table 1 of recognition result.
It should be pointed out that above-described embodiment is merely to illustrate the present invention, being achieved in that for each step can be
Variation, various modifications to these embodiments are it will be apparent that therefore all for those skilled in the art
The equivalents and improvement carried out on the basis of theory of the invention general and spirit, should all protection scope of the present invention it
It is interior.
Table 1
。
Claims (8)
1. a kind of Neonate Cry recognition methods based on dynamic time warping, which is characterized in that specific step is as follows:
Step 1,10-20 minutes before being breast-feeded using the recording of ZOOM-H4N portable recorder, when injection, 5-10 minutes after breast-feeding
Crying sound in the case of three kinds, and labeled as hungry, pain and unknown cause crying;
Step 2 pre-processes crying data, including down-sampled, preemphasis, framing adding window, end-point detection, to pretreatment
Crying signal afterwards carries out feature extraction, extracts the Mel frequency cepstrum coefficient of short-time energy, fundamental frequency and 0 to 12 ranks;
Step 3 carries out one-way analysis of variance to obtained characteristic parameter, and filter out has conspicuousness poor in three classes crying
Different feature;
Step 4 selects the reference template of three classifications using dynamic time warping algorithm respectively;
Step 5 executes three two classification tasks, carries out match cognization using dynamic time warping algorithm, calculates discrimination.
2. Neonate Cry recognition methods according to claim 1, which is characterized in that crying acquisition described in step 1
When recorder be placed on from newborn oral cavity 5-10 centimeters, in order to avoid generate loud noise.
3. Neonate Cry recognition methods according to claim 1 or 2, which is characterized in that crying described in step 2
Data are pre-processed, and are that original frequency is dropped to 8000Hz-16000Hz, pre emphasis factor 0.95-0.98, frame length is
10-30ms, it is 5-10ms that frame, which moves, carries out end-point detection using double threshold method, extracts the method for gene frequency using Cepstrum Method.
4. Neonate Cry recognition methods according to claim 3, which is characterized in that extraction gene described in step 2
The method of frequency is Cepstrum Method.
5. Neonate Cry recognition methods according to claim 4, which is characterized in that single factor test side described in step 3
Difference analysis is suitable for the analysis two-by-two of three classes or three classes or more, wherein and setting p value boundary is a certain value in 0.01-0.05, when
When p value is less than the boundary of setting, significant difference is indicated.
6. Neonate Cry recognition methods according to claim 5, which is characterized in that dynamic time described in step 4
The purpose of regular algorithm is the similarity measured between two templates.
7. Neonate Cry recognition methods according to claim 6, which is characterized in that similarity described in step 4 refers to
The distance measure of two templates, formula under optimal path are as follows:
Wherein, T (n) is Time alignment function, and N is the frame number of test template,It is European between two frame vectors
Distance.
8. Neonate Cry recognition methods according to claim 7, which is characterized in that the distance between two frame vectors are surveyed
Degree indicates with Euclidean distance, formula are as follows:
Wherein, k is the dimension of frame vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910134910.1A CN109841229A (en) | 2019-02-24 | 2019-02-24 | A kind of Neonate Cry recognition methods based on dynamic time warping |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910134910.1A CN109841229A (en) | 2019-02-24 | 2019-02-24 | A kind of Neonate Cry recognition methods based on dynamic time warping |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109841229A true CN109841229A (en) | 2019-06-04 |
Family
ID=66884847
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910134910.1A Pending CN109841229A (en) | 2019-02-24 | 2019-02-24 | A kind of Neonate Cry recognition methods based on dynamic time warping |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109841229A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619893A (en) * | 2019-09-02 | 2019-12-27 | 合肥工业大学 | Time-frequency feature extraction and artificial intelligence emotion monitoring method of voice signal |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1564245A (en) * | 2004-04-20 | 2005-01-12 | 上海上悦通讯技术有限公司 | Stunt method and device for baby's crying |
WO2006125268A1 (en) * | 2005-05-25 | 2006-11-30 | Craig Matthew Erskine-Smith | Dental brush |
CN101436405A (en) * | 2008-12-25 | 2009-05-20 | 北京中星微电子有限公司 | Method and system for recognizing speaking people |
CN101807396A (en) * | 2010-04-02 | 2010-08-18 | 陕西师范大学 | Device and method for automatically recording crying of babies |
CN102283836A (en) * | 2011-06-27 | 2011-12-21 | 苏州大学附属第一医院 | Application of sorafenib in treatment of early brain injury (EBI) after subarachnoid hemorrhage (SAH) |
CN102302491A (en) * | 2011-06-27 | 2012-01-04 | 苏州大学附属第一医院 | Application of sorafenib to treatment of cerebral vasospasm (CVS) after subarachnoid hemorrhage (SAH) |
CN102509547A (en) * | 2011-12-29 | 2012-06-20 | 辽宁工业大学 | Method and system for voiceprint recognition based on vector quantization based |
CN102881284A (en) * | 2012-09-03 | 2013-01-16 | 江苏大学 | Unspecific human voice and emotion recognition method and system |
CN108717497A (en) * | 2018-05-23 | 2018-10-30 | 大连海事大学 | Imitative stichopus japonicus place of production discrimination method based on PCA-SVM |
-
2019
- 2019-02-24 CN CN201910134910.1A patent/CN109841229A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1564245A (en) * | 2004-04-20 | 2005-01-12 | 上海上悦通讯技术有限公司 | Stunt method and device for baby's crying |
WO2006125268A1 (en) * | 2005-05-25 | 2006-11-30 | Craig Matthew Erskine-Smith | Dental brush |
CN101436405A (en) * | 2008-12-25 | 2009-05-20 | 北京中星微电子有限公司 | Method and system for recognizing speaking people |
CN101807396A (en) * | 2010-04-02 | 2010-08-18 | 陕西师范大学 | Device and method for automatically recording crying of babies |
CN102283836A (en) * | 2011-06-27 | 2011-12-21 | 苏州大学附属第一医院 | Application of sorafenib in treatment of early brain injury (EBI) after subarachnoid hemorrhage (SAH) |
CN102302491A (en) * | 2011-06-27 | 2012-01-04 | 苏州大学附属第一医院 | Application of sorafenib to treatment of cerebral vasospasm (CVS) after subarachnoid hemorrhage (SAH) |
CN102509547A (en) * | 2011-12-29 | 2012-06-20 | 辽宁工业大学 | Method and system for voiceprint recognition based on vector quantization based |
CN102881284A (en) * | 2012-09-03 | 2013-01-16 | 江苏大学 | Unspecific human voice and emotion recognition method and system |
CN108717497A (en) * | 2018-05-23 | 2018-10-30 | 大连海事大学 | Imitative stichopus japonicus place of production discrimination method based on PCA-SVM |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619893A (en) * | 2019-09-02 | 2019-12-27 | 合肥工业大学 | Time-frequency feature extraction and artificial intelligence emotion monitoring method of voice signal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cheng et al. | Speech emotion recognition using gaussian mixture model | |
Shama et al. | Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology | |
Rajisha et al. | Performance analysis of Malayalam language speech emotion recognition system using ANN/SVM | |
CN103280220A (en) | Real-time recognition method for baby cry | |
Fook et al. | Comparison of speech parameterization techniques for the classification of speech disfluencies | |
Chaudhary et al. | Gender identification based on voice signal characteristics | |
Zhu et al. | Y-vector: Multiscale waveform encoder for speaker embedding | |
WO2018095167A1 (en) | Voiceprint identification method and voiceprint identification system | |
CN106548786A (en) | A kind of detection method and system of voice data | |
Abdul | Kurdish speaker identification based on one dimensional convolutional neural network | |
Bareeda et al. | Lie detection using speech processing techniques | |
Thiruvengatanadhan | Speech recognition using SVM | |
Jena et al. | Gender recognition of speech signal using knn and svm | |
Pazhanirajan et al. | EEG signal classification using linear predictive cepstral coefficient features | |
CN109841229A (en) | A kind of Neonate Cry recognition methods based on dynamic time warping | |
Prasasti et al. | Identification of baby cry with discrete wavelet transform, mel frequency cepstral coefficient and principal component analysis | |
Tan et al. | Towards real time implementation of sparse representation classifier (SRC) based heartbeat biometric system | |
Sengupta et al. | Optimization of cepstral features for robust lung sound classification | |
CN111862991A (en) | Method and system for identifying baby crying | |
Mini et al. | Feature vector selection of fusion of MFCC and SMRT coefficients for SVM classifier based speech recognition system | |
Shakya | Acoustic Features Based Emotional Speech Signal Categorization by Advanced Linear Discriminator Analysis | |
Chen et al. | Heart Sound Classification Based on Mel-Frequency Cepstrum Coefficient Features and Multi-Scale Residual Recurrent Neural Networks | |
Yue et al. | Speaker age recognition based on isolated words by using SVM | |
Bernadin et al. | Wavelet processing for pitch period estimation | |
Bonifaco et al. | Comparative analysis of filipino-based rhinolalia aperta speech using mel frequency cepstral analysis and Perceptual Linear Prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190604 |