Summary of the invention:
The main object of the present invention is to propose a kind of depth motion figure Human bodys' response side based on smeared out boundary fragment
Method, the temporal information of capture movement simultaneously indicate classifier R-ProCRC (Robust using the cooperation of the probability of robust
Probabilistic Collaborative Representation based Classifier) [5] classify, mention
High accuracy of identification.
To achieve the goals above, the present invention provides the following technical solutions, and includes training stage and test phase.
The Human bodys' response method training stage technical solution of depth motion figure based on smeared out boundary fragment is as follows:
Step 1: the training set of the depth map sample of given human body behavior video sequence Wherein X(k)Indicate the depth map sequence of k-th of training sample,
For the original depth image of the i-th frame in k-th of sample, NkFor the totalframes of k-th of sample;Y(k)Indicate k-th of training sample institute
Category behavior classification;M indicates number of samples in training set;
Step 2: by video sequence training sample X each in training set(k)Temporally axis be directly divided into DIV it is isometric when
Between piece, each time leaf length isTimeslice after division is expressed asWherein
Step 3: selecting suitable fuzzy parameter α, carries out fragment Fuzzy Processing to each timeslice;Fuzzy Time piece table
It is shown asTo avoid subscript from overflowing, for first
A timeslice do not do before to Fuzzy Processing, to Fuzzy Processing after not done to the last one timeslice;
Step 4: each Fuzzy Time piece is calculatedIn the depth motion figure of three different projecting directionsIts
Middle v ∈ { f, s, t } respectively indicates the three-view diagram (three directions) of video sequence projection, main view, left view and top view;Extremely
This, calculates and obtains all training samplesCorresponding depth motion set of graphs
Step 5: step 4 is solved to the depth motion figure obtained using bicubic differential techniqueIt is adjusted to an identical size, and these depth motion figures are normalized
Between 0 to 1, after normalizationIt is denoted as
Step 6: by any training sample X(k)Corresponding depth motion set of graphsInto
Row vector, and the sequence action diagram after all vectorizations is connected, complete training sample X(k)Corresponding feature construction, the spy
Sign is denoted as H(k), then the characteristic set of all samples is expressed as { H(k)}k∈[1M];
Step 7: all samples that step 6 is obtainedOutput feature { H(k)}k∈[1M]Pass through PCA
Dimensionality reduction, and save the feature after all dimensionality reductions
The Human bodys' response method testing stage technological scheme of depth motion figure based on smeared out boundary fragment is as follows:
Step 1: the test sample (TestX, TestY) of the depth map sample of a given human body behavior video sequence,
Middle TestX indicates the depth map sequence of test sample,XiFor the original depth of the i-th frame in test sample
Spend image, NTFor the totalframes of the test sample;TestY indicates the affiliated behavior classification of test sample;
Step 2: by test sample TestX, temporally axis is directly divided into DIV isometric timeslice (sliced fashion and instructions
It is identical to practice the stage), each time leaf length isTimeslice after division is expressed asWherein
Step 3: according to the fuzzy parameter α that the training stage uses, fragment Fuzzy Processing is carried out to each timeslice;It is fuzzy
Timeslice is expressed asTo avoid subscript from overflowing
Out, to Fuzzy Processing before not done for first timeslice, to Fuzzy Processing after not done to the last one timeslice;
Step 4: each Fuzzy Time piece is calculatedIn the depth motion figure DMM of three different projecting directionsj,v, wherein v
∈ { f, s, t } respectively indicates the three-view diagram (three directions) of video sequence projection, main view, left view and top view;So far, it counts
It calculates and obtains the corresponding depth motion set of graphs { DMM of test sample TestXj,v}j∈[1DIV],v∈[f,s,t];
Step 5: step 4 is solved to the test sample depth motion figure obtained using bicubic differential technique
{DMMj,v}j∈[1DIV],v∈[f,s,t]It is adjusted to training stage same size, and according to training stage method for normalizing, by these depths
Degree action diagram is normalized between 0 to 1, the DMM after normalizationj,vIt is denoted as
Step 6: by the corresponding depth motion set of graphs of test sample TestXInto
Row vector, and the sequence action diagram after all vectorizations is connected, the corresponding feature construction of test sample TestX is completed, it should
Feature is denoted as HT;
Step 7: the output feature H for the test sample TestX that step 6 is obtainedTBy PCA dimensionality reduction, after obtaining dimensionality reduction
FeaturePCA dimension reduction method is consistent with the training stage;
Step 8: and then by the output feature after dimensionality reductionIt is sent into R-ProCRC [5] classifier, obtains classification output
PridY;
Step 9: comparing PridY and TestY, if PridY=TestY, identification is correct, otherwise identifies mistake.
Compared with prior art, the invention has the following advantages:
1, this method carries out Human bodys' response using depth data, compared to traditional color video data, depth
Data can be realized the efficient segmentation to human body, while save the shape and structure feature of human body, help to improve classification essence
Degree;
2, traditional feature extracting method using depth motion figure DMM projects to entire video frame on one DMM figure,
Drop-out time information.The depth motion figure Human bodys' response method for the smeared out boundary fragment that this method proposes, by depth map
Sequence carries out the fragment of time dimension, effectively captures the changing rule of temporal signatures;
3, human body behavior time domain otherness, the human body row of the depth motion figure for the smeared out boundary fragment that this method proposes are directed to
For recognition methods, the boundary between fragment is controlled using fuzzy parameter α, so that adjacent burst information is shared, further
Feature is improved to the robustness of human body behavior time domain difference, is significantly mentioned with the use of R-ProCRC classifier accuracy of identification
It is high.
Specific embodiment
Purpose, specific steps and feature in order to better illustrate the present invention, with reference to the accompanying drawing, with MSR
For Action3D data set, the present invention is described in further detail:
The Human bodys' response method of the depth motion figure of smeared out boundary fragment proposed by the present invention, wherein feature extraction
Flow chart is as shown in Figure 1.The equivalent fragment on boundary is determined to sample first, the mould on boundary is then determined according to parameter alpha
Paste degree calculates the video sequence after each fragment its depth motion figure DMM, will using bicubic differential technique
The DMM of all samples is fixed to an identical size, and normalizes, and obtains the feature of subsequence after vectorization of connecting, completes
The building of the output feature of training sample.
A kind of Human bodys' response method of depth motion figure based on smeared out boundary fragment proposed by the present invention includes instruction
Practice stage and test phase.
The Human bodys' response method training stage technical solution of depth motion figure based on smeared out boundary fragment is as follows:
Step 1: the training set of the depth map sample of given human body behavior video sequence Wherein X(k)Indicate the depth map sequence of k-th of training sample,
For the original depth image of the i-th frame in k-th of sample, NkFor the totalframes of k-th of sample;Y(k)Indicate k-th of training sample institute
Category behavior classification;M indicates number of samples in training set;
Step 2: by video sequence training sample X each in training set(k)Temporally axis be directly divided into DIV it is isometric when
Between piece, each time leaf length isTimeslice after division is expressed asWherein
Step 3: selecting suitable fuzzy parameter α, carries out fragment Fuzzy Processing to each timeslice;Fuzzy Time piece table
It is shown asTo avoid subscript from overflowing, for
One timeslice do not do before to Fuzzy Processing, to Fuzzy Processing after not done to the last one timeslice;
Step 4: each Fuzzy Time piece is calculatedIn the depth motion figure of three different projecting directionsIts
Middle v ∈ { f, s, t } respectively indicates the three-view diagram (three directions) of video sequence projection, main view, left view and top view;Extremely
This, calculates and obtains all training samplesCorresponding depth motion set of graphs
Step 5: step 4 is solved to the depth motion figure obtained using bicubic differential techniqueIt is adjusted to an identical size, and these depth motion figures are normalized
Between 0 to 1, after normalizationIt is denoted as
Step 6: by any training sample X(k)Corresponding depth motion set of graphsInto
Row vector, and the sequence action diagram after all vectorizations is connected, complete training sample X(k)Corresponding feature construction, the spy
Sign is denoted as H(k), then the characteristic set of all samples is expressed as { H(k)}k∈[1M];
Step 7: all samples that step 6 is obtainedOutput feature { H(k)}k∈[1M]Pass through PCA
Dimensionality reduction, and save the feature after all dimensionality reductions
The Human bodys' response method testing stage technological scheme of depth motion figure based on smeared out boundary fragment is as follows:
Step 1: the test sample (TestX, TestY) of the depth map sample of a given human body behavior video sequence,
Middle TestX indicates the depth map sequence of test sample,XiFor the original depth of the i-th frame in test sample
Spend image, NTFor the totalframes of the test sample;TestY indicates the affiliated behavior classification of test sample;
Step 2: by test sample TestX, temporally axis is directly divided into DIV isometric timeslice (sliced fashion and instructions
It is identical to practice the stage), each time leaf length isTimeslice after division is expressed asWherein
Step 3: according to the fuzzy parameter α that the training stage uses, fragment Fuzzy Processing is carried out to each timeslice;It is fuzzy
Timeslice is expressed asTo avoid subscript from overflowing
Out, to Fuzzy Processing before not done for first timeslice, to Fuzzy Processing after not done to the last one timeslice;
Step 4: each Fuzzy Time piece is calculatedIn the depth motion figure DMM of three different projecting directionsj,v, wherein v
∈ { f, s, t } respectively indicates the three-view diagram (three directions) of video sequence projection, main view, left view and top view;So far, it counts
It calculates and obtains the corresponding depth motion set of graphs { DMM of test sample TestXj,v}j∈[1DIV],v∈[f,s,t];
Step 5: step 4 is solved to the test sample depth motion figure obtained using bicubic differential technique
{DMMj,v}j∈[1DIV],v∈[f,s,t]It is adjusted to training stage same size, and according to training stage method for normalizing, by these depths
Degree action diagram is normalized between 0 to 1, the DMM after normalizationj,vIt is denoted as
Step 6: by the corresponding depth motion set of graphs of test sample TestXInto
Row vector, and the sequence action diagram after all vectorizations is connected, the corresponding feature construction of test sample TestX is completed, it should
Feature is denoted as HT;
Step 7: the output feature H for the test sample TestX that step 6 is obtainedTBy PCA dimensionality reduction, after obtaining dimensionality reduction
FeaturePCA dimension reduction method is consistent with the training stage;
Step 8: and then by the output feature after dimensionality reductionIt is sent into R-ProCRC [5] classifier, obtains classification output
PridY;
Step 9: comparing PridY and TestY, if PridY=TestY, identification is correct, otherwise identifies mistake.
In above-mentioned technical proposal, training stage step 2 is carried out video sequence in the method for isometric timeslice division, point
The selection of the piece number DIV determines optimal strip number according to specific human body behavior sample data set, with MSR Action3D data
For collection, DIV=3.
In above-mentioned technical proposal, training stage step 2 is carried out video sequence in the method for isometric timeslice division, often
A time leaf length isTime leaf length is using rounding mode downwards;If the last one timeslice curtailment
Then chosen by physical length;
In above-mentioned technical proposal, training stage step 3 is in the fragment Fuzzy Processing of timeslice, the selection of fuzzy parameter α
Optimized parameter is determined according to specific human body behavior sample data set, by taking MSR Action3D data set as an example, α=0.8.
In above-mentioned technical proposal, training stage step 3 is in the fragment Fuzzy Processing of timeslice, to the of each sample
One timeslice is not done preceding to Fuzzy Processing;To Fuzzy Processing after not done to the last one timeslice of each sample.
In above-mentioned technical proposal, training stage step 2 and step 3 are completed jointly at the fuzzy fragment of video sequence
Reason, as shown in Fig. 2, fuzzy parameter α controls the boundary between fragment, so that adjacent burst information is shared, further mentions
High robustness of the feature to human body behavior time domain difference.
In above-mentioned technical proposal, training stage step 4 is to each Fuzzy Time pieceIn three different projecting directions
Depth motion figureCalculating using video frame absolute difference superposition method, specifically:
The projection that three visual directions are done to the video frame in same timeslice obtains main view, the left view of each video
Figure and top view then subtract each other the same visual angle projection of consecutive frame, subtract each other rear absolute value and be overlapped, the track acted in this way
Just it has been saved, specific formula is as follows:
WhereinWithWhen respectively indicating first timeslice in k-th of sample with the last one
Between piece, v ∈ { f, s, t } respectively indicates the three-view diagram (three directions) of video sequence projection, i.e. main view, left view and vertical view
Figure;Indicate the v direction projection of the i-th frame depth map in k-th of sample;J ∈ (2 ..., DIV-1), α ∈ (0,1);Such as Fig. 3 institute
Show, the DMM figure of each projecting direction effectively saves the single fuzzy fragment of human body behavior sequence in the track of three projecting directions
Information.
In above-mentioned technical proposal, training stage step 5 uses bicubic differential technique by depth motion figureAdjustment
Size for same size, main view, left view and top view that this patent uses is respectively defined as: 50 × 25,50 × 40,40
×20;In actual implementation, it can choose the scaling that different interpolation methods realizes depth motion figure size, select premise for as far as possible
Reduce the loss of image information.
In above-mentioned technical proposal, step 6 is calculated in training stage step 7 all samples
Output feature { H(k)}k∈[1M]By PCA dimensionality reduction, intrinsic dimensionality after specific dimensionality reduction can depending on the number of training sample,
In this patent embodiment, if the number of training sample is M, final intrinsic dimensionality is (M-20) × 1.
In above-mentioned technical proposal, latent structure method and parameter and training rank that test phase step 2 to step 6 uses
Duan Xiangtong.
In above-mentioned technical proposal, test phase step 7 adopts the test sample TestX's that step 6 is calculated in PCA
Export feature HTDimensionality reduction, it is training stage number of samples that the dimension after dimensionality reduction, which is (M-20) × 1, M,.
Row in above-mentioned technical proposal, after the dimensionality reduction that test phase step 8 uses R-ProCRC classification to obtain step 7
It is characterizedClassify method particularly includes:
(1) optimized parameter is calculated
WhereinIt is the training characteristics after dimensionality reduction, HTFor test sample feature,It is belong to classification c (c ∈ C) all defeated
Enter the set of feature vector, ‖ C ‖ is total classification number, and λ and γ are the parameters between 0 to 1; Wherein's
Building method are as follows: first willIt is initialized as and dictionary0 matrix of identical size then willIt is assigned toIn, position
For?In relative position, can be obtainedSuch asValue are as follows: It is one
The weight matrix of diagonalization:
Here,Indicate all elements of the i-th row,Indicate i-th of value of test sample feature vector;
(2) estimation test sample exports featureBelong to the probability of classification c
Test sample exports featureThe Probabilistic estimation for belonging to classification c is as follows:
For all samplesIt is all identical, therefore above formula can simplify are as follows:
It is hereby achieved that featureAffiliated classification.
To verify effectiveness of the invention, the present invention is in famous human body behavior depth information database MSR
It is successively tested on Action3D, Action Pair.Table 1 gives the spy of two kinds of human body behavior depth information databases
Property.
Table 1: depth data Sink Characteristics description
As shown in table 2, MSR Action3D database is divided into three fixed subsets by us in an experiment,.Every height
Collection all uses 3 kinds of experiment methods, and Test One is using the demonstration behavior of the first time of each subject as training set, remaining work
For test set.Test Two will demonstrate twice behavior as training set before each subject, remaining to be used as test set.
For Cross Test using all video sequences of 1,3,5,7,9 these subject as training set, remaining is used as test.Experiment
It the results are shown in Table 3, it is seen that Activity recognition precision of the invention is in most cases better than traditional DMM method.
Table 2:MSR Action3D database subset
The comparison of table 3:MSR Action3D database subset
Table 4 show discrimination of the present invention on Action Pair database and with the comparison of DMM.Due to
Action Pair behavior database is all the behavior opposite there are a large amount of action sequences, such as " picking up " and " putting down ", " standing up "
" sitting down " etc., thus it is very sensitive to temporal information.Traditional DMM only has 50.6% discrimination, and identification of the invention
Rate has reached 97.2%.
Table 4:Action Pair algorithms of different discrimination
It is deep compared to traditional color video data since the present invention carries out Human bodys' response using depth data
Degree saves the shape and structure feature of human body according to can complete to fast accurate human body separation, is conducive to improve precision, make simultaneously
With the processing mode of depth motion figure DMM, there is better robust relative to the skeleton point obtained based on the estimation of bone tracking technique
Property;Traditional feature extracting method using depth motion figure DMM projects to entire video frame on one DMM figure, when loss
Between information;The Human bodys' response of the depth motion figure for the smeared out boundary fragment that this method proposes, existing DMM is divided
Piece, and the boundary between fragment is controlled using fuzzy parameter α, so that adjacent burst information is shared, but also DMM can
Better capture time information, brilliant recognition accuracy can be obtained with the use of R-ProCRC [5] classifier.
A specific embodiment of the invention is elaborated above in conjunction with attached drawing, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
It puts and makes a variety of changes.
Bibliography
[1].Bian W,Tao D,Rui Y.Cross-domain human action recognition.[J].IEEE
Transactions on Systems Man&Cybernetics Part B Cybernetics A Publication of
the IEEE Systems Man&Cybernetics Society,2012,42(2):298-307.
[2].Niebles J C,Wang H,Li F F.Unsupervised Learning of Human Action
Categories Using Spatial-Temporal Words[J].International Journal of Computer
Vision,2008,79(3):299-318.
[3].Wang J,Liu Z,Wu Y,et al.Mining actionlet ensemble for action
recognition with depth cameras[C]//IEEE Conference on Computer Vision and
Pattern Recognition.IEEE Computer Society,2012:1290-1297.
[4].Chen C,Liu K,Kehtarnavaz N.Real-time human action recognition
based on depth motion maps[J].Journal of Real-Time Image Processing,2013:1-9.
[5].Sijia C,Lei Z,et al.A Probabilistic Collaborative Representation
based Approach for Pattern Classification.IEEE Trans.on Pattern Analysis and
Machine Intelligence,2016.
[6].Xu H,Chen E,Liang C,et al.Spatio-Temporal Pyramid Model based on
depth maps for action recognition[C]//Mmsp2015 IEEE,International Workshop on
Multimedia Signal Processing.2015.