CN112668607A

CN112668607A - Multi-label learning method for recognizing tactile attributes of target object

Info

Publication number: CN112668607A
Application number: CN202011401667.4A
Authority: CN
Inventors: 易正琨; 吴新宇; 伍汉诚; 周贞宁; 米婷婷; 方森林
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2021-04-16
Anticipated expiration: 2040-12-04
Also published as: CN112668607B

Abstract

The invention discloses a multi-label learning method for identifying tactile attributes of a target object. The method comprises the following steps: acquiring multi-modal tactile data generated in the interaction process with a target object; extracting tactile features from the multi-modal tactile data to construct a training set, wherein the training set is used for representing the corresponding relation between tactile adjectives describing an interaction process and the extracted tactile features; and training the multi-label classifier by using the training set to obtain a recognition model for predicting the tactile attribute of the target object in real time. The method can be used for mining the potential relation between different tactile adjectives, and improving the classification speed and the classification accuracy by designing reasonable statistical characteristics and utilizing the relation among labels.

Description

Multi-label learning method for recognizing tactile attributes of target object

Technical Field

The invention relates to the technical field of robots, in particular to a multi-label learning method for identifying tactile attributes of a target object.

Background

Haptic perception is a complementary pattern of visual and auditory perception and plays a crucial role in autonomous robots. The use of machine learning methods to improve the haptic perception of robots is receiving more and more attention. However, an object is often described by more than one tactile adjective, and tactile understanding of multiple adjectives can be formulated as a multi-label classification problem. Most existing methods use very complex features and convert the multi-label haptic problem into a multi-classification problem. The correlation between multiple tactile adjectives cannot be fully exploited.

Intelligent robot technology has been spread in various fields such as medical treatment, service industry, military, agriculture, industry, etc., and in practical application, the ability of a robot to identify materials and types of objects is particularly important. How to improve the accuracy of robot classification of target objects becomes a common problem in the field of machine learning. In the process of identifying the object tactile attributes through conventional machine learning, visual or tactile perception is generally required to perform feature analysis on the object to further classify the object, however, for a feature-similar object, the object category of the object cannot be identified only from a single modality, and multiple modalities may be required to be combined to improve the identification degree of the object.

In the prior art, patent CN102945371B proposes a classification method based on multi-label flexible support vector machine, which defines a novel distance measurement method in multi-label space for measuring the distance between a midpoint and a point in multi-label space under a specific classification target; then, defining a neighborhood for each point in the multi-label space under a specific classification target, wherein the neighborhood of a certain point comprises a plurality of points which are closest to the central point under a novel distance measurement method; and finally, combining the neighborhood information of each sample point in the multi-label space, and performing multi-label classification training by using the novel multi-label flexible support vector machine classifier. The information contained in the multi-label space is utilized to improve the classification precision of the discrimination classifier in multi-label classification and reduce the influence of noise labels on classification.

Patent application CN111340061A provides an unsupervised feature selection method and system based on multi-label learning, including: extracting features of each acquired data sample to obtain a feature data set, learning a binary multi-label matrix and a feature selection matrix for the feature data set, and constructing an unsupervised feature selection objective function based on multi-label learning; solving an unsupervised feature selection target function based on multi-label learning by adopting a discrete optimization method based on an augmented Lagrange multiplier method to obtain a feature selection matrix; and sequencing the feature selection matrix to determine the selected target features. Simultaneously learning multi-label and executing feature selection for semantic guidance, and applying binary constraint in spectrum embedding to obtain multi-label to guide the final feature selection process; in addition, a dynamic sample similarity graph capture data structure is constructed in an adaptive mode, and therefore the discrimination capability of multiple labels is enhanced.

Although the above prior art scheme uses knowledge of multi-label classification, it is considered to output a label set of a certain dimension in prediction output. However, from the perspective of an algorithm for predictive classification, the classification used in the existing scheme does not consider the correlation of labels and the problem of unbalanced sample distribution, and the correlation between different labels is not well explored for an example with multiple labels; in the feature extraction method, the features used in the existing scheme are all self-created complex features, the reliability is not high, and the complexity and the calculated amount are greatly increased.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a multi-label learning method for identifying the tactile attributes of a target object, aims to mine the potential relation between different adjectives, improves the problem that the traditional two classifiers have poor classification effect on some tactile labels containing a small number of positive examples, and improves the classification speed and the classification accuracy by designing reasonable statistical characteristics and utilizing the relation among the labels.

According to a first aspect of the present invention, a multi-tag learning method for target object haptic attribute identification is provided. The method comprises the following steps:

acquiring multi-modal tactile data generated in the interaction process with a target object;

extracting tactile features from the multi-modal tactile data to construct a training set, wherein the training set is used for representing the corresponding relation between tactile adjectives describing an interaction process and the extracted tactile features;

training a multi-label classifier with the training set to obtain a recognition model for predicting haptic properties in real time.

According to a second aspect of the present invention, a target object haptic property identification method is provided. The method comprises the following steps:

multi-modal touch data generated in the process of interacting with a target object are collected in real time, and corresponding touch characteristics are extracted;

and inputting the extracted tactile features into the recognition model provided by the invention to obtain a recognition result.

Compared with the prior art, the method has the advantages that the provided statistical characteristics have the characteristics of simple model, high training speed and the like. In addition, in the aspect of optimizing the classification accuracy, the idea of decision-level fusion is adopted, and classification results of two feature selection methods are integrated, so that the classification accuracy achieves a higher precision effect.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow diagram of a multi-tag learning method for haptic attribute identification of a target object according to one embodiment of the present invention;

FIG. 2 is a process diagram of a multi-tag learning method for haptic attribute identification of a target object, according to one embodiment of the present invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

In brief, the invention relates to a target object classification method based on multi-label learning, which comprises the following technical scheme: acquiring tactile data of a target object, for example, acquiring data of multiple modalities such as pressure change, temperature change and electrode distribution of a sensor for preprocessing, so as to extract features; extracting simple statistical characteristics of the multi-modal touch data, and dividing the multi-modal touch data into a training set and a test set according to the positive and negative example distribution conditions of each touch adjective; the extracted features are classified under classifiers such as a Support Vector Machine (SVM), K Nearest Neighbor (KNN), adaptive K-value multi-label K nearest neighbor (AML-KNN) and the like, so that the expression effects of different classification methods on data are obtained, and the classification accuracy is improved.

Specifically, referring to fig. 1 and fig. 2, the multi-tag learning method for identifying tactile attributes of a target object according to the present invention includes the following steps:

and step S110, acquiring multi-modal tactile data generated in the process of interacting with the target object.

For example, the multi-modal haptic data includes pressure signals, temperature signals, and electrode distribution signals generated during interaction of the target object with the BIOTAC sensor.

In order to obtain tactile data of multiple modalities, for example, the data of multiple modalities are collected by using a BIOTAC sensor, the BIOTAC sensor mounted on the PR2 robot respectively performs interactive operation on 60 common articles in life, and the interactive process is divided into pressing, holding, slow sliding, fast sliding and the like. The sensor can generate five types of signals during each interaction: low frequency fluid Pressure (PDC); high frequency fluid vibration (PAC); core Temperature (TDC); core temperature variation (TAC) and 19 electrode impedances distributed over the BIOTAC fingertip.

Furthermore, the acquired data signals are preprocessed, and since the sensor generates noise when the process is switched or the interaction direction is changed, in order to avoid the noise, only the signals generated by effective physical interaction with the object are extracted. And intercepting each interactive process signal value so as to intercept effective signals capable of reflecting physical interaction with the target object.

Specifically, for 60 items common in life, each item was collected 10 times by two BIOTAC sensors mounted on barrett hand manipulators, resulting in a total of 600 sets of experimental data, preferably with a well-designed training/test splitting method due to the lack of positive examples of some adjectives. For a given adjective, to ensure that both the training data set and the test data set have positive and negative labeled objects, the positive and negative labeled objects are divided by a 9:1 ratio. In addition, having the same subject appear in only one set of training or testing, allows the classifier to learn to classify adjectives rather than experimental objects.

It should be noted that the multi-modal data acquisition tool used in the present invention can be replaced by other tools, such as a vision camera, a sound recorder, and various universal manipulators.

And step S120, extracting the tactile features from the multi-modal tactile data to construct a training set, wherein the training set is used for representing the corresponding relation between the tactile adjectives describing the interaction process and the extracted tactile features.

For example, for the acquired haptic sequence, in PAC, PDC, TAC, TDC signals, 15 statistical features of "maximum value, minimum value, mean value, peak value, absolute mean value, root mean square value, variance, standard deviation, root mean square amplitude, kurtosis, skewness, form factor, peak factor, pulse factor, margin factor" are adopted as the selected features. Of the 19 electrode signals, two most important principal components were extracted using principal component analysis in order to reduce the dimensionality of the electrode features. These two parts can represent the electrode signal with an accuracy of 95%. Over time, a sixth order polynomial is matched to each component, each polynomial having six coefficients that serve as the electrode characteristics being described.

And S130, training the multi-label classifier by using the training set to obtain a recognition model for predicting the tactile attribute of the target object in real time.

The selected features are assembled and then normalized, and then the multi-label classifier is used for multi-label classification of the experimental object, the multi-label classifier better understands the tactile adjectives of the experimental object by mining the relevant information among the labels, and the multi-label classifier is, for example, an ML-KNN (multi-label K nearest neighbor) classifier:

the algorithm for the ML-KNN (multi-label K nearest neighbor) classifier is as follows:

1) an example x is given with a label set Y, considering the k neighbors nearest x. N is a radical of_xRepresenting the k neighbors in the training set. Let

The label vector representing instance x, for the jth adjective label, if j e Y,

otherwise

Where a counting equation is given, expressed as:

wherein,

the number of neighbors containing label j in the neighbor representing instance x is, for an unknown instance x, given

Indicating the event that instance x has a label j,

indicating that instance x does not have tag j

According to bayes theory, the tag classification vector can be given by the following formula:

in the formula

Represents: of the k neighbors of instance x, there are exactly

One neighbor belongs to label j. The above formula can be rewritten according to bayesian theory and then the predicted label can be obtained.

2) The prediction labels for any unknown example are:

as shown in the above equation, the prior probability can be obtained from the data in the training set

And posterior probability

The probability that instance x belongs to label j can be determined.

In order to further optimize the classifier, the invention adopts AML-KNN (self-adaptive multi-label K nearest neighbor classifier), which utilizes neighborhood information of a test point on the basis of the original ML-KNN classifier and relieves the global selection problem of K by introducing the concept of a special K value of the point. One method is to find each training point x for the correct classification_iE.g., the set of k values for P, and using a supervised learner to model the k values as a function of the corresponding training vector. This function can then be applied to the test point to determine the value of k appropriate for it.

For example, the AML-KNN algorithm first finds a successful set of K values K for each instance point_xi. One possible approach is to exhaust all possible values of K, and the exhaustive set may be the following set K ═ 1,2, …, K_maxTherein of

(n represents the number of sample points in the training set). k is a radical of_maxThe following principles are followed for selection:

1)k_maxit cannot be too large or too small, so that the structures of neighboring points are protected.

2)k_maxShould be data dependent in determining each instance point y_iAfter the proper k value set is obtained, all information is subjected to weight fitting by using MLP (multi-layer perceptron), and for example points in a test set, the characteristic that the trained MLP can receive test points is to output a k value k with high probability and feasibility_yi。

In order to further improve the classification accuracy, the classifiers are fused, and two or more classifier results can be fused. Specifically, the selected features are classified, generally input as a training set, a test set, and an actual label, and output as a prediction accuracy and a prediction label. For example, there are three classifiers used for classification, respectively: KNN (K-nearest neighbor), SVM (support vector machine), ML-KNN algorithm (FIG. 2 only illustrates two classifiers)). Performing decision-level fusion on the classification results of the three basic classifiers, optimizing the classification accuracy by using gamma_cWeight, decision-level fused prediction label Y representing the c-th classifier^*Can be calculated by the following formula:

where j is 1,2, …, q denotes the tag number, C is 1,2, …, C denotes the sorter serial number, and for the jth tag, P_cjThe probability that classifier c produces a positive label and the classification is accurate is represented.

The threshold for the jth label is determined by the difficulty of label classification.

After training, real-time haptic attribute recognition can be performed. For example, according to the processes of steps S110 and S120, multi-modal data generated in the process of interacting with the target object is collected, corresponding haptic features are extracted from the multi-modal data, and the multi-modal data are respectively input to the trained classifiers, and the classification results are fused to obtain a recognition result.

Further, the verification was performed on public data and PHAC-2 using the simple statistical features and multi-label classification method employed by the present invention. The experimental result shows that compared with the prior art, the method improves the classification accuracy and the classification efficiency.

In summary, the advantages of the present invention are mainly embodied in the following aspects:

1) the recognition of the target object is carried out by proposing a multi-mode fusion idea. First, multi-modal characteristics of the experimental article or material, such as pressure, temperature, electrode distribution, are collected to obtain a data set thereof. These datasets are then fused and classified for multiple modalities with better classification than a single modality.

2) The invention uses simple statistical characteristics to extract the characteristics in the characteristic extraction stage, for example, kurtosis, mean value, variance and the like are taken as characteristic data sets, thereby simplifying the calculation steps, reducing the complexity of subsequent training models and improving the classification efficiency.

3) Compared with the traditional single-label classifier, the multi-label algorithm provided by the invention has the advantages that the data are trained and predicted by using the multi-label classification method, the correlation among labels can be fully excavated by the provided multi-label algorithm, and meanwhile, the problem of global optimization of a K value is solved by providing the self-adaptive K method.

4) The invention optimizes the classification accuracy, provides a method for realizing the fusion of a plurality of classifiers at a decision level, integrates the advantages that a plurality of basic classifiers have different performances on different labels, and further improves the classification accuracy.

5) The method provided by the invention is a novel target object identification method, and the used characteristics and the classifier are independent of any tactile data acquisition mode.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A multi-tag learning method for target object haptic attribute identification, comprising the steps of:

2. The method of claim 1, wherein the multi-modal haptic data comprises: low frequency fluid pressure, high frequency fluid vibration, core temperature change, and a plurality of electrode signals.

3. The method of claim 2, wherein for low frequency fluid pressure, high frequency fluid vibration, core temperature, and core temperature variation data, the extracted haptic features are one or more of a maximum, a minimum, a mean, a peak, an absolute mean, a root mean square value, a variance, a standard deviation, a square root amplitude, a kurtosis, a skewness, a form factor, a peak factor, a pulse factor, a margin factor; for the plurality of electrode signals, two most important principal components are extracted using principal component analysis for representing the electrode signals, and a sixth order polynomial is matched to each component over time, each polynomial having six coefficients, which are taken as the extracted haptic features.

4. The method of claim 1, wherein the training a multi-label classifier further comprises: and constructing a test set for verifying the classification effect, and enabling the training set and the test set to have objects with positive and negative marks.

5. The method of claim 1, wherein the haptic features in the training set are normalized features and the multi-modal haptic data is dessicated data.

6. The method of claim 1, wherein the multi-label classifier is trained according to the following steps:

training a plurality of multi-label classifiers based on the training set;

performing decision-level fusion on the classification results of the multiple multi-label classifiers to obtain a prediction label, which is expressed as:

where j is 1,2, …, q denotes the tag number, C is 1,2, …, C denotes the number of classifiers, and for the jth tag, P denotes_cjRepresenting the probability that classifier c produces a positive label and the classification is accurate,

is the threshold for the jth label, determined by the difficulty of label classification, γ_cRepresenting the weight of the c-th classifier.

7. The method of claim 6, wherein the multi-label classifiers comprise a K-nearest neighbor classifier, a support vector machine and an adaptive multi-label K-nearest neighbor classifier, the adaptive multi-label K-nearest neighbor classifier searches for a successful K-value set for each instance point, and after determining a suitable K-value set for each instance point, all information is weight-fitted by using a multi-layer perceptron, and feasible K values are determined according to probability values.

8. A target object haptic attribute identification method, comprising:

inputting the extracted tactile features into a recognition model constructed according to the method of any one of claims 1 to 7, and obtaining a recognition result.

9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.

10. A computer device comprising a memory and a processor, on which memory a computer program is stored which is executable on the processor, characterized in that the steps of the method of any of claims 1 to 8 are implemented when the processor executes the program.