CN115079818B - Hand capturing method and system - Google Patents

Hand capturing method and system Download PDF

Info

Publication number
CN115079818B
CN115079818B CN202210497950.4A CN202210497950A CN115079818B CN 115079818 B CN115079818 B CN 115079818B CN 202210497950 A CN202210497950 A CN 202210497950A CN 115079818 B CN115079818 B CN 115079818B
Authority
CN
China
Prior art keywords
gesture
current frame
image
probability
skeleton
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210497950.4A
Other languages
Chinese (zh)
Other versions
CN115079818A (en
Inventor
赵天奇
李志豪
巴君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Juli Dimension Technology Co ltd
Original Assignee
Beijing Juli Dimension Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Juli Dimension Technology Co ltd filed Critical Beijing Juli Dimension Technology Co ltd
Priority to CN202210497950.4A priority Critical patent/CN115079818B/en
Publication of CN115079818A publication Critical patent/CN115079818A/en
Application granted granted Critical
Publication of CN115079818B publication Critical patent/CN115079818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/014Hand-worn input/output arrangements, e.g. data gloves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application discloses a hand capturing method and a hand capturing system, wherein the method comprises the following steps: predicting the probabilities of different gesture semantics according to the gesture image of the current frame; inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic, and obtaining the image module matching probability and skeleton rotation value of each gesture semantic; multiplying the image module matching probability and the skeleton rotation value of each gesture semantic with the probability of each gesture semantic respectively to obtain the fusion probability of each gesture semantic; normalizing the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value; and outputting a skeleton rotation value according to the skeleton rotation distribution function of the current frame image so as to drive the virtual hand skeleton motion according to the frame skeleton rotation value. On the premise of ensuring the capturing precision, the efficiency of the whole capturing process is obviously improved.

Description

Hand capturing method and system
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a hand capturing method and a hand capturing system.
Background
The traditional method for controlling the gesture actions of the virtual person realizes gesture capturing by wearing the manual capturing glove, but the cost of the manual capturing glove is high, and the special glove is required to be customized according to the hand type of different consumers, so that the method is a great obstacle for the common consumers to experience the virtual person technology. In addition, the wearing time of the glove by the dynamic predator is too long, so that the wearing time is easy to cause discomfort of hands of the wearing personnel due to the constraint of the wearing personnel, and the control experience of the virtual person is reduced.
Disclosure of Invention
Therefore, the embodiment of the application provides the hand capturing method and the hand capturing system, which reduce the cost of the traditional hand capturing scheme and solve the problem of wearing comfort, hand capturing personnel can realize high-precision hand capturing only by a camera without purchasing and wearing the traditional hand capturing device, and the efficiency of the whole capturing process is obviously improved on the premise of ensuring the capturing precision.
In order to achieve the above object, the embodiment of the present application provides the following technical solutions:
according to a first aspect of an embodiment of the present application, there is provided a hand capturing method, the method including:
Collecting a gesture image of a current frame;
predicting the probabilities of different gesture semantics according to the gesture image of the current frame;
inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic, and obtaining the image module matching probability and skeleton rotation value of each gesture semantic;
multiplying the image module matching probability and the skeleton rotation value of each gesture semantic with the probability of each gesture semantic respectively to obtain the fusion probability of each gesture semantic;
normalizing the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value;
And outputting a skeleton rotation value according to the skeleton rotation distribution function of the current frame image so as to drive the virtual hand skeleton motion according to the frame skeleton rotation value.
Optionally, after the obtaining the bone rotation distribution function of the current frame image according to the processed fusion probability and the corresponding bone rotation value, before the outputting the bone rotation value according to the bone rotation distribution function of the current frame image, the method further includes:
acquiring a gesture capturing gesture distribution function and a motion description amount of a previous frame image;
Estimating and obtaining a skeleton rotation value according to the motion description quantity of the previous frame image;
And acquiring a skeletal rotation distribution function of the fused current frame image based on the gesture capturing gesture distribution function and the skeletal rotation value of the fused previous frame image and the skeletal rotation distribution function of the current frame image by Kalman filtering.
Optionally, the obtaining the bone rotation distribution function of the current frame image according to the processed fusion probability and the corresponding bone rotation value includes:
And carrying out maximum likelihood function solving according to all the processed fusion probabilities and the corresponding bone rotation values to obtain a bone rotation distribution function of the current frame image.
Optionally, the outputting the bone rotation value according to the bone rotation distribution function of the current frame image includes:
And calculating a mean value according to the bone rotation distribution function of the current frame image, and outputting the mean value as a bone rotation value.
Optionally, after predicting the probability of different gesture semantics from the current frame gesture image, the method further comprises:
and screening the probability of gesture semantics meeting the conditions according to the set probability threshold.
According to a second aspect of an embodiment of the present application, there is provided a hand capture system, the system comprising:
The data acquisition module is used for acquiring a gesture image of the current frame;
The classification prediction module is used for predicting the probabilities of different gesture semantics according to the gesture image of the current frame;
The skeleton rotation value calculation module is used for inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic to obtain the image module matching probability and the skeleton rotation value of each gesture semantic;
The fusion probability calculation module is used for multiplying the image module matching probability and the skeleton rotation value of each gesture semantic with the probability of each gesture semantic respectively to obtain the fusion probability of each gesture semantic;
The skeleton rotation distribution module is used for carrying out normalization processing on the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value;
And the skeleton rotation value output module is also used for outputting skeleton rotation values according to a skeleton rotation distribution function of the current frame image so as to drive the virtual hand skeleton motion according to the frame skeleton rotation values.
Optionally, the system further comprises:
the data acquisition module is also used for acquiring gesture capturing gesture distribution functions and motion description amounts of the previous frame image;
The bone rotation value calculation module is also used for estimating and obtaining a bone rotation value according to the motion description quantity of the previous frame image;
The fusion module is used for fusing the gesture capturing gesture distribution function and the skeleton rotation value of the previous frame image and the skeleton rotation distribution function of the current frame image based on Kalman filtering to obtain the skeleton rotation distribution function of the fused current frame image.
Optionally, the bone rotation distribution module is specifically configured to:
And carrying out maximum likelihood function solving according to all the processed fusion probabilities and the corresponding bone rotation values to obtain a bone rotation distribution function of the current frame image.
According to a third aspect of an embodiment of the present application, there is provided an electronic apparatus including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer program to perform the method of the first aspect.
According to a fourth aspect of embodiments of the present application, there is provided a computer readable storage medium having stored thereon computer readable instructions executable by a processor to implement the method of the first aspect described above.
In summary, the embodiment of the application provides a method and a system for capturing a hand, which are used for capturing a gesture image of a current frame; predicting the probabilities of different gesture semantics according to the gesture image of the current frame; inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic, and obtaining the image module matching probability and skeleton rotation value of each gesture semantic; multiplying the image module matching probability and the skeleton rotation value of each gesture semantic with the probability of each gesture semantic respectively to obtain the fusion probability of each gesture semantic; normalizing the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value; and outputting a skeleton rotation value according to the skeleton rotation distribution function of the current frame image so as to drive the virtual hand skeleton motion according to the frame skeleton rotation value. On the premise of ensuring the capturing precision, the efficiency of the whole capturing process is obviously improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.
The structures, proportions, sizes, etc. shown in the present specification are shown only for the purposes of illustration and description, and are not intended to limit the scope of the invention, which is defined by the claims, so that any structural modifications, changes in proportions, or adjustments of sizes, which do not affect the efficacy or the achievement of the present invention, should fall within the scope of the invention.
Fig. 1 is a schematic flow chart of a hand capturing method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a hand capture embodiment provided by an embodiment of the present application;
FIG. 3 is a block diagram of a hand capture system according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
Fig. 5 shows a schematic diagram of a computer-readable storage medium according to an embodiment of the present application.
Detailed Description
Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 shows a hand capturing method according to an embodiment of the present application, where the method includes:
Step 101: collecting a gesture image of a current frame;
Step 102: predicting the probabilities of different gesture semantics according to the gesture image of the current frame;
Step 103: inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic, and obtaining the image module matching probability and skeleton rotation value of each gesture semantic;
step 104: multiplying the image module matching probability and the skeleton rotation value of each gesture semantic with the probability of each gesture semantic respectively to obtain the fusion probability of each gesture semantic;
step 105: normalizing the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value;
Step 106: and outputting a skeleton rotation value according to the skeleton rotation distribution function of the current frame image so as to drive the virtual hand skeleton motion according to the frame skeleton rotation value.
In a possible implementation manner, after predicting the probability of different gesture semantics according to the gesture image of the current frame in step 102, the method further includes:
and screening the probability of gesture semantics meeting the conditions according to the set probability threshold.
In a possible implementation manner, after obtaining the bone rotation distribution function of the current frame image according to the processed fusion probability and the corresponding bone rotation value in step 105, before outputting the bone rotation value according to the bone rotation distribution function of the current frame image in step 106, the method further includes:
Acquiring a gesture capturing gesture distribution function and a motion description amount of a previous frame image; estimating and obtaining a skeleton rotation value according to the motion description quantity of the previous frame image; and acquiring a skeletal rotation distribution function of the fused current frame image based on the gesture capturing gesture distribution function and the skeletal rotation value of the fused previous frame image and the skeletal rotation distribution function of the current frame image by Kalman filtering.
In a possible implementation manner, in step 105, the obtaining a bone rotation distribution function of the current frame image according to the processed fusion probability and the corresponding bone rotation value includes:
And carrying out maximum likelihood function solving according to all the processed fusion probabilities and the corresponding bone rotation values to obtain a bone rotation distribution function of the current frame image.
In a possible implementation manner, in step 106, the outputting the bone rotation value according to the bone rotation distribution function of the current frame image includes:
And calculating a mean value according to the bone rotation distribution function of the current frame image, and outputting the mean value as a bone rotation value.
The hand capturing method according to the embodiment of the present application is described in detail below with reference to fig. 2.
In the first aspect, a gesture motion image of a person in front of a camera is acquired.
The common RGB camera is placed in front of the hand capturing person who can stand or sit down. The position of the camera should meet the requirement that the palm of the fingers of the left and right hands can be in the image when the maximum arm of the hand capturing person is unfolded. After the camera positions are placed, the hand-catcher can start gesture actions. Gesture actions include, but are not limited to: palm open, hand gestures ok, thumb down, thumb up, phone calls, numbers 1-9, heart comparing, finger open, finger close, pointing to the camera, etc.
In the second aspect, gesture types corresponding to each frame of image are recognized according to the collected RGB images, and different gesture motion semantic probabilities of people in front of the camera are predicted. Gesture semantic actions include: palm open, hand gestures ok, thumb down, thumb up, phone calls, numbers 1-9, heart comparing, finger open, finger close, pointing camera, etc. 20 action types. The result of this step in the present algorithm is a probability distribution of the gesture semantics of the current frame, such as the gesture open probability: 0.9, finger folding: 0.05, gesture ok:0.05, etc. 20 probabilities of different semantics.
By combining gesture recognition semantics, each gesture action is split into different semantics, so that each gesture action has different corresponding prediction modes and fusion modes, and the accuracy and the robustness of the whole gesture capturing system are improved.
In a third aspect, based on the result of the gesture recognition module, different gesture recognition maps different motion semantics, and a corresponding gesture capture module is selected according to the motion semantics obtained in the second aspect. And inputting the data acquired in the first step into a gesture capturing module, and outputting a gesture capturing result.
The embodiment of the application selects three gesture types with highest prediction probability, selects three corresponding gesture capturing modules aiming at different gesture types, and respectively acquires corresponding gesture capturing results. The gesture capture results include a bone rotation value and a matching value of the input image to the module (image module matching probability). At the time of subsequent fusion, if the matching value is larger, the bone rotation value accounts for a larger proportion in the final result.
In a fourth aspect, the hand skeletal joint rotation values output by the three different modules are fused with the matching values of the image module predicted by the hand capturing module and the gesture probability values output by the gesture recognition module. The method specifically comprises the following steps:
And multiplying the image module matching probability predicted by the gesture capturing module corresponding to the skeleton rotation values output by the three different modules by the gesture probability value output by the gesture recognition module to obtain three new probabilities, and normalizing the three probabilities to obtain three probability values with the sum equal to one. This step guarantees the accuracy of the probability of use at fusion: meanwhile, the hand joint rotation value with high probability is predicted by the gesture recognition module and the gesture capturing module, and the proportion is highest in the fusion process.
And fifthly, carrying out maximum likelihood function solving based on a Gaussian mixture model on the fusion probability and the corresponding bone rotation value obtained in the previous step to obtain a bone rotation distribution function of the current frame image, and calculating the mean value and the variance of the current observation distribution.
The method comprises the steps of fusing a gesture capturing gesture distribution function of a previous frame image of a current frame image based on Kalman filtering, obtaining a gesture capturing gesture distribution function obtained by a Gaussian mixture model of the current frame, estimating a bone rotation value based on a previous frame motion description amount, obtaining distribution of a result of the bone rotation value, and taking an average value of the distribution as final output of a system.
And combining the current hand capturing result with the gesture capturing result before the current moment, improving the smoothness of the overall gesture capturing effect in the time dimension, dynamically adjusting the fusion parameters of the gesture capturing process, and automatically removing jitter caused by noise. Based on the Kalman filtering method, the smoothness of the overall effect of hand capture is improved in the time dimension by combining the observation distribution and the prediction distribution, the fusion parameters of the hand capture process are dynamically adjusted, the shake caused by noise is automatically removed, and the anti-interference performance of the whole hand capture system is enhanced.
Because the system is used for ensuring the same real-time effect as the traditional gesture capturing glove, the input image of the system is only one frame, so that the gesture motion is easy to shake in time sequence when the gesture type is converted. A kalman filter-based fusion smoothing module is proposed for dynamically adjusting fusion smoothing coefficients in conjunction with a gesture capture type change process. The module realizes fusion and smoothing of gesture actions on time sequence, and increases smoothness of virtual character gesture capture and adaptability of scenes.
In summary, the embodiment of the application provides a hand capturing method, which is implemented by collecting a gesture image of a current frame; screening the probability of gesture semantics meeting the conditions according to a set probability threshold; inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic, and obtaining the image module matching probability and skeleton rotation value of each gesture semantic; multiplying the image module matching probability and the skeleton rotation value of each gesture semantic with the probability of each gesture semantic respectively to obtain the fusion probability of each gesture semantic; normalizing the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value; and outputting a skeleton rotation value according to the skeleton rotation distribution function of the current frame image so as to drive the virtual hand skeleton motion according to the frame skeleton rotation value. On the premise of ensuring the capturing precision, the efficiency of the whole capturing process is obviously improved.
Based on the same technical concept, the embodiment of the application further provides a hand capturing system, as shown in fig. 3, the system includes:
the data acquisition module 301 is configured to acquire a current frame gesture image;
the classification prediction module 302 is configured to screen probabilities of gesture semantics meeting the conditions according to a set probability threshold;
The skeleton rotation value calculation module 303 is configured to input the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic, so as to obtain an image module matching probability and a skeleton rotation value of each gesture semantic;
The fusion probability calculation module 304 is configured to multiply the image module matching probability, the skeleton rotation value and the probability of each gesture semantic respectively to obtain a fusion probability of each gesture semantic;
the skeleton rotation distribution module 305 is configured to normalize the fusion probability of all gesture semantics, and obtain a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value;
the bone rotation value output module 306 is further configured to output a bone rotation value according to a bone rotation distribution function of the current frame image, so as to drive the virtual hand bone motion according to the frame bone rotation value.
In one possible embodiment, the system further comprises: the data acquisition module 301 is further configured to acquire a gesture capturing gesture distribution function and a motion description amount of a previous frame image;
The bone rotation value calculation module 303 is further configured to estimate a bone rotation value according to the motion description amount of the previous frame image;
The fusion module is used for fusing the gesture capturing gesture distribution function and the skeleton rotation value of the previous frame image and the skeleton rotation distribution function of the current frame image based on Kalman filtering to obtain the skeleton rotation distribution function of the fused current frame image.
In one possible embodiment, the bone rotation distribution module 305 is specifically configured to:
And carrying out maximum likelihood function solving according to all the processed fusion probabilities and the corresponding bone rotation values to obtain a bone rotation distribution function of the current frame image.
The embodiment of the application also provides electronic equipment corresponding to the method provided by the embodiment. Referring to fig. 4, a schematic diagram of an electronic device according to some embodiments of the present application is shown. The electronic device 20 may include: a processor 200, a memory 201, a bus 202 and a communication interface 203, the processor 200, the communication interface 203 and the memory 201 being connected by the bus 202; the memory 201 stores a computer program executable on the processor 200, and the processor 200 executes the method according to any of the foregoing embodiments of the present application when the computer program is executed.
The memory 201 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and at least one other network element is implemented through at least one physical port 203 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.
Bus 202 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. The memory 201 is configured to store a program, and the processor 200 executes the program after receiving an execution instruction, and the method disclosed in any of the foregoing embodiments of the present application may be applied to the processor 200 or implemented by the processor 200.
The processor 200 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 200 or by instructions in the form of software. The processor 200 may be a general-purpose processor, including a central processing unit (Central Processing Unit, abbreviated as CPU), a network processor (Network Processor, abbreviated as NP), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 201, and the processor 200 reads the information in the memory 201, and in combination with its hardware, performs the steps of the above method.
The electronic device provided by the embodiment of the application and the method provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the electronic device and the method provided by the embodiment of the application due to the same inventive concept.
The present application further provides a computer readable storage medium corresponding to the method provided in the foregoing embodiments, referring to fig. 5, the computer readable storage medium is shown as an optical disc 30, on which a computer program (i.e. a program product) is stored, where the computer program, when executed by a processor, performs the method provided in any of the foregoing embodiments.
It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.
The computer-readable storage medium provided by the above-described embodiments of the present application has the same advantageous effects as the method adopted, operated or implemented by the application program stored therein, for the same inventive concept as the method provided by the embodiments of the present application.
It should be noted that:
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may also be used with the teachings herein. The required structure for the construction of such devices is apparent from the description above. In addition, the present application is not directed to any particular programming language. It will be appreciated that the teachings of the present application described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present application.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
Various component embodiments of the application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some or all of the components in the creation means of a virtual machine according to an embodiment of the present application may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present application can also be implemented as an apparatus or device program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present application may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. A hand capture method, the method comprising:
Collecting a gesture image of a current frame;
predicting the probabilities of different gesture semantics according to the gesture image of the current frame;
inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic, and obtaining the image module matching probability and skeleton rotation value of each gesture semantic;
Multiplying the matching probability of the image module of each gesture semantic with the probability of each gesture semantic to obtain the fusion probability of each gesture semantic;
normalizing the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value;
and outputting a bone rotation value according to a bone rotation distribution function of the current frame image so as to drive the virtual hand bone motion according to the bone rotation value.
2. The method of claim 1, wherein the obtaining the bone rotation distribution function of the current frame image according to the processed fusion probability and the corresponding bone rotation value comprises:
And carrying out maximum likelihood function solving according to all the processed fusion probabilities and the corresponding bone rotation values to obtain a bone rotation distribution function of the current frame image.
3. The method of claim 1, wherein outputting the bone rotation value according to the bone rotation distribution function of the current frame image comprises:
And calculating a mean value according to the bone rotation distribution function of the current frame image, and outputting the mean value as a bone rotation value.
4. The method of claim 1, wherein after predicting probabilities of different gesture semantics from the current frame gesture image, the method further comprises:
and screening the probability of gesture semantics meeting the conditions according to the set probability threshold.
5. A hand capture system, the system comprising:
The data acquisition module is used for acquiring a gesture image of the current frame;
The classification prediction module is used for predicting the probabilities of different gesture semantics according to the gesture image of the current frame;
The skeleton rotation value calculation module is used for inputting the gesture image of the current frame into a gesture capturing neural network corresponding to each gesture semantic to obtain the image module matching probability and the skeleton rotation value of each gesture semantic;
The fusion probability calculation module is used for multiplying the matching probability of the image module of each gesture semantic with the probability of each gesture semantic respectively to obtain the fusion probability of each gesture semantic;
The skeleton rotation distribution module is used for carrying out normalization processing on the fusion probability of all gesture semantics, and obtaining a skeleton rotation distribution function of the current frame image according to the processed fusion probability and the corresponding skeleton rotation value;
And the bone rotation value output module is also used for outputting bone rotation values according to a bone rotation distribution function of the current frame image so as to drive the virtual hand bone motion according to the bone rotation values.
6. The system of claim 5, wherein the bone rotation distribution module is configured to:
And carrying out maximum likelihood function solving according to all the processed fusion probabilities and the corresponding bone rotation values to obtain a bone rotation distribution function of the current frame image.
7. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor executes to implement the method according to any of the claims 1-4 when running the computer program.
8. A computer readable storage medium having stored thereon computer readable instructions executable by a processor to implement the method of any of claims 1-4.
CN202210497950.4A 2022-05-07 2022-05-07 Hand capturing method and system Active CN115079818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210497950.4A CN115079818B (en) 2022-05-07 2022-05-07 Hand capturing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210497950.4A CN115079818B (en) 2022-05-07 2022-05-07 Hand capturing method and system

Publications (2)

Publication Number Publication Date
CN115079818A CN115079818A (en) 2022-09-20
CN115079818B true CN115079818B (en) 2024-07-16

Family

ID=83247308

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210497950.4A Active CN115079818B (en) 2022-05-07 2022-05-07 Hand capturing method and system

Country Status (1)

Country Link
CN (1) CN115079818B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116719416B (en) * 2023-08-07 2023-12-15 海马云(天津)信息技术有限公司 Gesture motion correction method and device for virtual digital person, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104965592A (en) * 2015-07-08 2015-10-07 苏州思必驰信息科技有限公司 Voice and gesture recognition based multimodal non-touch human-machine interaction method and system
CN108196679A (en) * 2018-01-23 2018-06-22 河北中科恒运软件科技股份有限公司 Gesture-capture and grain table method and system based on video flowing

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140083848A (en) * 2012-12-26 2014-07-04 전남대학교산학협력단 Gesture Recognition Method and Apparatus Using Sensor Data
US9946354B2 (en) * 2014-08-29 2018-04-17 Microsoft Technology Licensing, Llc Gesture processing using a domain-specific gesture language
CN105205454A (en) * 2015-08-27 2015-12-30 深圳市国华识别科技开发有限公司 System and method for capturing target object automatically
EP3467707B1 (en) * 2017-10-07 2024-03-13 Tata Consultancy Services Limited System and method for deep learning based hand gesture recognition in first person view
CN108846378A (en) * 2018-07-03 2018-11-20 百度在线网络技术(北京)有限公司 Sign Language Recognition processing method and processing device
CN108921942B (en) * 2018-07-11 2022-08-02 北京聚力维度科技有限公司 Method and device for 2D (two-dimensional) conversion of image into 3D (three-dimensional)
US11756291B2 (en) * 2018-12-18 2023-09-12 Slyce Acquisition Inc. Scene and user-input context aided visual search
CN110084192B (en) * 2019-04-26 2023-09-26 南京大学 Rapid dynamic gesture recognition system and method based on target detection
CN111160114B (en) * 2019-12-10 2024-03-19 深圳数联天下智能科技有限公司 Gesture recognition method, gesture recognition device, gesture recognition equipment and computer-readable storage medium
US11295120B2 (en) * 2020-05-06 2022-04-05 Nec Corporation Of America Hand gesture habit forming
CN113792651B (en) * 2021-09-13 2024-04-05 广州广电运通金融电子股份有限公司 Gesture interaction method, device and medium integrating gesture recognition and fingertip positioning
CN114217792A (en) * 2021-11-29 2022-03-22 上海瑞家信息技术有限公司 Page loading method, equipment and device and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104965592A (en) * 2015-07-08 2015-10-07 苏州思必驰信息科技有限公司 Voice and gesture recognition based multimodal non-touch human-machine interaction method and system
CN108196679A (en) * 2018-01-23 2018-06-22 河北中科恒运软件科技股份有限公司 Gesture-capture and grain table method and system based on video flowing

Also Published As

Publication number Publication date
CN115079818A (en) 2022-09-20

Similar Documents

Publication Publication Date Title
CN109871781B (en) Dynamic gesture recognition method and system based on multi-mode 3D convolutional neural network
CN108875732B (en) Model training and instance segmentation method, device and system and storage medium
TWI714834B (en) Human face live detection method, device and electronic equipment
CN108010031B (en) Portrait segmentation method and mobile terminal
CN107392842B (en) Image stylization processing method and device, computing equipment and computer storage medium
CN111583097A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
WO2023280148A1 (en) Blood vessel segmentation method and apparatus, and electronic device and readable medium
CN109785246B (en) Noise reduction method, device and equipment for non-local mean filtering
CN109840883B (en) Method and device for training object recognition neural network and computing equipment
CN107277615B (en) Live broadcast stylization processing method and device, computing device and storage medium
CN111028006B (en) Service delivery auxiliary method, service delivery method and related device
CN111723687A (en) Human body action recognition method and device based on neural network
CN111008935B (en) Face image enhancement method, device, system and storage medium
CN108399599B (en) Image processing method and device and electronic equipment
CN108921131B (en) Method and device for generating face detection model and three-dimensional face image
CN112602319B (en) Focusing device, method and related equipment
CN115079818B (en) Hand capturing method and system
CN107959798B (en) Video data real-time processing method and device and computing equipment
CN107610046A (en) Background-blurring method, apparatus and system
CN111081266A (en) Training generation countermeasure network, and voice enhancement method and system
CN111597966B (en) Expression image recognition method, device and system
CN112184580A (en) Face image enhancement method, device, equipment and storage medium
WO2024011859A1 (en) Neural network-based face detection method and device
CN112561822B (en) Beautifying method and device, electronic equipment and storage medium
CN112949348A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant