Skip to content

Training a hand gesture recognition model for contactless human-computer interaction

Notifications You must be signed in to change notification settings

JieYing-99/Hand-Gesture-Recognition-for-HCI

Repository files navigation

Hand Gesture Recognition for Contactless Human-Computer Interaction


Building a hand gesture recognition model and using it to identify hand gestures in real-time to trigger actions on a computer

Table of Contents

  1. About the Project
  2. Prerequisites
  3. Setup
  4. Acknowledgments

About the Project

The COVID-19 pandemic has inevitably accelerated the adoption of a number of contactless Human-Computer Interaction (HCI) technologies, one of which is the hand gesture control technology. Hand gesture-controlled applications are widely used across various industries, including healthcare, food services, entertainment, smartphone and automotive.

In this project, a hand gesture recognition model is trained to recognize static and dynamic hand gestures. The model is used to predict hand gestures in real-time through the webcam. Depending on the hand gestures predicted, the corresponding keystrokes (keyboard shortcuts) will be sent to trigger actions on a computer.

Built with

Dataset

The dataset used is a subset of the 20BN-Jester dataset from Kaggle. It is a large collection of labelled video clips of humans performing hand gestures in front of a camera.

The full dataset consists of 27 classes of hand gestures in 148,092 video clips of 3 seconds length, which in total account for more than 5 million frames.

In this project, 10 classes of hand gestures have been selected to train the hand gesture recognition model.

Example Usage

Any actions on a computer can be triggered as long as they are linked to a keyboard shortcut. For simplicity, this project is configured to trigger actions on YouTube because it has its own built-in keyboard shortcuts.

The table below shows the hand gestures and the actions they trigger on YouTube.

Hand gesture Action
Swiping Left
swiping_left
Fast forward 10 seconds
Swiping Right
swiping_right
Rewind 10 seconds
Swiping Down
swiping_down
Previous video
Swiping Up
swiping_up
Next video
Sliding Two Fingers Down
sliding_two_fingers_down
Decrease volume
Sliding Two Fingers Up
sliding_two_fingers_up
Increase volume
Thumb Down
thumb_down
Mute / unmute
Thumb Up
thumb_up
Enter / exit full screen
Stop Sign
stop_sign
Play / Pause
No Gesture
no_gesture
No action

Project Outline

  1. Data Exploration
    • Explore class distribution of training and validation data.
      Training data:class_distribution_train
      Validation data:class_distribution_validation
  2. Data Extraction
    • Extract training and validation data of the selected classes from the dataset.
  3. Hyperparameter Tuning
    • Perform grid search to determine the optimal values for dropout and learning rate.
  4. Model Training
    • Build a 3D ResNet-101 model with the optimal hyperparameters.
    • Compile the model.
    • Train the model.
  5. Classification
    • Read frames from the webcam, predict the hand gestures in the frames using the model and send the corresponding keystrokes to trigger actions on the computer.

Prerequisites

  • Python 3.7.9 or above

Setup

pip install -r requirements.txt

Acknowledgments

About

Training a hand gesture recognition model for contactless human-computer interaction

Topics

Resources

Stars

Watchers

Forks