Skip to content

This repository contains a transformer-based model for real-time American Sign Language (ASL) recognition. The model leverages transformer architecture to interpret ASL gestures and utilizes the Gemini-Pro LLM API for constructing sentences from recognized ASL signs.

License

Notifications You must be signed in to change notification settings

DEV-D-GR8/SignSense

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SignSense: Transformer based ASL Recognition Model

This repository contains a transformer-based model for real-time American Sign Language (ASL) recognition. The model leverages state-of-the-art transformer architecture to accurately interpret ASL gestures and utilizes the Gemini-Pro LLM API for constructing sentences from recognized ASL signs. The system supports live video input for seamless, on-the-fly translation of ASL gestures into textual form, aiding communication accessibility.

Table of Contents

Project Overview

This project aims to provide a robust solution for real-time ASL recognition using a transformer-based deep learning model. The model captures live video input, processes the frames to detect and recognize ASL gestures, and constructs meaningful sentences from the recognized words using the Gemini-Pro LLM API. This tool can significantly enhance communication for individuals who use ASL.

Features

  • Real-time ASL Recognition: Detect and recognize ASL gestures in real-time.
  • Transformer Architecture: Utilizes advanced transformer models for high accuracy.
  • Sentence Construction: Integrates with Gemini-Pro LLM API to build sentences from recognized signs.
  • Live Video Input: Supports live video input for seamless ASL translation.

Tech Stack

  • Python: Core programming language used for development.
  • TensorFlow: Deep learning framework for building and training the transformer model.
  • OpenCV: Library for real-time computer vision tasks, used for video capture and preprocessing.
  • MediaPipe: Framework for building multimodal machine learning pipelines, used for hand and gesture tracking.
  • Gemini-Pro LLM API: API for generating sentences from recognized ASL words.

Setup and Testing

To run the project on your local machine, follow these steps:

  1. Clone the Repository:
    git clone https://github.com/DEV-D-GR8/SignSense.git
    cd SignSense
    
  2. Install Dependencies:
    pip install -r requirements.txt
    
  3. Run the Application:
    python app.py
    

Demo

For a visual demonstration, check out my YouTube video.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

About

This repository contains a transformer-based model for real-time American Sign Language (ASL) recognition. The model leverages transformer architecture to interpret ASL gestures and utilizes the Gemini-Pro LLM API for constructing sentences from recognized ASL signs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published