Vision

Introduction

This project leverages the advanced capabilities of the MiniCPM-V (i.e., OmniLMM-3B), to bring cutting-edge image recognition to real-time camera feeds. By harnessing the power of this model, the application can analyze and understand scenes captured by the camera, providing instant feedback on what it perceives. You can modify the prompt to see how the model responds to different inputs.

How to use this repository

Download the MiniCPM-V from the model1 and model2 links. And put them in the MiniCPM-V folder.
Install the requirements by running pip install -r requirements.txt.
To start the image recognition application, use the run.sh script with one of the following device options: mps, cpu, or cuda. For example:
```
./run.sh mps   # For running on Apple Silicon GPU
./run.sh cpu   # For running on CPU
./run.sh cuda  # For running on CUDA-enabled GPU
```
Ensure you have given execution permissions to the script by running chmod +x run.sh if necessary.
Quit the application by pressing q.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
MiniCPM-V		MiniCPM-V
assets/readme		assets/readme
.gitignore		.gitignore
install.sh		install.sh
readme.md		readme.md
requirements.txt		requirements.txt
run.sh		run.sh
vision.py		vision.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision

Introduction

How to use this repository

About

Releases

Packages

Languages

TobyYang7/Vision

Folders and files

Latest commit

History

Repository files navigation

Vision

Introduction

How to use this repository

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages