SVLM

Speech Integrated Visual Language Model

Install

install MPI

conda install -c conda-forge mpi4py mpich pip install colorama

install Requirements

May need to remove MPI4PY (Conda install above)

pip install -r assets/requirements/requirements.txt

#Rest of packages pip install -r assets/requirements/requirements_custom.txt

#Custom Operator for deformabalbe vision encoder cd modeling/vision/encoder/ops && sh make.sh && cd ../../../../

#Run Demo pip install gradio pip install timm pip install nltk pip install Pillow==9.5.0 pip install transformers pip install kornia

Testing with LLaVA

pip install cog

Grounding LLaVA with SEEM

Setup LLaVA using eval mode, to trigger the SEEM model for grounding

Will ask LLaVA for objects in the image, then use Named Entity Relationship for gathering thing classes (or thing phrases) for instance segmenation.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.vscode		.vscode
LLaVA		LLaVA
SEEM		SEEM
Stable-Diffusion/api		Stable-Diffusion/api
diagram		diagram
notebooks		notebooks
results		results
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
app.py		app.py
example_sd.py		example_sd.py
grounding_modules.py		grounding_modules.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SVLM

Install

install MPI

install Requirements

May need to remove MPI4PY (Conda install above)

Testing with LLaVA

Grounding LLaVA with SEEM

About

Releases

Packages

Languages

Ommos92/SVLM

Folders and files

Latest commit

History

Repository files navigation

SVLM

Install

install MPI

install Requirements

May need to remove MPI4PY (Conda install above)

Testing with LLaVA

Grounding LLaVA with SEEM

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages