multimodal-large-language-models

Here are 72 public repositories matching this topic...

KayvanShah1 / VirtuTA

VirtuTA is an AI teaching assistant that delivers quick, accurate responses to student queries directly on Piazza. Powered by agentic workflows, Google Gemini, and Langchain, it automates both conceptual and logistical course queries.

python youtube-api google-cloud-platform piazza-api mongodb-atlas etl-pipeline university-of-southern-california mlops-workflow large-language-models prompt-engineering langchain multimodal-large-language-models retrieval-augmented-generation google-gemini agentic-workflow

Updated Jun 25, 2024
Jupyter Notebook

sitamgithub-MSIT / TechSage

Star

chatbot artificial-intelligence gradio techbot gemini-api multimodal-data huggingface-spaces generative-ai multimodal-large-language-models gemini-pro-vision gemini-pro

Updated Jun 5, 2024
Python

surakku / cadence-gemma

Star

Giving RecurrentGemma sight.

natural-language-processing multimodal-large-language-models

Updated Jun 27, 2024
Python

DistilledCode / mmrl

Star

Multi-Modal Representational Learning for Social Media Popularity Prediction

neural-network embeddings data-pipeline multimodal-deep-learning praw-reddit airflow-dags chromadb multimodal-large-language-models

Updated Jun 24, 2024
Python

adithya-s-k / eagle

Star

A framework streamlining Training, Finetuning, Evaluation and Deployment of Multi Modal Language models

vlm llm multimodal-large-language-models

Updated May 18, 2024

NotYuSheng / Multimodal-Large-Language-Model

Sponsor

Star

Localized multimodal large language model integrated with Streamlit and Ollama for interactive text and image processing tasks.

multimodal large-language-models llm llava multimodal-large-language-models ollama visual-large-language-models

Updated Jun 28, 2024
Python

CKeibel / FHSWF-deep-learning

Star

Multimodal RAG and comparisons between language models. (Project for Deep Learning Module at the FHSWF)

machine-learning deep-learning multimodal rag multimodal-large-language-models multimodal-rag

Updated Jun 2, 2024
Python

nicolay-r / Awesome-Image-Captioning-MLLMs

Star

A curated list of awesome Image captioning strudies, aimed at annotating and reporting CT / MRI scans

nlp image text reports multimodality languagemodels multimodal-large-language-models

Updated Apr 12, 2024

nagababumo / Building-Applications-with-Vector-Databases

Star

multimodality multi similarity-search pinecone rag face-similarity vector-database hybrid-search large-language-models llm multimodal-large-language-models

Updated Jun 4, 2024
Jupyter Notebook

pipixin321 / Arcana

Star

Implementation of "Arcana: Improving Multi-modal Large Language Model through Boosting Vision Capabilitie"

visual perception lora clip multimodal-large-language-models

Updated Jun 7, 2024
Python

sitamgithub-MSIT / well-being

Star

chatbot artificial-intelligence gradio gemini-api multimodal-data huggingface-spaces generative-ai multimodal-large-language-models gemini-15-pro

Updated Jun 10, 2024
Python

sitamgithub-MSIT / streamlit-app-builder

Star

A Streamlit-based AI assistant generates custom Streamlit app code from user-provided images or text using the Google Gemini model.

artificial-intelligence code-generation gemini-api multimodal-data streamlit streamlit-webapp generative-ai multimodal-large-language-models gemini-15-pro

Updated Jun 29, 2024
Python

hari-huynh / viVQA-voice-assistant

Star

Voice assistant using Multimodal LLMs - LLaVA-NeXT (Mistral 7B) finetuned & PhoWhisper

text-to-speech lora visual-question-answering llava multimodal-large-language-models audio-speech-recognition mistral-7b

Updated May 15, 2024
Python

alexander-moore / vlm

Star

Composition of Multimodal Language Models From Scratch

machine-learning ai vlm llm mllm vision-language-model multimodal-large-language-models mmllm

Updated Jun 6, 2024
Jupyter Notebook

patrick-tssn / MM-NIAVH

Star

Pressure Testing Large Video-Language Models (LVLM): Doing multimodal retrieval from LVLM at any video lengths to measure accuracy

video-language llm pressure-testing multimodal-large-language-models

Updated Jun 21, 2024
Python

pipixin321 / HolmesVAD

Star

Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"

video datasets anomaly-detection multimodal-large-language-models

Updated Jun 29, 2024

BillChan226 / MJ-Bench

Star

Official implementation for "MJ-BENCH: Is Your Multimodal Reward Model Really a Good Judge?"

benchmark reward-models multimodal-large-language-models

Updated Jun 7, 2024
Jupyter Notebook

NishilBalar / Awesome-LVLM-Hallucination

Star

up-to-date and curated list of awesome state-of-the-art LVLMs hallucinations research work, papers & resources

hallucination large-vision-language-models multimodal-large-language-models hallucination-evaluation hallucination-detection hallucination-mitigation hallucination-survey

Updated Jun 20, 2024

patrick-tssn / VideoHallucer

Star

VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)

multimodal-large-language-models hallucination-detection video-language-model video-hallucination

Updated Jun 25, 2024
Python

NKU-MetautoAI / awesome-large-vision-language-models

Star

Advances in recent large vision language models (LVLMs)

awesome-list large-language-models large-vision-language-models multimodal-large-language-models

Updated Jun 28, 2024

Improve this page

Add a description, image, and links to the multimodal-large-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-large-language-models topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal-large-language-models

Here are 72 public repositories matching this topic...

KayvanShah1 / VirtuTA

sitamgithub-MSIT / TechSage

surakku / cadence-gemma

DistilledCode / mmrl

adithya-s-k / eagle

NotYuSheng / Multimodal-Large-Language-Model

CKeibel / FHSWF-deep-learning

nicolay-r / Awesome-Image-Captioning-MLLMs

nagababumo / Building-Applications-with-Vector-Databases

pipixin321 / Arcana

sitamgithub-MSIT / well-being

sitamgithub-MSIT / streamlit-app-builder

hari-huynh / viVQA-voice-assistant

alexander-moore / vlm

patrick-tssn / MM-NIAVH

pipixin321 / HolmesVAD

BillChan226 / MJ-Bench

NishilBalar / Awesome-LVLM-Hallucination

patrick-tssn / VideoHallucer

NKU-MetautoAI / awesome-large-vision-language-models

Improve this page

Add this topic to your repo