multi-modal

Here are 320 public repositories matching this topic...

OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Updated Oct 22, 2024
Python

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Updated Nov 23, 2024
Python

modelscope / modelscope

Star

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

Updated Nov 22, 2024
Python

THUDM / CogVLM

Star

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

Updated May 29, 2024
Python

OpenGVLab / InternVL

Star

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

image-classification gpt multi-modal semantic-segmentation video-classification image-text-retrieval llm vision-language-model gpt-4v vit-6b vit-22b gpt-4o

Updated Nov 23, 2024
Python

lucidrains / DALLE-pytorch

Star

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

deep-learning transformers artificial-intelligence multi-modal attention-mechanism text-to-image

Updated Feb 17, 2024
Python

modelscope / agentscope

Star

Start building LLM-empowered multi-agent applications in an easier way.

agent drag-and-drop chatbot multi-agent multi-modal distributed-agents gpt-4 large-language-models llm llm-agent llama3 gpt-4o

Updated Nov 20, 2024
Python

marqo-ai / marqo

Star

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Updated Nov 22, 2024
Python

OFA-Sys / Chinese-CLIP

Star

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp computer-vision deep-learning transformers pytorch chinese pretrained-models multi-modal clip coreml-models contrastive-loss vision-language multi-modal-learning image-text-retrieval vision-and-language-pre-training

Updated Aug 6, 2024
Python

valhalla / valhalla

Star

Open Source Routing Engine for OpenStreetMap

directions openstreetmap routing astar traveling-salesman dijkstra routing-engine isochrones multi-modal tiled

Updated Nov 23, 2024
C++

THUDM / VisualGLM-6B

Star

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

gpt multi-modal chatglm-6b

Updated Aug 23, 2024
Python

zjunlp / DeepKE

Star

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

Updated Nov 21, 2024
Python

PKU-YuanGroup / Video-LLaVA

Star

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

multi-modal instruction-tuning large-vision-language-model

Updated Sep 25, 2024
Python

modelscope / data-juicer

Star

Making data higher-quality, juicier, and more digestible for any large models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

Updated Nov 22, 2024
Python

docarray / docarray

Star

Represent, send, store and search multimodal data

elasticsearch machine-learning deep-learning protobuf pytorch data-structures nearest-neighbor-search cross-modal multi-modal semantic-search multimodal nested-data weaviate dataclass pydantic fastapi neural-search qdrant docarray

Updated Nov 22, 2024
Python

VectorSpaceLab / OmniGen

Star

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

image image-generation multi-modal multi-task diffusion image-edit

Updated Nov 16, 2024
Jupyter Notebook

SciSharp / LLamaSharp

Star

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot llama gpt multi-modal llm llava semantic-kernel llamacpp llama-cpp llama2 llama3

Updated Nov 22, 2024
C#

THUDM / CogVLM2

Star

GPT4V-level open-source multi-modal model based on Llama3-8B

pretrained-models language-model multi-modal cogvlm

Updated Sep 3, 2024
Python

PKU-YuanGroup / MoE-LLaVA

Star

Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Updated May 15, 2024
Python

dvlab-research / LISA

Star

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

segmentation multi-modal llm large-language-model

Updated Jul 2, 2024
Python

Improve this page

Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modal

Here are 320 public repositories matching this topic...

OpenBMB / MiniCPM-V

activeloopai / deeplake

modelscope / modelscope

THUDM / CogVLM

OpenGVLab / InternVL

lucidrains / DALLE-pytorch

modelscope / agentscope

marqo-ai / marqo

OFA-Sys / Chinese-CLIP

valhalla / valhalla

THUDM / VisualGLM-6B

zjunlp / DeepKE

PKU-YuanGroup / Video-LLaVA

modelscope / data-juicer

docarray / docarray

VectorSpaceLab / OmniGen

SciSharp / LLamaSharp

THUDM / CogVLM2

PKU-YuanGroup / MoE-LLaVA

dvlab-research / LISA

Improve this page

Add this topic to your repo