Skip to content
View realsammyt's full-sized avatar
Block or Report

Block or report realsammyt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

multimodal

51 repositories

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Python 1,372 132 Updated Jul 8, 2024

Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)

Python 3,711 313 Updated Jun 12, 2024

ReVersion: Diffusion-Based Relation Inversion from Images

Python 437 14 Updated Jul 7, 2024

This script allows to automate video stylization task using StableDiffusion and ControlNet.

Python 807 62 Updated Nov 21, 2023

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,513 240 Updated Mar 5, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,590 235 Updated Jun 4, 2024

An open-source framework for training large multimodal models.

Python 3,568 271 Updated May 25, 2024

Curated list of useful LLM / Analytics / Datascience resources

1,564 134 Updated May 30, 2024

This is the official repository for the LENS (Large Language Models Enhanced to See) system.

Jupyter Notebook 341 11 Updated Dec 9, 2023

🦜🔗 Build context-aware reasoning applications

Python 88,885 13,990 Updated Jul 10, 2024

A multimodal inference pipeline that integrates InstructBLIP with textgen-webui for Vicuna and related models.

Python 30 2 Updated Jul 14, 2023

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, B…

Python 404 30 Updated Apr 21, 2024

SciGraphQA

Jupyter Notebook 34 2 Updated Aug 8, 2023

Fast and Accurate ML in 3 Lines of Code

Python 7,427 881 Updated Jul 10, 2024

Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!

Python 3,562 236 Updated Jul 10, 2024

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…

Rust 3,592 186 Updated Jul 10, 2024

Meta-Transformer for Unified Multimodal Learning

Python 1,470 113 Updated Dec 5, 2023

Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).

Python 2,550 216 Updated Jun 26, 2024

🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation with AI everywhere (mobile, web and terminal) using LLM OpenAI …

JavaScript 5,891 710 Updated Mar 15, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 4,306 331 Updated May 28, 2024

A collection of papers, codes, projects, tutorials ... for Knowledge Graph and other NLP methods

63 13 Updated Jun 17, 2024

WavJourney: Compositional Audio Creation with LLMs

Python 510 45 Updated Sep 28, 2023

Python library for designing and training your own Diffusion Models with PyTorch.

Python 264 10 Updated Jun 21, 2024

PALLAIDIUM - a generative AI movie studio integrated in the Blender video editor.

Python 861 69 Updated Jul 10, 2024

Magick is a cutting-edge toolkit for a new kind of AI builder. Make Magick with us!

TypeScript 652 108 Updated Jul 10, 2024

ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.

Python 3,799 415 Updated Jul 10, 2024

Official implementation of SEED-LLaMA (ICLR 2024).

Python 518 29 Updated Apr 11, 2024

A framework to enable multimodal models to operate a computer.

Python 8,217 1,096 Updated Jul 9, 2024

Label Studio is a multi-type data labeling and annotation tool with standardized output format

JavaScript 17,400 2,162 Updated Jul 10, 2024

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activelo…

Python 7,877 607 Updated Jul 10, 2024