Skip to content
View isfinne's full-sized avatar
Block or Report

Block or report isfinne

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Orchestrate zero-shot computer vision models

HTML 346 6 Updated Jul 25, 2024
Go 3 Updated Jul 17, 2024

Example applications, microservices, and code samples for the Internet Computer

JavaScript 515 331 Updated Jul 29, 2024

Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

68 1 Updated Jul 16, 2024

[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Python 105 2 Updated Jul 28, 2024

DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution

Python 30 4 Updated Jul 15, 2024

Survey on Data-centric Large Language Models

52 Updated Jul 8, 2024

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,596 110 Updated Jul 14, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,597 98 Updated Jul 27, 2024

Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"

Python 67 Updated Jul 5, 2024

Hello VaM 的发布版本

240 3 Updated Jul 28, 2024

Evaluation code for Ref-L4, a new REC benchmark in the LMM era

Python 8 Updated Jul 11, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,607 99 Updated Jul 26, 2024

xMetaCene 自动化脚本

Python 9 Updated Jun 25, 2024

An Open-source Toolkit for LLM Development

Python 2,640 168 Updated May 24, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 5,818 614 Updated Jul 24, 2024

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab

Python 814 54 Updated Apr 22, 2024
Python 78 1 Updated Jun 17, 2024

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Python 100 3 Updated May 29, 2024

A Framework of Small-scale Large Multimodal Models

Python 518 47 Updated Jul 21, 2024

The official implementation of Hierarchical Semantic Decoding with Counting Assitance for Generalized Referring Expression Segmentation

14 Updated Jun 3, 2024

LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning

Python 50 5 Updated Apr 16, 2024

[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"

Python 159 7 Updated Jun 26, 2024

We write your reusable computer vision tools. 💜

Python 18,172 1,398 Updated Jul 28, 2024

🐚 OpenDevin: Code Less, Make More

Python 29,128 3,369 Updated Jul 29, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 4,060 393 Updated Jul 17, 2024

✨✨Latest Advances on Multimodal Large Language Models

10,896 722 Updated Jul 25, 2024

A benchmark dataset for GRES and GREC [CVPR2023 Highlight]

Python 173 3 Updated Sep 4, 2023

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 700 35 Updated Jun 2, 2024

evm 系列 以太坊 bsc matic avax okx 等 区块链 通用 快速 打铭文工具

Rust 88 34 Updated Jan 14, 2024
Next