Skip to content
View isfinne's full-sized avatar

Block or report isfinne

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

FLUR.GG

JavaScript 29 21 Updated Sep 30, 2024

SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process

Python 151 10 Updated Jan 21, 2024

A monorepo for packages implementing CAT protocol

TypeScript 180 124 Updated Oct 7, 2024

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Python 222 13 Updated Aug 6, 2024

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

TypeScript 16,313 1,162 Updated Oct 7, 2024

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

Python 1,171 115 Updated Sep 10, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,475 140 Updated Oct 4, 2024

(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator

Python 77 Updated May 28, 2024

A description of different datasets

Python 5 Updated Aug 29, 2024

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,162 849 Updated Sep 13, 2024

Command line interface for ORE miners.

Rust 1,461 660 Updated Oct 6, 2024

Orchestrate zero-shot computer vision models

HTML 398 12 Updated Aug 20, 2024

Example applications, microservices, and code samples for the Internet Computer

JavaScript 535 350 Updated Oct 7, 2024

Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

81 2 Updated Jul 16, 2024

[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Python 116 2 Updated Aug 23, 2024

DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution

Python 34 4 Updated Jul 15, 2024

Survey on Data-centric Large Language Models

58 Updated Jul 8, 2024

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,785 125 Updated Sep 26, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,785 108 Updated Jul 29, 2024

Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"

Python 283 13 Updated Oct 7, 2024

Hello VaM 的发布版本

278 5 Updated Jul 28, 2024

Evaluation code for Ref-L4, a new REC benchmark in the LMM era

Python 14 Updated Jul 11, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,710 112 Updated Sep 19, 2024

xMetaCene 自动化脚本

Python 13 Updated Jun 25, 2024

An Open-source Toolkit for LLM Development

Python 2,700 170 Updated May 24, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 6,449 663 Updated Aug 12, 2024

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab

Python 829 55 Updated Apr 22, 2024
Python 80 1 Updated Jun 17, 2024

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Python 108 3 Updated May 29, 2024
Next