Skip to content
View lemoner20's full-sized avatar

Block or report lemoner20

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 1,406 70 Updated Sep 5, 2024

Natural Language Processing Tutorial for Deep Learning Researchers

Jupyter Notebook 14,036 3,902 Updated Feb 21, 2024

500 AI Machine learning Deep learning Computer vision NLP Projects with code

19,468 5,020 Updated Jul 26, 2024

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.

Python 1,965 205 Updated Aug 15, 2024

State-of-the-Art Text Embeddings

Python 14,780 2,432 Updated Aug 30, 2024

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Python 29,903 7,397 Updated Aug 22, 2024

Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video”

Python 360 39 Updated Jul 5, 2024

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 2,827 203 Updated Jul 27, 2024

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,180 147 Updated Aug 23, 2024

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 1,990 187 Updated Apr 24, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 4,263 416 Updated Jul 30, 2024

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Python 358 13 Updated Apr 8, 2024

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Python 2,135 124 Updated Aug 29, 2024

Easily compute clip embeddings and build a clip retrieval system with them

Jupyter Notebook 2,342 208 Updated Apr 15, 2024

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Python 4,280 447 Updated Aug 6, 2024

Mixture-of-Experts for Large Vision-Language Models

Python 1,896 121 Updated May 15, 2024

Python bindings for llama.cpp

Python 7,632 916 Updated Sep 5, 2024

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Python 1,156 148 Updated Aug 14, 2024

A Gradio web UI for Large Language Models.

Python 39,369 5,176 Updated Sep 4, 2024

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

Go 87,971 6,854 Updated Sep 5, 2024

Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".

Python 1,674 174 Updated Sep 25, 2023

A family of lightweight multimodal models.

Python 869 66 Updated Sep 4, 2024

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

Python 28,043 5,576 Updated Sep 5, 2024

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,702 150 Updated Sep 3, 2024
HTML 3,401 2,024 Updated Jul 13, 2024

Open source code for AAAI 2023 Paper "BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning"

Python 145 6 Updated Jul 6, 2023
Python 99 11 Updated Dec 23, 2022

Tracking and collecting papers/projects/others related to Segment Anything.

1,501 129 Updated Aug 16, 2024

Learning audio concepts from natural language supervision

Python 455 35 Updated May 27, 2024

Ongoing research training transformer models at scale

Python 9,846 2,227 Updated Sep 5, 2024
Next