Skip to content
View ihollywhy's full-sized avatar

Block or report ihollywhy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 5,674 440 Updated Sep 19, 2024

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Python 2,726 174 Updated Aug 1, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 21,795 2,115 Updated Aug 9, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,868 150 Updated Sep 25, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 4,465 437 Updated Jul 30, 2024

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models l…

Jupyter Notebook 5,291 823 Updated Oct 4, 2024

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

Python 973 91 Updated Feb 12, 2023

Official PyTorch Implementation of "GAN-Supervised Dense Visual Alignment" (CVPR 2022 Oral, Best Paper Finalist)

Python 1,011 120 Updated Oct 12, 2022
Python 7,098 549 Updated Aug 12, 2024

Rembg is a tool to remove images background

Python 16,512 1,847 Updated Oct 1, 2024

EfficientViT is a new family of vision models for efficient high-resolution vision.

Python 1,799 164 Updated Aug 9, 2024

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,823 1,055 Updated Aug 15, 2024

State-of-the-art 2D and 3D Face Analysis Project

Python 23,027 5,370 Updated Sep 30, 2024

Official PyTorch implementation of StyleGAN3

Python 6,383 1,123 Updated Sep 12, 2023

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,308 224 Updated Jun 14, 2024

Taming Transformers for High-Resolution Image Synthesis

Jupyter Notebook 5,735 1,139 Updated Jul 30, 2024

Mamba SSM architecture

Python 12,777 1,078 Updated Oct 7, 2024

CoTracker is a model for tracking any point (pixel) on a video.

Jupyter Notebook 2,738 196 Updated Sep 25, 2024

Generative Models by Stability AI

Python 24,292 2,702 Updated Sep 4, 2024

Focus on prompting and generating

Python 40,591 5,667 Updated Aug 21, 2024

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,123 210 Updated Sep 26, 2024

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 9,527 685 Updated Jul 25, 2024

Official implementations for paper: Anydoor: zero-shot object-level image customization

Python 3,949 359 Updated Apr 8, 2024

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Python 772 37 Updated Jun 27, 2024

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1,160 58 Updated Mar 14, 2024

Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)

Python 258 12 Updated Feb 23, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 5,923 407 Updated May 29, 2024

An Open-source Toolkit for LLM Development

Python 2,700 170 Updated May 24, 2024

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,712 370 Updated Mar 14, 2024

VisionLLM Series

Python 868 23 Updated Sep 13, 2024
Next