Stars
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
以图搜图,使用cv_resnest101_general_recognition模型和Milvus数据库
real time face swap and one-click video deepfake with only a single image
deep learning for image processing including classification and object-detection etc.
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
A curated list of awesome computer vision resources
A curated list of image inpainting and video inpainting papers and resources
[AAAI 2023] Exploring CLIP for Assessing the Look and Feel of Images
📚 计算机经典编程书籍、大黑书、编程电子书、电子书、编程书籍,包括计算机基础、C/C++、Java、Python、面试题、架构设计、算法系列等经典电子书。
「C/C++学习+面试指南」一份涵盖大部分 C++ 程序员所需要掌握的知识。入门、进阶、深入、校招、社招,准备 C++ 学习& 面试,首选 CppGuide!
an implementation of transformer, bert, gpt, and diffusion models for learning purposes
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Easily compute clip embeddings and build a clip retrieval system with them
Effortless data labeling with AI support from Segment Anything and other awesome models.
哔哩下载姬downkyi,哔哩哔哩网站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
An open source implementation of CLIP.
Meta-Transformer for Unified Multimodal Learning
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
source code to ICLR'19, 'A Closer Look at Few-shot Classification'
[ICML 2023] A Closer Look at Few-shot Classification Again
Ready-to-use code and tutorial notebooks to boost your way into few-shot learning for image classification.
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥