-
Nangjing University of Posts and Telecommunications
- NanJing
Block or Report
Block or report AlphaQi
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
定投改变命运 —— 让时间陪你慢慢变富 https://onregularinvesting.com
A generative speech model for daily dialogue.
Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral
[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression
Official Pytorch Implementation of SMIRK: 3D Facial Expressions through Analysis-by-Neural-Synthesis (CVPR 2024)
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫
YOLOv10: Real-Time End-to-End Object Detection
A latent text-to-image diffusion model
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
End-to-End Object Detection with Transformers
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
An open source implementation of CLIP.
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
量化投资回测框架 backtest,stock, option, future,factor investing, portfiolio analyis
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
LAVIS - A One-stop Library for Language-Vision Intelligence
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection