Stars
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
中文大模型能力评测榜单:目前已囊括115个大模型,覆盖chatgpt、gpt4o、百度文心一言、阿里通义千问、讯飞星火、商汤senseChat、minimax等商用模型, 以及百川、qwen2、glm4、yi、书生internLM2、llama3等开源大模型,多维度能力评测。不仅提供能力评分排行榜,也提供所有模型的原始输出结果!
YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931) ECCV Workshops 2022)
Open-source simulator for autonomous driving research.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
The world's simplest facial recognition api for Python and the command line
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
The devkit of the nuScenes dataset.
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
awesome-autonomous-driving
End-to-End Object Detection with Transformers
[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Learning Image-adaptive 3D Lookup Tables for High Performance Photo Enhancement in Real-time
GPT4V-level open-source multi-modal model based on Llama3-8B
🦜🔗 Build context-aware reasoning applications
This is the official repository for Retrieval Augmented Visual Question Answering
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
State-of-the-art 2D and 3D Face Analysis Project
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese