Stars
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
NVMeVirt: A Versatile Software-defined Virtual NVMe Device
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Large Language Model Text Generation Inference
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Context Manager to profile the forward and backward times of PyTorch's nn.Module
🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.
Large World Model -- Modeling Text and Video with Millions Context
Compare different hardware platforms via the Roofline Model for LLM inference tasks.
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
Computer Architecture -VLSI -Verilog Codes-Xilinx-Irsim
Official repo to On the Generalization Ability of Retrieval-Enhanced Transformers
Graph Partitioning for Large-scale Graph Datasets
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
LlamaIndex is a data framework for your LLM applications
A high-throughput and memory-efficient inference and serving engine for LLMs
A tutorial of building an LSM-Tree storage engine in a week.