Skip to content
View berryxian's full-sized avatar

Block or report berryxian

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 56,239 5,770 Updated Aug 24, 2024

NVMeVirt: A Versatile Software-defined Virtual NVMe Device

C 185 54 Updated Aug 7, 2024
Python 40 3 Updated Jun 24, 2024

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 11,190 1,622 Updated Oct 26, 2024

Large Language Model Text Generation Inference

Python 9,101 1,070 Updated Nov 15, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 35,470 4,119 Updated Nov 15, 2024

Context Manager to profile the forward and backward times of PyTorch's nn.Module

Python 83 4 Updated Oct 10, 2023

🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.

Python 54 7 Updated Nov 6, 2024

Large World Model -- Modeling Text and Video with Millions Context

Python 7,150 552 Updated Oct 19, 2024

Compare different hardware platforms via the Roofline Model for LLM inference tasks.

Jupyter Notebook 74 4 Updated Mar 13, 2024
Python 10 3 Updated Jul 24, 2024

Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters

Python 104 4 Updated Sep 23, 2024

NeuPIMs Simulator

Jupyter Notebook 53 12 Updated Jun 19, 2024

ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference

C++ 67 11 Updated Nov 12, 2024

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

Python 58 8 Updated Oct 24, 2024

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 311 37 Updated Sep 11, 2024

Processing-In-Memory (PIM) Simulator

C++ 129 44 Updated Jul 9, 2024

Computer Architecture -VLSI -Verilog Codes-Xilinx-Irsim

Verilog 12 2 Updated May 8, 2021

Official repo to On the Generalization Ability of Retrieval-Enhanced Transformers

Python 35 5 Updated Jun 4, 2024

Graph Partitioning for Large-scale Graph Datasets

C++ 89 13 Updated Dec 14, 2021

提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手

Python 34,601 3,618 Updated Sep 23, 2024

微信聊天记录导出、微信年度报告生成!记录你的2023!

Python 122 13 Updated Jan 16, 2024

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 32,054 5,575 Updated Nov 9, 2024

LlamaIndex is a data framework for your LLM applications

Python 36,764 5,272 Updated Nov 15, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 30,271 4,587 Updated Nov 16, 2024

A tutorial of building an LSM-Tree storage engine in a week.

Rust 2,883 400 Updated Nov 13, 2024
Next