![linux logo](https://raw.githubusercontent.com/github/explore/80688e429a7d4ef2fca1e82350fe8e3517d3494d/topics/linux/linux.png)
-
Shanghai Jiao Tong University
- Shanghai
- raphael-hao.top
Highlights
- Pro
Block or Report
Block or report Raphael-Hao
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
A minimal GPU design in Verilog to learn how GPUs work from the ground up
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Create LLM agents with long-term memory and custom tools 📚🦙
Umami is a simple, fast, privacy-focused alternative to Google Analytics.
Start building LLM-empowered multi-agent applications in an easier way.
🕸 A Node app for creating a Feed Reader in Notion.
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Envision a world where EVERY student can read ALL the code of a teaching operating system.
rFaaS: a high-performance FaaS platform with RDMA acceleration for low-latency invocations.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Generative Agents: Interactive Simulacra of Human Behavior
CoreNet: A library for training deep neural networks
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.