TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 6,996 746 Updated May 31, 2024

triton-lang / triton

Development repository for the Triton language and compiler

C++ 11,552 1,354 Updated Jun 2, 2024

efficient / libcuckoo

A high-performance, concurrent hash table

C++ 1,554 269 Updated Apr 6, 2024

sql-machine-learning / sqlflow

Brings SQL and AI together.

Go 5,034 697 Updated Apr 18, 2024

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 16,475 1,891 Updated May 29, 2024

fmtlib / fmt

A modern formatting library

C++ 19,605 2,404 Updated Jun 2, 2024

wzh99 / relay-mlir

An MLIR-based toy DL compiler for TVM Relay.

C++ 43 5 Updated Oct 16, 2022

KEKE046 / mlir-tutorial

Hands-On Practical MLIR Tutorial

C++ 164 27 Updated Oct 20, 2023

facebookincubator / velox

A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

C++ 3,201 1,062 Updated Jun 2, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 5,552 866 Updated Mar 27, 2024

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,112 325 Updated May 16, 2023

ztxz16 / fastllm

纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行

C++ 3,129 309 Updated Jun 1, 2024

THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Python 15,550 1,841 Updated Apr 11, 2024

THUDM / GLM

GLM (General Language Model)

Python 3,060 316 Updated Nov 3, 2023

li-plus / chatglm.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & more LLMs

C++ 2,657 318 Updated Apr 29, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 11,377 1,003 Updated May 31, 2024

NVlabs / tiny-cuda-nn

Lightning fast C++/CUDA neural network framework

C++ 3,477 430 Updated May 21, 2024

ymcui / Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 17,677 1,818 Updated Apr 30, 2024

skeeto / w64devkit

Portable C and C++ Development Kit for x64 (and x86) Windows

C 2,548 176 Updated May 29, 2024

ggerganov / llama.cpp

LLM inference in C/C++

C++ 59,440 8,457 Updated Jun 2, 2024

yizhongw / self-instruct

Aligning pretrained language models with instruction data generated by themselves.

Python 3,862 458 Updated Mar 27, 2023

meta-llama / llama

Inference code for Llama models

Python 53,723 9,259 Updated May 15, 2024

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 127,009 25,151 Updated Jun 2, 2024

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 7,166 838 Updated Jun 1, 2024

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 79,052 21,299 Updated Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

James Jiang jiangzihao2009

Achievements