Skip to content
View AndySong20's full-sized avatar
Block or Report

Block or report AndySong20

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
36 results for source starred repositories
Clear filter

A C++ header-only HTTP/HTTPS server and client library

C++ 12,528 2,225 Updated Aug 10, 2024

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 509 36 Updated Aug 15, 2024

Reference implementations of MLPerf™ training benchmarks

Python 1,588 550 Updated Aug 14, 2024

A Python framework for high performance GPU simulation and graphics

Python 3,990 216 Updated Aug 16, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,929 863 Updated Aug 14, 2024

vim配置

Vim Script 4,887 1,809 Updated Nov 16, 2023

视频音频生成字幕,生成srt文件。无需申请第三方API,本地实现音频转文本。基于Transformer的视频字幕生成框架。A GUI tool for generating subtitle from videos and generating srt files.

Python 760 154 Updated Feb 1, 2024

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 9,820 767 Updated May 19, 2024

CUDA Library Samples

Cuda 1,472 306 Updated Aug 15, 2024

Transformer related optimization, including BERT, GPT

C++ 5,717 882 Updated Mar 27, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 24,883 3,595 Updated Aug 17, 2024

Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.

Python 435 131 Updated Aug 16, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 11,256 2,339 Updated Aug 17, 2024

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 5,940 1,729 Updated Jul 26, 2024
Python 23 11 Updated Apr 15, 2023

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 1,747 287 Updated Aug 17, 2024

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 10,071 1,448 Updated Aug 15, 2024

Stable Diffusion web UI

Python 138,044 26,244 Updated Aug 13, 2024

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

C++ 6,659 2,143 Updated Aug 17, 2024

oneAPI Math Kernel Library (oneMKL) Interfaces

C++ 600 154 Updated Aug 16, 2024

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

C++ 402 79 Updated Aug 16, 2024

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 81,375 21,844 Updated Aug 17, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,209 578 Updated Aug 16, 2024

Development repository for the Triton language and compiler

C++ 12,251 1,476 Updated Aug 17, 2024

Fast and memory-efficient exact attention

Python 12,962 1,169 Updated Aug 16, 2024

A High Performance Metadata System for Kubernetes

Go 758 79 Updated May 13, 2024

CUDA Templates for Linear Algebra Subroutines

C++ 5,106 868 Updated Aug 16, 2024

A library for efficient similarity search and clustering of dense vectors.

C++ 30,021 3,521 Updated Aug 16, 2024

润学全球官方指定GITHUB,整理润学宗旨、纲领、理论和各类润之实例;解决为什么润,润去哪里,怎么润三大问题; 并成为新中国人的核心宗教,核心信念。

31,361 2,601 Updated Jul 31, 2024
Next