Skip to content
View snowpeakz's full-sized avatar

Block or report snowpeakz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

C++ builds C++

C++ 7 Updated Sep 27, 2024

ByteDance's Recommendation System

Python 868 124 Updated Nov 8, 2023

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.

Cuda 590 132 Updated Sep 23, 2024

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 2,597 405 Updated Sep 27, 2024

CUDA Core Compute Libraries

C++ 1,164 138 Updated Sep 27, 2024
C++ 470 84 Updated Sep 26, 2024

Material for gpu-mode lectures

Jupyter Notebook 2,558 254 Updated Sep 23, 2024

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 128 25 Updated Aug 25, 2024

cuDF - GPU DataFrame Library

C++ 8,302 886 Updated Sep 27, 2024

compiler learning resources collect.

Python 2,074 324 Updated May 27, 2024

how to learn PyTorch and OneFlow

332 20 Updated Mar 22, 2024

how to optimize some algorithm in cuda.

Cuda 1,465 121 Updated Sep 25, 2024

Fast and memory-efficient exact attention

Python 13,564 1,243 Updated Sep 27, 2024

Awesome-LLM: a curated list of Large Language Model

17,605 1,429 Updated Sep 23, 2024

CUDA Templates for Linear Algebra Subroutines

C++ 5,428 916 Updated Sep 25, 2024

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021

Python 54 12 Updated Jul 21, 2021

Alluxio, data orchestration for analytics and machine learning in the cloud

Java 6,808 2,933 Updated Sep 12, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 132,650 26,434 Updated Sep 27, 2024

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,173 328 Updated May 16, 2023

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Python 303 48 Updated Apr 11, 2023

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

C++ 452 34 Updated Mar 15, 2024

Transformer related optimization, including BERT, GPT

C++ 5,797 886 Updated Mar 27, 2024

Ongoing research training transformer models at scale

Python 10,116 2,278 Updated Sep 27, 2024

Inference code for Llama models

Python 55,659 9,495 Updated Aug 18, 2024

Training and serving large-scale neural networks with auto parallelization.

Python 3,051 353 Updated Dec 9, 2023

ImageBind One Embedding Space to Bind Them All

Python 8,243 758 Updated Jul 31, 2024

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 6,161 1,777 Updated Jul 26, 2024

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more

C 20,340 3,855 Updated Sep 8, 2024

Awesome resources for GPUs

467 47 Updated Jul 1, 2023

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 5,873 667 Updated Sep 6, 2024
Next