Skip to content
View Yangsx-1's full-sized avatar
  • Fudan University
  • Shanghai
  • 20:46 (UTC +08:00)
Block or Report

Block or report Yangsx-1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 773 121 Updated Jul 29, 2023

how to optimize some algorithm in cuda.

Cuda 1,282 106 Updated Jul 29, 2024

《动手学大模型Dive into LLMs》系列编程实践教程

2,822 233 Updated Jul 3, 2024

Video+code lecture on building nanoGPT from scratch

Python 3,136 400 Updated Jul 26, 2024

2021年最新整理, C++ 学习资料,含C++ 11 / 14 / 17 / 20 / 23 新特性、入门教程、推荐书籍、优质文章、学习笔记、教学视频等

C++ 4,579 980 Updated Jun 8, 2022

LLM101n: Let's build a Storyteller

26,086 1,390 Updated Jul 29, 2024

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training

C++ 919 198 Updated Jul 21, 2024

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 8,197 800 Updated Jul 28, 2024

《简单粗暴 LaTeX》出版图书开源仓库 | The opensource repo for my published LaTeX book.

TeX 1,538 210 Updated Jan 22, 2021

模型部署白皮书(CUDA|ONNX|TensorRT|C++)🚀🚀🚀

160 37 Updated Jul 11, 2024

<<自己动手写docker>> 源码

Go 1,946 556 Updated Dec 6, 2021

12 Weeks, 24 Lessons, AI for All!

Jupyter Notebook 33,477 5,516 Updated Jul 25, 2024

《C++ Templates The Complete Guide - second edition》的非专业个人翻译

TeX 225 41 Updated Dec 28, 2022

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,708 395 Updated Jul 15, 2024

LLM inference in C/C++

C++ 62,746 8,994 Updated Jul 30, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 23,889 3,432 Updated Jul 30, 2024

面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版

Jupyter Notebook 10,838 1,297 Updated Jul 21, 2024

High-Performance C++ Fundamental Library

C++ 382 55 Updated Jul 27, 2024

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 5,815 661 Updated Jul 29, 2024

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

Python 2,686 325 Updated Jan 2, 2024

Ongoing research training transformer models at scale

Python 9,537 2,154 Updated Jul 30, 2024

Making large AI models cheaper, faster and more accessible

Python 38,417 4,316 Updated Jul 30, 2024

A Cloud Native Batch System (Project under CNCF)

Go 3,965 918 Updated Jul 30, 2024

Header-only C++/python library for fast approximate nearest neighbors

C++ 4,187 615 Updated Jul 27, 2024

简单粗暴 TensorFlow 2 | A Concise Handbook of TensorFlow 2 | 一本简明的 TensorFlow 2 入门指导教程

Jupyter Notebook 3,938 844 Updated Mar 21, 2023

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 1,713 203 Updated Jun 2, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 34,904 3,663 Updated Jul 28, 2024

《大语言模型》作者:赵鑫,李军毅,周昆,唐天一,文继荣

2,008 137 Updated Apr 22, 2024

大模型基础: 一文了解大模型基础知识

2,150 196 Updated Jul 11, 2024

📚 极客时间电子书

9,722 3,231 Updated Jan 26, 2023
Next