MARD1NO

🎯

Focusing

ZZK MARD1NO

🎯

Focusing

Paddle very good | I still feel you here

255 followers · 348 following

SiliconFlow
Neverland
https://mard1no.github.io/

Achievements

x2 x3

Achievements

x2 x3

Block or Report

Block or report MARD1NO

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Lists (1)

Sort

🚀 My stack

2 repositories

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

ROCm / rocMLIR

C++ 113 38 Updated Jul 30, 2024

AlibabaPAI / torchacc

4 Updated Jul 25, 2024

sgl-project / sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Python 3,537 219 Updated Jul 30, 2024

meta-llama / llama-models

Utilities intended for use with Llama models.

Python 2,875 390 Updated Jul 30, 2024

EurekaLabsAI / tensor

The Tensor (or Array)

Python 280 21 Updated Jul 30, 2024

HandH1998 / QQQ

QQQ is an innovative and hardware-optimized W4A8 quantization solution.

Python 32 2 Updated Jul 24, 2024

ROCm / hipTensor

AMD’s C++ library for accelerating tensor primitives

C++ 33 16 Updated Jul 29, 2024

flashinfer-ai / debug-print

Debug print operator for cudagraph debugging

Cuda 7 Updated Jul 29, 2024

ROCm / rocRAND

RAND library for HIP programming language

C++ 107 66 Updated Jul 26, 2024

modelscope / swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 2,619 236 Updated Jul 30, 2024

wkcn / nanoGPT

Integrate MS-AMP into nanoGPT (https://github.com/karpathy/nanoGPT)

Python 1 Updated Jul 19, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 9,537 2,154 Updated Jul 30, 2024

fanshiqing / grouped_gemm

Forked from tgale96/grouped_gemm

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 39 18 Updated Jul 18, 2024

rapidsai / kvikio

KvikIO - High Performance File IO

Python 134 48 Updated Jul 30, 2024

unslothai / unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Python 13,341 883 Updated Jul 30, 2024

xiaoxiaohehe001 / flash-attention

Forked from wwbitejotunn/flash-attention

Fast and memory-efficient exact attention

Python 1 Updated Jul 16, 2024

matrix97317 / OneNeuralNetwork

This is a cross-chip platform collection of operators and a unified neural network library.

Python 12 1 Updated Nov 3, 2023

siliconflow / BizyAir

BizyAir: Comfy Nodes that can run in any environment.

Python 100 8 Updated Jul 30, 2024

ShiArthur03 / ShiArthur03

MATLAB 10,275 1,936 Updated Jul 16, 2024

fishaudio / fish-speech

Brand new TTS solution

Python 6,618 515 Updated Jul 30, 2024

bytedance / flux

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 104 9 Updated Jul 25, 2024

NVIDIA / nvmath-python

NVIDIA Math Libraries for the Python Ecosystem

Cython 185 6 Updated Jul 8, 2024

Cambricon / mlu-ops

Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .

C++ 94 102 Updated Jul 30, 2024

NVIDIA / cloudai

CloudAI Benchmark Framework

Python 21 11 Updated Jul 30, 2024

PaddlePaddle / PaddleAPEX

PaddleAPEX：Paddle Accuracy and Performance EXpansion pack

Python 7 5 Updated Jul 22, 2024

OpenRLHF / OpenRLHF-Docs

2 Updated Jul 29, 2024

microsoft / vidur

A large-scale simulation framework for LLM inference

Python 162 18 Updated Jul 28, 2024

jquesnelle / yarn

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,272 112 Updated Apr 17, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

26,092 1,390 Updated Jul 29, 2024

microsoft / triton-shared

Shared Middle-Layer for Triton Compilation

MLIR 141 27 Updated Jul 29, 2024

ZZK MARD1NO

Block or report MARD1NO

Lists (1)

🚀 My stack

Starred repositories

Awesome Lists