muse-coder

muse-coder

4 followers · 39 following

Chongqing University
Chongqing
04:59 (UTC -12:00)

Achievements

Block or Report

Block or report muse-coder

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Lists (12)

Sort

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

pigirons / sgemm_hsw

This is an implementation of sgemm_kernel on L1d cache.

Assembly 213 33 Updated Feb 26, 2024

dreamgonfly / transformer-pytorch

A PyTorch implementation of Transformer in "Attention is All You Need"

Python 103 28 Updated Dec 6, 2020

Sanskar777 / QRS-peak-detection-in-ECG-signals-using-verilog

Verilog 9 2 Updated Dec 1, 2019

hyunwoongko / transformer

Transformer: PyTorch Implementation of "Attention Is All You Need"

Python 2,536 394 Updated Apr 17, 2024

jadore801120 / attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Python 8,636 1,954 Updated Apr 16, 2024

zeasa / nvdla-compiler

Python 41 15 Updated Nov 18, 2019

Yinghan-Li / YHs_Sample

Yinghan's Code Sample

Cuda 259 49 Updated Jul 25, 2022

LeiWang1999 / ZYNQ-NVDLA

NVDLA (An Opensource DL Accelerator Framework) implementation on FPGA.

Verilog 284 57 Updated Dec 27, 2023

accel-sim / accel-sim-framework

This is the top-level repository for the Accel-Sim framework.

Python 271 105 Updated Jul 14, 2024

riple / dnnweaver2.drone

Python 2 8 Updated Oct 25, 2018

hsharma35 / dnnweaver2

Open Source Specialized Computing Stack for Accelerating Deep Neural Networks.

Jupyter Notebook 196 73 Updated Apr 22, 2019

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,159 159 Updated Jul 16, 2024

Guangxuan-Xiao / torch-int

This repository contains integer operators on GPUs for PyTorch.

Python 159 48 Updated Sep 29, 2023

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,188 4,020 Updated Jul 17, 2024

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 8,050 785 Updated Jul 17, 2024

turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 3,282 244 Updated Jul 23, 2024

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 469 34 Updated Jul 10, 2024

AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,150 429 Updated Jul 17, 2024

QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 12,763 1,030 Updated Jun 27, 2024

NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter

Python 4,493 670 Updated Jun 17, 2024

TRT2022 / MST-plus-plus-TensorRT

🐩 🐩 🐩 TensorRT 2022复赛方案：首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化

Python 130 19 Updated Jul 6, 2022

Tlntin / Qwen-TensorRT-LLM

Python 548 50 Updated Jun 19, 2024

bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Python 5,796 588 Updated Jul 23, 2024

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 3,479 309 Updated Jul 23, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,228 2,452 Updated Jul 15, 2024

meta-llama / llama

Inference code for Llama models

Python 54,354 9,332 Updated Jul 23, 2024

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 22,222 2,461 Updated Jul 23, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 12,559 1,120 Updated Jul 23, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 5,674 878 Updated Mar 27, 2024

adam-maj / tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 6,730 501 Updated Jun 14, 2024

muse-coder

Block or report muse-coder

Lists (12)

BNN

CPU

Cuda

dataflow accelerator

FIFO

PYZQ

Quantization

RAM

Stochastic Computing

TensorRT

Transformer

Wallace tree

Stars