Skip to content
View qtwang's full-sized avatar
🧠
Quack!
🧠
Quack!
Block or Report

Block or report qtwang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

BLAS-like Library Instantiation Software Framework

C 2,217 362 Updated Jul 17, 2024

Next generation BLAS implementation for ROCm platform

C++ 333 153 Updated Jul 29, 2024

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!

Python 2,434 169 Updated Jul 25, 2024

Fast CUDA matrix multiplication from scratch

Cuda 373 47 Updated Dec 28, 2023

Step-by-step optimization of CUDA SGEMM

Cuda 187 32 Updated Mar 30, 2022

Source code for SIGMOD 2020 paper "Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination"

C++ 43 11 Updated Jul 17, 2020

[EMNLP 2023] Adapting Language Models to Compress Long Contexts

Python 253 17 Updated Feb 26, 2024

[ICLR 2024] Official implementation of "TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting"

Python 1,112 149 Updated Jul 26, 2024
Jupyter Notebook 144 38 Updated Jun 29, 2020

Python Packaging User Guide

Python 1,394 823 Updated Jul 29, 2024

CMake for C++ Best Practices

CMake 1,026 111 Updated May 27, 2024

Collaborative Collection of C++ Best Practices. This online resource is part of Jason Turner's collection of C++ Best Practices resources. See README.md for more information.

7,959 875 Updated Jul 11, 2024

An elegant \LaTeX\ résumé template. 大陆镜像 https://gods.coding.net/p/resume/git

TeX 9,008 2,560 Updated Mar 15, 2024

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,707 395 Updated Jul 15, 2024

Dumpy: A Compact and Adaptive Index for Large Data Series Collections (SIGMOD'23)

C++ 8 2 Updated Dec 12, 2023

Open-source vector similarity search for Postgres

C 10,759 484 Updated Jul 27, 2024

LLM inference in C/C++

C++ 62,733 8,991 Updated Jul 30, 2024

Seed guided neural metric learning approach for calculating trajectory similarities

Python 49 13 Updated Jul 9, 2019

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,119 574 Updated Jul 26, 2024
Python 253 31 Updated Apr 2, 2024

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,095 531 Updated Jul 24, 2024

Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467

Python 254 23 Updated Aug 5, 2023

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

Python 331 29 Updated Jul 26, 2024

Examples from Programming in Parallel with CUDA

Cuda 94 38 Updated Mar 17, 2023

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 35,951 4,419 Updated Jul 30, 2024

Running inference on the ZeroSCROLLS benchmark

Python 18 2 Updated Apr 18, 2024

Official repository for LongChat and LongEval

Python 501 29 Updated May 24, 2024

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 9,893 1,146 Updated Jul 26, 2024

StableLM: Stability AI Language Models

Jupyter Notebook 15,850 1,037 Updated Apr 8, 2024

A toolkit for machine learning from time series

Python 914 98 Updated Jul 29, 2024
Next