![tensorflow logo](https://raw.githubusercontent.com/github/explore/80688e429a7d4ef2fca1e82350fe8e3517d3494d/topics/tensorflow/tensorflow.png)
Highlights
- Pro
Block or Report
Block or report Erland366
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
🔍 A Hex Editor for Reverse Engineers, Programmers and people who value their retinas when working at 3 AM.
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
Named Tensors for Legible Deep Learning in JAX
The official PyTorch implementation of Google's Gemma models
anthonix / llm.c
Forked from karpathy/llm.cLLM training in simple, raw C/HIP for AMD GPUs
An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.
📚 Download the full collection of Paul Graham essays in EPUB, PDF & Markdown for easy reading.
The implementation for paper "Token-wise Influential Training Data Retrieval for Large Language Models" (Accepted on ACL 2024).
TORAX: Tokamak transport simulation in JAX
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Experimental q[X]ora kernel development code
Scalable toolkit for efficient model alignment
Fast lexical search library implementing BM25 in Python using Scipy (on average 2x faster than Elasticsearch in single-threaded setting)
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization and sparsity. It compresses deep learning models for downstream deployment frame…
Proof-of-concept of global switching between numpy/jax/pytorch in a library.
LeiWang1999 / vllm-bitblas
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
MambaOut: Do We Really Need Mamba for Vision?
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
Collective communications library with various primitives for multi-machine training.
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization