Skip to content
View Misby's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Misby

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Java 10 3 Updated Apr 29, 2024

Implement some method of LLM KV Cache Sparsity

Python 21 2 Updated Jun 6, 2024

Android GPU Inspector

Go 944 138 Updated Oct 15, 2024

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,253 858 Updated Sep 13, 2024

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 64,813 8,009 Updated Oct 15, 2024

Distribute and run LLMs with a single file.

C++ 19,783 996 Updated Oct 14, 2024

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Python 2,117 380 Updated Oct 15, 2024

Multi-Candidate Speculative Decoding

Python 28 5 Updated Apr 22, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 1 Updated Mar 19, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,598 509 Updated Oct 4, 2024

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…

TypeScript 17,381 4,666 Updated Oct 15, 2024

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

Python 797 80 Updated Sep 27, 2024

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 9,976 819 Updated Jun 10, 2024

Inference Llama 2 in one file of pure C

C 17,314 2,056 Updated Aug 6, 2024

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,922 409 Updated Sep 6, 2024

compiler learning resources collect.

Python 2,097 325 Updated May 27, 2024
Java 1 Updated Oct 26, 2023

TFLite Support is a toolkit that helps users to develop ML and deploy TFLite models onto mobile / ioT devices.

C++ 373 126 Updated Oct 10, 2024

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

C++ 4,758 540 Updated Sep 26, 2024

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 20,286 4,153 Updated Oct 15, 2024

Arm neon optimization practice

C++ 387 103 Updated Dec 22, 2020

Apache NuttX is a mature, real-time embedded operating system (RTOS)

C 2,760 1,146 Updated Oct 15, 2024

开发内功修炼

C 6,266 1,010 Updated Aug 29, 2024

On-Device Training Under 256KB Memory [NeurIPS'22]

Python 1 Updated Dec 2, 2022

Models and examples built with TensorFlow

Python 2 Updated May 21, 2020

IEEE 802.11 a/g/p Transceiver

C++ 2 Updated Oct 22, 2020

software center for hnd/axhnd/axhnd.675x routers

Classic ASP 2 Updated Jan 7, 2021

GNU Radio – the Free and Open Software Radio Ecosystem

C++ 2 Updated Jan 7, 2021
Next