Stars
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
PolyLUT is the first quantized neural network training methodology that maps a neuron to a LUT while using multivariate polynomial function learning to exploit the flexibility of the FPGA soft logic.
NN Training at the Edge with PyTorch, PYNQ and an Ultra96-V2 board
Allo: A Programming Model for Composable Accelerator Design
Run stable-diffusion-webui with Radeon RX 580 8GB on Ubuntu 22.04.2 LTS
VHDL module for running operations from memory with the software also written in vhdl
Inference Llama 2 using only python and numpy
This repo contains the code for the project, DIY Smart Watch using M5StickC
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…
GPGPU-Sim enabled Turing WMMA API and its benchmark results. Undergraduate study at Yonsei Univ.
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"