Skip to content
View Gothamv's full-sized avatar
  • 11° 1' 23.48'' N, 77° 0' 20.82'' E

Highlights

  • Pro
Block or Report

Block or report Gothamv

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

WIP

Python 70 1 Updated Aug 13, 2024

pytorch implementation for Patient Knowledge Distillation for BERT Model Compression

Python 195 45 Updated Sep 20, 2019

KAN (Kolmogorov–Arnold Networks) in the MLX framework

Python 6 2 Updated Aug 4, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 20,418 2,052 Updated Jul 18, 2024

A Machine Learning framework from scratch in Pure Mojo 🔥

Mojo 366 24 Updated Jul 19, 2024

All the handwritten notes 📝 and source code files 🖥️ used in my YouTube Videos on Machine Learning & Simulation (https://www.youtube.com/channel/UCh0P7KwJhuQ4vrzc3IRuw4Q)

Jupyter Notebook 797 176 Updated Jul 29, 2024

From the Transistor to the Web Browser, a rough outline for a 12 week course

5,111 426 Updated Oct 12, 2021

High Quality Resources on GPU Programming/Architecture

548 16 Updated Jul 26, 2024

Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.

Jupyter Notebook 5,283 1,334 Updated Jan 20, 2024

1.58 Bit LLM on Apple Silicon using MLX

Python 93 4 Updated May 10, 2024

A library for mechanistic interpretability of GPT-style language models

Python 1,324 263 Updated Aug 14, 2024

Materials for EACL2024 tutorial: Transformer-specific Interpretability

Jupyter Notebook 31 1 Updated Mar 26, 2024

Plot the vector graph of attention based text visualisation

Python 363 58 Updated Apr 12, 2019

Video+code lecture on building nanoGPT from scratch

Python 63 10 Updated Jun 14, 2024

gpt-2 from scratch in mlx

Python 341 22 Updated Jun 12, 2024

Video+code lecture on building nanoGPT from scratch

Python 3,230 427 Updated Aug 13, 2024

1st Place Solution for LLM - Detect AI Generated Text Kaggle Competition

Python 137 24 Updated May 20, 2024

Cramming the training of a (BERT-type) language model into limited compute.

Python 1,278 100 Updated Jun 13, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 11,800 919 Updated May 23, 2024

fast vector database made in numpy

Python 734 37 Updated Apr 29, 2024

Build robust LLM applications with true composability 🔗

Python 407 28 Updated Jan 3, 2024

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,639 965 Updated Jul 10, 2024

Implementation of a Transformer, but completely in Triton

Python 233 13 Updated Apr 5, 2022

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Python 14,253 940 Updated Aug 14, 2024

knowledge distillation papers

730 82 Updated Feb 10, 2023

Attention is all you need implementation

Jupyter Notebook 503 218 Updated Jun 8, 2024

Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…

Python 2,774 344 Updated May 8, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python 1,310 139 Updated Jun 3, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 24,757 2,613 Updated Aug 14, 2024
Next