Starred repositories
Trade autonomously on Polymarket using AI Agents
PC Software for BambuLab and other 3D printers
Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?
Empowering RAG with a memory-based data interface for all-purpose applications!
Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"
Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead
YOLOv10 trained on DocLayNet dataset.
Memory optimized finetuning scripts for CogVideoX using TorchAO and DeepSpeed
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Allows adding ChatGPT as a search engine in Firefox.
Latte: Latent Diffusion Transformer for Video Generation.
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Label Studio is a multi-type data labeling and annotation tool with standardized output format
A collection of high-quality models for the MuJoCo physics engine, curated by Google DeepMind.
Towards Robust Evaluation for Geospatial Foundation Models
Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Real-time behaviour synthesis with MuJoCo, using Predictive Control
A scene exporter from Blender to the MuJoCo physics engine
Mean Average Precision for Object Detection
mean Average Precision - This code evaluates the performance of your neural net for object recognition.
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
Official code for "DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut", NeurIPS 2024
[CVPR 2024] On the Content Bias in Fréchet Video Distance
Notebooks for fine tuning pali gemma