Skip to content
@mit-han-lab

MIT HAN Lab

Efficient AI Computing. PI: Song Han

Pinned Loading

  1. streaming-llm streaming-llm Public

    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks

    Python 6.6k 363

  2. smoothquant smoothquant Public

    [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

    Python 1.2k 144

  3. llm-awq llm-awq Public

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    Python 2.5k 191

  4. bevfusion bevfusion Public archive

    [ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

    Python 2.3k 417

  5. once-for-all once-for-all Public

    [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

    Python 1.9k 333

  6. temporal-shift-module temporal-shift-module Public

    [ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

    Python 2.1k 416

Repositories

Showing 10 of 55 repositories
  • duo-attention Public

    DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

    mit-han-lab/duo-attention’s past year of commit activity
    Python 321 MIT 14 1 0 Updated Oct 31, 2024
  • efficientvit Public

    Efficient vision foundation models for high-resolution generation and perception.

    mit-han-lab/efficientvit’s past year of commit activity
    Python 2,239 Apache-2.0 182 91 0 Updated Oct 29, 2024
  • vila-u Public

    VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

    mit-han-lab/vila-u’s past year of commit activity
    Python 100 MIT 2 1 0 Updated Oct 24, 2024
  • Quest Public

    [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

    mit-han-lab/Quest’s past year of commit activity
    Cuda 189 18 4 1 Updated Oct 20, 2024
  • llm-awq Public

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    mit-han-lab/llm-awq’s past year of commit activity
    Python 2,484 MIT 191 129 6 Updated Oct 16, 2024
  • hart Public

    HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

    mit-han-lab/hart’s past year of commit activity
    Python 275 MIT 8 5 1 Updated Oct 16, 2024
  • Block-Sparse-Attention Public

    A sparse attention kernel supporting mix sparse patterns

    mit-han-lab/Block-Sparse-Attention’s past year of commit activity
    C++ 49 BSD-3-Clause 0 2 0 Updated Oct 14, 2024
  • data-efficient-gans Public

    [NeurIPS 2020] Differentiable Augmentation for Data-Efficient GAN Training

    mit-han-lab/data-efficient-gans’s past year of commit activity
    Python 1,279 BSD-2-Clause 175 27 0 Updated Sep 24, 2024
  • qserve Public

    QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

    mit-han-lab/qserve’s past year of commit activity
    Python 425 Apache-2.0 22 26 3 Updated Sep 5, 2024
  • proxylessnas Public

    [ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

    mit-han-lab/proxylessnas’s past year of commit activity
    C++ 1,422 MIT 285 0 2 Updated Aug 30, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Most used topics

Loading…