Skip to content
View Jiacheng-Zhu-AIML's full-sized avatar
🚀
🚀

Highlights

  • Pro

Block or report Jiacheng-Zhu-AIML

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization

Python 27 6 Updated Jan 31, 2023

Official Implementation of SWAD (NeurIPS 2021)

Python 151 18 Updated Dec 10, 2022

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,672 87 Updated Jan 21, 2024

Toolkit for creating, sharing and using natural language prompts.

Python 2,629 345 Updated Oct 23, 2023

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

Python 270 23 Updated Jun 17, 2024

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Python 1,330 64 Updated Mar 8, 2024
Python 481 44 Updated Aug 15, 2024

A JAX research toolkit for building, editing, and visualizing neural networks.

Python 1,614 50 Updated Aug 12, 2024

The official Meta Llama 3 GitHub site

Python 25,719 2,863 Updated Aug 12, 2024
Python 1,146 163 Updated Aug 21, 2024

[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"

Python 62 9 Updated Jun 6, 2024

Model Stock: All we need is just a few fine-tuned models

75 Updated Mar 29, 2024

official code for "Large Language Models as Optimizers"

Python 362 38 Updated Aug 16, 2024

Code examples and resources for DBRX, a large language model developed by Databricks

Python 2,493 234 Updated May 1, 2024

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 11,584 783 Updated Aug 15, 2024
Jupyter Notebook 83 6 Updated Apr 27, 2024

Official repository of Evolutionary Optimization of Model Merging Recipes

Python 1,173 81 Updated Mar 30, 2024

Sphinx theme from Read the Docs

Sass 4,728 1,728 Updated Aug 20, 2024
Jupyter Notebook 101 11 Updated Sep 20, 2023

Grok open release

Python 49,374 8,328 Updated Aug 7, 2024
Python 4,076 509 Updated Mar 19, 2024

Official implementation for Sparse MetA-Tuning (SMAT)

Python 9 Updated Jun 29, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python 1,321 139 Updated Jun 3, 2024

[NAACL 2024] Embodied Executable Policy Learning with Language-based Scene Summarization

Python 4 Updated Mar 13, 2024
Python 24 1 Updated Nov 23, 2023
Python 409 15 Updated Nov 2, 2023

Official PyTorch Implementation of "Learning to Learn with Generative Models of Neural Network Checkpoints"

Python 332 21 Updated Oct 3, 2022

Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answer based on user queries.

Python 134 9 Updated Feb 9, 2024