Muennighoff

Niklas Muennighoff Muennighoff

217 followers · 5 following

Achievements

x3 x3 x3

Achievements

x3 x3 x3

Stars

allenai / OLMoE

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 397 30 Updated Sep 17, 2024

All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More

Python 32,280 3,697 Updated Oct 1, 2024

sail-sg / scaling-with-vocab

[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623

Python 53 4 Updated Sep 26, 2024

xlang-ai / BRIGHT

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Python 44 Updated Jul 23, 2024

embeddings-benchmark / results

Data for the MTEB leaderboard

Python 4 5 Updated Oct 1, 2024

embeddings-benchmark / leaderboard

Code for the MTEB leaderboard

Python 10 9 Updated Oct 1, 2024

sail-sg / regmix

🧬 RegMix: Data Mixture as Regression for Language Model Pre-training

Jupyter Notebook 81 3 Updated Sep 19, 2024

mlfoundations / dclm

DataComp for Language Models

HTML 1,121 99 Updated Sep 5, 2024

embeddings-benchmark / arena

Code for the MTEB Arena

Python 14 6 Updated Sep 17, 2024

AIR-Bench / AIR-Bench

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark

Python 98 8 Updated Sep 28, 2024

bigcode-project / bigcodebench

BigCodeBench: Benchmarking Code Generation Towards AGI

Python 191 22 Updated Sep 17, 2024

bigcode-project / starcoder

Home of StarCoder: fine-tuning & inference!

Python 7,273 517 Updated Feb 27, 2024

bigcode-project / starcoder2

Home of StarCoder2!

Python 1,731 158 Updated Mar 21, 2024

mlfoundations / scaling

Language models scale reliably with over-training and on downstream tasks

Jupyter Notebook 91 4 Updated Apr 2, 2024

alon-albalak / data-selection-survey

A Survey on Data Selection for Language Models

151 8 Updated Jun 4, 2024

ContextualAI / gritlm

Generative Representational Instruction Tuning

Jupyter Notebook 537 38 Updated Sep 22, 2024

ContextualAI / HALOs

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Python 710 39 Updated Sep 28, 2024

loubnabnl / bloom-code-evaluation

Evaluation of BLOOM on the HumanEval benchmark

Shell 6 Updated Aug 18, 2022

KennethEnevoldsen / scandinavian-embedding-benchmark

A Scandinavian Benchmark for sentence embeddings

Python 27 3 Updated Aug 24, 2024

bigcode-project / astraios

Astraios: Parameter-Efficient Instruction Tuning Code Language Models

Jupyter Notebook 57 2 Updated Apr 10, 2024

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 6,575 1,742 Updated Sep 30, 2024

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 4,463 446 Updated Oct 1, 2024

allenai / dolma

Data and tools for generating and inspecting OLMo pre-training data.

Python 928 100 Updated Sep 30, 2024

bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

Python 781 206 Updated Sep 26, 2024

Data-Provenance-Initiative / Data-Provenance-Collection

Jupyter Notebook 186 40 Updated Oct 1, 2024

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Python 7,000 511 Updated Sep 26, 2024

bigcode-project / octopack

🐙 OctoPack: Instruction Tuning Code Large Language Models

Jupyter Notebook 425 27 Updated Sep 22, 2024

huggingface / datablations

Scaling Data-Constrained Language Models

Jupyter Notebook 316 19 Updated Sep 22, 2024

bigscience-workshop / multilingual-modeling

BLOOM+1: Adapting BLOOM model to support a new unseen language

Python 70 16 Updated Mar 2, 2024

bigscience-workshop / xmtf

Crosslingual Generalization through Multitask Finetuning

Jupyter Notebook 513 37 Updated Sep 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Niklas Muennighoff Muennighoff

Achievements

Achievements

Block or report Muennighoff

Stars

allenai / OLMoE

All-Hands-AI / OpenHands

sail-sg / scaling-with-vocab

xlang-ai / BRIGHT

embeddings-benchmark / results

embeddings-benchmark / leaderboard

sail-sg / regmix

mlfoundations / dclm

embeddings-benchmark / arena

AIR-Bench / AIR-Bench

bigcode-project / bigcodebench

bigcode-project / starcoder

bigcode-project / starcoder2

mlfoundations / scaling

alon-albalak / data-selection-survey

ContextualAI / gritlm

ContextualAI / HALOs

loubnabnl / bloom-code-evaluation

KennethEnevoldsen / scandinavian-embedding-benchmark

bigcode-project / astraios

EleutherAI / lm-evaluation-harness

allenai / OLMo

allenai / dolma

bigcode-project / bigcode-evaluation-harness

Data-Provenance-Initiative / Data-Provenance-Collection

FlagOpen / FlagEmbedding

bigcode-project / octopack

huggingface / datablations

bigscience-workshop / multilingual-modeling

bigscience-workshop / xmtf