Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…

Python 1,178 108 Updated Apr 3, 2024

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,171 4,017 Updated Jul 17, 2024

huggingface / text-clustering

Easily embed, cluster and semantically label text datasets

Python 404 26 Updated Mar 28, 2024

huggingface / cosmopedia

Python 358 31 Updated Jul 17, 2024

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 1,559 182 Updated Jun 2, 2024

YuchuanTian / RethinkTinyLM

[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”

Python 109 6 Updated Jul 8, 2024

charent / Phi2-mini-Chinese

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型，支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Jupyter Notebook 433 46 Updated Jul 11, 2024

Ledzy / BAdam

Python 172 12 Updated Jul 17, 2024

rasbt / LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 23,098 2,378 Updated Jul 19, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 12,445 1,107 Updated Jul 15, 2024

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,361 432 Updated May 3, 2024

deepseek-ai / DeepSeek-Coder-V2

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

1,390 68 Updated Jul 3, 2024

LiveBench / LiveBench

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Python 139 11 Updated Jul 18, 2024

princeton-nlp / LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Python 494 38 Updated Mar 4, 2024

EleutherAI / pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Jupyter Notebook 2,149 157 Updated Jul 12, 2024

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 34,899 5,369 Updated Jul 14, 2024

deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 923 41 Updated Jan 16, 2024

hadasah / btm

Python 66 6 Updated Apr 29, 2024

LilitYolyan / CutPaste

Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch

Python 111 25 Updated Apr 21, 2022

lukasruff / Classification-AD

Repository for the paper "Rethinking Assumptions in Anomaly Detection"

Python 35 8 Updated Oct 3, 2023

hendrycks / outlier-exposure

Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

Python 537 106 Updated Oct 9, 2021

xuhongzuo / rosas

official implementation of RoSAS: Deep Semi-supervised Anomaly Detection with Contamination-resilient Continuous Supervision

Python 8 3 Updated Jul 18, 2023

OpenBMB / MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Python 4,445 321 Updated Jul 16, 2024

janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)

TypeScript 20,971 1,209 Updated Jul 18, 2024

arcee-ai / mergekit

Tools for merging pretrained large language models.

Python 4,122 359 Updated Jul 19, 2024

QwenLM / Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 6,324 359 Updated Jul 18, 2024

Starred topics

one-class-learning

sequence-prediction

knowledge-distillation

neural-architecture-search