- Nanjing
Block or Report
Block or report LynZhangyl
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
The Gretel Python Client allows you to interact with the Gretel REST API.
Aligning pretrained language models with instruction data generated by themselves.
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…
Code and documentation to train Stanford's Alpaca models, and generate the data.
Easily embed, cluster and semantically label text datasets
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
Fast and memory-efficient exact attention
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
LiveBench: A Challenging, Contamination-Free LLM Benchmark
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
The hub for EleutherAI's work on interpretability and learning dynamics
The simplest, fastest repository for training/finetuning medium-sized GPTs.
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch
Repository for the paper "Rethinking Assumptions in Anomaly Detection"
Deep Anomaly Detection with Outlier Exposure (ICLR 2019)
official implementation of RoSAS: Deep Semi-supervised Anomaly Detection with Contamination-resilient Continuous Supervision
MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
Tools for merging pretrained large language models.
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.