Block or Report
Block or report suhmily
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Llama3、Llama3.1 中文仓库(撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
Finetune Llama-3-8b on the MathInstruct dataset
A family of compressed models obtained via pruning and knowledge distillation
PaL: Program-Aided Language Models (ICML 2023)
Mix of Minimal Optimal Sets (MMOS) of dataset has two advantages for two aspects, higher performance and lower construction costs on math reasoning.
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
Train transformer language models with reinforcement learning.
A curated list of language modeling researches for code and related datasets.
代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota
Lightweight and portable LLM sandbox runtime (code interpreter) Python library.
OrangeX4 / latex2sympy
Forked from purdue-tlt/latex2sympyParse LaTeX math expressions
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training
Hackable and optimized Transformers building blocks, supporting a composable construction.
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks. This tool combines the capa…
This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting