-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Insights: hiyouga/LLaMA-Factory
Overview
Could not load contribution data
Please try again later
4 Pull requests merged by 4 people
-
update deepseek template
#4892 merged
Jul 26, 2024 -
fix: Repair the issue where quantization failed after merging the adapter.
#4950 merged
Jul 26, 2024 -
Add ROCm support
#4970 merged
Jul 26, 2024 -
Added the reference address for TRL PPO details.
#4961 merged
Jul 26, 2024
1 Pull request opened by 1 person
-
docs: add Japanese README
#4957 opened
Jul 24, 2024
43 Issues closed by 5 people
-
是否支持acclerate launch & pdsh 模式 启动Multi Nodes Training?
#4991 closed
Jul 28, 2024 -
数据集加载出现内存不足情况
#4981 closed
Jul 27, 2024 -
似乎使用vllm部署后的类openai的api中的model字段值可以任意写,与具体是什么模型无关,不影响api调用,对吗?
#4985 closed
Jul 27, 2024 -
尝试训练llama3.1出错
#4984 closed
Jul 27, 2024 -
NPU DPO、KTO训练 learning rate 一直为0
#4980 closed
Jul 27, 2024 -
vllm_maxlen与max_new_tokens区别
#4983 closed
Jul 27, 2024 -
部署时vllm_maxlen和vllm_gpu_util等vllm参数是写在命令行上还是对应的yaml文件中,能否给出这俩参数的使用示例
#4976 closed
Jul 26, 2024 -
docker版无法正常使用
#4975 closed
Jul 26, 2024 -
LoRA相关的代码具体在哪个部分呢?
#4972 closed
Jul 26, 2024 -
微调llama3.1报错,需要更新transformers库
#4962 closed
Jul 25, 2024 -
AttributeError: module 'torch' has no attribute 'float8_e4m3fn'是什么原因?
#4958 closed
Jul 25, 2024 -
dpo_beat参数设置
#4967 closed
Jul 25, 2024 -
Push model to Huggingface - credentials??
#4968 closed
Jul 25, 2024 -
建议增加internLM2.5系列模型
#4959 closed
Jul 24, 2024 -
微调glm4-9b,在运行llamafactory-cli train examples/train_lora/glm4_9b_lora_predict.yaml报错
#4955 closed
Jul 24, 2024 -
请问README中的硬件要求部分,是在多长的序列上测试的
#4956 closed
Jul 24, 2024 -
ChatGLM4模型合并后,使用官方的GLM4的open api发布后访问就出错
#4918 closed
Jul 24, 2024 -
对微调的Qwen2-7B模型做量化,为什么使用量化后的模型推理时间变慢,显存占用无明显提升?
#4920 closed
Jul 24, 2024 -
Should I use both of CPT lora and SFT lora for inference?
#4930 closed
Jul 24, 2024 -
请问垂直领域的开源模型可以加入框架进行pt、sft训练嘛?如何加入呢?
#4933 closed
Jul 24, 2024 -
webui训练GLM4-9B-chat模型时候报错,数据集少的时候没问题,一多就报错
#4928 closed
Jul 24, 2024 -
在use_unsloth: true时pissa工作不正常
#4925 closed
Jul 24, 2024 -
用在微调阶段扩展了词表但未导出的大模型推理时在inference文件夹中的配置文件中加入new_special_tokens: "宋浩"(这是我引入的新token)后跟模型对话模型无法产生任何输出
#4934 closed
Jul 24, 2024 -
可以使用lora训练gptq或者awq量化之后的模型吗?训练没有量化的模型资源不够
#4929 closed
Jul 24, 2024 -
请问chat时候能否显示输出的token和token id?
#4932 closed
Jul 24, 2024 -
API启动,P2P问题。
#4935 closed
Jul 24, 2024 -
Gemma2-9B SFT微调学习率设置问题请教
#4937 closed
Jul 24, 2024 -
训练 loss 始终为 0
#4936 closed
Jul 24, 2024 -
LLaMA-Factory pretrain不支持DeepSpeed zero-stage3
#4938 closed
Jul 24, 2024 -
可以支持meta的新模型Meta-Llama-3.1-8B-Instruct吗?
#4942 closed
Jul 24, 2024 -
Qwen2-7B-instruct训练参数含义请教
#4943 closed
Jul 24, 2024 -
默认参数建议:sft阶段是否考虑将学习率调小
#4944 closed
Jul 24, 2024 -
微调glm-4-9b-chat模型过程是正常的,但是推理的时候报错 ,Transformers是最低版本了
#4945 closed
Jul 24, 2024 -
未训练之前,测试加载的glm-4-9b模型,模型回答重复
#4948 closed
Jul 24, 2024 -
使用2bit跑70b模型报npu显存溢出
#4951 closed
Jul 24, 2024 -
最新版本代码Qwen2-72B-Instruct LORA训练后,推理爆显存
#4949 closed
Jul 24, 2024 -
web页面加载模型后Chat的结果,与api.py启动后对话,回复内容不一致
#4926 closed
Jul 23, 2024 -
是否可以增加联合训练的功能?
#4921 closed
Jul 22, 2024 -
LoRA部分的实现代码在哪呢?
#4922 closed
Jul 22, 2024 -
GLM4-9B-chat模型推理报错
#4917 closed
Jul 22, 2024
18 Issues opened by 18 people
-
是否考虑支持训练时的scheduled sampling模式?
#4990 opened
Jul 28, 2024 -
qwen-audio 的 微调什么时候支持
#4989 opened
Jul 28, 2024 -
llama 3.1-8b-chat 微调和chat加载均报参数错误
#4986 opened
Jul 27, 2024 -
llama3.1 405B fp8 support
#4979 opened
Jul 26, 2024 -
单卡全参微调qwen2-1.5b时产生了报错
#4978 opened
Jul 26, 2024 -
微调训练好后,基于vllm部署的GLM模型,是否支持function calling?
#4977 opened
Jul 26, 2024 -
pytorch的docker镜像是否可以使用pytorch/pytorch
#4974 opened
Jul 26, 2024 -
模型推理一直卡住
#4971 opened
Jul 26, 2024 -
为什么我对qwen2 0.5b所有线性层lora微调和全量微调用的时间基本一致
#4966 opened
Jul 25, 2024 -
windows部署api服务,推理过程使用的是CPU,而不是GPU?
#4965 opened
Jul 25, 2024 -
运行久了,显卡内存越来越高,导致内存溢出
#4964 opened
Jul 25, 2024 -
量化会卡住,Issues里很多人遇到了同样的问题,但都没有解决方案
#4963 opened
Jul 25, 2024 -
How to fine tune 405B
#4940 opened
Jul 23, 2024 -
Cannot find any model weights when activating VLLM-based inference backend
#4931 opened
Jul 23, 2024 -
已经修改过 deepseek-v2-lite的 float32问题,失败但没有明显错误信息
#4924 opened
Jul 22, 2024
8 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
vllm多卡推理遇到的问题
#4893 commented on
Jul 23, 2024 • 0 new comments -
关于运用llama-factory中的deepseed全参数微调qwen2-7b-instruct面临问题
#4898 commented on
Jul 23, 2024 • 0 new comments -
Enable Contamination-Free Packaging Method During Pretraining
#4744 commented on
Jul 24, 2024 • 0 new comments -
模型在合并LoRA权重后回答混乱,与合并前差距明显(非Issue #2505)
#4913 commented on
Jul 24, 2024 • 0 new comments -
qwen2 72b 910b lora后merge生成的权重 推理失败
#4659 commented on
Jul 24, 2024 • 0 new comments -
全量参数开放做增量预训练,数据集加载内存溢出报错,不符合预期
#4915 commented on
Jul 26, 2024 • 0 new comments -
关于最新代码的模型切分问题
#4912 commented on
Jul 26, 2024 • 0 new comments -
Support Several MLLM Models
#4136 commented on
Jul 22, 2024 • 0 new comments