【求助】基于LLaMA-Factory的模型微调训练train_loss为1985太大 #243
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
question
wontfix
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: HswOAuth/llm_course#243
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
控制台打印日志如附件txt
可以尝试对学习率进行调整?学习率过高或过低都可能导致训练损失过高。
请保持所有参数与上课时一致,再次尝试微调,观察下loss
我的参数和上课设置一样的,步骤也一样的,但微调结束后没有画出损失曲线,测试效果时,胡言乱语。
可以在这个后面跟进一下喔,这个问题不少同学都有,如果有后续解决方案会在这里更新#273 (comment)
我也是,做了几次了,都一样的效果,没损失曲线,改成6次,学习率改成10^-4次方都试过
请将LLaMA-Factory放置在/root/目录下,也就是:/root/LLaMA-Factory,然后再重复做一次实验