【求助】LLaMA-Factory对 GLM-4-9B-Chat 进行模型微调loss直接为0 #229
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
question
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: HswOAuth/llm_course#229
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
问题
这个实验在之前讲课的时候试过一次,没有问题。这两天整理笔记的时候有重新操作了一下,基座模型下载后运行没问题,通过LLaMA-Factory微调后,第二次打印日志时loss已经是0了,模型回答也乱码,试了Qwen-7B-Chat模型相同数据集微调后没有这个问题,这个是GLM-4-9B-Chat问题吗,微调参数要调整哪个地方。请大家帮忙解答一下。



微调参数没动
训练情况
日志
可以尝试将模型删除,然后重新下载基座模型试试,另外请保持与上课时参数一模一样,再尝试看能否可以微调
都试过了。老师,我看这个好像是普遍问题,看到别的同学也没法复现。