26-25.3.4-大语言模型训练篇:多机多卡微调 - 林希老师-执行报错 #606

Open
opened 2025-03-07 11:44:08 +08:00 by liwd1977 · 1 comment

执行到6.1.1步骤时,命令报错(见附件),请老师帮忙看下
我看到报错后,曾经尝试创建/userhome/xtuner-workdir1目录,修改命令中的--wordir缺失字符k(从文档直接copy-paste会丢失这个字母)等操作,但报错都仍然存在

执行到6.1.1步骤时,命令报错(见附件),请老师帮忙看下 我看到报错后,曾经尝试创建/userhome/xtuner-workdir1目录,修改命令中的--wordir缺失字符k(从文档直接copy-paste会丢失这个字母)等操作,但报错都仍然存在

试下这里的复制
cd /code/
NPROC_PER_NODE=1 xtuner train qwen1_5_0_5b_chat_full_alpaca_e3_copy.py --work-dir /userhome/xtuner-workdir1 --deepspeed deepspeed_zero3_offload

试下这里的复制 cd /code/ NPROC_PER_NODE=1 xtuner train qwen1_5_0_5b_chat_full_alpaca_e3_copy.py --work-dir /userhome/xtuner-workdir1 --deepspeed deepspeed_zero3_offload
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: HswOAuth/llm_course#606
No description provided.