大模型微调踩坑:RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
今天在微调Llama-3.1-8B-Instruct模型时遇到了一个奇怪的错误。当我尝试使用QLoRA和PEFT进行微调时,程序报错提示。
·
今天在微调Llama-3.1-8B-Instruct模型时遇到了一个奇怪的错误。当我尝试使用QLoRA和PEFT进行微调时,程序报错提示
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
解决方法:
在调用get_peft_model之前添加以下代码即可解决:
model.enable_input_require_grads()
model = get_peft_model(model, peft_config)
整个流程的重要代码:
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
)
peft_config = LoraConfig(
r=16,
lora_alpha=32,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
target_modules=["q_proj", "k_proj", "v_proj"]
#["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
)
training_args = SFTConfig(
output_dir="./model_7b",
per_device_train_batch_size=1,
gradient_accumulation_steps=4,
learning_rate=2e-5,
num_train_epochs=1,
max_steps=-1,
lr_scheduler_type="cosine",
warmup_steps=100,
logging_steps=10,
save_steps=50,
save_total_limit=2,
fp16=False,
bf16=True,
gradient_checkpointing=True,
optim="paged_adamw_8bit",
dataset_text_field="messages"
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)
# 这一行代码可以解决这个报错
# ===================================
model.enable_input_require_grads()
# ===================================
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
model.config.use_cache = False
model.gradient_checkpointing_enable()
model.train()
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
args=training_args,
)
火山引擎开发者社区是火山引擎打造的AI技术生态平台,聚焦Agent与大模型开发,提供豆包系列模型(图像/视频/视觉)、智能分析与会话工具,并配套评测集、动手实验室及行业案例库。社区通过技术沙龙、挑战赛等活动促进开发者成长,新用户可领50万Tokens权益,助力构建智能应用。
更多推荐


所有评论(0)