PEFT库:参数高效微调工具集
--- title: "PEFT库:参数高效微调工具集" description: "全面介绍PEFT库支持的各种参数高效微调方法,选择最适合的微调策略" tags: ["PEFT", "参数高效微调", "Adapter", "Prefix Tuning"] category: "llm" icon: "🧠"
PEFT库:参数高效微调工具集
PEFT简介
PEFT(Parameter-Efficient Fine-Tuning)是Hugging Face推出的参数高效微调工具库,集成了多种先进的微调方法。它提供了统一的API接口,让开发者可以轻松尝试不同的参数高效微调技术。
PEFT支持的主要方法:
- LoRA:低秩适配
- QLoRA:量化低秩适配
- Prefix Tuning:前缀微调
- Prompt Tuning:提示微调
- Adapter:适配器层
- IA3:内部适配
安装与使用
pip install peft
基本使用模式
from peft import (
LoraConfig,
get_peft_model,
TaskType,
PeftModel
)
# 1. 加载基础模型
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("model_name")
# 2. 配置PEFT方法
config = LoraConfig(
task_type=TaskType.CAUSAL_LM,
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj"]
)
# 3. 应用PEFT
model = get_peft_model(model, config)
# 4. 训练后保存
model.save_pretrained("peft_adapter")
# 5. 加载PEFT模型
model = PeftModel.from_pretrained(base_model, "peft_adapter")
LoRA配置详解
from peft import LoraConfig, TaskType
lora_config = LoraConfig(
task_type=TaskType.CAUSAL_LM, # 因果语言模型
r=16, # 低秩维度
lora_alpha=32, # 缩放因子
lora_dropout=0.1, # Dropout率
target_modules=[ # 目标模块
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
],
bias="none", # 偏置处理
modules_to_save=None # 需要完全训练的模块
)
# 不同任务类型
task_types = {
"因果LM": TaskType.CAUSAL_LM,
"序列分类": TaskType.SEQ_CLS,
"token分类": TaskType.TOKEN_CLS,
"问答": TaskType.QUESTION_ANS
}
Prefix Tuning
Prefix Tuning在每一层添加可训练的前缀向量:
from peft import PrefixTuningConfig
prefix_config = PrefixTuningConfig(
task_type=TaskType.CAUSAL_LM,
num_virtual_tokens=20, # 虚拟token数量
projection_dim=64, # 投影维度
encoder_hidden_size=256 # 编码器隐藏层大小
)
model = get_peft_model(model, prefix_config)
model.print_trainable_parameters()
Prompt Tuning
Prompt Tuning仅在输入嵌入层添加可训练参数:
from peft import PromptTuningConfig, TaskType
prompt_config = PromptTuningConfig(
task_type=TaskType.CAUSAL_LM,
num_virtual_tokens=10, # 虚拟token数量
prompt_tuning_init="TEXT", # 使用文本初始化
prompt_tuning_init_text="Classify if this text is positive or negative:",
tokenizer_name_or_path="model_name"
)
model = get_peft_model(model, prompt_config)
Adapter方法
Adapter在Transformer层中插入小型可训练模块:
from peft import AdapterConfig, TaskType
adapter_config = AdapterConfig(
task_type=TaskType.CAUSAL_LM,
hidden_size=768, # 隐藏层大小
adapter_size=64, # Adapter维度
adapter_activation="relu",
adapter_init_scale=1e-3,
add_layer_norm=True, # 添加LayerNorm
adapter_residual=True # 残差连接
)
model = get_peft_model(model, adapter_config)
IA3(Infused Adapter by Inhibiting and Amplifying Inner Activations)
from peft import IA3Config
ia3_config = IA3Config(
task_type=TaskType.CAUSAL_LM,
target_modules=["q_proj", "v_proj", "mlp"],
feedforward_modules=["mlp"],
lora_alpha=32,
lora_dropout=0.05
)
model = get_peft_model(model, ia3_config)
方法对比
# 各方法的参数量对比
methods = {
"LoRA (r=16)": {"params": "4M", "memory": "低"},
"Prefix Tuning (20 tokens)": {"params": "2M", "memory": "低"},
"Prompt Tuning (10 tokens)": {"params": "0.1M", "memory": "极低"},
"Adapter (size=64)": {"params": "6M", "memory": "中"},
"IA3": {"params": "0.5M", "memory": "极低"}
}
组合使用
# 多个PEFT方法可以组合使用
from peft import PeftModel, LoraConfig, PrefixTuningConfig
# 先应用LoRA
lora_model = get_peft_model(model, LoraConfig(r=16))
# 再添加Prefix Tuning(实验性)
# 注意:不是所有方法都可以组合
适配器管理
from peft import PeftModel
# 加载模型
model = PeftModel.from_pretrained(base_model, "adapter1")
# 添加新适配器
model.load_adapter("adapter2", adapter_name="task2")
# 切换适配器
model.set_adapter("task2")
# 查看适配器信息
print(model.peft_config)
# 删除适配器
model.delete_adapter("task2")
训练最佳实践
from transformers import TrainingArguments
# 参数高效微调通常需要更高的学习率
training_args = TrainingArguments(
learning_rate=3e-4, # 比全量微调高
num_train_epochs=3,
per_device_train_batch_size=8,
gradient_accumulation_steps=4,
warmup_steps=100,
weight_decay=0.01,
logging_steps=10,
save_steps=500,
fp16=True
)
# 使用PEFT Trainer的推荐设置
from peft import prepare_model_for_kbit_training
# 量化模型需要先准备
model = prepare_model_for_kbit_training(model)
导出与部署
# 方法1:保存适配器(推荐)
model.save_pretrained("peft_adapter")
# 方法2:合并权重导出
merged_model = model.merge_and_unload()
merged_model.save_pretrained("merged_model")
# 方法3:仅保存训练的参数
model.save_pretrained("peft_only", save_embedding_layers=True)
PEFT库通过提供多样化的参数高效微调方法,让开发者可以根据具体需求和资源约束选择最合适的微调策略。