Skip to content

Latest commit

 

History

History
31 lines (27 loc) · 1.36 KB

4.adapter.md

File metadata and controls

31 lines (27 loc) · 1.36 KB

Adapter

Adapter是Parameter-Efficient Transfer Learning for NLP 论文提供的轻量级训练组件。一般添加到MLP结构之后生效。

AdapterConfig (
 dim: int MLP结构输出中hidden_state的dim一般等于模型的hidden_size
 target_modules: Union[List[str], str] MLP结构的module_key如果是str类型则进行full_match统配查找如果是List则进行末尾匹配
 hidden_pos: Union[str, int] MLP输出结构中hidden_state的位置如果是tuple/list则传入int如果是dict则传入str类型的key
 method_name: str MLP结构的前向方法Adapter默认会patch到该方法上在forward调用后使用其hidden_state输入tuner默认是forwardadapter_length: int adapter结构中间层长度默认为128
 act_layer: str 激活算子默认为gelu
)

一个使用adapter的例子如下:

from modelscope import Model
from swift import Swift, LoRAConfig
import torch
model = Model.from_pretrained('ZhipuAI/chatglm2-6b', torch_dtype=torch.bfloat16, device_map='auto')
adapter_config = AdapterConfig(
                dim=model.config.hidden_size,
                target_modules=['mlp']),
                method_name='forward',
                hidden_pos=0,
            )
model = Swift.prepare_model(model, adapter_config)
# use model to do other things