Adapter

Adapter是Parameter-Efficient Transfer Learning for NLP 论文提供的轻量级训练组件。一般添加到MLP结构之后生效。

AdapterConfig (
 dim: int MLP结构输出中hidden_state的dim，一般等于模型的hidden_size
 target_modules: Union[List[str], str] MLP结构的module_key，如果是str类型则进行full_match统配查找，如果是List，则进行末尾匹配
 hidden_pos: Union[str, int] MLP输出结构中hidden_state的位置，如果是tuple/list则传入int，如果是dict则传入str类型的key
 method_name: str MLP结构的前向方法，Adapter默认会patch到该方法上，在forward调用后使用其hidden_state输入tuner，默认是forward。
 adapter_length: int adapter结构中间层长度，默认为128
 act_layer: str 激活算子，默认为gelu
)

一个使用adapter的例子如下：

from modelscope import Model
from swift import Swift, LoRAConfig
import torch
model = Model.from_pretrained('ZhipuAI/chatglm2-6b', torch_dtype=torch.bfloat16, device_map='auto')
adapter_config = AdapterConfig(
                dim=model.config.hidden_size,
                target_modules=['mlp']),
                method_name='forward',
                hidden_pos=0,
            )
model = Swift.prepare_model(model, adapter_config)
# use model to do other things

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4.adapter.md

4.adapter.md

Adapter

Files

4.adapter.md

Latest commit

History

4.adapter.md

File metadata and controls

Adapter