Skip to content

Commit

Permalink
Fix/0802 (modelscope#1581)
Browse files Browse the repository at this point in the history
  • Loading branch information
tastelikefeet committed Aug 2, 2024
1 parent 00f6c5c commit fe7b847
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 2 deletions.
8 changes: 8 additions & 0 deletions docs/source/LLM/自定义与拓展.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,14 @@
- [自定义对话模板](#自定义对话模板)

## 自定义数据集
自定义数据集视频介绍:

<video width="600" height="400" controls>
<source src="https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/custom_dataset.mp4" type="video/mp4">
</video>

下面是文字版:

我们支持三种**自定义数据集**的方法.

1. 【推荐】直接命令行传参的方式,指定`--dataset xxx.json yyy.jsonl zzz.csv`, **更加方便支持自定义数据集**, 支持五种数据集格式(即使用`SmartPreprocessor`,支持的数据集格式见下方), 支持`dataset_id``dataset_path`. 不需要修改`dataset_info.json`文件. 该方法适合刚接触ms-swift的用户, 下两种方法适合对ms-swift进行拓展的开发者.
Expand Down
5 changes: 3 additions & 2 deletions swift/llm/utils/template.py
Original file line number Diff line number Diff line change
Expand Up @@ -1012,8 +1012,9 @@ def register_template(template_type: str, template: Template, *, exist_ok: bool

register_template(
TemplateType.default,
Template([], ['### Human:\n{{QUERY}}\n\n### Assistant:\n'], ['\n\n'], [['eos_token_id']], DEFAULT_SYSTEM,
['{{SYSTEM}}\n\n']))
Template([], ['### Human:\n{{QUERY}}\n\n### Assistant:\n'], ['\n\n'], [['eos_token_id']],
DEFAULT_SYSTEM, ['{{SYSTEM}}\n\n'],
auto_add_bos=True))


# You can set the query as '' to serve as a template for pre-training.
Expand Down

0 comments on commit fe7b847

Please sign in to comment.