Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoNLP TrainerBase and Text Classification #3728

Merged
merged 76 commits into from
Jan 5, 2023
Merged

Conversation

sijunhe
Copy link
Collaborator

@sijunhe sijunhe commented Nov 11, 2022

PR types

New features

PR changes

APIs

Description

AutoNLP TrainerBase and Text Classification

AiStudio例子 https://aistudio.baidu.com/aistudio/projectdetail/4994688?contributionType=1

@sijunhe sijunhe self-assigned this Nov 11, 2022
auto_trainer.train(
train_ds,
dev_ds,
num_cpus=1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的计算会存在异构设备同时计算吗?
因为同时暴露num_cpus和num_gpus.
从飞桨自身发展角度,未来还有更多硬件,如昆仑xpu等,这一命名设计会限制未来新AI硬件接入的扩展

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的cpu就是正常的cpu core, gpu特指cuda devices. 至于未来的npu, xpu之类的,也可以通过类似的方法来限制https://docs.ray.io/en/latest/tune/faq.html#how-do-i-set-resources

这块有什么建议吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为ray也是apache 2.0协议开源的,有定制的需要我们也可以向upstream推。之前tpu,他们就做了类似的工作https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/gcp/tpu.yaml

time_budget_s=time_budget_s,
max_concurrent_trials=max_concurrent_trials,
)
tuner = tune.Tuner(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

得看ray的依赖是否很重,如果很重的话是否可以在外围,当用户需要这个功能时再安装,而不宜作为默认的依赖

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当前就是放在外围的。见setup.py

@ZeyuChen
Copy link
Member

#3724
GitHub有DraftPR的功能,后续可以不需要通过PR Title这么区分

@sijunhe sijunhe marked this pull request as draft November 12, 2022 03:15
@sijunhe sijunhe changed the title [Draft, Do Not Review] AutoNLP TrainerBase and Text Classification AutoNLP TrainerBase and Text Classification Nov 12, 2022
@sijunhe sijunhe marked this pull request as ready for review November 14, 2022 12:11
@sijunhe sijunhe marked this pull request as draft November 15, 2022 02:15
@sijunhe sijunhe marked this pull request as ready for review January 3, 2023 11:58
@@ -1,4 +1,4 @@
paddlepaddle>=2.3.0,<2.4.0
paddlepaddle==2.4.0rc0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前prompt model的动转静需要2.4.0以后才能跑通

@@ -45,6 +45,7 @@ unit-test:
install:
pip install -r requirements-dev.txt
pip install -r requirements.txt
pip install -r paddlenlp/experimental/autonlp/requirements.txt
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里按道理不应该加入make install, 因为大部分开发都不需要autonlp的依赖,现在暂时为了跑单测加入

Copy link
Member

@ZeyuChen ZeyuChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文档增加下跳转,同意一下AI Studio品牌名撰写,其他没问题

paddlenlp/experimental/autonlp/README.md Show resolved Hide resolved
paddlenlp/experimental/autonlp/README.md Outdated Show resolved Hide resolved
paddlenlp/experimental/autonlp/README_en.md Show resolved Hide resolved
train_dataset=train_ds,
eval_dataset=dev_ds,
label_column="labels",
text_column="sentence",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于文本分类任务中输入有两个columns输入,这里是否兼容了?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

刻意不兼容,后续会有专门的AutoTrainerForSemanticSearch


```python
auto_trainer = AutoTrainerForTextClassification(
train_dataset=train_ds,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的输入这一侧,是否要把datasets概念传递给用户了?

Copy link
Collaborator Author

@sijunhe sijunhe Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对,这里暂时是这样设计的,用户需要自己把数据集转成datasets格式。一个是trainer本来就需要Datasets, 所以这里正好衔接上。还有就是MVP版本尽量轻量化,减少非核心功能,如果后续有需求,可以支持pd.DataFrame之类的转化


- num_models (int, required): 模型试验数量
- num_gpus (str, optional): 实验使用的 GPU 数量。默认情况下,这是根据检测到的 GPU 设置的。
- num_cpus (str, optional): 实验使用的 CPU 数量。默认情况下,这是根据检测到的 vCPU 设置的。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vCPU -> CPU

Copy link
Collaborator Author

@sijunhe sijunhe Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里确实是vCPU, 英语原文是virtual core, 来自底层调用的ray

- num_gpus (str, optional): 实验使用的 GPU 数量。默认情况下,这是根据检测到的 GPU 设置的。
- num_cpus (str, optional): 实验使用的 CPU 数量。默认情况下,这是根据检测到的 vCPU 设置的。
- max_concurrent_trials (int, optional): 同时运行的最大试验数。必须是非负数。如果为 None 或 0,则不应用任何限制。默认为None。
- time_budget_s: (int|float|datetime.timedelta, optional) 以秒为单位的全局时间预算,超过时间后停止所有模型试验。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的时间限制倒是没有必要提供这么多类型,可以直接int、float类型

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个来自于底层ray的配置,既然人家配置好了,我也乐于放出来

- hp_overrides: (dict[str, Any], optional): (仅限高级用户)。覆盖每个候选模型的超参数。例如,`{"TrainingArguments.max_steps":5}`。
- custom_model_candiates: (dict[str, Any], optional): (仅限高级用户)。运行用户提供的候选模型而不 PaddleNLP 的默认候选模型。可以参考 `._model_candidates` 属性


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里还是有疑问?AutoNLP看起来没有统一对外接口,而是不同的任务统一接口,这里的考虑是什么了?

Copy link
Collaborator Author

@sijunhe sijunhe Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

以后会有,当前MVP因为只有一个分类的class, 暂时还没有配置。后续会支持类似Taskflow这种,加上一个task_type

"""
model_result = self._get_model_result(trial_id=trial_id)
exported_model_path = os.path.join(model_result.log_dir, self.export_path)
shutil.copytree(exported_model_path, export_path, dirs_exist_ok=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

导出模型这里,可以把静态图的模型的导出放入进去

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一块暂时还是输出动态图,保持一个灵活性。后续版本会统一增加to_static的stage。

from hyperopt import hp
from paddle.io import Dataset
from scipy.special import expit as sigmoid
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sklearn看起来没有放入到requirements

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddlenlp本来就带了sklearn, seqeval带进来的

)

@property
def _model_candidates(self) -> List[Dict[str, Any]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前咱们只考虑预训练模型是吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sijunhe sijunhe merged commit 4afe186 into develop Jan 5, 2023
@sijunhe sijunhe deleted the autonlp_trainer branch January 5, 2023 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants