Model card issue#1125 #1129

EziOzoani · 2023-11-23T23:11:36Z

Updates to address this issue

Addition of training regime in the annotated model card to keep this doc and the template in sync.
Defined training_regime, along with examples

Updates to address [this issue](#1125) - Addition of training regime in the annotated model card to keep this doc and the template in sync. - Defined training_regime, along with examples

Sentence rephrasing

HuggingFaceDocBuilderDev · 2023-11-23T23:15:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

pcuenca · 2023-11-24T09:04:56Z

docs/hub/model-card-annotated.md

@@ -158,6 +158,9 @@ _Write 1-2 sentences on what the training data is. Ideally this links to a Datas

 ## Training Procedure [optional]

+_When you want to know what hardware you'll need to fine-tune a model, consider the following factors: the number of parameters in the model and the training regime you plan to use._
+
+_e.g A model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._


Suggested change

_e.g A model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._

_e.g A model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Ampere or higher hardware. Mixed fp16 requires at least 54GB of GPU memory._

These numbers sound a bit high to me. In any case, they depend on a number of factors like optimizer choice. Should the recommended optimizer be a part of the training_regime data?

docs/hub/model-card-annotated.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

julien-c · 2023-12-18T10:19:57Z

is this PR still in process, @EziOzoani?

EziOzoani added 2 commits November 23, 2023 23:08

Update model-card-annotated.md

4306a91

Updates to address [this issue](#1125) - Addition of training regime in the annotated model card to keep this doc and the template in sync. - Defined training_regime, along with examples

Update model-card-annotated.md

23ba992

Sentence rephrasing

EziOzoani requested review from pcuenca, osanseviero, davanstrien and meg-huggingface November 23, 2023 23:11

osanseviero requested a review from Wauplin November 24, 2023 07:50

pcuenca reviewed Nov 24, 2023

View reviewed changes

EziOzoani and others added 2 commits November 24, 2023 12:24

Update docs/hub/model-card-annotated.md

44c32a2

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update docs/hub/model-card-annotated.md

66f5f60

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model card issue#1125 #1129

Model card issue#1125 #1129

EziOzoani commented Nov 23, 2023

HuggingFaceDocBuilderDev commented Nov 23, 2023

pcuenca Nov 24, 2023

pcuenca Nov 24, 2023

julien-c commented Dec 18, 2023

	_e.g A model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._
	_e.g A model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Ampere or higher hardware. Mixed fp16 requires at least 54GB of GPU memory._

Model card issue#1125 #1129

Are you sure you want to change the base?

Model card issue#1125 #1129

Conversation

EziOzoani commented Nov 23, 2023

HuggingFaceDocBuilderDev commented Nov 23, 2023

pcuenca Nov 24, 2023

Choose a reason for hiding this comment

pcuenca Nov 24, 2023

Choose a reason for hiding this comment

julien-c commented Dec 18, 2023