Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device. #1665

Open
Wolchenok57 opened this issue Oct 3, 2024 · 0 comments

Comments

@Wolchenok57
Copy link

Wolchenok57 commented Oct 3, 2024

A new undistilled version of flux.1 dev that may be easier to train (https://huggingface.co/nyanko7/flux-dev-de-distill ; https://huggingface.co/MinusZoneAI/flux-dev-de-distill-fp8), raised error: NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

I tested this on fluxgym, comfyui-flux-trainer and flux branch of sd-scripts and it is the same between them all.
Full traceback with launch (my env file is in fluxgym folder, i do not have infinite nvme ssd):

(env) I:\neurostuff\kohya_ss(f1)>accelerate launch ^
Продолжить? --mixed_precision bf16 ^
Продолжить? --num_cpu_threads_per_process 1 ^
Продолжить? sd-scripts/flux_train_network.py ^
Продолжить? --pretrained_model_name_or_path "I:\neurostuff\ComfyUI\models\unet\FLUX1\consolidated_s6700_fp8.safetensors" ^
Продолжить? --clip_l "I:\neurostuff\ComfyUI\models\clip\clip_l.safetensors" ^
Продолжить? --t5xxl "I:\neurostuff\ComfyUI\models\clip\t5xxl_fp16.safetensors" ^
Продолжить? --ae "I:\neurostuff\ComfyUI\models\vae\FLUX1\ae.sft" ^
Продолжить? --cache_latents_to_disk ^
Продолжить? --save_model_as safetensors ^
Продолжить? --sdpa --persistent_data_loader_workers ^
Продолжить? --max_data_loader_n_workers 2 ^
Продолжить? --seed 42 ^
Продолжить? --gradient_checkpointing ^
Продолжить? --mixed_precision bf16 ^
Продолжить? --save_precision bf16 ^
Продолжить? --network_module networks.lora_flux ^
Продолжить? --network_dim 16 ^
Продолжить? --network_alpha 16.0 ^
Продолжить? --optimizer_type adafactor ^
Продолжить? --optimizer_args "relative_step=False" "scale_parameter=False" "warmup_init=False" ^
Продолжить? --lr_scheduler constant_with_warmup ^
Продолжить? --max_grad_norm 0.0 ^
Продолжить? --learning_rate 8e-4 ^
Продолжить? --cache_text_encoder_outputs ^
Продолжить? --cache_text_encoder_outputs_to_disk ^
Продолжить? --fp8_base ^
Продолжить? --highvram ^
Продолжить? --max_train_epochs 1 ^
Продолжить? --save_every_n_epochs 1 ^
Продолжить? --dataset_config "I:\neurostuff\fluxgym\outputs\megafignya\dataset.toml" ^
Продолжить? --output_dir "I:\neurostuff\fluxgym\outputs\megafignya" ^
Продолжить? --output_name megafignya ^
Продолжить? --timestep_sampling shift ^
Продолжить? --discrete_flow_shift 3.1582 ^
Продолжить? --model_prediction_type raw ^
Продолжить? --guidance_scale 1 ^
Продолжить? --loss_type l2 ^
Продолжить? --apply_t5_attn_mask ^
Продолжить? --weighting_scheme logit_normal ^
Продолжить? --logit_mean 0.0 ^
Продолжить? --logit_std 1.0 ^
Продолжить? --mode_scale 1.29 ^
Продолжить? --sigmoid_scale ^ 1.0 ^
Продолжить? --enable_bucket ^
Продолжить? --bucket_no_upscale ^
Продолжить? --min_bucket_reso 256 ^
Продолжить? --max_bucket_reso 1024
highvram is enabled / highvramが有効です
2024-10-03 18:20:21 WARNING cache_latents_to_disk is enabled, so cache_latents is also enabled / train_util.py:3895
cache_latents_to_diskが有効なため、cache_latentsを有効にします
2024-10-03 18:20:21 INFO t5xxl_max_token_length: 512 flux_train_network.py:144
I:\neurostuff\fluxgym\env\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces was not set. It will be set to True by default. This behavior will be depracted in transformers v4.45, and will be then set to False by default. For more details check this issue: huggingface/transformers#31884
warnings.warn(
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565
2024-10-03 18:20:22 INFO Loading dataset config from train_network.py:270
I:\neurostuff\fluxgym\outputs\megafignya\dataset.toml
INFO prepare images. train_util.py:1803
INFO get image size from name of cache files train_util.py:1741
100%|█████████████████████████████████████████████████████████████████████████████| 1148/1148 [00:09<00:00, 121.06it/s]
2024-10-03 18:20:32 INFO set image size from cache files: 1148/1148 train_util.py:1748
INFO found directory I:\neurostuff\DATASET[REDACTED] contains train_util.py:1750
1148 image files
INFO 1148 train images with repeating. train_util.py:1844
INFO 0 reg images. train_util.py:1847
WARNING no regularization images / 正則化画像が見つかりませんでした train_util.py:1852
INFO [Dataset 0] config_util.py:570
batch_size: 1
resolution: (512, 512)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "I:\neurostuff\DATASET[REDACTED]"
image_count: 1148
num_repeats: 1
shuffle_caption: False
keep_tokens: 1
keep_tokens_separator:
caption_separator: ,
secondary_separator: None
enable_wildcard: False
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
alpha_mask: False,
is_reg: False
class_tokens: None
caption_extension: .txt
INFO [Dataset 0] config_util.py:576
INFO loading image sizes. train_util.py:876
100%|██████████████████████████████████████████████████████████████████████████████████████| 1148/1148 [00:00<?, ?it/s]
INFO make buckets train_util.py:882
WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is train_util.py:899
set, because bucket reso is defined by image size automatically /
bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計
算されるため、min_bucket_resoとmax_bucket_resoは無視されます
INFO number of images (including repeats) / train_util.py:928
各bucketの画像枚数(繰り返し回数を含む)
INFO bucket 0: resolution (128, 512), count: 3 train_util.py:933
INFO bucket 1: resolution (192, 512), count: 1 train_util.py:933
INFO bucket 2: resolution (256, 512), count: 38 train_util.py:933
INFO bucket 3: resolution (320, 512), count: 310 train_util.py:933
INFO bucket 4: resolution (384, 512), count: 94 train_util.py:933
INFO bucket 5: resolution (448, 448), count: 1 train_util.py:933
INFO bucket 6: resolution (448, 512), count: 30 train_util.py:933
INFO bucket 7: resolution (512, 192), count: 3 train_util.py:933
INFO bucket 8: resolution (512, 256), count: 104 train_util.py:933
INFO bucket 9: resolution (512, 320), count: 118 train_util.py:933
INFO bucket 10: resolution (512, 384), count: 65 train_util.py:933
INFO bucket 11: resolution (512, 448), count: 37 train_util.py:933
INFO bucket 12: resolution (512, 512), count: 344 train_util.py:933
INFO mean ar error (without repeats): 0.06703154647887108 train_util.py:938
INFO network for CLIP-L only will be trained. T5XXL will not be trained flux_train_network.py:50
/ CLIP-Lのネットワークのみが学習されます。T5XXLは学習されません
INFO preparing accelerator train_network.py:335
accelerator device: cuda
INFO Building Flux model dev flux_utils.py:45
INFO Loading state dict from flux_utils.py:52
I:\neurostuff\ComfyUI\models\unet\FLUX1\consolidated_s6700_fp8.safetensors
INFO Loaded Flux: flux_utils.py:55
_IncompatibleKeys(missing_keys=['guidance_in.in_layer.weight',
'guidance_in.in_layer.bias', 'guidance_in.out_layer.weight',
'guidance_in.out_layer.bias'], unexpected_keys=[])
INFO Loaded fp8 FLUX model flux_train_network.py:80
INFO Building CLIP flux_utils.py:74
INFO Loading state dict from flux_utils.py:167
I:\neurostuff\ComfyUI\models\clip\clip_l.safetensors
INFO Loaded CLIP: flux_utils.py:170
INFO Loading state dict from flux_utils.py:213
I:\neurostuff\ComfyUI\models\clip\t5xxl_fp16.safetensors
INFO Loaded T5xxl: flux_utils.py:216
INFO Building AutoEncoder flux_utils.py:62
INFO Loading state dict from I:\neurostuff\ComfyUI\models\vae\FLUX1\ae.sft flux_utils.py:66
INFO Loaded AE: flux_utils.py:69
import network module: networks.lora_flux
INFO [Dataset 0] train_util.py:2326
INFO caching latents with caching strategy. train_util.py:984
INFO checking cache validity... train_util.py:994
100%|███████████████████████████████████████████████████████████████████████████| 1148/1148 [00:00<00:00, 12551.45it/s]
2024-10-03 18:20:33 INFO no latents to cache train_util.py:1034
INFO move vae and unet to cpu to save memory flux_train_network.py:187
Traceback (most recent call last):
File "I:\neurostuff\kohya_ss(f1)\sd-scripts\flux_train_network.py", line 446, in
trainer.train(args)
File "I:\neurostuff\kohya_ss(f1)\sd-scripts\train_network.py", line 392, in train
self.cache_text_encoder_outputs_if_needed(args, accelerator, unet, vae, text_encoders, train_dataset_group, weight_dtype)
File "I:\neurostuff\kohya_ss(f1)\sd-scripts\flux_train_network.py", line 191, in cache_text_encoder_outputs_if_needed
unet.to("cpu")
File "I:\neurostuff\fluxgym\env\lib\site-packages\torch\nn\modules\module.py", line 1340, in to
return self._apply(convert)
File "I:\neurostuff\fluxgym\env\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
module._apply(fn)
File "I:\neurostuff\fluxgym\env\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
module._apply(fn)
File "I:\neurostuff\fluxgym\env\lib\site-packages\torch\nn\modules\module.py", line 927, in _apply
param_applied = fn(param)
File "I:\neurostuff\fluxgym\env\lib\site-packages\torch\nn\modules\module.py", line 1333, in convert
raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
Traceback (most recent call last):
File "C:\Python\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "I:\neurostuff\fluxgym\env\Scripts\accelerate.exe_main
.py", line 7, in
File "I:\neurostuff\fluxgym\env\lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
args.func(args)
File "I:\neurostuff\fluxgym\env\lib\site-packages\accelerate\commands\launch.py", line 1106, in launch_command
simple_launcher(args)
File "I:\neurostuff\fluxgym\env\lib\site-packages\accelerate\commands\launch.py", line 704, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['I:\neurostuff\fluxgym\env\Scripts\python.exe', 'sd-scripts/flux_train_network.py', '--pretrained_model_name_or_path', 'I:\neurostuff\ComfyUI\models\unet\FLUX1\consolidated_s6700_fp8.safetensors', '--clip_l', 'I:\neurostuff\ComfyUI\models\clip\clip_l.safetensors', '--t5xxl', 'I:\neurostuff\ComfyUI\models\clip\t5xxl_fp16.safetensors', '--ae', 'I:\neurostuff\ComfyUI\models\vae\FLUX1\ae.sft', '--cache_latents_to_disk', '--save_model_as', 'safetensors', '--sdpa', '--persistent_data_loader_workers', '--max_data_loader_n_workers', '2', '--seed', '42', '--gradient_checkpointing', '--mixed_precision', 'bf16', '--save_precision', 'bf16', '--network_module', 'networks.lora_flux', '--network_dim', '16', '--network_alpha', '16.0', '--optimizer_type', 'adafactor', '--optimizer_args', 'relative_step=False', 'scale_parameter=False', 'warmup_init=False', '--lr_scheduler', 'constant_with_warmup', '--max_grad_norm', '0.0', '--learning_rate', '8e-4', '--cache_text_encoder_outputs', '--cache_text_encoder_outputs_to_disk', '--fp8_base', '--highvram', '--max_train_epochs', '1', '--save_every_n_epochs', '1', '--dataset_config', 'I:\neurostuff\fluxgym\outputs\megafignya\dataset.toml', '--output_dir', 'I:\neurostuff\fluxgym\outputs\megafignya', '--output_name', 'megafignya', '--timestep_sampling', 'shift', '--discrete_flow_shift', '3.1582', '--model_prediction_type', 'raw', '--guidance_scale', '1', '--loss_type', 'l2', '--apply_t5_attn_mask', '--weighting_scheme', 'logit_normal', '--logit_mean', '0.0', '--logit_std', '1.0', '--mode_scale', '1.29', '--sigmoid_scale', '1.0', '--enable_bucket', '--bucket_no_upscale', '--min_bucket_reso', '256', '--max_bucket_reso', '1024']' returned non-zero exit status 1.

Default flux.1 dev fp8 works. And my 16 GB 4070 TiS handle training well. But with this new model, it makes this error. Is there anything that can be done to modify the code, or is it incompatible at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant