opensora-plan-v1.1 推理生成的视频都是类似马赛克 #668

guyuliang7 · 2024-09-20T02:27:00Z

mindone 版本：最新版
MindSpore 版本：2.3.1
cann 版本：8.0.RC2 （提供的C18 CANN包也使用过）

examples/opensora_pku/scripts/text_condition/sample_video_221.sh
examples/opensora_pku/scripts/text_condition/sample_video_65.sh

SamitHuang · 2024-09-24T02:30:39Z

你好，请问有推理的log信息吗？权重是否已正确加载呢？

guyuliang7 · 2024-09-24T02:40:55Z

你好，请问有推理的日志信息吗？权重是否正确已加载呢？

/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/numpy/core/getlimits.py:499: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
return self._float_to_str(self.smallest_subnormal)
/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/numpy/core/getlimits.py:499: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
return self._float_to_str(self.smallest_subnormal)
Warning, cannot find compiled version of RoPE2D, using a slow version instead
Warning, cannot find compiled version of RoPE2D, using a slow version instead
Get New FA API!
[2024-09-24 10:35:31] INFO: Using jit_level: O0
[2024-09-24 10:35:31] INFO: vae init
[WARNING] CORE(234340,ffff9fc2a020,python):2024-09-24-10:35:31.667.049 [mindspore/core/utils/ms_context.cc:531] GetJitLevel] Set jit level to O2 for rank table startup method.
The config attributes {'in_channels': 3, 'out_channels': 3} were passed to CausalVAEModel, but are not expected and will be ignored. Please verify your config.json configuration file.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.0.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.0.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.11.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.11.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.2.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.3.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.3.num_batches_tracked from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.3.running_mean from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.3.running_var from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.3.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.5.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.6.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.6.num_batches_tracked from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.6.running_mean from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.6.running_var from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.6.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.8.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.9.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.9.num_batches_tracked from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.9.running_mean from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.9.running_var from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.discriminator.main.9.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.logvar from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.lin0.model.1.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.lin1.model.1.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.lin2.model.1.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.lin3.model.1.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.lin4.model.1.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice1.0.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice1.0.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice1.2.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice1.2.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice2.5.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice2.5.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice2.7.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice2.7.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice3.10.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice3.10.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice3.12.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice3.12.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice3.14.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice3.14.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice4.17.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice4.17.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice4.19.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice4.19.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice4.21.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice4.21.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice5.24.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice5.24.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice5.26.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice5.26.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice5.28.bias from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.net.slice5.28.weight from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.scaling_layer.scale from state_dict.
[2024-09-24 10:35:41] INFO: Deleting key loss.perceptual_loss.scaling_layer.shift from state_dict.
[2024-09-24 10:35:41] INFO: Restored from /data/nfs/gyl/work/Open-Sora-Plan-v1.1.0/vae/causalvae_488.ckpt
[2024-09-24 10:35:41] INFO: Use amp level O2 for causal 3D VAE with dtype=Float16, custom_fp32_cells: []
[2024-09-24 10:35:41] INFO: Number of prompts: 19
[2024-09-24 10:35:41] INFO: Number of generated samples for each prompt 1
[2024-09-24 10:35:41] INFO: loading annotations from ./sample_videos/prompt_list_65/dataset.csv ...
[2024-09-24 10:35:41] INFO: Num data samples: 19
[2024-09-24 10:35:45] INFO: Num batches: 19
[2024-09-24 10:35:47] INFO: Latte-65x512x512 init
The config attributes {'attention_mode': 'xformers'} were passed to LatteT2V, but are not expected and will be ignored. Please verify your config.json configuration file.
[2024-09-24 10:36:13] INFO: Restored from ckpt /data/nfs/gyl/work/Open-Sora-Plan-v1.1.0/65x512x512/LatteT2V-65x512x512.ckpt
[WARNING] ME(234340:281473362075680,MainProcess):2024-09-24-10:36:23.575.260 [mindspore/train/serialization.py:1560] For 'load_param_into_net', 2 parameters in the 'net' are not loaded, because they are not in the 'parameter_dict', please check whether the network structure is consistent when training and loading checkpoint.
[WARNING] ME(234340:281473362075680,MainProcess):2024-09-24-10:36:23.575.800 [mindspore/train/serialization.py:1564] ['temp_pos_embed', 'pos_embed.pos_embed'] are not loaded.
net param not load: ['temp_pos_embed', 'pos_embed.pos_embed'] 2
ckpt param not load: [] 0
[2024-09-24 10:36:23] INFO: Set mixed precision to O2 with dtype=fp16
[2024-09-24 10:36:23] INFO: T5 init
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565
/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: huggingface/transformers#31884
warnings.warn(
[2024-09-24 10:36:41] INFO: Load checkpoint from /data/nfs/gyl/work/opensora/t5-v1_1-xxl/model.ckpt.
[2024-09-24 10:37:34] WARNING: Checkpoint not loaded: ['shared.embedding_table']
[2024-09-24 10:37:34] INFO: Use amp level O2 for text encoder T5 with dtype=Float16
[2024-09-24 10:37:34] INFO: Key Settings:

MindSpore mode[GRAPH(0)/PYNATIVE(1)]: 0
Num of samples: 19
Num params: 1,305,797,131 (latte: 1,058,937,632, vae: 246,859,499)
Num trainable params: 0
Transformer dtype: Float16
VAE dtype: Float16
Text encoder dtype: Float16
Sampling steps 150
Sampling method: DDIM
CFG guidance scale: 7.5
Enable flash attention: True (Float16)

guyuliang7 · 2024-09-24T02:44:52Z

0%| | 0/19 [00:00<?, ?it/s]start compile Ascend C operator MatMulV3. kernel name is te_matmulv3_12832c7f8fbacbaef0c1a296941370ee5565ee599a1f730337eed8987b8f777d_1
start compile Ascend C operator MatMulV3. kernel name is te_matmulv3_9387668753053eef8001047dc4129b9a32f86c0a75ccb6ae8407a5d692916f74_1 | 0/150 [00:00<?, ?it/s]
compile Ascend C operator: MatMulV3 success!
compile Ascend C operator: MatMulV3 success!
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 150/150 [04:14<00:00, 1.70s/it]###

guyuliang7 · 2024-09-24T02:52:10Z

我在跑opensora的时候遇到了同样的问题在跑opensora-1.0的时候推理的结果是正常的在opensora-1.1 opensora-1.2的时候都是马赛克

CaitinZhao · 2024-09-24T06:26:05Z

[WARNING] CORE(234340,ffff9fc2a020,python):2024-09-24-10:35:31.667.049 [mindspore/core/utils/ms_context.cc:531] GetJitLevel] Set jit level to O2 for rank table startup method.

从上面日志看好像没有在jit_level O0上执行，切到O2了，检查下是不是有 RANK_TABLE_FILE 或 MINDSPORE_HCCL_CONFIG_PATH 的环境变量，去掉试下

guyuliang7 · 2024-09-24T06:37:05Z

[WARNING] CORE(234340,ffff9fc2a020,python):2024-09-24-10:35:31.667.049 [mindspore/core/utils/ms_context.cc:531] GetJitLevel] Set jit level to O2 for rank table startup method.

从上面日志看好像没有在jit_level O0上执行，切到O2了，检查下是不是有RANK_TABLE_FILE或MINDSPORE_HCCL_CONFIG_PATH的环境变量，去掉试下

查找环境变量存在RANK_TABLE_FILE=/user/config/nbstart_hccl.json 但是去掉之后对结果没有改善

CaitinZhao · 2024-09-24T06:45:57Z

还有上面那行日志吗

guyuliang7 · 2024-09-24T06:46:48Z

还有上面那行日志吗

不存在了

[2024-09-24 14:27:08] INFO: Using jit_level: O0
[2024-09-24 14:27:08] INFO: vae init
The config attributes {'in_channels': 3, 'out_channels': 3} were passed to CausalVAEModel, but are not expected and will be ignored. Please verify your config.json configuration file.

guyuliang7 · 2024-09-24T07:31:40Z

这个模型权重没有加载到这是为什么呢
[2024-09-24 15:23:22] INFO: Load checkpoint from /data/nfs/gyl/work/opensora/t5-v1_1-xxl/t5-v1_1-xxl.ckpt.
[2024-09-24 15:24:17] WARNING: Checkpoint not loaded: ['shared.embedding_table']

wtomin · 2024-10-03T06:15:06Z

你好，我已切换到你所提供的镜像环境，执行scripts/text_condition/sample_video_65.sh，生成视频是正常的，没有复现出你遇到的问题。

但是通过对比推理的log信息，发现有以下不同之处：

mindone/opensora_pku 的默认sampling_method是PNDM，你的推理log显示你使用的方法是DDIM;
Transformer, VAE 和 T5 text encoder的默认data type是BF16，你的推理log显示，这三个模型使用的都是FP16；
通过scripts/model_conversion/convert_all.sh得到的T5 权重名称是t5-v1_1-xxl.ckpt，你的推理log显示你使用的权重是model.ckpt. 请注意，权重转换脚本有更新过，你使用的model.ckpt很可能是旧的权重转换脚本得到的ckpt.

综合来看，你遇到的问题应该不是环境导致的，而是代码版本问题。我使用的代码commit id 是2a7adcf0 (Sep 24). 你可以参考这个commit id, 或者直接更新到最新的master 代码。另外，请重新运行权重转换脚本convert_all.sh。

如有问题，请及时联络。

SamitHuang assigned wtomin Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opensora-plan-v1.1 推理生成的视频都是类似马赛克 #668

opensora-plan-v1.1 推理生成的视频都是类似马赛克 #668

guyuliang7 commented Sep 20, 2024

SamitHuang commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

CaitinZhao commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

CaitinZhao commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

wtomin commented Oct 3, 2024

opensora-plan-v1.1 推理生成的视频都是类似马赛克 #668

opensora-plan-v1.1 推理生成的视频都是类似马赛克 #668

Comments

guyuliang7 commented Sep 20, 2024

SamitHuang commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

CaitinZhao commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

CaitinZhao commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

guyuliang7 commented Sep 24, 2024

wtomin commented Oct 3, 2024