Error first try SD3 directml RX580 #3689

KillyTheNetTerminal · 2024-06-12T13:55:09Z

Error occurred when executing KSampler:

Expected all tensors to be on the same device, but found at least two devices, privateuseone:0 and cpu!

File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\nodes.py", line 1355, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\nodes.py", line 1325, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\sample.py", line 43, in sample
samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 794, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 696, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 683, in sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 662, in inner_sample
samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 567, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comf\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\k_diffusion\sampling.py", line 137, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 291, in call
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 649, in call
return self.predict_noise(*args, **kwargs)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 652, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 277, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\samplers.py", line 226, in calc_cond_batch
output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\model_base.py", line 103, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comf\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comf\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\ldm\modules\diffusionmodules\mmdit.py", line 961, in forward
return super().forward(x, timesteps, context=context, y=y)
File "C:\Users\WarMa\OneDrive\Escritorio\SD\comfyuai\ComfyUI\comfy\ldm\modules\diffusionmodules\mmdit.py", line 937, in forward
x = self.x_embedder(x) + self.cropped_pos_embed(hw, device=x.device).to(dtype=x.dtype)

timesqueezer · 2024-06-12T14:44:52Z

Can confirm the same error on a RTX 3050 / Intel Core i7-11800H notebook. The only difference is this line:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

KillyTheNetTerminal · 2024-06-12T14:48:37Z

exactly cause you have Nvidea and Cuda

KillyTheNetTerminal · 2024-06-12T14:56:34Z

working on CPU but slow as hell. i3-9100f

Cremesis · 2024-06-12T14:57:15Z

Same problem for me

Exception during processing!!! Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Traceback (most recent call last):
  File "C:\Users\<myuser>\Downloads\comfyui\ComfyUI\execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)

jtyszkiew · 2024-06-12T15:02:24Z

Seems to work for me on:

│Total VRAM 11980 MB, total RAM 64140 MB
│pytorch version: 2.3.0+cu121
│Set vram state to: NORMAL_VRAM
│Device: cuda:0 NVIDIA GeForce RTX 4070 : cudaMallocAsync
│VAE dtype: torch.bfloat16
│Using pytorch cross attention

Nvidia + CUDA

kuldp18 · 2024-06-12T15:06:48Z

Can confirm the same error on a RTX 3050 / Intel Core i7-11800H notebook. The only difference is this line: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I have the same error too..

15Litrov · 2024-06-12T15:08:31Z

A fresh manual install with nightly pytorch (other not tested) helped me overcome this problem.
1050ti 4gb + 32gb RAM

kuldp18 · 2024-06-12T15:09:52Z

A fresh manual install with nightly pytorch (other not tested) helped me overcome this problem. 1050ti 4gb + 32gb RAM

can we just update the pytorch in the current install? and how is 4gb vram handling sd3 btw?

15Litrov · 2024-06-12T15:12:51Z

A fresh manual install with nightly pytorch (other not tested) helped me overcome this problem. 1050ti 4gb + 32gb RAM

can we just update the pytorch in the current install? and how is 4gb vram handling sd3 btw?

Maybe? I did not test.
About performance: 30 s/it for 1024x1024 with dualCLIP.

kuldp18 · 2024-06-12T15:16:56Z

Guys the issue is fixed, please do an update!

AlexBenjarmin · 2024-06-12T15:31:48Z

Guys the issue is fixed, please do an update!

update Comfy UI?

KillyTheNetTerminal · 2024-06-12T15:34:46Z

yes it works now, update comfyui (I use manager) very slow per it. There's is a way to speed up this?

KillyTheNetTerminal · 2024-06-12T16:41:07Z

ltdrdata · 2024-06-12T16:45:55Z

You should not use dpmpp_2m, karras.
Just use euler, sgm_uniform.

karras is bad for SD3.

kuldp18 · 2024-06-12T16:48:19Z

You should not use dpmpp_2m, karras. Just use euler, sgm_uniform.

karras is bad for SD3.

the official recommendation is dpm though, isn't euler too random according to sd3 architecture?

ltdrdata · 2024-06-12T17:12:44Z

You should not use dpmpp_2m, karras. Just use euler, sgm_uniform.
karras is bad for SD3.

the official recommendation is dpm though, isn't euler too random according to sd3 architecture?

https://comfyanonymous.github.io/ComfyUI_examples/sd3/

Official example is suggesting euler, sgm_uniform.
In my test.
dpmpp_2m sampler is ok.
but the scheduler must be one of normal, simple, sgm_uniform, ddim_uniform.

KillyTheNetTerminal · 2024-06-12T18:57:51Z

the same, the image is still noisy

KillyTheNetTerminal · 2024-06-12T19:13:45Z

ltdrdata · 2024-06-12T23:33:53Z

Try on cpu mode.

Wallboy · 2024-06-13T04:07:51Z

Same issue with just getting noisy generated images. 7900 XTX also running using DirectML.

Perhaps SD3 is not working with AMD GPUs/DirectML yet.

kopaser6463 · 2024-06-13T06:40:00Z

Same issue, i nail it down a little bit to variable named out in sampling_function in samplers.py being different on cpu/directml.
Here a crazy path to it.
nodes.py -> samplers.py -> KSampler.sample -> sample(diferent one) -> CFGGuider.sample ->CFGGuider.inner_sample (sampler.sample(self, sigmas...)) -> sampler = sampler_object(self.sampler just a name) -> sampler_object -> ksampler -> KSAMPLER.sample(self, model_wrap, sigmas, extra_args...) -> model_k = KSamplerX0Inpaint(model_wrap, sigmas) ->
model_wrap is self in sampler.sample so CFGGuider() call return self.predict_noise() -> sampling_function(model) -> cfg_function(model) -> out.
It is different on cpu/directml.
Why? I don't know.

Wallboy · 2024-06-13T09:56:20Z

If anyone wants to get a working SD3 with AMD GPUs in the mean time, look up the ComfyUI Zluda fork and use that instead. Working great.

Just be warned that the first generation takes a little while as a bunch of databases are being processed. Similar to if you've ever used A1111 and Zluda, you had that same wait time for your first generation after installing it.

KillyTheNetTerminal · 2024-06-13T16:17:42Z

I never try and set up zluda for comfyui, this speed ups generations? compared to directml? how can I test this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error first try SD3 directml RX580 #3689

Error first try SD3 directml RX580 #3689

KillyTheNetTerminal commented Jun 12, 2024

timesqueezer commented Jun 12, 2024 •

edited

KillyTheNetTerminal commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

Cremesis commented Jun 12, 2024

jtyszkiew commented Jun 12, 2024

kuldp18 commented Jun 12, 2024

15Litrov commented Jun 12, 2024 •

edited

kuldp18 commented Jun 12, 2024

15Litrov commented Jun 12, 2024

kuldp18 commented Jun 12, 2024

AlexBenjarmin commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

ltdrdata commented Jun 12, 2024 •

edited

kuldp18 commented Jun 12, 2024

ltdrdata commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

ltdrdata commented Jun 12, 2024

Wallboy commented Jun 13, 2024

kopaser6463 commented Jun 13, 2024

Wallboy commented Jun 13, 2024

KillyTheNetTerminal commented Jun 13, 2024

Error first try SD3 directml RX580 #3689

Error first try SD3 directml RX580 #3689

Comments

KillyTheNetTerminal commented Jun 12, 2024

timesqueezer commented Jun 12, 2024 • edited

KillyTheNetTerminal commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

Cremesis commented Jun 12, 2024

jtyszkiew commented Jun 12, 2024

kuldp18 commented Jun 12, 2024

15Litrov commented Jun 12, 2024 • edited

kuldp18 commented Jun 12, 2024

15Litrov commented Jun 12, 2024

kuldp18 commented Jun 12, 2024

AlexBenjarmin commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

ltdrdata commented Jun 12, 2024 • edited

kuldp18 commented Jun 12, 2024

ltdrdata commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

KillyTheNetTerminal commented Jun 12, 2024

ltdrdata commented Jun 12, 2024

Wallboy commented Jun 13, 2024

kopaser6463 commented Jun 13, 2024

Wallboy commented Jun 13, 2024

KillyTheNetTerminal commented Jun 13, 2024

timesqueezer commented Jun 12, 2024 •

edited

15Litrov commented Jun 12, 2024 •

edited

ltdrdata commented Jun 12, 2024 •

edited