-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not working on apple silicon (CogVideoX Fun Sampler Implementation) #59
Comments
I probably left the fp8 fast mode on, check that and put it to disabled to see if it resolves this. What GPU are you using? |
No it's disabled, i'm on a macbook, the issue seems to be that autocast isn't supported in any pytorch except nightly (As of a week ago) ... so that autocast to fp16 is breaking things... oddly when i went to nightly i started getting errors that in prompt_embeds=positive.to(dtype).to(device), positive is a .. list and doesn't have a .to on list |
|
HAHA I had overlooked that CogVideo was using different text nodes than the stock ones, swapped to those, now that passes, however now seems to be breaking as it appears something is hardcoded to use cuda instead of failing back to mps or cpu if cudas not available .. haven't tracked down where yet... i updated in the pipeline for pipeline_cogvideox.py where you had a hardcoded torch.device("cuda") to
|
Strange thing, in that inpainting file if i throw a print to see what device is before it tries to send the vae to a device... the device is set to device = self._execution_device... and then device if i print it is "cuda:0".... |
Ya i'm not sure where that _execution_device is getting set, even if i hard code that instance of it to "mps" or "cpu" ... it seems somehow it's used elsewhere and its still trying to force things onto cuda... which macs dont have |
I think it defaults to cuda if it can't find it from accelerate... dunno why that wouldn't work, you can try just forcing the execution device to mps though. |
if you mean trying to just self._execution_device = "mps" wont work its apparently not allowed. AttributeError: can't set attribute '_execution_device'... A bit of digging it seems that diffusers returns the device thats set in _hf_hook in the model... which is returning cuda:0 |
Potentially found the reason: I wasn't calling the |
yep that solved that issue, so now with 2.6.0-dev pytorch (for the autocast to work in the pipeline)... it doesn't give the device error anymore.... Great catch, optional properties are so easy to overlook in these codebases So close to it running lol i can feel it! XD Now the hang is at ...
Get the feeling the dtype is being not passed somewhere it needs to be for float16 |
Diffusers 0.30.3 is required for the official I2V model only, not the "Fun" variant. Does that work for you btw, or is this only issue with the "Fun" models? |
cleared my folder and pulled latest from git repo ... and tested with the 2b models with the respective sampler... got very similar errors but... slightly different with standard 2b (first in list)...
with fun 2b...
|
I tried to resolve the above - running the 5B I2V model - it seems to be a deeper issue within the CogVideo diffuser model or in the MPS implementation of pytorch (though I can't be sure).
At point 2, I tried forcing the precision as float32 for all tensors and also forcing them to float16 before the call to: Might try to set this up on a GPU instance somewhere using an Nvidia card ¯_(ツ)_/¯ |
Well the float32 precision will likely oomw without any bugs or issues, bf16 they show at 16gb (confirmed as it can oom even on T4 colab) they even mention in the colabs that they can oom on 16gb vram and memory, I imagine some of this in this comfy extension is the tensor shuffling around chewing up memory but definitly think it needs to run in fp16 to have a chance of running locally on a 36gb… keep in mind on Mac’s offloading doesn’t do anything as it’s unified vram/ram we’d have to swap to completely unloading extraneous stuff not just shifting it to cpu |
I have a similar issue running the 5B I2V model on MacBookPro M3 Max (128 RAM, Sonoma latest).
Let me know if you'd prefer I open another issue or run a few tests given this machine's memory. The full error output is:
|
Can confirm this is an issue on M2 Max chips. |
!!! Exception during processing !!! unsupported scalarType
Traceback (most recent call last):
File "/Users/user/AI/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 198, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "/Users/user/AI/ComfyUI/execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/nodes.py", line 519, in process
autocast_context = torch.autocast(mm.get_autocast_device(device)) if autocastcondition else nullcontext()
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 229, in init
dtype = torch.get_autocast_dtype(device_type)
RuntimeError: unsupported scalarType
The text was updated successfully, but these errors were encountered: