Commit Graph

1438 Commits

Author SHA1 Message Date
comfyanonymous f4dac8ab6f Wan code small cleanup. 2025-02-27 07:22:42 -05:00
BiologicalExplosion 89253e9fe5
Support Cambricon MLU (#6964)
Co-authored-by: huzhan <huzhan@cambricon.com>
2025-02-26 20:45:13 -05:00
comfyanonymous 3ea3bc8546 Fix wan issues when prompt length is long. 2025-02-26 20:34:02 -05:00
comfyanonymous 0270a0b41c Reduce artifacts on Wan by doing the patch embedding in fp32. 2025-02-26 16:59:26 -05:00
comfyanonymous c37f15f98e Add fast preview support for Wan models. 2025-02-26 08:56:23 -05:00
comfyanonymous 4bca7367f3 Don't try to use clip_fea on t2v model. 2025-02-26 08:38:09 -05:00
comfyanonymous b6fefe686b Better wan memory estimation. 2025-02-26 07:51:22 -05:00
comfyanonymous fa62287f1f More code reuse in wan.
Fix bug when changing the compute dtype on wan.
2025-02-26 05:22:29 -05:00
comfyanonymous 0844998db3 Slightly better wan i2v mask implementation. 2025-02-26 03:49:50 -05:00
comfyanonymous 4ced06b879 WIP support for Wan I2V model. 2025-02-26 01:49:43 -05:00
comfyanonymous cb06e9669b Wan seems to work with fp16. 2025-02-25 21:37:12 -05:00
comfyanonymous 9a66bb972d Make wan work with all latent resolutions.
Cleanup some code.
2025-02-25 19:56:04 -05:00
comfyanonymous ea0f939df3 Fix issue with wan and other attention implementations. 2025-02-25 19:13:39 -05:00
comfyanonymous f37551c1d2 Change wan rope implementation to the flux one.
Should be more compatible.
2025-02-25 19:11:14 -05:00
comfyanonymous 63023011b9 WIP support for Wan t2v model. 2025-02-25 17:20:35 -05:00
comfyanonymous f40076096e Cleanup some lumina te code. 2025-02-25 04:10:26 -05:00
comfyanonymous 96d891cb94 Speedup on some models by not upcasting bfloat16 to float32 on mac. 2025-02-24 05:41:32 -05:00
comfyanonymous ace899e71a Prioritize fp16 compute when using allow_fp16_accumulation 2025-02-23 04:45:54 -05:00
comfyanonymous aff16532d4 Remove some useless code. 2025-02-22 04:45:14 -05:00
comfyanonymous 072db3bea6 Assume the mac black image bug won't be fixed before v16. 2025-02-21 20:24:07 -05:00
comfyanonymous a6deca6d9a Latest mac still has the black image bug. 2025-02-21 20:14:30 -05:00
comfyanonymous 41c30e92e7 Let all model memory be offloaded on nvidia. 2025-02-21 06:32:21 -05:00
comfyanonymous 12da6ef581 Apparently directml supports fp16. 2025-02-20 09:30:24 -05:00
Silver c5be423d6b
Fix link pointing to non-exisiting docs (#6891)
* Fix link pointing to non-exisiting docs

The current link is pointing to a path that does not exist any longer.
I changed it to point to the currect correct path for custom nodes datatypes.

* Update node_typing.py
2025-02-20 07:07:07 -05:00
maedtb 5715be2ca9
Fix Hunyuan unet config detection for some models. (#6877)
The change to support 32 channel hunyuan models is missing the `key_prefix` on the key.

This addresses a complain in the comments of acc152b674.
2025-02-19 07:14:45 -05:00
bymyself afc85cdeb6
Add Load Image Output node (#6790)
* add LoadImageOutput node

* add route for input/output/temp files

* update node_typing.py

* use literal type for image_folder field

* mark node as beta
2025-02-18 17:53:01 -05:00
Jukka Seppänen acc152b674
Support loading and using SkyReels-V1-Hunyuan-I2V (#6862)
* Support SkyReels-V1-Hunyuan-I2V

* VAE scaling

* Fix T2V

oops

* Proper latent scaling
2025-02-18 17:06:54 -05:00
comfyanonymous b07258cef2 Fix typo.
Let me know if this slows things down on 2000 series and below.
2025-02-18 07:28:33 -05:00
comfyanonymous 31e54b7052 Improve AMD arch detection. 2025-02-17 04:53:40 -05:00
comfyanonymous 8c0bae50c3 bf16 manual cast works on old AMD. 2025-02-17 04:42:40 -05:00
comfyanonymous 530412cb9d Refactor torch version checks to be more future proof. 2025-02-17 04:36:45 -05:00
comfyanonymous e2919d38b4 Disable bf16 on AMD GPUs that don't support it. 2025-02-16 05:46:10 -05:00
comfyanonymous 1cd6cd6080 Disable pytorch attention in VAE for AMD. 2025-02-14 05:42:14 -05:00
comfyanonymous d7b4bf21a2 Auto enable mem efficient attention on gfx1100 on pytorch nightly 2.7
I'm not not sure which arches are supported yet. If you see improvements in
memory usage while using --use-pytorch-cross-attention on your AMD GPU let
me know and I will add it to the list.
2025-02-14 04:18:14 -05:00
comfyanonymous 019c7029ea Add a way to set a different compute dtype for the model at runtime.
Currently only works for diffusion models.
2025-02-13 20:34:03 -05:00
comfyanonymous 8773ccf74d Better memory estimation for ROCm that support mem efficient attention.
There is no way to check if the card actually supports it so it assumes
that it does if you use --use-pytorch-cross-attention with yours.
2025-02-13 08:32:36 -05:00
comfyanonymous 1d5d6586f3 Fix ruff. 2025-02-12 06:49:16 -05:00
zhoufan2956 35740259de
mix_ascend_bf16_infer_err (#6794) 2025-02-12 06:48:11 -05:00
comfyanonymous ab888e1e0b Add add_weight_wrapper function to model patcher.
Functions can now easily be added to wrap/modify model weights.
2025-02-12 05:55:35 -05:00
comfyanonymous d9f0fcdb0c Cleanup. 2025-02-11 17:17:03 -05:00
HishamC b124256817
Fix for running via DirectML (#6542)
* Fix for running via DirectML

Fix DirectML empty image generation issue with Flux1. add CPU fallback for unsupported path. Verified the model works on AMD GPUs

* fix formating

* update casual mask calculation
2025-02-11 17:11:32 -05:00
comfyanonymous af4b7c91be Make --force-fp16 actually force the diffusion model to be fp16. 2025-02-11 08:33:09 -05:00
comfyanonymous 4027466c80 Make lumina model work with any latent resolution. 2025-02-10 00:24:20 -05:00
comfyanonymous 095d867147 Remove useless function. 2025-02-09 07:02:57 -05:00
Pam caeb27c3a5
res_multistep: Fix cfgpp and add ancestral samplers (#6731) 2025-02-08 19:39:58 -05:00
comfyanonymous 3d06e1c555 Make error more clear to user. 2025-02-08 18:57:24 -05:00
catboxanon 43a74c0de1
Allow FP16 accumulation with `--fast` (#6453)
Currently only applies to PyTorch nightly releases. (>=20250208)
2025-02-08 17:00:56 -05:00
comfyanonymous 079eccc92a Don't compress http response by default.
Remove argument to disable it.

Add new --enable-compress-response-body argument to enable it.
2025-02-07 03:29:21 -05:00
comfyanonymous 14880e6dba Remove some useless code. 2025-02-06 05:00:37 -05:00
comfyanonymous 37cd448529 Set the shift for Lumina back to 6. 2025-02-05 14:49:52 -05:00
comfyanonymous 94f21f9301 Upcasting rope to fp32 seems to make no difference in this model. 2025-02-05 04:32:47 -05:00
comfyanonymous 60653004e5 Use regular numbers for rope in lumina model. 2025-02-05 04:17:25 -05:00
comfyanonymous a57d635c5f Fix lumina 2 batches. 2025-02-04 21:48:11 -05:00
comfyanonymous 8ac2dddeed Lower the default shift of lumina to reduce artifacts. 2025-02-04 06:50:37 -05:00
comfyanonymous 3e880ac709 Fix on python 3.9 2025-02-04 04:20:56 -05:00
comfyanonymous e5ea112a90 Support Lumina 2 model. 2025-02-04 04:16:30 -05:00
comfyanonymous 44e19a28d3 Use maximum negative value instead of -inf for masks in text encoders.
This is probably more correct.
2025-02-02 09:46:00 -05:00
Dr.Lt.Data 0a0df5f136
better guide message for sageattention (#6634) 2025-02-02 09:26:47 -05:00
KarryCharon 24d6871e47
add disable-compres-response-body cli args; add compress middleware; (#6672) 2025-02-02 09:24:55 -05:00
comfyanonymous 9e1d301129 Only use stable cascade lora format with cascade model. 2025-02-01 06:35:22 -05:00
comfyanonymous 8d8dc9a262 Allow batch of different sigmas when noise scaling. 2025-01-30 06:49:52 -05:00
filtered 222f48c0f2
Allow changing folder_paths.base_path via command line argument. (#6600)
* Reimpl. CLI arg directly inside folder_paths.

* Update tests to use CLI arg mocking.

* Revert last-minute refactor.

* Fix test state polution.
2025-01-29 08:06:28 -05:00
comfyanonymous 13fd4d6e45 More friendly error messages for corrupted safetensors files. 2025-01-28 09:41:09 -05:00
comfyanonymous 255edf2246 Lower minimum ratio of loaded weights on Nvidia. 2025-01-27 05:26:51 -05:00
comfyanonymous 67feb05299 Remove redundant code. 2025-01-25 19:04:53 -05:00
comfyanonymous 14ca5f5a10 Remove useless code. 2025-01-24 06:15:54 -05:00
comfyanonymous 96e2a45193 Remove useless code. 2025-01-23 05:56:23 -05:00
Chenlei Hu dfa2b6d129
Remove unused function lcm in conds.py (#6572) 2025-01-23 05:54:09 -05:00
comfyanonymous d6bbe8c40f Remove support for python 3.8. 2025-01-22 17:04:30 -05:00
chaObserv e857dd48b8
Add gradient estimation sampler (#6554) 2025-01-22 05:29:40 -05:00
comfyanonymous fb2ad645a3 Add FluxDisableGuidance node to disable using the guidance embed. 2025-01-20 14:50:24 -05:00
comfyanonymous d8a7a32779 Cleanup old TODO. 2025-01-20 03:44:13 -05:00
Sergii Dymchenko ebf038d4fa
Use `torch.special.expm1` (#6388)
* Use `torch.special.expm1`

This function provides greater precision than `exp(x) - 1` for small values of `x`.

Found with TorchFix https://github.com/pytorch-labs/torchfix/

* Use non-alias
2025-01-19 04:54:32 -05:00
catboxanon b1a02131c9
Remove comfy.samplers self-import (#6506) 2025-01-18 17:49:51 -05:00
comfyanonymous 507199d9a8 Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
comfyanonymous 2f3ab40b62 Add warning when using old pytorch versions. 2025-01-17 18:47:27 -05:00
comfyanonymous 0aa2368e46 Fix some cosmos fp8 issues. 2025-01-16 17:45:37 -05:00
comfyanonymous cca96a85ae Fix cosmos VAE failing with videos longer than 121 frames. 2025-01-16 16:30:06 -05:00
comfyanonymous 31831e6ef1 Code refactor. 2025-01-16 07:23:54 -05:00
comfyanonymous 88ceb28e20 Tweak hunyuan memory usage factor. 2025-01-16 06:31:03 -05:00
comfyanonymous 23289a6a5c Clean up some debug lines. 2025-01-16 04:24:39 -05:00
comfyanonymous 9d8b6c1f46 More accurate memory estimation for cosmos and hunyuan video. 2025-01-16 03:48:40 -05:00
comfyanonymous 6320d05696 Slightly lower hunyuan video memory usage. 2025-01-16 00:23:01 -05:00
comfyanonymous 25683b5b02 Lower cosmos diffusion model memory usage. 2025-01-15 23:46:42 -05:00
comfyanonymous 4758fb64b9 Lower cosmos VAE memory usage by a bit. 2025-01-15 22:57:52 -05:00
comfyanonymous 008761166f Optimize first attention block in cosmos VAE. 2025-01-15 21:48:46 -05:00
comfyanonymous cba58fff0b Remove unsafe embedding load for very old pytorch. 2025-01-15 04:32:23 -05:00
comfyanonymous 2feb8d0b77 Force safe loading of files in torch format on pytorch 2.4+
If this breaks something for you make an issue.
2025-01-15 03:50:27 -05:00
Pam c78a45685d
Rewrite res_multistep sampler and implement res_multistep_cfg_pp sampler. (#6462) 2025-01-14 18:20:06 -05:00
comfyanonymous 3aaabb12d4 Implement Cosmos Image/Video to World (Video) diffusion models.
Use CosmosImageToVideoLatent to set the input image/video.
2025-01-14 05:14:10 -05:00
comfyanonymous 1f1c7b7b56 Remove useless code. 2025-01-13 03:52:37 -05:00
comfyanonymous 90f349f93d Add res_multistep sampler from the cosmos code.
This sampler should work with all models.
2025-01-12 03:10:07 -05:00
Jedrzej Kosinski 6c9bd11fa3
Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377)
* Add 'sigmas' to transformer_options so that downstream code can know about the full scope of current sampling run, fix Hook Keyframes' guarantee_steps=1 inconsistent behavior with sampling split across different Sampling nodes/sampling runs by referencing 'sigmas'

* Cleaned up hooks.py, refactored Hook.should_register and add_hook_patches to use target_dict instead of target so that more information can be provided about the current execution environment if needed

* Refactor WrapperHook into TransformerOptionsHook, as there is no need to separate out Wrappers/Callbacks/Patches into different hook types (all affect transformer_options)

* Refactored HookGroup to also store a dictionary of hooks separated by hook_type, modified necessary code to no longer need to manually separate out hooks by hook_type

* In inner_sample, change "sigmas" to "sampler_sigmas" in transformer_options to not conflict with the "sigmas" that will overwrite "sigmas" in _calc_cond_batch

* Refactored 'registered' to be HookGroup instead of a list of Hooks, made AddModelsHook operational and compliant with should_register result, moved TransformerOptionsHook handling out of ModelPatcher.register_all_hook_patches, support patches in TransformerOptionsHook properly by casting any patches/wrappers/hooks to proper device at sample time

* Made hook clone code sane, made clear ObjectPatchHook and SetInjectionsHook are not yet operational

* Fix performance of hooks when hooks are appended via Cond Pair Set Props nodes by properly caching between positive and negative conds, make hook_patches_backup behave as intended (in the case that something pre-registers WeightHooks on the ModelPatcher instead of registering it at sample time)

* Filter only registered hooks on self.conds in CFGGuider.sample

* Make hook_scope functional for TransformerOptionsHook

* removed 4 whitespace lines to satisfy Ruff,

* Add a get_injections function to ModelPatcher

* Made TransformerOptionsHook contribute to registered hooks properly, added some doc strings and removed a so-far unused variable

* Rename AddModelsHooks to AdditionalModelsHook, rename SetInjectionsHook to InjectionsHook (not yet implemented, but at least getting the naming figured out)

* Clean up a typehint
2025-01-11 12:20:23 -05:00
comfyanonymous ee8a7ab69d Fast latent preview for Cosmos. 2025-01-11 04:41:24 -05:00
comfyanonymous 2ff3104f70 WIP support for Nvidia Cosmos 7B and 14B text to world (video) models. 2025-01-10 09:14:16 -05:00
comfyanonymous 129d8908f7 Add argument to skip the output reshaping in the attention functions. 2025-01-10 06:27:37 -05:00
comfyanonymous ff838657fa Cleaner handling of attention mask in ltxv model code. 2025-01-09 07:12:03 -05:00
comfyanonymous 2307ff6746 Improve some logging messages. 2025-01-08 19:05:22 -05:00
comfyanonymous d0f3752e33 Properly calculate inner dim for t5 model.
This is required to support some different types of t5 models.
2025-01-07 17:33:03 -05:00
comfyanonymous 4209edf48d Make a few more samplers deterministic. 2025-01-07 02:12:32 -05:00