-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] IP Adapters (author @okotaku ) #5713
Conversation
@marianbastiUNRN I think it is fine for now let me know if you're interested in working on this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! Let's merge 🚀
* add ip-adapter --------- Co-authored-by: okotaku <[email protected]> Co-authored-by: sayakpaul <[email protected]> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Steven Liu <[email protected]>
I've been working on this for 2 weeks and now it's built in.... Thanks haha |
Open a new issue for this. It's ideal for users to comment on PRs after they have been merged. |
This PR seems to break the positional arguments for We might want to clarify this in the release note for the next release. |
hi @TonyLianLong |
Hello, I’m just starting to program in Python and I still don’t understand exactly how to do it correctly |
@yiyixuxu I'm interested in implementing this, can you guide me to the steps necessary please? |
* add ip-adapter --------- Co-authored-by: okotaku <[email protected]> Co-authored-by: sayakpaul <[email protected]> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Steven Liu <[email protected]>
Hi, @yiyixuxu Could you also provide an img2img IPAdaptor sample for SDXL? I always got below error when using SDXL. Thanks!
|
Never mind, I figured it out, I need to use the sd_models' image encode explicitly. like this: image_encoder = CLIPVisionModelWithProjection.from_pretrained(
<IP-Adapter Model Path>
subfolder="models/image_encoder",
torch_dtype=torch.float16,
).to("cuda")
pipeline = StableDiffusionXLImg2ImgPipeline.from_pretrained(
<pretrain model path>
, torch_dtype=torch.float16
, image_encoder = image_encoder
)
pipeline.to("cuda") |
is it possible to load multiple image as reference for IP adapter? |
Hey @thibaudart, Hope you're doing well - we've just recently opened the Discussion tab on the Diffusers' repo: https://github.com/huggingface/diffusers/discussions |
of course |
* add ip-adapter --------- Co-authored-by: okotaku <[email protected]> Co-authored-by: sayakpaul <[email protected]> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Steven Liu <[email protected]>
For controlnet and ip-Adapter, I have a question about the multi-computation by using a batch size, e.g., batch_size = 4. I try to put image, prompt, and generator lists, etc. into the pipeline. But the result failed with an error: ValueError: Thus, maybe the multi-computation by using a batch size is not added in this project. I am not sure. Could anyone help me? Thanks. |
it would be better if you open a new issue with this, also you will need to provide us with a minimal reproducible code. Without it, I can say that the error message says it all, you are passing 4 images to the ip adapters but you're only loading one ip adapter. Probably the error lies in how are you passing the images for the batch. |
Hello, Mr. asomoza. Thanks for your reply. With your help, I have taken some tests but still failed. So I open an issue about the details. |
Dear asomoza, it seems that I have figured out my problem. Finally, I find that the ip-Adapter embedding is not supported to work with a batch of images separately. It deals with all the images in one batch uniformly. Thus, the better way is to embed the adapter images one by one and then cat (torch.cat) them up. Then we pass the catted embeddings into our pipeline to generate images in one batch separately. The details can be seen in this issue #7933. Thank you very much. |
the author of this PR is @okotaku
and the original PR: #4944
this is a demo of alternative design (alterative to #4944) that add the image_projection layer to Unet
works with SD, SDXL
it works with text-to-image, image-to-image, inpaint, see text-to-image example below, and you can find examples for img2img here and inpaint here
It works with LCM-Lora out of box
work with controlnet
work with animate diff