Skip to content

Conversation

@leisuzz
Copy link
Contributor

@leisuzz leisuzz commented Dec 18, 2025

What does this PR do?

  1. I got the error:
    raise ValueError(f"Expected image_latents to be a list, got {type(image_latents)}.")
    (1) cond_model_input_list will go to "_prepare_image_ids" in a list of [[1, cond_model_input[0], cond_model_input[1], cond_model_input[2]], ...]
    (2) As the "_prepare_image_ids" in pipeline will do the torch.cat(image_latent_ids, dim=0), this will cause mismatch of shape in the training step in code model_input_ids = torch.cat([model_input_ids, cond_model_input_ids], dim=1). cond_model_input_ids .shape[0] is 1, but model_input_ids.shape[0] is the batch size. The code cond_model_input_ids.view is to resize the shape to meet the requirement
    So this change will also work if batch size is more than 1.

  2. When I only changed the cond_model_input to list, I got the training abnormal training loss (start with ~1.7, which is too high). So I fix model prediction based on the pipeline part, and loss becomes reasonable (start with ~0.4).

With the code:

model_pred = model_pred[:, :noisy_seq_len, :]
model_input_ids = model_input_ids[:, :noisy_seq_len, :]
The training loss is:
Steps:   0%|          | 1/5000 [00:29<40:20:41, 29.05s/it]
Steps:   0%|          | 1/5000 [00:29<40:20:41, 29.05s/it, loss=0.328, lr=1e-5]
Steps:   0%|          | 2/5000 [01:00<42:12:33, 30.40s/it, loss=0.328, lr=1e-5]
Steps:   0%|          | 2/5000 [01:00<42:12:33, 30.40s/it, loss=0.835, lr=1e-5]
Steps:   0%|          | 3/5000 [01:29<41:20:34, 29.78s/it, loss=0.835, lr=1e-5]
Steps:   0%|          | 3/5000 [01:29<41:20:34, 29.78s/it, loss=0.254, lr=1e-5]
Steps:   0%|          | 4/5000 [01:58<40:54:41, 29.48s/it, loss=0.254, lr=1e-5]
Steps:   0%|          | 4/5000 [01:58<40:54:41, 29.48s/it, loss=0.405, lr=1e-5]
Steps:   0%|          | 5/5000 [02:27<40:43:31, 29.35s/it, loss=0.405, lr=1e-5]
Steps:   0%|          | 5/5000 [02:27<40:43:31, 29.35s/it, loss=1.03, lr=1e-5] 
Steps:   0%|          | 6/5000 [02:53<39:12:51, 28.27s/it, loss=1.03, lr=1e-5]
Steps:   0%|          | 6/5000 [02:53<39:12:51, 28.27s/it, loss=0.574, lr=1e-5]
Steps:   0%|          | 7/5000 [03:20<38:17:51, 27.61s/it, loss=0.574, lr=1e-5]
Steps:   0%|          | 7/5000 [03:20<38:17:51, 27.61s/it, loss=0.29, lr=1e-5] 
Steps:   0%|          | 8/5000 [03:49<38:54:26, 28.06s/it, loss=0.29, lr=1e-5]
Steps:   0%|          | 8/5000 [03:49<38:54:26, 28.06s/it, loss=0.393, lr=1e-5]

With the original code:

model_pred = model_pred[:, : packed_noisy_model_input.size(1) :]
model_pred = Flux2Pipeline._unpack_latents_with_ids(model_pred, model_input_ids)

The training loss is:

Steps:   0%|          | 1/5000 [00:46<64:57:32, 46.78s/it]
Steps:   0%|          | 1/5000 [00:46<64:57:32, 46.78s/it, loss=2.01, lr=1e-5]
Steps:   0%|          | 2/5000 [01:15<50:29:04, 36.36s/it, loss=2.01, lr=1e-5]
Steps:   0%|          | 2/5000 [01:15<50:29:04, 36.36s/it, loss=2.08, lr=1e-5]
Steps:   0%|          | 3/5000 [01:47<47:31:01, 34.23s/it, loss=2.08, lr=1e-5]
Steps:   0%|          | 3/5000 [01:47<47:31:01, 34.23s/it, loss=1.83, lr=1e-5]
Steps:   0%|          | 4/5000 [02:18<45:54:39, 33.08s/it, loss=1.83, lr=1e-5]
Steps:   0%|          | 4/5000 [02:18<45:54:39, 33.08s/it, loss=1.99, lr=1e-5]
Steps:   0%|          | 5/5000 [02:47<43:39:23, 31.46s/it, loss=1.99, lr=1e-5]
Steps:   0%|          | 5/5000 [02:47<43:39:23, 31.46s/it, loss=2.02, lr=1e-5]
Steps:   0%|          | 6/5000 [03:16<42:28:13, 30.62s/it, loss=2.02, lr=1e-5]
Steps:   0%|          | 6/5000 [03:16<42:28:13, 30.62s/it, loss=2.01, lr=1e-5]
Steps:   0%|          | 7/5000 [03:42<40:32:24, 29.23s/it, loss=2.01, lr=1e-5]
Steps:   0%|          | 7/5000 [03:42<40:32:24, 29.23s/it, loss=1.83, lr=1e-5]
Steps:   0%|          | 8/5000 [04:12<40:37:29, 29.30s/it, loss=1.83, lr=1e-5]
Steps:   0%|          | 8/5000 [04:12<40:37:29, 29.30s/it, loss=1.92, lr=1e-5]

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@leisuzz leisuzz changed the title Bugfix for dreambooth flux2 img2img2 Dec 18, 2025
@leisuzz
Copy link
Contributor Author

leisuzz commented Dec 18, 2025

@sayakpaul Please take a look at this PR. Thank you for your help!

@sayakpaul
Copy link
Member

Do you have a reproducer?

@leisuzz
Copy link
Contributor Author

leisuzz commented Dec 18, 2025

@sayakpaul I've updated the result in the description, thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants