Skip to content

FLUX LoRA with CLIP text-encoder weights fails (empty rank -> IndexError) under transformers>=5 #13984

@christopher5106

Description

@christopher5106

Describe the bug

Loading a kohya-style FLUX LoRA that contains CLIP text-encoder weights (lora_te1_*) fails with IndexError: list index out of range when transformers>=5 is installed.

The text-encoder half of the LoRA is never loaded; pipe.load_lora_weights(...) raises before any adapter is injected.

Root cause

transformers>=5 flattened CLIPTextModel: the text_model. wrapper module was removed, so text_encoder.named_modules() now yields names like encoder.layers.0.self_attn.k_proj instead of text_model.encoder.layers.0.self_attn.k_proj.

The kohya→diffusers conversion (_convert_kohya_flux_lora_to_diffusers) still produces text-encoder keys prefixed with text_model., e.g. text_encoder.text_model.encoder.layers.0.self_attn.k_proj.lora_B.weight.

In diffusers/loaders/lora_base.py::_load_lora_into_text_encoder, the rank dict is built by matching text_encoder.named_modules() against the (converted, PEFT-format) state-dict keys:

for name, _ in text_encoder.named_modules():
    if name.endswith((".q_proj", ".k_proj", ".v_proj", ".out_proj", ".fc1", ".fc2")):
        rank_key = f"{name}.lora_B.weight"   # e.g. "encoder.layers.0.self_attn.k_proj.lora_B.weight"
        if rank_key in state_dict:           # but keys are "text_model.encoder.layers.0...."
            rank[rank_key] = state_dict[rank_key].shape[1]

Under transformers 5 the module names no longer carry the text_model. prefix while the state-dict keys still do, so nothing matches, rank stays empty, and _create_lora_configget_peft_kwargs does r = lora_alpha = list(rank_dict.values())[0]IndexError.

Reproduction

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
# any kohya FLUX LoRA that includes CLIP text-encoder weights (lora_te1_*)
pipe.load_lora_weights("kohya_flux_lora_with_text_encoder.safetensors")

Minimal isolation of the empty-rank step:

from transformers import CLIPTextModel
te = CLIPTextModel.from_pretrained("black-forest-labs/FLUX.1-dev", subfolder="text_encoder")
print(hasattr(te, "text_model"))                       # False on transformers>=5 (was True on 4.x)
print([n for n, _ in te.named_modules()][:6])          # ['', 'embeddings', ..., 'encoder', 'encoder.layers']
# -> module names are now un-prefixed, but converted LoRA keys are still 'text_model.encoder....'

Traceback

File ".../diffusers/loaders/lora_base.py", line 390, in _load_lora_into_text_encoder
    lora_config = _create_lora_config(state_dict, network_alphas, metadata, rank, is_unet=False)
File ".../diffusers/utils/peft_utils.py", line 355, in _create_lora_config
    lora_config_kwargs = get_peft_kwargs(...)
File ".../diffusers/utils/peft_utils.py", line 158, in get_peft_kwargs
    r = lora_alpha = list(rank_dict.values())[0]
IndexError: list index out of range

Suggested fix

Reconcile the converted text-encoder key namespace with the model's named_modules() for transformers>=5 (drop/normalize the stale text_model. segment), or build rank in a prefix-tolerant way. A clearer error than a bare IndexError when rank ends up empty would also help.

System Info

  • diffusers 0.38.0
  • transformers 5.9.0
  • peft 0.19.1
  • torch 2.10–2.11 (+cu128), Python 3.11/3.12, Linux
  • Model: black-forest-labs/FLUX.1-dev

Who can help?

@sayakpaul @BenjaminBossan

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions