👁️Vision Fine-tuning
Learn how to fine-tune vision/multimodal LLMs with Unsloth
Disabling Vision / Text-only fine-tuning
model = FastVisionModel.get_peft_model(
model,
finetune_vision_layers = True, # False if not finetuning vision layers
finetune_language_layers = True, # False if not finetuning language layers
finetune_attention_modules = True, # False if not finetuning attention layers
finetune_mlp_modules = True, # False if not finetuning MLP layers
r = 16, # The larger, the higher the accuracy, but might overfit
lora_alpha = 16, # Recommended alpha == r at least
lora_dropout = 0,
bias = "none",
random_state = 3407,
use_rslora = False, # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
target_modules = "all-linear", # Optional now! Can specify a list if needed
modules_to_save=[
"lm_head",
"embed_tokens",
],
)Vision Data Collator
Vision Fine-tuning Dataset
Image
Caption

Multi-image training
🔎Training on assistant responses only for vision models, VLMs
Last updated
Was this helpful?

