Skip to content

Conversation

@DrishtiShrrrma
Copy link
Contributor

@DrishtiShrrrma DrishtiShrrrma commented Oct 28, 2025

Refs DrishtiShrrrma#2

Summary

This PR updates the Transformers Version Recommendation so that LLaVA-Next models don’t default to “latest”. We pin to tested, working versions. This prevents a reproducible crash where the image processor receives the literal placeholder.

What changed

  • Replace the bullet that says transformers==latest for LLaVA-Next” with:
  • Use transformers==4.48.0 (recommended) or transformers==4.46.0 for LLaVA-Next series (e.g., llava-hf/llava-v1.6-vicuna-7b-hf).
  • Keep “transformers==latest” for the other model families listed in the README.

Why
Newer transformers builds change image handling and, in our testing, cause LLaVA-Next evaluation to pass the literal placeholder to the processor, leading to:

ValueError: Incorrect image source. Must be a valid URL starting with `http://`/ or `https://`/, a valid path to an image file, or a base64 encoded string. Got USER: <image>

Observations (tested)

transformers Result Error / Notes
4.46.0 ✅ Works Stable across tested benchmarks
4.48.0 ✅ Works Stable across tested benchmarks
4.57.1 ❌ Fails ValueError: Incorrect image source. Must be a valid URL starting with http://orhttps://, a valid path to an image file, or a base64 encoded string. Got USER: <image>
5.0.0.dev0 ❌ Fails ValueError: Incorrect image source. Must be a valid URL starting with http://orhttps://, a valid path to an image file, or a base64 encoded string. Got USER: <image>

Benchmarks tested

CountBenchQA, MMBench_DEV_EN, MME, SEEDBench_IMG

Model tested

llava-hf/llava-v1.6-vicuna-7b-hf (key: llava_next_vicuna_7b)

Minimal repro

# Failing case (example)
pip install "transformers==4.57.1"
python run.py --data CountBenchQA --model llava_next_vicuna_7b --verbose
# -> ValueError: Incorrect image source ... Got USER: <image>

# Working case (example)
pip install "transformers==4.48.0"
python run.py --data CountBenchQA --model llava_next_vicuna_7b --verbose
# -> runs successfully

Environment

  • Platform: Google Colab (Python 3.12.12)
  • PyTorch: 2.8.0+cu126 | CUDA: 12.6
  • GPU: NVIDIA L4
docs(readme): pin transformers==4.48.0 (or 4.46.0) for LLaVA-Next
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant