Fix Anthropic max_tokens issue causing slow execution (issue #3807) #3808
+116
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix Anthropic max_tokens issue causing slow execution (issue #3807)
Summary
This PR fixes issue #3807 where Anthropic models were always reaching the max_tokens limit, causing extremely slow execution times.
Root Cause: In v1.0+,
max_tokensdefaulted to 4096 and was always passed to the Anthropic API, causing the model to generate up to 4096 tokens even for simple queries that should only need a few tokens (e.g., "Answer in one sentence: What is CrewAI?").The Fix:
max_tokensparameter fromint = 4096toint | None = None(optional)_prepare_completion_params():Changes:
lib/crewai/src/crewai/llms/providers/anthropic/completion.py: Modified__init__and_prepare_completion_paramslib/crewai/tests/llms/anthropic/test_anthropic.py: Added 5 comprehensive tests covering all max_tokens scenariosReview & Testing Checklist for Human
IMPORTANT: All tests in this PR use mocks. Real API testing is critical to validate the fix actually works.
Test with real Anthropic API key: Create a simple crew/agent with Anthropic LLM and verify that:
Verify dynamic max_tokens values are appropriate: The defaults (1024/2048) are somewhat arbitrary. Consider if these should be:
Check backwards compatibility: Verify that users who explicitly set
max_tokens(e.g.,LLM(model="anthropic/...", max_tokens=8192)) still get their specified value passed through unchangedReview the context window threshold: The 100k threshold for deciding between 1024 vs 2048 tokens is hardcoded. Verify this makes sense for the Claude model family.
Suggested Test Plan
Notes
Session Info: Requested by João (joao@crewai.com)
Devin Session: https://app.devin.ai/sessions/4a9ba7830b2f45f9b6ce79a9ff43cf14