Skip to content

Conversation

@muzammilvd
Copy link

This PR introduces a feature that allows users to toggle the "thinking" indicator when using Ollama models.

✨ What's New
Added a value to enable or disable the "think" in config behavior when generating responses with Ollama models. See example below

ollama_config = {
    "ollama_host": f"{str(os.environ.get('OLLAMA_BASE_URL', None))}",
    "model": f"{str(os.environ.get('OLLAMA_MODEL', None))}",
    "think": False     # Default is True
}

Updated relevant logic to respect this flag.

Maintained backward compatibility by keeping the default behavior unchanged (i.e., "thinking" remains enabled unless explicitly turned off).

📚 Why This Is Useful
When we use thinking model in ollama such as (qwen, deepseek) so it generates thinking content as well that make difficult to parse the json from the response. Now we can turn it off so we will get only the message content from ollama response.

🧪 Testing
Verified that the feature works correctly with the Ollama backend.
Tested both enabled and disabled states to ensure expected behavior.
No impact on non-Ollama models.

required ollama version >= 0.5.0

Copy link

@llamapreview llamapreview bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto Pull Request Review from LlamaPReview

1. Overview

1.1 Core Changes

  • Primary purpose and scope: Add configuration flag to toggle Ollama's "thinking" indicator in API responses
  • Key components modified: Ollama client integration module (src/vanna/ollama/ollama.py)
  • Cross-component impacts: No impact on non-Ollama models or other components
  • Business value alignment: Improves JSON response parsing for specific Ollama models (Qwen, DeepSeek) by eliminating extraneous thinking content

1.2 Technical Architecture

  • System design modifications: New configuration parameter added to Ollama client options
  • Component interaction changes: Modified Ollama API request format
  • Integration points impact: Only affects Ollama API integration points
  • Dependency changes: Requires Ollama version >= 0.5.0

2. Critical Findings

2.1 Must Fix (P0🔴)

No critical issues found

2.2 Should Fix (P1🟡)

No important improvements required

2.3 Consider (P2🟢)

Area: Documentation for new configuration parameter

  • Analysis Confidence: High
  • Improvement Opportunity: Adding comments would clarify purpose and usage of new think parameter for future maintainers

Area: Input validation for think parameter

  • Analysis Confidence: Medium
  • Improvement Opportunity: Explicit type checking would prevent configuration errors and improve robustness

2.4 Summary of Action Items

  • Add documentation comments for new parameter (immediate)
  • Consider adding input validation (next iteration)

3. Technical Analysis

3.1 Code Logic Analysis

📁 src/vanna/ollama/ollama.py - init

  • Submitted PR Code:
    self.keep_alive = config.get('keep_alive', None)
    self.ollama_options = config.get('options', {})
    self.num_ctx = self.ollama_options.get('num_ctx', 2048)
    self.think = config.get('think', True)
  • Analysis:
    • Clean implementation with backward-compatible default (True)
    • Missing documentation about parameter purpose and usage
    • No explicit type validation for boolean input
    • Cross-component impact: Only affects Ollama API interactions
  • LlamaPReview Suggested Improvements:
    self.keep_alive = config.get('keep_alive', None)
    self.ollama_options = config.get('options', {})
    self.num_ctx = self.ollama_options.get('num_ctx', 2048)
    
    # Controls display of "thinking" indicators in Ollama responses
    # When disabled, simplifies JSON parsing by removing intermediate thoughts
    # Default: True (backward compatible with previous behavior)
    self.think = config.get('think', True)
  • Improvement rationale:
    • Clarifies purpose and usage for maintainers
    • Documents backward compatibility guarantee
    • Prevents future misuse of parameter

📁 src/vanna/ollama/ollama.py - generate_sql

  • Submitted PR Code:
    response_dict = self.ollama_client.chat(model=self.model,
                                            messages=prompt,
                                            stream=False,
                                            options=self.ollama_options,
                                            keep_alive=self.keep_alive,
                                            think=self.think)
  • Analysis:
    • Correct implementation passing new parameter to Ollama API
    • Maintains existing parameter structure and ordering
    • No impact on other parameters or functionality
    • Business logic: Enables cleaner JSON parsing for specific models
  • LlamaPReview Suggested Improvements: No changes needed

3.2 Key Quality Aspects

  • Testing strategy and coverage: Manual verification confirmed but automated tests would strengthen validation
  • Documentation needs: Update configuration documentation to include new parameter

4. Overall Evaluation

  • Technical assessment: Well-implemented feature with clean, focused changes
  • Business impact: Improves integration experience for specific Ollama models
  • Risk evaluation: Low risk due to backward-compatible default and localized changes
  • Notable positive aspects: Maintains backward compatibility, clear PR description
  • Implementation quality: Concise and effective modification to existing logic
  • Final recommendation: Approve (with optional documentation enhancement)

💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant