Allow custom VLLM endpoint URL #306

imenelydiaker · 2025-10-13T21:43:00Z

When deploying a VLLM server on a different Node/GPU than the one we are using when running agents, the base URL for the endpoint cannot be http://0.0.0.0:8000.

AgentLab/src/agentlab/llm/chat_api.py

Lines 463 to 482 in da8cb7c

    
           class VLLMChatModel(ChatModel): 
        
               def __init__( 
        
                   self, 
        
                   model_name, 
        
                   api_key=None, 
        
                   temperature=0.5, 
        
                   max_tokens=100, 
        
                   n_retry_server=4, 
        
                   min_retry_wait_time=60, 
        
               ): 
        
                   super().__init__( 
        
                       model_name=model_name, 
        
                       api_key=api_key, 
        
                       temperature=temperature, 
        
                       max_tokens=max_tokens, 
        
                       max_retry=n_retry_server, 
        
                       min_retry_wait_time=min_retry_wait_time, 
        
                       api_key_env_var="VLLM_API_KEY", 
        
                       client_class=OpenAI, 
        
                       client_args={"base_url": "http://0.0.0.0:8000/v1"},

This PR introduces a new environment variable VLLM_API_URL that allows to use a custom endpoint URL or fall back to the default local server.

Description by Korbit AI

What change is being made?

Allow configuring the VLLM endpoint URL via environment variable VLLM_API_URL, defaulting to http://localhost:8000/v1 if not set.

Why are these changes being made?

To enable configuring the VLLM backend URL without code changes, using a sensible default when the variable is not provided.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

korbit-ai

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.

Category	Issue	Status
	Repeated environment variable lookup on model instantiation ▹ view

Files scanned

File Path	Reviewed
src/agentlab/llm/chat_api.py	✅

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

korbit-ai · 2025-10-13T21:44:23Z

src/agentlab/llm/chat_api.py

            api_key_env_var="VLLM_API_KEY",
            client_class=OpenAI,
-            client_args={"base_url": "http://0.0.0.0:8000/v1"},
+            client_args={"base_url": os.getenv("VLLM_API_URL", "http://localhost:8000/v1")},


Repeated environment variable lookup on model instantiation

Tell me more

What is the issue?

The os.getenv() call is executed on every VLLMChatModel instantiation, performing an unnecessary environment variable lookup each time.

Why this matters

This creates redundant system calls when multiple VLLMChatModel instances are created, as the environment variable is unlikely to change during program execution. The overhead becomes more significant in scenarios with frequent model instantiation.

Suggested change ∙ Feature Preview

Cache the environment variable lookup at module level or class level to avoid repeated os.getenv() calls:

# At module level VLLM_BASE_URL = os.getenv("VLLM_API_URL", "http://localhost:8000/v1") # Then in __init__: client_args={"base_url": VLLM_BASE_URL}

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

amanjaiswal73892

LGTM

add custom VLLM endpoitn url

66076ea

imenelydiaker requested a review from amanjaiswal73892 October 13, 2025 21:43

korbit-ai bot reviewed Oct 13, 2025

View reviewed changes

imenelydiaker closed this Oct 14, 2025

imenelydiaker deleted the vllm-config branch October 14, 2025 01:03

imenelydiaker restored the vllm-config branch October 14, 2025 01:03

imenelydiaker reopened this Oct 14, 2025

amanjaiswal73892 approved these changes Oct 14, 2025

View reviewed changes

amanjaiswal73892 merged commit 9b6a33f into ServiceNow:main Oct 14, 2025
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow custom VLLM endpoint URL #306

Allow custom VLLM endpoint URL #306

Uh oh!

imenelydiaker commented Oct 13, 2025 •

edited by korbit-ai bot

Loading

korbit-ai bot left a comment •

edited

Loading

korbit-ai bot Oct 13, 2025

amanjaiswal73892 left a comment

Uh oh!

Labels

2 participants

	class VLLMChatModel(ChatModel):
	def __init__(
	self,
	model_name,
	api_key=None,
	temperature=0.5,
	max_tokens=100,
	n_retry_server=4,
	min_retry_wait_time=60,
	):
	super().__init__(
	model_name=model_name,
	api_key=api_key,
	temperature=temperature,
	max_tokens=max_tokens,
	max_retry=n_retry_server,
	min_retry_wait_time=min_retry_wait_time,
	api_key_env_var="VLLM_API_KEY",
	client_class=OpenAI,
	client_args={"base_url": "http://0.0.0.0:8000/v1"},

Allow custom VLLM endpoint URL #306

Allow custom VLLM endpoint URL #306

Uh oh!

Conversation

imenelydiaker commented Oct 13, 2025 • edited by korbit-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description by Korbit AI

What change is being made?

Why are these changes being made?

korbit-ai bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.

korbit-ai bot Oct 13, 2025

Choose a reason for hiding this comment

Repeated environment variable lookup on model instantiation

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

amanjaiswal73892 left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

2 participants

imenelydiaker commented Oct 13, 2025 •

edited by korbit-ai bot

Loading

korbit-ai bot left a comment •

edited

Loading