Skip to content

Conversation

@sitammeur
Copy link
Contributor

@sitammeur sitammeur commented Aug 11, 2025

Summary by CodeRabbit

  • New Features

    • Added support for GPT-5 and Claude Opus 4.1 in the model list.
    • Updated fallback behavior: when a selected model isn’t available, defaults to Claude Opus 4.1 for Model 1.
  • Refactor

    • Enhanced routing so GPT models use a dedicated path, improving reliability and consistency of responses.
  • Style

    • Rebranded UI: page title and main header updated to “⚔️ CodeArena: Compare Codegen Models.”
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 11, 2025

Walkthrough

The UI title/header was updated. The default fallback for model1 changed to “Claude Opus 4.1.” The model catalog replaced “Claude Sonnet 4” and “GPT-4.1” with “Claude Opus 4.1” and “GPT-5.” Response routing now sends “GPT” models to OpenAI (OPENAI_API_KEY, max_completion_tokens) and others to OpenRouter (OPENROUTER_API_KEY, max_tokens).

Changes

Cohort / File(s) Summary
UI branding & default fallback
code-model-comparison/app.py
Updated page title/header text. Adjusted default fallback for model1 to “Claude Opus 4.1” when unavailable; model2 fallback unchanged.
Model catalog & routing logic
code-model-comparison/model_service.py
Updated AVAILABLE_MODELS (removed “Claude Sonnet 4”, “GPT-4.1”; added “Claude Opus 4.1”, “GPT-5”). get_model_response_async now routes models containing “GPT” to OpenAI with OPENAI_API_KEY and max_completion_tokens=2000; others to OpenRouter with OPENROUTER_API_KEY and max_tokens=2000.

Sequence Diagram(s)

sequenceDiagram
  actor User
  participant App as App UI
  participant MS as ModelService
  participant OA as OpenAI API
  participant OR as OpenRouter API

  User->>App: Select models / submit prompt
  App->>MS: get_model_response_async(model_name, prompt)
  alt model_name contains "GPT"
    MS->>OA: acompletion(api_key=OPENAI_API_KEY, max_completion_tokens=2000)
    OA-->>MS: response stream
  else
    MS->>OR: acompletion(api_key=OPENROUTER_API_KEY, max_tokens=2000)
    OR-->>MS: response stream
  end
  MS-->>App: stream tokens / final text
  App-->>User: display comparison output
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • Add Code Eval with Opik #166: Modifies the same app default-model fallback and model_service AVAILABLE_MODELS plus routing/API-key logic, indicating a closely related update.

Poem

In CodeArena’s glow, I tap my paws,
New banners raised, with title applause.
GPT hops to OpenAI’s gate,
Others to OpenRouter’s fate.
Opus ascends where Sonnet stood—
A bunny reviews, “All understood!”
Now, compare away—hop-hop, it’s good! 🐇⚔️

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (2)
code-model-comparison/model_service.py (1)

75-84: Reflect correct API key environment variable in error messages
Adjust the exception handler in code-model-comparison/model_service.py (around lines 75–84) so that invalid/missing API key errors reference OPENAI_API_KEY when routing to OpenAI and OPENROUTER_API_KEY otherwise.

• File: code-model-comparison/model_service.py
Lines: 75–84

     except Exception as e:
         error_msg = f"Error generating response: {str(e)}"
-        if "api_key" in str(e).lower() or "authentication" in str(e).lower():
-            error_msg = "Error: Invalid or missing API key. Please check your OPENROUTER_API_KEY configuration."
+        if "api_key" in str(e).lower() or "authentication" in str(e).lower():
+            # Dynamically reference the correct env var based on model routing
+            env_var = "OPENAI_API_KEY" if "GPT" in model_name else "OPENROUTER_API_KEY"
+            error_msg = (
+                f"Error: Invalid or missing API key. "
+                f"Please check your {env_var} configuration."
+            )
         elif "quota" in str(e).lower() or "limit" in str(e).lower():
             error_msg = "Error: API quota exceeded or rate limit reached. Please try again later."
         elif "model" in str(e).lower():
             error_msg = f"Error: Model '{model_name}' is not available or has issues. Please try a different model."
code-model-comparison/app.py (1)

65-71: Broken logo URL (double protocol) in “Powered by” strip.

The second image src has a duplicated https://, so the Opik logo won’t render.

-        <img src="https://files.buildwithfern.com/https://opik.docs.buildwithfern.com/docs/opik/2025-08-01T07:08:31.326Z/img/logo-dark-mode.svg" width="100">
+        <img src="https://files.buildwithfern.com/opik.docs.buildwithfern.com/docs/opik/2025-08-01T07:08:31.326Z/img/logo-dark-mode.svg" width="100">
🧹 Nitpick comments (1)
code-model-comparison/app.py (1)

91-94: Unreachable fallback branch; simplify default selection.

Because you stop execution when all_models is empty (Lines 82–85), the else "Claude Opus 4.1" fallback on Line 92 is unreachable. Simplify to avoid config drift.

-if default_model1 not in all_models:
-    default_model1 = all_models[0] if all_models else "Claude Opus 4.1"
+if default_model1 not in all_models:
+    default_model1 = all_models[0]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11a169b and 166a064.

📒 Files selected for processing (2)
  • code-model-comparison/app.py (3 hunks)
  • code-model-comparison/model_service.py (2 hunks)
🔇 Additional comments (2)
code-model-comparison/app.py (2)

13-13: Branding/title change LGTM.


62-62: Main header change LGTM.

Comment on lines +9 to 13
"Claude Opus 4.1": "openrouter/anthropic/claude-opus-4.1",
"Qwen3-Coder": "openrouter/qwen/qwen3-coder",
"Gemini 2.5 Flash": "openrouter/google/gemini-2.5-flash",
"GPT-4.1": "openrouter/openai/gpt-4.1",
"GPT-5": "gpt-5",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Model catalog updates look consistent, but verify identifiers exist upstream.

"Claude Opus 4.1" and "GPT-5" entries look fine. Ensure LiteLLM recognizes these identifiers and providers as configured.

Run web checks to confirm model IDs and parameter support:

If needed, I can adjust the mapping to provider-qualified IDs (e.g., openai/gpt-5) or add a fallback route.


🌐 Web query:

Does LiteLLM support the model id "gpt-5" with acompletion, and should we pass max_tokens or max_completion_tokens for streaming?

💡 Result:

Short answer: Yes — LiteLLM supports GPT‑5 model IDs (e.g. "openai/gpt-5" or "azure/gpt-5") for its completion APIs, and for GPT‑5 you should use max_completion_tokens to control the streamed completion length (LiteLLM will also accept max_tokens and map it to max_completion_tokens for GPT‑5, but passing max_completion_tokens is clearer/explicit). (docs.litellm.ai)

Quick notes and example:

  • Supported IDs: LiteLLM lists OpenAI/Azure GPT‑5 model IDs such as openai/gpt-5 and openai/gpt-5-chat (and provider-prefixed variants). (docs.litellm.ai)
  • Params: LiteLLM exposes both max_tokens and max_completion_tokens; for GPT‑5 it maps max_tokens → max_completion_tokens, and you can (and probably should) pass max_completion_tokens when streaming so you explicitly set the completion token cap. (docs.litellm.ai)

Example (Python):
from litellm import completion
resp = completion(
model="openai/gpt-5",
messages=[{"role":"user","content":"Write a short summary of X."}],
stream=True,
max_completion_tokens=250
)
for chunk in resp:
print(chunk.choices[0].delta)

If you want, tell me which provider/model string you’re using (openai/…, azure/…, or bare "gpt-5") and I’ll give a tailored snippet and confirm the exact param mapping for your LiteLLM version.

Citations:


🌐 Web query:

Is "openrouter/anthropic/claude-opus-4.1" a valid model identifier on OpenRouter via LiteLLM?

💡 Result:

Short answer: No. The valid Anthropic Opus identifier on OpenRouter via LiteLLM uses the Claude 3 naming, e.g., openrouter/anthropic/claude-3-opus-20240229, not “claude-opus-4.1.”[1][3]

Details:

  • LiteLLM’s OpenRouter convention is openrouter/provider/model-name, and Anthropic’s Opus listing on OpenRouter uses the Claude 3 format such as claude-3-opus-20240229.[1]
  • LiteLLM docs state you can call any OpenRouter model by passing model="openrouter/", with examples including openrouter/anthropic/claude-2 and the Claude 3 variants, but there is no “claude-opus-4.1” pattern shown.[3]
  • A third-party page shows a string like openrouter/anthropic/claude-opus-4.1, but this appears inconsistent with OpenRouter’s established Claude 3 model identifiers and should not be relied upon without confirmation on the OpenRouter models list.[2]

Recommendation:

  • Use the confirmed identifier: openrouter/anthropic/claude-3-opus-20240229.[1]
  • If you need the latest Sonnet/Haiku/Opus variant, check the OpenRouter model list and plug it into LiteLLM as openrouter/anthropic/.[3]

Citations:


Fix invalid LiteLLM model identifiers and use explicit max_completion_tokens

Please update the model mapping in code-model-comparison/model_service.py and streaming calls as follows:

  • In the AVAILABLE_MODELS mapping (lines 9–13):

    -    "Claude Opus 4.1": "openrouter/anthropic/claude-opus-4.1",
    +    "Claude Opus 4.1": "openrouter/anthropic/claude-3-opus-20240229",  # valid OpenRouter Anthropic Opus ID
    
    -    "GPT-5": "gpt-5",
    +    "GPT-5": "openai/gpt-5",  # provider-prefixed for LiteLLM compatibility
  • In your GPT-5 streaming completion branches (around lines 49–64):
    • Replace any usage of max_tokens with max_completion_tokens to explicitly set the streamed token cap (LiteLLM will map max_tokensmax_completion_tokens but explicit is clearer).

    resp = completion(
      model="openai/gpt-5",
      messages=…,
      stream=True,
    - max_tokens=250
    + max_completion_tokens=250
    )

These changes ensure LiteLLM recognizes both model IDs and correctly applies the streaming token limit.

🤖 Prompt for AI Agents
In code-model-comparison/model_service.py lines 9–13 and the GPT-5 streaming
branches around lines 49–64, update the AVAILABLE_MODELS mapping so the GPT-5
entry uses the valid LiteLLM identifier ("openai/gpt-5") instead of "gpt-5", and
in all GPT-5 stream calls replace max_tokens with max_completion_tokens (e.g.,
max_completion_tokens=250) so the streamed token cap is set explicitly; keep the
rest of the call arguments the same.
Comment on lines +49 to +64
if "GPT" in model_name:
response = await acompletion(
model=model_mapping,
messages=messages,
api_key=os.getenv("OPENAI_API_KEY"),
max_completion_tokens=2000,
stream=True,
)
else:
response = await acompletion(
model=model_mapping,
messages=messages,
api_key=os.getenv("OPENROUTER_API_KEY"),
max_tokens=2000,
stream=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use max_tokens for LiteLLM acompletion; align params across routes.

The GPT branch passes max_completion_tokens, which acompletion may not accept. Use max_tokens in both branches to avoid parameter mismatch and parity issues.

Apply this minimal diff:

-        if "GPT" in model_name:
-            response = await acompletion(
-                model=model_mapping,
-                messages=messages,
-                api_key=os.getenv("OPENAI_API_KEY"),
-                max_completion_tokens=2000,
-                stream=True,
-            )
+        if "GPT" in model_name:
+            response = await acompletion(
+                model=model_mapping,
+                messages=messages,
+                api_key=os.getenv("OPENAI_API_KEY"),
+                max_tokens=2000,
+                stream=True,
+            )

Optionally, dedupe branches by deriving api_key and calling acompletion once:

-        if "GPT" in model_name:
-            response = await acompletion(
-                model=model_mapping,
-                messages=messages,
-                api_key=os.getenv("OPENAI_API_KEY"),
-                max_tokens=2000,
-                stream=True,
-            )
-        else:
-            response = await acompletion(
-                model=model_mapping,
-                messages=messages,
-                api_key=os.getenv("OPENROUTER_API_KEY"),
-                max_tokens=2000,
-                stream=True,
-            )
+        is_gpt = "GPT" in model_name
+        response = await acompletion(
+            model=model_mapping,
+            messages=messages,
+            api_key=os.getenv("OPENAI_API_KEY") if is_gpt else os.getenv("OPENROUTER_API_KEY"),
+            max_tokens=2000,
+            stream=True,
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if "GPT" in model_name:
response = await acompletion(
model=model_mapping,
messages=messages,
api_key=os.getenv("OPENAI_API_KEY"),
max_completion_tokens=2000,
stream=True,
)
else:
response = await acompletion(
model=model_mapping,
messages=messages,
api_key=os.getenv("OPENROUTER_API_KEY"),
max_tokens=2000,
stream=True,
)
if "GPT" in model_name:
response = await acompletion(
model=model_mapping,
messages=messages,
api_key=os.getenv("OPENAI_API_KEY"),
max_tokens=2000,
stream=True,
)
else:
response = await acompletion(
model=model_mapping,
messages=messages,
api_key=os.getenv("OPENROUTER_API_KEY"),
max_tokens=2000,
stream=True,
)
🤖 Prompt for AI Agents
In code-model-comparison/model_service.py around lines 49 to 64, the GPT branch
passes max_completion_tokens while the other branch uses max_tokens, causing
acompletion parameter mismatch; change the GPT branch to use max_tokens=2000
(matching the other branch) and align other parameter names, and optionally
simplify by setting api_key = os.getenv("OPENAI_API_KEY") if "GPT" in model_name
else os.getenv("OPENROUTER_API_KEY") and call acompletion once with the derived
api_key and shared args (model=model_mapping, messages=messages,
api_key=api_key, max_tokens=2000, stream=True).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant