Skip to content

Conversation

@seunggil1
Copy link
Contributor

@seunggil1 seunggil1 commented Sep 2, 2025

Why are these changes needed?

1. OpenAIChatCompletionClient.count_tokens (from autogen_ext.models.openai) ignores several JSON Schema fields of Tools (e.g., anyOf, default, title), printing Not supported field ... warnings and producing consistent gaps between the pre-send estimate and usage.prompt_tokens.

Autogen Tool description example

def tool3(
        test1: Annotated[Optional[str], "example"] = None,
        test2: Literal["1", "2"] = "2"
) -> str:
    return str(test1) + str(test2)

tools = [ FunctionTool(tool3, description="example tool 3") ]
client.count_tokens(messages, tools=tools)

Printed warning log

Not supported field anyOf
Not supported field default
Not supported field title

Changes

  • Add anyOf, default, title in count_tokens_openai
elif field == "anyOf":
    tool_tokens -= 3
    for o in v["anyOf"]:
        tool_tokens += 3
        tool_tokens += len(encoding.encode(o["type"]))
elif field == "default":
    tool_tokens += 2
    tool_tokens += len(encoding.encode(json.dumps(v["default"])))
elif field == "title":
    tool_tokens += 2
    tool_tokens += len(encoding.encode(v["title"]))

Limitations

  • This change reduces—but does not eliminate—the discrepancy between estimated token counts and actual usage.
  • I don't currently know the exact logic behind counting message tokens in the count_tokens_openai function, so I just made it similar to other fields.

2. In actual requests, Autogen omits the tools parameter when no tools are provided. But the current implementation adds +12 unconditionally, which overcounts tokens for tool-less calls.

# num_tokens += 12 # before
 if oai_tools:     # changed
    num_tokens += 12
return num_tokens
  • Difference in results before / after modification
messages = [UserMessage(content="What is the current time in Seoul?", source="user")]
model_client = OpenAIChatCompletionClient(model="gpt-4o")
token_estimate = model_client.count_tokens(messages=messages)

create_result = await model_client.create(
    messages=messages,
    cancellation_token=ctx.cancellation_token,
)
token_usage = create_result.usage.prompt_tokens

if token_usage != token_estimate:
    print(f"Token usage mismatch: estimated {token_estimate}, actual {token_usage}")
# before
Token usage mismatch: estimated 29, actual 16
# after
Token usage mismatch: estimated 17, actual 16

Related issue number

Closes #6980

Checks

@seunggil1
Copy link
Contributor Author

@microsoft-github-policy-service agree

@ekzhu
Copy link
Contributor

ekzhu commented Sep 16, 2025

@seunggil1 can you fix the build errors?

@ekzhu
Copy link
Contributor

ekzhu commented Sep 18, 2025

@copilot can you create a PR to address the remaining issues based on this branch.

@codecov
Copy link

codecov bot commented Sep 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.99%. Comparing base (fb03c1c) to head (cc2dc74).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6987      +/-   ##
==========================================
+ Coverage   80.96%   80.99%   +0.03%     
==========================================
  Files         238      238              
  Lines       18279    18291      +12     
==========================================
+ Hits        14799    14815      +16     
+ Misses       3480     3476       -4     
Flag Coverage Δ
unittests 80.99% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@ekzhu ekzhu enabled auto-merge (squash) September 18, 2025 03:58
@ekzhu ekzhu merged commit f76f92d into microsoft:main Sep 18, 2025
73 checks passed
@seunggil1 seunggil1 deleted the fix/6980 branch September 18, 2025 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants