Fix not supported field warnings in count_tokens_openai #6987

seunggil1 · 2025-09-02T15:57:01Z

Why are these changes needed?

1. `OpenAIChatCompletionClient.count_tokens` (from autogen_ext.models.openai) ignores several JSON Schema fields of Tools (e.g., anyOf, default, title), printing Not supported field ... warnings and producing consistent gaps between the pre-send estimate and usage.prompt_tokens.

Autogen Tool description example

def tool3(
        test1: Annotated[Optional[str], "example"] = None,
        test2: Literal["1", "2"] = "2"
) -> str:
    return str(test1) + str(test2)

tools = [ FunctionTool(tool3, description="example tool 3") ]
client.count_tokens(messages, tools=tools)

Printed warning log

Not supported field anyOf
Not supported field default
Not supported field title

Changes

Add anyOf, default, title in count_tokens_openai

elif field == "anyOf":
    tool_tokens -= 3
    for o in v["anyOf"]:
        tool_tokens += 3
        tool_tokens += len(encoding.encode(o["type"]))
elif field == "default":
    tool_tokens += 2
    tool_tokens += len(encoding.encode(json.dumps(v["default"])))
elif field == "title":
    tool_tokens += 2
    tool_tokens += len(encoding.encode(v["title"]))

Limitations

This change reduces—but does not eliminate—the discrepancy between estimated token counts and actual usage.
I don't currently know the exact logic behind counting message tokens in the count_tokens_openai function, so I just made it similar to other fields.
- The only reference I've found at this point is the token calculation guide in the OpenAI Cookbook: https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken
- If there are official guidelines or recommended constants beyond the Cookbook, I’d appreciate pointers

2. In actual requests, Autogen omits the tools parameter when no tools are provided. But the current implementation adds +12 unconditionally, which overcounts tokens for tool-less calls.

Per OpenAI’s cookbook heuristics for tool-call counting, func_end = 12 represents the wrapper overhead that applies only when a tools array exists.
- I referenced this article : https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken
Changes
- func_end(+12) only when tools exist in count_tokens_openai

# num_tokens += 12 # before
 if oai_tools:     # changed
    num_tokens += 12
return num_tokens

Difference in results before / after modification

messages = [UserMessage(content="What is the current time in Seoul?", source="user")]
model_client = OpenAIChatCompletionClient(model="gpt-4o")
token_estimate = model_client.count_tokens(messages=messages)

create_result = await model_client.create(
    messages=messages,
    cancellation_token=ctx.cancellation_token,
)
token_usage = create_result.usage.prompt_tokens

if token_usage != token_estimate:
    print(f"Token usage mismatch: estimated {token_estimate}, actual {token_usage}")

# before
Token usage mismatch: estimated 29, actual 16
# after
Token usage mismatch: estimated 17, actual 16

Related issue number

Closes #6980

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

…t_count_tokens

seunggil1 · 2025-09-02T16:08:38Z

@microsoft-github-policy-service agree

ekzhu · 2025-09-16T09:30:49Z

@seunggil1 can you fix the build errors?

ekzhu · 2025-09-18T00:43:47Z

@copilot can you create a PR to address the remaining issues based on this branch.

codecov · 2025-09-18T01:12:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.99%. Comparing base (fb03c1c) to head (cc2dc74).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6987      +/-   ##
==========================================
+ Coverage   80.96%   80.99%   +0.03%     
==========================================
  Files         238      238              
  Lines       18279    18291      +12     
==========================================
+ Hits        14799    14815      +16     
+ Misses       3480     3476       -4

Flag	Coverage Δ
unittests	`80.99% <100.00%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

seunggil1 added 4 commits September 2, 2025 01:53

Fix not supported field warnings in count_tokens_openai

d40c0a2

Fix apply func_end(+12) only when tools exist in count_tokens_openai

c5cbf31

Fix test coverage for optional parameters in count_tokens_openai

f0589ec

Refactor count_tokens test cases in test_openai_chat_completion_clien…

c7db7e4

…t_count_tokens

Merge branch 'main' into fix/6980

d881d86

Merge branch 'main' into fix/6980

d4d0ab9

ekzhu added 2 commits September 17, 2025 17:57

Fix format and tests

d0795eb

fix type

f09bac3

ekzhu approved these changes Sep 18, 2025

View reviewed changes

Merge branch 'main' into fix/6980

cc2dc74

ekzhu enabled auto-merge (squash) September 18, 2025 03:58

ekzhu merged commit f76f92d into microsoft:main Sep 18, 2025
73 checks passed

seunggil1 deleted the fix/6980 branch September 18, 2025 04:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix not supported field warnings in count_tokens_openai #6987

Fix not supported field warnings in count_tokens_openai #6987

Uh oh!

seunggil1 commented Sep 2, 2025 •

edited

Loading

seunggil1 commented Sep 2, 2025

ekzhu commented Sep 16, 2025

ekzhu commented Sep 18, 2025

codecov bot commented Sep 18, 2025 •

edited

Loading

Uh oh!

Labels

2 participants

Fix not supported field warnings in count_tokens_openai #6987

Fix not supported field warnings in count_tokens_openai #6987

Uh oh!

Conversation

seunggil1 commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

1. OpenAIChatCompletionClient.count_tokens (from autogen_ext.models.openai) ignores several JSON Schema fields of Tools (e.g., anyOf, default, title), printing Not supported field ... warnings and producing consistent gaps between the pre-send estimate and usage.prompt_tokens.

2. In actual requests, Autogen omits the tools parameter when no tools are provided. But the current implementation adds +12 unconditionally, which overcounts tokens for tool-less calls.

Related issue number

Checks

seunggil1 commented Sep 2, 2025

ekzhu commented Sep 16, 2025

ekzhu commented Sep 18, 2025

codecov bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Labels

2 participants

seunggil1 commented Sep 2, 2025 •

edited

Loading

1. `OpenAIChatCompletionClient.count_tokens` (from autogen_ext.models.openai) ignores several JSON Schema fields of Tools (e.g., anyOf, default, title), printing Not supported field ... warnings and producing consistent gaps between the pre-send estimate and usage.prompt_tokens.

codecov bot commented Sep 18, 2025 •

edited

Loading