Sharp spike in premium request usage and recent tool changes in VSCode #187486
Unanswered
znorman-harris
asked this question in
Copilot Conversations
Replies: 1 comment 1 reply
-
|
If you’re monitoring quota in real time, it changes how you use it.
The behavior you describe is consistent with current orchestration design, not necessarily a billing anomaly. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Select Topic Area
Question
Copilot Feature Area
General
Body
Context: I have only been using quick-access area in the lower-right of VSCode to see how many requests are made and I'm monitoring it more throughout the day. This started about 2 weeks ago. I have been using Claude Sonnet 4.5 for the last few months.
I have burned through 20% of my monthly quota with silly questions today all by itself. I'm not sure if I'm just noticing quota usage more or if there was a sharp change from the updates in the last couple of days. I have a feeling it is a change, here's what I'm seeing:
Simple MCP tool responses split into multiple requests
Ex: GitHub Copilot now "chunks" over tool responses that it used to read all at once. Feels like a change made with the intent to make request numbers go up and hence more usage = more money spent
Ex: MCP tool responses are now pretty-printed, saved to disk, and the LLM is forced to read them in chunks. Each chunk feels like it is being tracked as a "request" and an artificial inflation of "usage". This also feels crazy because pretty printing MCP tool responses (which are JSON) will add a lot of overhead when it reads 100 lines with 70% white space.
Each MCP action/tool by GitHub Copilot seems to also trigger a request:
Ex: I had GitHub Copilot update some JSON schema files. I asked it to find files, read a specific section, and update schemas in about 20 files.
This ended up taking a minute or two and took 10% of my monthly quota. From a personal perspective, I find it extremely difficult to justify using these tools (even for work) when it would have only taken me maybe 5-10 minutes longer.
If a simple request is 1/10th of the monthly quota, and it only saves me 10 minutes in time, the cost/benefit ratio just isn't there. I'm now just thinking "how much of my quota is this question going to ask" instead of just asking like I was back in January.
Just sharing my thoughts because this seems like a pretty negative change, wondering if others are seeing the same thing.
There is a similar discussion to this from a month ago, I'm noticing this more in the last couple of weeks. Here's the other discussion link: https://github.com/orgs/community/discussions/184133
Beta Was this translation helpful? Give feedback.
All reactions