Skip to content

[WIP] Add aggregate_records DML tool to MCP server#3179

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/add-aggregate-records-tool
Draft

[WIP] Add aggregate_records DML tool to MCP server#3179
Copilot wants to merge 2 commits intomainfrom
copilot/add-aggregate-records-tool

Conversation

Copy link
Contributor

Copilot AI commented Feb 28, 2026

  • Add AggregateRecords property to DmlToolsConfig (with UserProvided flag, constructor, FromBoolean, Default)
  • Update DmlToolsConfigConverter to read/write aggregate-records property
  • Add CLI option for runtime.mcp.dml-tools.aggregate-records.enabled in ConfigureOptions
  • Update ConfigGenerator.TryUpdateConfiguredMcpValues for aggregate-records
  • Create AggregateRecordsTool implementing IMcpTool (tool metadata + execute logic)
  • Add tests for AggregateRecordsTool (disabled at runtime/entity level, metadata schema)
  • Build and verify changes compile
  • Run code review and security scan
Original prompt

This section details on the original issue you should resolve

<issue_title>[Enh]: add aggregate_records DML tool to MCP server</issue_title>
<issue_description>## What?

Allow models to answer: "How many products are there?" and "What is our most expensive product?"

Why?

These are among the most common information discovery questions, a primary model use case.

How?

Introduce a new tool: aggregate_records that reuses native GraphQL aggregation capabilities in DAB.

Schema

{
  "type": "object",
  "properties": {
    "entity": {
      "type": "string",
      "description": "Entity name with READ permission.",
      "required": true
    },
    "function": {
      "type": "string",
      "enum": ["count", "avg", "sum", "min", "max"],
      "description": "Aggregation function to apply.",
      "required": true
    },
    "field": {
      "type": "string",
      "description": "Field to aggregate. Use '*' for count.",
      "required": true
    },
    "distinct": {
      "type": "boolean",
      "description": "Apply DISTINCT before aggregating.",
      "default": false
    },
    "filter": {
      "type": "string",
      "description": "OData filter applied before aggregating (WHERE). Example: 'unitPrice lt 10'",
      "default": ""
    },
    "groupby": {
      "type": "array",
      "items": { "type": "string" },
      "description": "Fields to group by, e.g., ['category', 'region']. Grouped field values are included in the response.",
      "default": []
    },
    "orderby": {
      "type": "string",
      "enum": ["asc", "desc"],
      "description": "Sort aggregated results by the computed value. Only applies with groupby.",
      "default": "desc"
    },
    "having": {
      "type": "object",
      "description": "Filter applied after aggregating on the result (HAVING). Operators are AND-ed together.",
      "properties": {
        "eq":  { "type": "number", "description": "Aggregated value equals." },
        "neq": { "type": "number", "description": "Aggregated value not equals." },
        "gt":  { "type": "number", "description": "Aggregated value greater than." },
        "gte": { "type": "number", "description": "Aggregated value greater than or equal." },
        "lt":  { "type": "number", "description": "Aggregated value less than." },
        "lte": { "type": "number", "description": "Aggregated value less than or equal." },
        "in":  {
          "type": "array",
          "items": { "type": "number" },
          "description": "Aggregated value is in the given list."
        }
      }
    }
  },
  "required": ["entity", "function", "field"]
}

Response Alias Convention

The aggregated value in the response is always aliased as {function}_{field}. For count with "*", the alias is count.

Examples

Q1: "How many products are there?"

{
  "entity": "Product",
  "function": "count",
  "field": "*"
}
SELECT COUNT(*) AS count
FROM Product;

Example output:

count
77

Q2: "What is the average price of products under $10?"

{
  "entity": "Product",
  "function": "avg",
  "field": "unitPrice",
  "filter": "unitPrice lt 10"
}
SELECT AVG(unitPrice) AS avg_unitPrice
FROM Product
WHERE unitPrice < 10;

Example output:

avg_unitPrice
6.74

Q3: "Which categories have more than 20 products?"

{
  "entity": "Product",
  "function": "count",
  "field": "*",
  "groupby": ["categoryName"],
  "having": {
    "gt": 20
  }
}
SELECT categoryName, COUNT(*) AS count
FROM Product
GROUP BY categoryName
HAVING COUNT(*) > 20;

Example output:

categoryName count
Beverages 24
Condiments 22

Q4: "For discontinued products, which categories have a total revenue between $500 and $10,000?"

{
  "entity": "Product",
  "function": "sum",
  "field": "unitPrice",
  "filter": "discontinued eq true",
  "groupby": ["categoryName"],
  "having": {
    "gte": 500,
    "lte": 10000
  }
}
SELECT categoryName, SUM(unitPrice) AS sum_unitPrice
FROM Product
WHERE discontinued = 1
GROUP BY categoryName
HAVING SUM(unitPrice) >= 500
   AND SUM(unitPrice) <= 10000;

Example output:

categoryName sum_unitPrice
Seafood 1834.50
Produce 742.00

Q5: "How many distinct suppliers do we have?"

{
  "entity": "Product",
  "function": "count",
  "field": "supplierId",
  "distinct": true
}
SELECT COUNT(DISTINCT supplierId) AS count_supplierId
FROM Product;

Example output:

count_supplierId
29

Q6: "Which categories have exactly 5 or 10 products?"

{
  "entity": "Product",
  "function": "count",
  "field": "*",
  "groupby": ["categoryName"],
  "having": {
    "in": [5, 10]
  }
}
SELECT c...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes Azure/data-api-builder#3178

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants