Skip to content

gemini-cli-extensions/data-agent-kit-starter-pack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Data Agent Kit Starter Pack

Note

This extension is currently in beta (pre-v1.0), and may see breaking changes until the first stable release (v1.0).

This plugin provides a specialized suite of skills and MCP tools for data engineers and database practitioners working on Google Cloud. It acts as an expert assistant, allowing you to use natural language prompts in your preferred coding agent to architect complex data pipelines, transform data with dbt, write Spark and BigQuery SQL notebooks, and orchestrate end-to-end workflows across the Google Cloud data ecosystem (BigQuery, Spanner, BigLake, Dataproc, etc.).

Important

We Want Your Feedback! Please share your thoughts with us by opening an issue on GitHub. Your input is invaluable and helps us improve the project for everyone.

Contents

Why Use the Data Agent Kit Starter Pack?

  • Seamless Workflow: Bring Google Cloud data engineering expertise directly into your terminal or IDE via Gemini CLI, Claude Code, or Codex.
  • End-to-End Data Pipelines: Effortlessly generate code that reads raw data from Cloud Storage, processes it with Spark or BigQuery, transform it through medallion architectures (bronze, silver, gold) using dbt, and export it to serving layers like Spanner.
  • Ecosystem Integration: Work across boundaries—generate BigLake Iceberg catalog tables, train BigQuery ML models (XGBoost, KMEANS), and create interactive Streamlit dashboards or LookML models, all from natural language.
  • Workflow Orchestration: Automatically create and schedule orchestration pipelines that tie your notebooks and dbt models together into robust, scheduled jobs.

Prerequisites

Ensure you have the following installed:

Getting Started

Installation

Choose the installation method for your preferred coding agent. Run the commands in terminal

Gemini CLI and Gemini Code Assist

Install the extension directly from GitHub:

gemini extensions install https://github.com/gemini-cli-extensions/data-agent-kit-starter-pack --ref 0.1.1
Claude Code

Run the claude command to start the agent, then follow these steps:

  1. Add the marketplace:
/plugin marketplace add https://github.com/gemini-cli-extensions/data-agent-kit-starter-pack#0.1.1 
  1. Install the plugin:
/plugin install data-agent-kit-starter-pack@data-agent-kit-starter-pack-marketplace
Codex
  1. Run the installation script in your terminal:

macOS / Linux:

curl -sSL https://raw.githubusercontent.com/gemini-cli-extensions/data-agent-kit-starter-pack/0.1.1/codex-install.sh | bash

Windows:

irm https://raw.githubusercontent.com/gemini-cli-extensions/data-agent-kit-starter-pack/0.1.1/codex-install.ps1 | iex
  1. Install the plugin in Codex:

Start the Codex agent (codex), then run:

/plugins

Use the interactive options to install the plugin with the name Data Agent Kit Starter Pack.

Configuration

This extension brings a suite of specialized Skills and MCP toolboxes. While skills are ready to use upon installation, you must configure the MCP toolboxes and authenticate with Google Cloud for them to start successfully.

Note

If you use Gemini CLI, Claude Code, or Codex in your IDE (e.g., via VS Code extensions), they share the same underlying configuration and MCP servers as the CLI agents.

1. Authenticate with Google Cloud

The MCP toolboxes require an active authenticated session to interact with your resources. Run the following commands in your terminal:

gcloud auth login
gcloud auth application-default login

2. Update Agent Configuration

You must configure the MCP toolboxes in your agent's configuration files for them to start successfully. After updating, you must restart the agent.

To verify your configuration:

  • Run the /mcp command to check the status of available MCP servers.
  • Ask your agent "What skills are available?" to view the list of active skills.
Gemini CLI and Gemini Code Assist

Edit the configuration file: ~/.gemini/extensions/data-agent-kit-starter-pack/gemini-extension.json

Claude Code

Edit the configuration file: ~/.claude/plugins/cache/data-agent-kit-starter-pack-marketplace/data-agent-kit-starter-pack/0.1.1/.mcp.json

Codex
  1. Edit the configuration file: ~/.agents/plugins/data-agent-kit-starter-pack/.mcp.json

  2. Use the interactive options to uninstall and install the plugin with the name Data Agent Kit Starter Pack:

/plugins

Usage Examples

Interact with your coding agent using natural language prompts to perform complex data engineering tasks:

  • Data Ingestion & Processing:
    • "Create a Spark notebook that reads raw fraud transaction data from gs://fin-clearing-west1/raw, deduplicates records, and writes hourly partitions to a BigLake Iceberg catalog table."
    • "Create a BigQuery SQL notebook that drops an existing table and writes deduplicated transaction data from GCS."
  • Data Transformation (dbt):
    • "Create a dbt pipeline to transform bronze_transactions into silver and gold tables, standardizing timestamps and joining with identity tables."
  • Machine Learning & Serving:
    • "Train a robust XGBoost model using BigQuery ML on the gold_transactions table to identify potential fraud."
    • "Generate an inference notebook to batch-process new partitions and write flagged transactions into a Cloud Spanner table for high-availability access."
  • Analysis & Visualization:
    • "Generate a complete View for my BigQuery tables to show YoY revenue growth, then generate a LookML model and an interactive Streamlit dashboard prototype."
  • Orchestration:
    • "Create an orchestration pipeline that first runs the dedup notebook, then the dbt pipeline, and finally the model training and inference notebooks. Schedule it to run every Monday morning."

Troubleshooting

Use gemini --debug to enable debugging.

Common issues:

  • Plugin Not Found: Ensure you have restarted your agent (e.g., Gemini CLI or Codex) after installation.
  • Authentication Errors: Many GCP skills require an active authenticated session. Ensure you have run gcloud auth login and gcloud auth application-default login on your machine. See Set up Application Default Credentials for more information.
  • "failed to find default credentials: google: could not find default credentials.": Ensure Application Default Credentials (ADC) are available in your environment.
  • MCP Connection Issues: Update the MCP server configurations such as project, region etc. needed by MCP toolboxes in order to connect successfully to them.
  • "✖ Error during discovery for server: MCP error -32000: Connection closed": The connection could not be established. Ensure your configuration is correctly set in the agent's configuration file.
  • "✖ MCP ERROR: Error: spawn .../toolbox ENOENT": The Toolbox binary did not download correctly. Ensure you are using Gemini CLI v0.6.0+.
  • "cannot execute binary file": The Toolbox binary did not download correctly. Ensure the correct binary for your OS/Architecture has been downloaded.

Security Reminder: Agent Environment Hardening

Your agent can execute tools and commands on your behalf. Protect your Google Cloud resources by enforcing The Principle of Least Privilege across all CLIs, MCP servers and other resources available to your agents.

You can read more here on how to mitigate prompt injection attacks with Google Cloud MCP.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors