Skip to content
brickster.ai
Projects

Trending GitHub projects.

Repos tagged topic:databricks — open-source tools, integrations, and accelerators built on or around Databricks. Excludes the official repos already covered by Releases.

Language:
dbeaver/dbeaver50.3k

dbeaver

Free universal database tool and SQL client

aidatabasedatabricksdb2
Java4.2kpushed today
getredash/redash28.6k

redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

analyticsathenabibigquery
Python4.6kpushed today
cube-js/cube20.1k

cube

📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics

agentic-analyticsagentsaianalytics
Rust2.0kpushed today
Tencent/APIJSON18.4k

APIJSON

🏆 Real-Time no-code, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and Frontend(Client) can customize response JSONs 🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构

baasclickhousecruddatabricks
Java2.3kpushed 1w ago
tobymao/sqlglot9.3k

sqlglot

Python SQL Parser and Transpiler

bigqueryclickhousedatabricksduckdb
Python1.2kpushed today
growthbook/growthbook7.8k

growthbook

Open Source Feature Flags, Experimentation, and Product Analytics

ab-testingabtestabtestinganalytics
TypeScript756pushed today
microsoft/SynapseML5.2k

SynapseML

Simple and Distributed Machine Learning

aiapache-sparkazurebig-data
Scala860pushed 2d ago
dotnet/spark2.1k

spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

analyticsapache-sparkazurebigdata
C#329pushed 2w ago
Multiwoven/multiwoven1.7k

multiwoven

🔥🔥🔥 Open source Reverse ETL - alternative to hightouch and census.

bigquerycdpcustomer-data-platformdata-activation
Ruby88pushed 2d ago
databricks-solutions/ai-dev-kit1.6k

ai-dev-kit

Databricks Toolkit for Coding Agents provided by Field Engineering

agentsclaudecursordatabricks
Python344pushed 3d ago
getnao/nao1.2k

nao

👾 nao is an open source analytics agent. (1) Create context with nao-core cli, (2) deploy nao chat interface for everyone

agentic-analyticsanalyticsanalytics-engineeringbigquery
TypeScript165pushed today
zinggAI/zingg1.2k

zingg

Scalable master data management, identity resolution, entity resolution, and deduplication using ML

cdpcustomer-data-platformdata-sciencedatabricks
Java168pushed 1w ago
databricks/mlops-stacks685

mlops-stacks

This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.

databricksmachine-learningmlops
Python257pushed 1mo ago
AltimateAI/altimate-code616

altimate-code

Open-source agentic data engineering harness for dbt, SQL, and cloud warehouses. 100+ tools, 10 warehouses, AI-powered.

agentagentic-data-engineeringaianalytics-engineering
TypeScript61pushed today
DataflareApp/dataflare570

dataflare

Simple, easy-to-use database manager

bigqueryclickhousecloudflare-d1cloudflare-r2
TypeScript35pushed 4d ago
databrickslabs/dbldatagen470

dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

data-generationdatabricksdatagendatageneration
Python97pushed today
dataflint/spark462

spark

Drop-in replacement for Apache Spark UI

apache-sparkbig-datadata-pipelinedata-pipelines
TypeScript54pushed 2d ago
databrickslabs/dbx460

dbx

🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

cicicddatabricksdatabricks-api
Python129pushed 2mo ago
Fast-Editor/Lynkr445

Lynkr

Streamline your workflow with Lynkr, a CLI tool that acts as an HTTP proxy for efficient code interactions using Claude Code CLI.

agentsaiclaudeclaudecode
JavaScript44pushed yesterday
databrickslabs/dqx419

dqx

Databricks framework to validate Data Quality of pySpark DataFrames and Tables

data-profilingdata-qualitydata-quality-monitoringdatabricks
Python119pushed today
DataflareApp/Dataflare383

Dataflare

Fast. Simple. Database Manager.

bigqueryclickhousecloudflare-d1cloudflare-r2
17pushed 1mo ago
databricks/terraform-databricks-examples331

terraform-databricks-examples

Examples of using Terraform to deploy Databricks resources

awsazuredatabricksdatabricks-module
HCL219pushed 6d ago
DataWithBaraa/databricks_bootcamp_2026331

databricks_bootcamp_2026

End-to-end Data Lakehouse project built on Databricks, following the Medallion Architecture (Bronze, Silver, Gold). Covers real-world data engineering and analytics workflows using Spark, PySpark, SQL, Delta Lake, and Unity Catalog. Designed for learning, portfolio building, and job interviews.

aiapache-sparkdata-analyticsdata-engineering
Jupyter Notebook168pushed 4mo ago
adidas/lakehouse-engine288

lakehouse-engine

The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.

big-dataconfiguration-drivendata-engineeringdata-quality
Python50pushed 2mo ago
databrickslabs/dlt-meta259

dlt-meta

Metadata driven Spark Declarative Pipelines framework for bronze/silver pipelines

databricksdltlakeflow-declarative-pipelinesmeta-programming
Python125pushed 1w ago
databricks/databricks-sql-python229

databricks-sql-python

Databricks SQL Connector for Python

databricksdwhpython3sql
Python146pushed today
OWOX/owox-data-marts222

owox-data-marts

Open-Source Self-Service Analytics Platform

analyticsathenabigquerydashboard
TypeScript30pushed today
CartoDB/analytics-toolbox-core209

analytics-toolbox-core

A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities

analytics-toolboxbigquerycartodatabricks
JavaScript43pushed 2d ago
buremba/universql202

universql

Pushdown compute from Snowflake to DuckDB running on your infrastructure

databricksdbtduckdbproxy-server
Jupyter Notebook7pushed 7mo ago
jrlasak/databricks-code-practice201

databricks-code-practice

Practice Databricks coding skills with hands-on exercises. Import into Databricks Free Edition, write code, run assertions, check pass/fail. Covers Delta Lake, Spark SQL, PySpark, Auto Loader, medallion architecture, window functions, and more.

auto-loadercoding-practicedata-engineeringdatabricks
Python111pushed 1mo ago
aloneguid/stowage191

stowage

Bloat-free, no BS cloud storage SDK.

aws-s3azure-storagedatabricksgcp-storage
C#22pushed 2mo ago
databricks-solutions/databricks-apps-cookbook173

databricks-apps-cookbook

Ready-to-use code snippets for building interactive Databricks Apps.

databricksdatabricks-appsweb-application
Python114pushed 2w ago
lamastex/scalable-data-science168

scalable-data-science

Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.

apache-sparkdata-sciencedatabricksscala
HTML93pushed 9mo ago
aehrc/VariantSpark147

VariantSpark

machine learning for genomic variants

association-studiesawsbioinformaticsdatabricks
JavaScript48pushed 1mo ago