We’ve started putting together a tiny index of agentic data tools. Skills, MCPs, and CLIs organized across the data stack: Microsoft Fabric, Databricks, Snowflake, dbt, duckdb, and more. Cloud providers too. 👉 GitHub: https://lnkd.in/etHizZPi #datasystems #aisystems
Mure Data’s Post
More Relevant Posts
-
SQL Analytics has always been one of my favourite SquaredUp features - it enables you to combine data from multiple sources and run queries over them as if they were SQL tables. SQL Analytics now ships with our SmartAssist AI assistant - so you don't even need to know any SQL to build queries - just describe the outcomes you need: 👉 "Show me my combined cloud spend across AWS and Azure." 👉 "Now format that to two decimal places." 👉 "Break it down by week and show it as a column chart." I have written a walkthrough in this blog article: https://lnkd.in/ettFY_Jq
To view or add a comment, sign in
-
-
Snowflake is not a database. It is a cloud data platform with elastic compute, credit-based pricing, and an expanding ecosystem of dbt, Snowpipe, Tasks, and Streams. So why do most observability tools still monitor it like it is Postgres? #PRIZM was built from the metadata up to understand how Snowflake actually works. Criticality-aware profiling means your CFO's revenue table gets deep checks. The forgotten staging table from 2019 gets nothing. Your credit bill stays predictable even as your catalog grows into tens of thousands of assets. We broke down the 7 layers you actually need to observe and the 10 capabilities to demand before you sign with any vendor. https://lnkd.in/djYSuUpj #DataObservability #Snowflake #DataQuality #AIReadyData #DataEngineering #PRIZM
To view or add a comment, sign in
-
-
Most Databricks teams can see their costs. Explaining them is a different problem entirely. Visibility is having the numbers — DBU reports, cloud invoices, usage dashboards. Most teams have all of that. Explainability is being able to answer the questions those numbers raise: → Which job drove last month's spike? → Which team owns the spend? → Did that cluster optimisation actually reduce total cost — or just shift it from the Databricks bill to the cloud bill? Neither billing system answers those questions on its own. And bridging them takes more than a notebook. Our latest article breaks down exactly why — the structural reason the two systems can't reconcile themselves, and why having the data is not the same as having the answer. Full article linked in the comments → #Databricks #FinOps #DataEngineering #CostObservability #CloudCost #DataPlatform #lumin8
To view or add a comment, sign in
-
-
I have built data platforms on AWS, Azure, and GCP. Here is the truth nobody puts in a blog post: The cloud does not matter as much as you think. I have seen beautiful AWS architectures that nobody trusted. I have seen a single Snowflake table on GCP that ran a $2B business unit's reporting. I have seen Azure Databricks pipelines that were masterpieces of engineering and took 8 months to deliver zero business value. The platform is never the problem. The problem is almost always one of these three things: Nobody agreed on what the data should mean before building the pipeline. The people consuming the data were never involved in designing it. The team optimized for technical elegance instead of business outcomes. I have made all three mistakes. On all three clouds. The engineers who consistently deliver are not the ones with the most certifications or the most impressive stack. They are the ones who spend the first week asking business questions instead of writing code. The cloud is just infrastructure. Judgment is the actual skill. Which mistake have you made most often? Be honest. #DataEngineering #AWS #Azure #GCP #BigData #TechCareers
To view or add a comment, sign in
-
The new Knowledge Catalog from Google Cloud looks very interesting and seems to attemtpt to solve the issue around context and agents. Much around AI these days seems to be revolving more around this particular issue, rather than using model number 3.whatever https://lnkd.in/gR2rHzK8 Getting charged in "per DCU-hour" however ... even the documentation admits it's "an abstract billing unit"
To view or add a comment, sign in
-
We cut cloud costs by 80%!! 💸📉 The surprising part? The biggest saving wasn't from better code. 🙅♀️ It was from reading the bill properly. 📑🔍 We found: → Clusters running all night with no jobs on them. 🌙💤 → Data stored in CSV (on a distributed system, in 2022!). 📁🚫 → Jobs running one by one that could run in parallel. 🐢➡️🐎 Migrated to Databricks . Converted to Delta Lake. Fixed the cluster config. 🛠️✨ The Result: ✅ 80% cost reduction. ✅ 85% faster processing. Sometimes the best engineering is just paying attention. 🧠💡 What's the most surprising inefficiency you've found in a data platform? 👇 #Databricks #DataEngineering #CloudCost #BigData #ApacheSpark #DeltaLake #CloudOptimization #TorontoTech #TorontoDataEngineer
To view or add a comment, sign in
-
If your team is dealing with fragmented data environments, runaway BigQuery costs, or pipelines that underperform at scale, this session is worth your time. On May 29, Jellyfish Training (Google's North American Cloud Training Partner of the Year) is running a full-day virtual session: 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗗𝗮𝘁𝗮 𝗪𝗼𝗿𝗸𝗹𝗼𝗮𝗱𝘀 𝗼𝗻 Google Cloud. Built for data architects and engineers, the day covers Knowledge Catalog (formerly Dataplex) and Data Mesh architecture, BigQuery workload management and pricing, Dataflow and batch pipeline tuning, and a FinOps module on budgets and alerting. Two hands-on labs are built into the schedule. Participants who complete the training earn a digital Credly "Enterprise Data Efficiency" badge. 📅 May 29, 2026 | 9:00 AM – 5:00 PM CDT | Virtual Participation is limited to keep the experience hands-on and high-impact. 🔗 Register here ➡ https://lnkd.in/e_H6UAJu #GoogleCloud #DataEngineering
To view or add a comment, sign in
-
If your team is dealing with fragmented data environments, runaway BigQuery costs, or pipelines that underperform at scale, this session is worth your time. On May 29, Jellyfish Training (Google's North American Cloud Training Partner of the Year) is running a full-day virtual session: 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗗𝗮𝘁𝗮 𝗪𝗼𝗿𝗸𝗹𝗼𝗮𝗱𝘀 𝗼𝗻 Google Cloud. Built for data architects and engineers, the day covers Knowledge Catalog (formerly Dataplex) and Data Mesh architecture, BigQuery workload management and pricing, Dataflow and batch pipeline tuning, and a FinOps module on budgets and alerting. Two hands-on labs are built into the schedule. Participants who complete the training earn a digital Credly "Enterprise Data Efficiency" badge. 📅 May 29, 2026 | 9:00 AM – 5:00 PM CDT | Virtual Participation is limited to keep the experience hands-on and high-impact. 🔗 Register here ➡ https://lnkd.in/dUsEpXcW #GoogleCloud #DataEngineering
To view or add a comment, sign in
-
If your team is dealing with fragmented data environments, runaway BigQuery costs, or pipelines that underperform at scale, this session is worth your time. On May 29, Jellyfish Training (Google's North American Cloud Training Partner of the Year) is running a full-day virtual session: 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗗𝗮𝘁𝗮 𝗪𝗼𝗿𝗸𝗹𝗼𝗮𝗱𝘀 𝗼𝗻 Google Cloud. Built for data architects and engineers, the day covers Knowledge Catalog (formerly Dataplex) and Data Mesh architecture, BigQuery workload management and pricing, Dataflow and batch pipeline tuning, and a FinOps module on budgets and alerting. Two hands-on labs are built into the schedule. Participants who complete the training earn a digital Credly "Enterprise Data Efficiency" badge. 📅 May 29, 2026 | 9:00 AM – 5:00 PM CDT | Virtual Participation is limited to keep the experience hands-on and high-impact. 🔗 Register here ➡ https://lnkd.in/e-pE6jgJ #GoogleCloud #DataEngineering
To view or add a comment, sign in
-
If your team is dealing with fragmented data environments, runaway BigQuery costs, or pipelines that underperform at scale, this session is worth your time. On May 29, Jellyfish Training (Google's North American Cloud Training Partner of the Year) is running a full-day virtual session: 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗗𝗮𝘁𝗮 𝗪𝗼𝗿𝗸𝗹𝗼𝗮𝗱𝘀 𝗼𝗻 Google Cloud. Built for data architects and engineers, the day covers Knowledge Catalog (formerly Dataplex) and Data Mesh architecture, BigQuery workload management and pricing, Dataflow and batch pipeline tuning, and a FinOps module on budgets and alerting. Two hands-on labs are built into the schedule. Participants who complete the training earn a digital Credly "Enterprise Data Efficiency" badge. 📅 May 29, 2026 | 9:00 AM – 5:00 PM CDT | Virtual Participation is limited to keep the experience hands-on and high-impact. 🔗 Register here ➡ https://lnkd.in/d_yk5CV3 #GoogleCloud #DataEngineering
To view or add a comment, sign in