Snowflake's Cortex Code triaged three broken pipeline nodes and fixed them in a matter of minutes..
I wasn't lying when I said this is the biggest software unlock this year..
Check out the demo below where I walk through exactly how Cortex Code paired with Coalesce.io's MCP server is able to save me 𝙃𝙊𝙐𝙍𝙎.
The workflow:
1. CoCo identifies the failures
2. Agents fix the broken nodes simultaneously
3. I review, test, and deploy
This is just one way to use this tool, more coming tomorrow! Follow me to stay up to date.
Repo: https://lnkd.in/eaBQZDMk
What's going on, guys? My name is Jared Robado. Today I'm going to show you something that. I'm coining called Agentic Data Engineering, and I think it's something that in the next couple of months we'll probably see. And it's super exciting. So let's jump right into it. All right, so let's just start out with the problem statement here. One of the most frustrating things as a data engineer on a small data team like myself is having to actually debug failed tables. And right now we're in COALESCE, which is a data transformation tool. Highly recommend. But here we have some instances of some failed tables. And this is the pain point because myself is data engineer. If I didn't create the pipeline and somebody else on my team did, then I have to gain that context of how that table was built. Now we're just going to deploy an agent to it to figure out what's wrong. So obviously this is a demo. I've laid out the errors on each of the tables. So just to walk you through them so that we can see that the agent correctly will fix them. We have a non nullable on the. Department code column on this sample data load employees, we're all table. We have a duplicate row error because there is another business key that needs set and then we're doing a cast on a non numeric. So those are our three errors that are agent will retreat to us. So here we are in Cortex code huge unlock again for any data team. In my opinion. It's a product that Snowflake released back in February. I highly recommend everybody should go check it out. So I have my prompt written here. It says let's investigate the latest job around using the coalesce job failure investigation skill, which is right here the entire repository and the skills I'm going to leave in the comments down below. So let's go ahead and run this. So what it's doing here is it's going found the latest job failure. So after it ran now it tells us if found the three nodes, which as we know from what I showed in Snowflake previously, these are three nodes that failed. Now right here you can see it's in parallel. So at the same time. It's getting each workspace node, which is a really, really, really cool unlock with the Cortex command line. So now it has all the configurations of the JSON for each coalesce node. It's going to analyze each one and then figure out which one it needs to either add or remove to in order to make those changes. Alright, so here is the summary of what we did here I am back in coalesce just to show you again how how cool this is and how exciting it is. So it actually added the event TS as the business. He here on the dimension table. It added this coalesce transformation to fix the department code being a null or having null values in a non nullable field. As well as adding this try cast instead of cast. Aye, so just wanted to to close the loop here on everything. So Cortex command line recommended the changes and use the coalesce MCP to change those. In coalesce I've deployed these changes. Now I'm going to run this job again. So something that previously would have taken me anywhere from 30 minutes to 2 hours and 30 minutes if I didn't build the pipeline myself and I have to go gather all the context can now be done in a matter of minutes. Very exciting stuff. Shout out to the teams that Snowflake and Coalesce for just being able to bring 2 products together and. Yeah, look forward to to seeing what everybody else is building.
This is fun, we are doing something similar with #Frosty at Gyrus, an open-source agent for Snowflake that can analyze failures, plan fixes, and generate SQL changes automatically.
This is the kind of AI use case that sells itself. Triaging broken pipeline nodes and actually fixing them in minutes instead of hours of manual debugging. Data teams spend so much time on pipeline maintenance that anything that cuts that down has an immediate measurable ROI. No hype needed.
That’s a great example of where agentic data engineering is heading. Moving from alerting on failures to actually triaging and fixing pipelines in parallel can dramatically cut downtime. The real impact isn’t just speed it’s shifting engineers from reactive firefighting to focusing on higher-value improvements.
This is fun, we are doing something similar with #Frosty at Gyrus, an open-source agent for Snowflake that can analyze failures, plan fixes, and generate SQL changes automatically.