Max Beauchemin on AI-driven coding and productivity

This title was summarized by AI from the post below.

1,046 followers

3mo

"If you're not using AI all day, every day, writing any code that's not written by an AI, I don't know what you're doing." - Max Beauchemin on the current state of software development. He's coding 10 hours a day at 20x speed using Claude. Built Claudette to manage the workflow. Wrote a sessionization pipeline with dbt and a custom CLI called sup that gives Claude SQL access. The people who figure out agentic coding on their large dbt projects and Airflow DAG repos will have a higher multiple over their peers. The shift is already here for those paying attention. Check out the entire episode at https://lnkd.in/gN-KgdBY and subscribe to the Data Renegades podcast wherever you get your podcasts. #AIEngineering #DataEngineering #AgenticWorkflow #ProductivityHacks #ClaudeAI

The Creator of Airflow on Why Data Engineering Can't Share Code

https://www.youtube.com/

To view or add a comment, sign in

More Relevant Posts

Omasan Akperi
2mo
Report this post
Data science is one of the most talked-about roles in tech, but who is a data scientist and what do they do? In this episode [https://lnkd.in/eukumVRC], I break down the real work of data scientists, explore how the role differs from that of data analysts, discuss where data scientists work, and examine how AI is shaping the future of the field. Whether you’re curious about data science, thinking about transitioning from analytics, or trying to cut through the hype, this episode gives you a clear and honest perspective on the role. I plan on recording an episode every week and discussing many topics relating to data, from data analysis to data science, to data engineering, to data annotation. I want to talk about it all: my personal journey, the job market, the skills required, and their evolution. Nothing would be off the table as long as it has to do with data. I would also be having guests over to share their experiences, so if you are interested in being a guest on the podcast, please send a dm or drop a message in the comments.

Episode 5: Who is a Data Scientist? https://spotify.com
Like Comment
To view or add a comment, sign in
Sudeep Kumar
2mo Edited
Report this post
Day 16 of my RAG journey: Today, I laid the first real foundation for my RAG pipeline. We often think of fancy prompts or clever chat flows, but it all starts with data ingestion. Here's what I achieved today: → A batch file uploader for TXT, MD, and PDF → Automatic text extraction from these documents → Clean raw text stored in Snowflake tables No embeddings. No vectors. Just solid, usable text ready for the next steps. What did I learn? 1. RAG isn't a magic trick; it begins with clean data. 2. Handling different file formats is crucial. 3. If your data ingestion process is messy, retrieval will struggle later. 4. Snowflake serves as an excellent central store for RAG pipelines. The next steps are chunking, embedding, and retrieval. Building this pipeline step-by-step is illuminating the process behind RAG for me. Deep understanding comes from doing, not just reading or following others. So, whatever your journey, remember to focus on the fundamentals. What building block are you working on today? Share your experience and tag someone on a similar path. Snowflake Streamlit Chanin Nantasenamat 👍 Follow Sudeep Kumar ✅ for more on Data Engineering & AI ♻️ Repost to help your network learn along #30DaysOfAI #Streamlit #LLM #RAG #AIEngineering #DataEngineering #SudeepKumar10x #SnowflakeSquad

1 Comment
Like Comment
To view or add a comment, sign in
Laxmi Prasanna Gandham
2mo
Report this post
🚀 𝐃𝐀𝐘 13 𝐂𝐎𝐌𝐏𝐋𝐄𝐓𝐄𝐃 – 𝐃𝐚𝐭𝐚𝐛𝐫𝐢𝐜𝐤𝐬 14 𝐃𝐚𝐲𝐬 𝐀𝐈 𝐂𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞 📌 𝐓𝐨𝐩𝐢𝐜: 𝐌𝐨𝐝𝐞𝐥 𝐂𝐨𝐦𝐩𝐚𝐫𝐢𝐬𝐨𝐧 & 𝐅𝐞𝐚𝐭𝐮𝐫𝐞 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐖𝐡𝐚𝐭 𝐈 𝐥𝐞𝐚𝐫𝐧𝐞𝐝 𝐭𝐨𝐝𝐚𝐲 🔹 Training multiple ML models 🔹 Feature engineering using Spark 🔹 Building Spark ML Pipelines 🔹 Tracking and comparing experiments with MLflow 🔗 Notion Link: https://l1nq.com/yjENt ✨ 𝐊𝐞𝐲 𝐭𝐚𝐤𝐞𝐚𝐰𝐚𝐲: The best model is chosen through comparison and metrics, not assumptions. Codebasics | Databricks | Indian Data Club
Like Comment
To view or add a comment, sign in
Harini Jeyaraj
2mo
Report this post
Day 13/14 – Model Comparison & Feature Engineering 🚀 Today I worked on comparing multiple machine learning models using MLflow and building a Spark ML Pipeline in Databricks. What I did: • Trained Linear Regression, Decision Tree, and Random Forest models • Tracked metrics (R², RMSE) using MLflow • Compared model performance using MLflow visualizations • Built an end-to-end Spark ML pipeline • Selected the best model based on performance Key takeaway: Even with limited features, Random Forest performed better by capturing non-linear patterns. Databricks Codebasics Indian Data Club #Databricks #MLflow #MachineLearning #DataEngineering #LearningByDoing
Like Comment
To view or add a comment, sign in
Observe, Inc.

16,081 followers
2mo
Report this post
Observe CTO and co-founder Jacob Leverich goes beyond the usual talking points and shares how Observe actually built its platform on a data lake architecture for real-time observability at scale. On the Data Engineering Podcast, Jacob covers why open standards matter (OpenTelemetry for collection and Apache Iceberg as an open table format), and how AI-assisted workflows can work better when they’re grounded in a curated “context graph” organized around real-world use cases—helping support and engineering teams, not just SRE. He also lays out the practical design decisions required to make a data lake searchable and economically viable as telemetry grows: durable streaming ingest engineered for scale, curated columnar datasets to reduce scan/read amplification, and context-driven correlation across logs, metrics, and traces so teams can troubleshoot faster with fewer handoffs. Bottom line: Observe is engineered to consolidate telemetry, extend retention, and accelerate troubleshooting—without the usual cost curve. https://lnkd.in/gb-sCP5w #observability #opentelemetry #apacheiceberg #dataengineering #sre

Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability dataengineeringpodcast.com
Like Comment
To view or add a comment, sign in
John Mark Troyer
2mo
Report this post
Most observability stacks break at scale because telemetry gets too expensive to keep. Observe cofounder/CTO Jacob Leverich shares a rare, founder-level walkthrough of the design decisions required to deliver real-time observability on a data lake. In a crowded technical market, this kind of under-the-hood detail is the best way to understand a company's differentiation. I really enjoyed this podcast episode!

Observe, Inc.

16,081 followers
2mo

Observe CTO and co-founder Jacob Leverich goes beyond the usual talking points and shares how Observe actually built its platform on a data lake architecture for real-time observability at scale. On the Data Engineering Podcast, Jacob covers why open standards matter (OpenTelemetry for collection and Apache Iceberg as an open table format), and how AI-assisted workflows can work better when they’re grounded in a curated “context graph” organized around real-world use cases—helping support and engineering teams, not just SRE. He also lays out the practical design decisions required to make a data lake searchable and economically viable as telemetry grows: durable streaming ingest engineered for scale, curated columnar datasets to reduce scan/read amplification, and context-driven correlation across logs, metrics, and traces so teams can troubleshoot faster with fewer handoffs. Bottom line: Observe is engineered to consolidate telemetry, extend retention, and accelerate troubleshooting—without the usual cost curve. https://lnkd.in/gb-sCP5w #observability #opentelemetry #apacheiceberg #dataengineering #sre

Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability dataengineeringpodcast.com
Like Comment
To view or add a comment, sign in
Shruti Bhat
2mo
Report this post
If you’re curious why Observe, Inc. is built on Snowflake, give this a listen. It just makes sense. The next generation of observability architectures need to be data lake native.

Observe, Inc.

16,081 followers
2mo

Observe CTO and co-founder Jacob Leverich goes beyond the usual talking points and shares how Observe actually built its platform on a data lake architecture for real-time observability at scale. On the Data Engineering Podcast, Jacob covers why open standards matter (OpenTelemetry for collection and Apache Iceberg as an open table format), and how AI-assisted workflows can work better when they’re grounded in a curated “context graph” organized around real-world use cases—helping support and engineering teams, not just SRE. He also lays out the practical design decisions required to make a data lake searchable and economically viable as telemetry grows: durable streaming ingest engineered for scale, curated columnar datasets to reduce scan/read amplification, and context-driven correlation across logs, metrics, and traces so teams can troubleshoot faster with fewer handoffs. Bottom line: Observe is engineered to consolidate telemetry, extend retention, and accelerate troubleshooting—without the usual cost curve. https://lnkd.in/gb-sCP5w #observability #opentelemetry #apacheiceberg #dataengineering #sre

Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability dataengineeringpodcast.com
Like Comment
To view or add a comment, sign in
Georgii Baturin
2mo
Report this post
The final part of my "Hands-on dbt with ClickHouse" series is live. As always, feedback, shares and likes are very welcome! A dbt project that needs a human to run it is not a pipeline. It is a script. In the final part of the series, we move from a manually triggered setup to a real pipeline. Scheduled runs, dependency-aware execution, and no human in the loop to remember to press a button. This article is about orchestration and about turning dbt into something that actually runs on its own. And as a bonus, it is not just about running jobs. We also cover notifications, so failures do not stay hidden in logs, but show up in Slack, where teams can react before broken data reaches dashboards. In the previous parts, we built the full foundation step by step: 1️⃣ Environment [https://lnkd.in/egn28jUf] 2️⃣ Staging [https://lnkd.in/ewryrVhh] 3️⃣ Intermediate models [https://lnkd.in/eZkAYPPE] 4️⃣ Configs [https://lnkd.in/edQmJery] 5️⃣ Seeds and Snapshots [https://lnkd.in/e53fkPBy] 6️⃣ Ad hoc Analyses [https://lnkd.in/ejRU_pyp] 7️⃣ Data Tests [https://lnkd.in/eT-Dt4X4] 8️⃣ Unit Tests [https://lnkd.in/dQvQcVyt] The final part ties everything together and shows how a dbt project moves from a local playground to a production-ready pipeline. Read it here 👉 [https://lnkd.in/ddA6-gXj] #dbt #clickhouse #airflow #dataengineering #analyticsengineering #learning
Like Comment
To view or add a comment, sign in
Data Tribes

428 followers
2mo
Report this post
New to Data Science? Confused by all the buzz? We’ve created a free podcast series to explain what data science and machine learning really are — without jargon or hype. 5 short episodes on: • What data science is (and isn’t) • Why data quality matters • How real-world data becomes ML 🎧 Listen on Spotify: https://lnkd.in/dSeJyx9w

Season 1 - Episode 5: Feature Engineering: Turning Real-World Information into ML Power datatribes.ae
Like Comment
To view or add a comment, sign in
PRIYANKA DUTTA
2mo
Report this post
✅ 𝗗𝗮𝘆 11 𝗖𝗼𝗺𝗽𝗹𝗲𝘁𝗲𝗱 - 𝗗𝗮𝘁𝗮𝗯𝗿𝗶𝗰𝗸𝘀 𝟭𝟰 𝗗𝗮𝘆𝘀 𝗔𝗜 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲 🎯 PHASE 3: ADVANCED ANALYTICS Today’s focus was on Statistical Analysis & ML Preparation, where I explored how data moves one step closer to machine learning readiness using Databricks. What I Learn on Day 11: Descriptive statistics Hypothesis testing A/B test design Feature engineering 🔍 What I worked on Day 11: 1. Calculate statistical summaries 2. Test hypotheses (weekday vs weekend) 3. Identify correlations 4. Engineer features for ML Sharing today's progress below & now, I’m ready to visualize these insights on 𝐃𝐚𝐲 12 🚀🚀 Thanks to Codebasics, Databricks, and Indian Data Club for this hands-on learning experience. #Day11 #14DayAIChallenge #DatabricksWithIDC #Databricks #DeltaLake #DataEngineering #Codebasics #IndianDataClub
Like Comment
To view or add a comment, sign in

1,046 followers

View Profile Follow

Max Beauchemin on AI-driven coding and productivity

The Creator of Airflow on Why Data Engineering Can't Share Code

https://www.youtube.com/

More from this author

How to Speed Up PR Reviews Without Ditching dbt Best Practices

Asking for a friend — what can I do about dbt test bloat?

Unlock the power of dbt best practices (and see them in action in a large-scale data-infra project)

Explore content categories

Max Beauchemin on AI-driven coding and productivity

The Creator of Airflow on Why Data Engineering Can't Share Code

https://www.youtube.com/

More Relevant Posts

More from this author

How to Speed Up PR Reviews Without Ditching dbt Best Practices

Asking for a friend — what can I do about dbt test bloat?

Unlock the power of dbt best practices (and see them in action in a large-scale data-infra project)

Explore related topics

Explore content categories