Recce’s cover photo
Recce

Recce

Data Infrastructure and Analytics

Helping data teams preview, validate, and ship data changes with confidence.

About us

Recce helps modern data teams preview, validate, and ship data changes with confidence. By turning pull requests into structured, context-rich reviews, Recce makes it easy to spot meaningful changes, verify intent and impact, and reduce cognitive load for authors and reviewers alike. Curate reproducible checklists that compare data across environments — so you can catch what matters, skip what doesn’t, and align your team before merging. Accelerate development, cut down on manual QA, and bring visibility, verifiability, and velocity to your data workflows.

Website
https://datarecce.io
Industry
Data Infrastructure and Analytics
Company size
2-10 employees
Headquarters
San Francisco
Type
Privately Held
Specialties
dbt, Modern Data Stack, code review, Data Engineering, SQL, Data Lineage, Query Diff, Lineage Diff, and Data Model Diff

Locations

Employees at Recce

Updates

  • Can AI coding agents build something as intricate as Apache Arrow? Not yet, says the co-creator Wes McKinnny on the Data Renegades Podcast. "Arrow is a project that has the intricacy of a fine Swiss watch. There's a lot of very small details that were created very painstakingly over a long period of time." Wes McKinney uses AI agents daily. He runs parallel Claude Code sessions and shipped two new open source projects in the last month. But he draws a sharp line at core infrastructure: file formats, processing engines, metadata management. These require deliberation and architectural nuance that current agents lack. Data infrastructure remains one of AI's hardest frontiers. Listen wherever you catch your favorite podcasts. #ApacheArrow #datainfrastructure #AIlimits #dataengineering #DataRenegades

  • Wes McKinney's dropped this thesis in a Data Renegades Podcast: AI has created radical accountability for every software vendor. Building software just got dramatically cheaper. One engineer with a Claude subscription can prototype a replacement for tools that entire teams used to tolerate. Customers no longer have to accept mediocre products because the cost of leaving has collapsed. Wes's message to vendors shipping broken tools: "This is bad. Why haven't you fixed this yet? If I was on your engineering team, I would have already fixed this. I would have done it like today with Claude code." The flip side: a flood of AI-generated projects that only make sense to their creators. Hyper-personalized software with bad taste. The barrier to entry dropped, but the bar for credibility rose. "People are going to decide which companies to pay attention to on the basis of how credible the people are involved." Full episode linked in comments, or wherever you listen to podcasts #AIcodingagent #radicalaccountability #agenticworkflows #dataengineering #DataRenegades

    • No alternative text description for this image
  • Data Debug brought three practitioners to Mux's office this past Tuesday to answer one question: how do you make AI actually reliable for data work? The talks told a complete story. Claire Gouze CEO/Founder at nao Labs (YC X25) benchmarked 21 AI analytics tools on text-to-SQL accuracy. The headline finding: going from no context to a cleaned data model jumped accuracy from 17% to 86%. Semantic layers alone? 4% correct. Context quality is everything. Our own Dori Wilson shared the AI skills framework she built to operationalize that context. Skills are markdown files that encode domain knowledge, workflows, and guardrails into AI coding tools. Structured as a self-improving loop, every session compounds. She walked through a real aggregation bug Claude introduced, how a review skill caught it, and how the fix became a permanent rule the system enforces automatically. Kasia Rachuta (Lead Data Scientist) showed the breadth of what's possible today: analyzing CS tickets with Snowflake Cortex AI, fuzzy address matching that beat regex by 20%, automated Slack responses from documentation, and ETL doc generation. The practical filter: knowing when AI saves time versus when it's faster to write the code yourself. All three full talks are now on YouTube. See them here: https://lnkd.in/g6f_TxSP Data Debug SF runs monthly. If you're building with AI in data, this is the room to be in. #DataDebugSF #DataEngineering #AnalyticsEngineering #AI #dbt

    • No alternative text description for this image
  • The creator of pandas went full-time on an unpaid open source library at 26. No salary. A year of savings. A mouse-infested apartment in the East Village. "I would just wake up and write Python code. Take a break to eat and go to yoga. Then work until midnight or one o'clock in the morning, pretty much every day, seven days a week." Wes McKinney built the most-used Python data library in the world on founder hours before the term existed. Full conversation on Data Renegades. Link in comments. #pandasPython #opensource #dataengineering #DataRenegades

  • Recce reposted this

    View profile for Mattia Pavoni

    Bauplan5K followers

    One question we got during our last webinar: "Is the underlying data warehouse Iceberg? What are the options?" Short answer: yes. Apache Iceberg is the open format handshake between Bauplan and the rest of your stack. Your data never moves — it stays in your storage layer, your existing source of truth. No migration, no lock-in. Turns out that matters quite a bit when AI agents are running dozens of experiments in parallel. In this webinar, we show exactly how that works end to end — with our friends at Recce: → An agent builds a user segmentation pipeline from scratch, in full branch isolation → A second agent adds bot detection to that same pipeline → Recce's review agent compares the branches, surfaces schema diffs + lineage impact, and generates an auditable merge report Zero production risk. Full human oversight. Structured workflow. Trusting AI agents with your data is one of the hardest unsolved problems in data engineering. This is how we're solving it. Full recording 👇 https://lnkd.in/g4FpT8Nc

  • What's the worst production failure you've seen? Scott Breitenother borrows a comedian's line to reframe the question: "When an escalator breaks, it just becomes stairs. When your data workload fails, it often just results in stale data." Early in his career, a failed pipeline meant panic. Page the team, drop everything, scramble to fix it. Over time, the real lesson was learning to separate the severity levels. A real error (wrong numbers going to an exec) is fundamentally different from a pipeline that didn't run and left the data three hours old instead of one. Most data failures fall into the second category. The dashboard is stale. The report is delayed. But the numbers, when they arrive, are correct. Understanding that distinction changes how teams build alerting, handle on-call rotations, and decide what actually deserves a 2am page. "I think we'll be OK." Listen to the full Data Renegades Podcast episode with Scott wherever you get your favorite podcasts. #DataEngineering #DataReliability #Analytics

  • "The data team can't afford to be two years behind. You need to be using AI to generate the code. You need to get that exoskeleton going now." Scott Breitenother on Data Renegades, on why the gap between data teams and engineering teams is no longer acceptable. Data teams have historically trailed engineers by about two years in adopting modern software practices. Git, CI/CD, testing frameworks. Every cycle, data was a step behind. Scott says this time the technology is moving too fast and the cost of falling behind is too high to wait. The answer is the full-stack data person who owns the entire pipeline and uses AI agents to move at engineering speed. Not five specialized roles. One person, full stack, augmented. Our own CL Kaoasks if every data team should operate this way. Scott doesn't hedge: "Yes. We need to be making full stack data folks that can move faster and use their exoskeletons." The question isn't whether the shift is coming. It's whether your team is already behind. Listen to the full episode with Scott wherever you get your favorite podcasts. #DataEngineering #DataScience #AIAgents #Analytics

  • "These technologies, they're not robots replacing us. They're exoskeletons that make us better, faster, stronger." Scott Breitenother on Data Renegades Podcast on why the AI conversation keeps getting framed wrong. Every technology wave follows the same pattern. MPP databases didn't replace analysts. The modern data stack didn't replace data engineers. LLMs won't either. Each one is an exoskeleton that magnifies what humans already do well. Scott compares it to a stock analyst in the 1950s covering five stocks. By the 70s, maybe twenty. Now with AI, maybe ten thousand. The job isn't gone. The surface area is just bigger. For data teams, that means instead of telling one story, you tell twenty. The productivity bar rises, but nobody is getting replaced by the tool. "You might be put out of a job by someone who knows AI. But if you know the tools, you're not going anywhere." Listen to the full episode with Scott wherever you get your favorite podcasts. #DataEngineering #AI #Analytics #DataTeams

  • At Kilo, new engineers ship code on their first day. A working MVP feature by end of week. In production the next week. Scott Breitenother on Data Renegades Podcast on what happens when writing code is no longer the bottleneck. "We've tried to ruthlessly hunt down every unnecessary decision gate, every kind of comfort blanket that organizations put in to make people feel comfortable." The pattern he sees with every new hire: intimidation on day one, validation when given real ownership, then flight. Nobody wants to spend their time asking for permission. Once the gates come down, people move at what Kilo internally calls "kilo speed." The developers aren't working 24-hour days. They're just using agents and nobody is asking them to check in every five minutes. Listen to the full episode with Scott wherever you get your favorite podcasts. #AIAgents #Engineering #DataTeams #Analytics

Affiliated pages

Similar pages

Browse jobs

Funding

Recce 1 total round

Last Round

Pre seed

US$ 4.0M

See more info on crunchbase