DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Building a Production-Ready Data Pipeline on AWS: A Hands-On Guide for Data Engineers

Building a Production-Ready Data Pipeline on AWS: A Hands-On Guide for Data Engineers

1
3 min read
Comparing Great Expectations and CsvPath Framework

Comparing Great Expectations and CsvPath Framework

8 min read
Financial Transaction Data Reconciler PayPal

Financial Transaction Data Reconciler PayPal

5 min read
Introducing dremioframe - A Pythonic DataFrame Interface for Dremio

Introducing dremioframe - A Pythonic DataFrame Interface for Dremio

9 min read
Stifel Modern Data Platform

Stifel Modern Data Platform

4 min read
Core Microsoft Fabric Concepts

Core Microsoft Fabric Concepts

1
3 min read
Implementing a CDC pipeline with Debezium

Implementing a CDC pipeline with Debezium

8 min read
LogInSight: A Lightweight CloudWatch Log Analytics Tool for Faster Debugging and Real-Time Insights

LogInSight: A Lightweight CloudWatch Log Analytics Tool for Faster Debugging and Real-Time Insights

2
3 min read
Stop Waiting for the Cloud: Building a Hybrid SQL+Python Data Pipeline Locally with DuckDB

Stop Waiting for the Cloud: Building a Hybrid SQL+Python Data Pipeline Locally with DuckDB

5 min read
Building Streaming Iceberg Tables for Real-Time Logistics Analytics

Building Streaming Iceberg Tables for Real-Time Logistics Analytics

4 min read
Smart Invoice Analyzer — How I Automated Invoice Processing & Predicted Sales Using Machine Learning

Smart Invoice Analyzer — How I Automated Invoice Processing & Predicted Sales Using Machine Learning

3
2 min read
Building a Scalable Community Health Worker Analytics Platform: My Journey with dbt and Snowflake

Building a Scalable Community Health Worker Analytics Platform: My Journey with dbt and Snowflake

4 min read
A Stranger In a New Town: CsvPath metadata fields

A Stranger In a New Town: CsvPath metadata fields

6 min read
dupl

dupl

1 min read
Apache Dev List Digest: Iceberg, Polaris, Arrow & Parquet (Nov 18–24, 2025)

Apache Dev List Digest: Iceberg, Polaris, Arrow & Parquet (Nov 18–24, 2025)

5 min read
Embeddings and Vector Similarity: How Machines Understand Meaning

Embeddings and Vector Similarity: How Machines Understand Meaning

19 min read
From Raw to Refined: Data Pipeline Architecture at Scale

From Raw to Refined: Data Pipeline Architecture at Scale

12 min read
Agent Cost Optimization: A Data Engineer's Guide

Agent Cost Optimization: A Data Engineer's Guide

13 min read
INTRODUCTION TO DBT(Data Build Tool)

INTRODUCTION TO DBT(Data Build Tool)

1
2 min read
Breaking Into Gaming Analytics: From 1 Billion Mobile Users to 5B Daily Events

Breaking Into Gaming Analytics: From 1 Billion Mobile Users to 5B Daily Events

1
6 min read
Building an Enterprise Patching Dashboard with AWS - A Complete Guide

Building an Enterprise Patching Dashboard with AWS - A Complete Guide

4
5 min read
Taming the Data Beast: Build Pipelines That Bend, Not Break by Arvind Sundararajan

Taming the Data Beast: Build Pipelines That Bend, Not Break by Arvind Sundararajan

2 min read
Azure Data Solutions: Data Factory, Synapse, Data Lake & Databricks Integration

Azure Data Solutions: Data Factory, Synapse, Data Lake & Databricks Integration

1
4 min read
Data Quality at Scale: Why Your Pipeline Needs More Than Green Checkmarks

Data Quality at Scale: Why Your Pipeline Needs More Than Green Checkmarks

8 min read
Introduction to the Confluent REST Proxy

Introduction to the Confluent REST Proxy

2
4 min read
loading...