DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
"Shannon Was Right, But We Can Be Smarter: How ALEC Achieves 22x Compression on IoT Data"

"Shannon Was Right, But We Can Be Smarter: How ALEC Achieves 22x Compression on IoT Data"

Comments
3 min read
Apache Data Lakehouse Weekly: January 15–22, 2026

Apache Data Lakehouse Weekly: January 15–22, 2026

Comments
5 min read
Medallion Architecture 101: Building Data Pipelines That Don't Fall Apart

Medallion Architecture 101: Building Data Pipelines That Don't Fall Apart

Comments
11 min read
My Friday "Sanity Savers" (Software, Data & DevOps edition) 🛠️

My Friday "Sanity Savers" (Software, Data & DevOps edition) 🛠️

Comments 1
1 min read
Introduction to Linux for Data Engineers, Including Practical Use of Vi and Nano.

Introduction to Linux for Data Engineers, Including Practical Use of Vi and Nano.

Comments
3 min read
Introducing `everyrow.io/dedupe`: An LLM-based approach to semantic deduplication

Introducing `everyrow.io/dedupe`: An LLM-based approach to semantic deduplication

Comments
6 min read
Streaming Crypto Changes: A Practical Guide to Real-Time Data Pipelines with Debezium CDC

Streaming Crypto Changes: A Practical Guide to Real-Time Data Pipelines with Debezium CDC

Comments
3 min read
Lakehouse? More Like a Lake + Warehouse Parking Lot

Lakehouse? More Like a Lake + Warehouse Parking Lot

5
Comments
10 min read
Why AI Models Fail in Production — Even When Accuracy Looks High

Why AI Models Fail in Production — Even When Accuracy Looks High

Comments
1 min read
🕸️ I Just Deleted My Scraper Boilerplate: Meet the "One-Liner" Crawler

🕸️ I Just Deleted My Scraper Boilerplate: Meet the "One-Liner" Crawler

Comments
3 min read
From Splicing Fibers to Scaling Clouds: My Journey to the AWS Community

From Splicing Fibers to Scaling Clouds: My Journey to the AWS Community

Comments
2 min read
Building Production-Grade Data Analytics Pipelines: A Real-World Case Study in Government Data

Building Production-Grade Data Analytics Pipelines: A Real-World Case Study in Government Data

Comments
9 min read
Building an Automated Data Pipeline

Building an Automated Data Pipeline

Comments
2 min read
SQL - PostgreSQL: Execution Order

SQL - PostgreSQL: Execution Order

2
Comments
5 min read
The Three Phases of Data Pipelines

The Three Phases of Data Pipelines

Comments
4 min read
Architecture of a 6TB Media Pipeline: Engineering Real-Time Content at Bharat Drone Shakti

Architecture of a 6TB Media Pipeline: Engineering Real-Time Content at Bharat Drone Shakti

Comments
6 min read
Why Columnar Storage Makes Analytics Faster

Why Columnar Storage Makes Analytics Faster

1
Comments
1 min read
Are Wide Tables Fast or Slow?

Are Wide Tables Fast or Slow?

5
Comments
4 min read
HOW TO GIT IT

HOW TO GIT IT

Comments
3 min read
Tableau + Databricks at Scale: A Technical Guide for Managing 10,000+ Databases

Tableau + Databricks at Scale: A Technical Guide for Managing 10,000+ Databases

Comments
5 min read
Pipelines, ETL, and Warehouses: The DNA of Data Engineering

Pipelines, ETL, and Warehouses: The DNA of Data Engineering

4
Comments
4 min read
How to Set Up GPG Keys for an Existing GitHub Account (Step-by-Step)

How to Set Up GPG Keys for an Existing GitHub Account (Step-by-Step)

Comments
2 min read
Making AI Data Flows Visible: Building an Open-Source Tool to Understand SaaS & LLM Data Risk

Making AI Data Flows Visible: Building an Open-Source Tool to Understand SaaS & LLM Data Risk

1
Comments
3 min read
An Introduction to Git: Concepts, Commands, and Workflows

An Introduction to Git: Concepts, Commands, and Workflows

Comments
4 min read
Apache Iceberg & the Open Data Stack: Why the Lakehouse is Real in 2026

Apache Iceberg & the Open Data Stack: Why the Lakehouse is Real in 2026

Comments
8 min read
loading...