DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

Comments
2 min read
Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK

Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK

Comments
8 min read
Synthetic Data for RAG: Safe Generation, Deduplication, and Drift-Aware Curation in 2025

Synthetic Data for RAG: Safe Generation, Deduplication, and Drift-Aware Curation in 2025

Comments
10 min read
Building Distributed Systems with Ray—Just Like Running a Restaurant

Building Distributed Systems with Ray—Just Like Running a Restaurant

Comments
7 min read
The State of Apache Iceberg v4 - October 2025 Edition

The State of Apache Iceberg v4 - October 2025 Edition

Comments
6 min read
Another Data Nerd Guide to re:Invent 2025

Another Data Nerd Guide to re:Invent 2025

Comments
3 min read
TikTok Data Engineer Full 3-Round Interview

TikTok Data Engineer Full 3-Round Interview

Comments
4 min read
How to Convert Excel to CSV in Python using Spire.XLS for Python

How to Convert Excel to CSV in Python using Spire.XLS for Python

Comments
4 min read
Snowflake 自律化サービスがもたらすデータエンジニアの新時代2

Snowflake 自律化サービスがもたらすデータエンジニアの新時代2

Comments
1 min read
Making JSON Compression Searchable — SEE (Schema-Aware Encoding)

Making JSON Compression Searchable — SEE (Schema-Aware Encoding)

Comments
2 min read
Big Data Processing (Hadoop, Spark)

Big Data Processing (Hadoop, Spark)

2
Comments
5 min read
Building a clean Energy Data Pipeline for Africa( from raw CSVs to MongoDB)

Building a clean Energy Data Pipeline for Africa( from raw CSVs to MongoDB)

Comments
1 min read
From APIs to Aquifers: A Developer's Guide to Smart Water Management Data

From APIs to Aquifers: A Developer's Guide to Smart Water Management Data

Comments
7 min read
Data in the Cloud: Understanding 6 Common Data Formats in Analytics

Data in the Cloud: Understanding 6 Common Data Formats in Analytics

Comments
3 min read
Guia arquitetônico de ponta para a construção de uma plataforma de dados

Guia arquitetônico de ponta para a construção de uma plataforma de dados

Comments
6 min read
Python For Data Engineering

Python For Data Engineering

Comments
3 min read
Picking the Right Data Format for Your Workflow

Picking the Right Data Format for Your Workflow

Comments
3 min read
🔍 Understanding 6 Common Data Formats in Data Analytics (With Examples)

🔍 Understanding 6 Common Data Formats in Data Analytics (With Examples)

Comments
4 min read
Data in the Cloud: 6 Common Data Formats

Data in the Cloud: 6 Common Data Formats

Comments
3 min read
Collecting Africa’s Energy Insights:

Collecting Africa’s Energy Insights:

5
Comments
4 min read
AI-Driven Data Engineering: Building Real-Time Intelligence Pipelines

AI-Driven Data Engineering: Building Real-Time Intelligence Pipelines

Comments
4 min read
A Dive into Apache Iceberg™'s Metadata

A Dive into Apache Iceberg™'s Metadata

Comments
4 min read
6 Common Data Formats in Data Analytics

6 Common Data Formats in Data Analytics

Comments
2 min read
Otimizando Redshift na Prática: Um Estudo de Caso com DISTKEY e SORTKEY

Otimizando Redshift na Prática: Um Estudo de Caso com DISTKEY e SORTKEY

Comments
4 min read
Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

4
Comments
5 min read
loading...