DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
How to fuzzy-match 1M rows with dbt in under 10 minutes (2026 guide)

How to fuzzy-match 1M rows with dbt in under 10 minutes (2026 guide)

Comments
4 min read
Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained

Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained

Comments
3 min read
How to Compare Multiple CSV Files Quickly with Dataplotter

How to Compare Multiple CSV Files Quickly with Dataplotter

Comments
3 min read
How Linux is Used in Real-World Data Engineering

How Linux is Used in Real-World Data Engineering

3
Comments
4 min read
10 Years of Blood Reports into One Graph: Building a Personal Medical Knowledge Base with Unstructured.io, Neo4j, and LlamaIndex

10 Years of Blood Reports into One Graph: Building a Personal Medical Knowledge Base with Unstructured.io, Neo4j, and LlamaIndex

1
Comments
3 min read
Resource Monitoring for Data Pipelines

Resource Monitoring for Data Pipelines

Comments
3 min read
SQLite, Go/Postgres, & Petabytes: Database Patterns for Builders

SQLite, Go/Postgres, & Petabytes: Database Patterns for Builders

1
Comments
4 min read
How to Set Up a GitHub Profile README Using Markdown (Beginner Guide)

How to Set Up a GitHub Profile README Using Markdown (Beginner Guide)

2
Comments
2 min read
Flowfile v0.8.0 — Your Flows Can Run Themselves Now

Flowfile v0.8.0 — Your Flows Can Run Themselves Now

Comments
4 min read
# Apache Data Lakehouse Weekly: March 20–27, 2026

# Apache Data Lakehouse Weekly: March 20–27, 2026

Comments
7 min read
Frosty : 150 + AI Open Source Sub- Agents to Automate Snowflake

Frosty : 150 + AI Open Source Sub- Agents to Automate Snowflake

Comments
2 min read
When Synthetic Data Lies: A Hidden Correlation Problem I Didn’t Expect

When Synthetic Data Lies: A Hidden Correlation Problem I Didn’t Expect

3
Comments
3 min read
Building & Monitoring Data Backends: Tools, Architecture, and Observability

Building & Monitoring Data Backends: Tools, Architecture, and Observability

Comments
4 min read
Issues of Multi-GB Spreadsheets in Data Lakes

Issues of Multi-GB Spreadsheets in Data Lakes

Comments
4 min read
Asset-Based Data Orchestration: Lessons from Building a Multi-State Social Data Platform

Asset-Based Data Orchestration: Lessons from Building a Multi-State Social Data Platform

1
Comments
6 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.