Skip to content

DEV Community

Carlos A. Martinez

Posted on Jun 13, 2023

Azure Data Factory LAB Copy CSV

Azure Data Factory is the platform that solves such data scenarios. It is the cloud-based ETL and data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. Using Azure Data Factory, you can create and schedule data-driven workflows (called pipelines) that can ingest data from disparate data stores. You can build complex ETL processes that transform data visually with data flows or by using compute services such as Azure HDInsight Hadoop, Azure Databricks, and Azure SQL Database.

Lab ADF Copy Data from CSV to CSV

1 STEP

Create a resource group: LAB01_ADF_COPYING_DATA_FROM_CSV_TO_CSV

2 STEP

Create a storage account (blob storage)

Redundancy: Locally-redundant storage (LRS)

3 STEP

Create a storage account (data lake)

Enable hierarchical namespace

4 STEP

Inside blob storage (moviesblobst) add container 'bankmovies'

and after that upload file csv

5 STEP

Config data lake

Linked services

Linked services are much like connection strings, which define the connection information needed for the service to connect to external resources.

Is time to create a Data Factory, here we go.

6 STEP

Name: LAB01ADF01

Launch studio

7 STEP

Create new Linked service to Blob Storage

Name: ls_blob_moviesblobst

Storage account name: moviesblobst
and test connection

8 STEP

Create new Linked service to Data Lake

Name: ls_dl_moviesdatalakee
Storage account name: moviesdatalakee
and test connection

9 STEP

Create 2 dataset origin and sink

Dataset Blob Storage

Dataset > New dataset > Azure Blob Storage > DelimitedText (CSV)

Name: ds_movies_bank_row_bs
Linked service: ls_blob_moviesblobst

File path you can clic in preview data

Dataset Data Lake

Dataset > New dataset > Azure Data Lake Storage Gen2 > DelimitedText (CSV)

Name: ds_movies_bank_raw_dl
Linked service: ls_dl_moviesdatalakee

Validate all and publish all

10 STEP

Generate new pipeline, name is pl_ingestion_movies_data

Activities: Move & transform > Copy data

Copy data movies

Source dataset: ds_movies_bank_row_bs
Sink dataset: ds_movies_bank_raw_dl

Validate check
Debug

Go to data lake and look the file csv with data

Thanks for taking your time to read this post.

Top comments (1)

Subscribe

Nerius Pérez Toirac • Jun 15 '23

Great Carlos, useful manual!!! Tabks!!!

Read next

AI Breakthrough: 90% Faster 3D Object Detection Using Text-Guided Processing

Mike Young - Feb 23

Zero-Shot Foundation Models Match Traditional Forecasting in Cloud Computing Metrics, Study Shows

Mike Young - Feb 23

New AI System Cuts False Information by 20% Using Smart Information Processing Framework

Mike Young - Feb 23

Study Shows AI Code Generators Only 60% Accurate, Half With Security Flaws

Mike Young - Feb 23