This post was originally published on The Capsized Eight Blog
Dealing with various sources of data in web applications requires us to create services that will extract information from CSV, Excel, and other file types. In that case, it's best to use some existing libraries, or if your backend is on Rails, use gems. There are many gems with very cool features like CSVImporter
and Roo
. But you can also use plain Ruby CSV
.
Either way, if those are small CSV files, you will get your job done easily. But what if you need to import large CSV files (~100MB / ~1M rows)?
I found that PostgreSQL has a really powerful yet very simple command called COPY
which copies data between a file and a database table.
It can be used in both ways:
- to import data from a CSV file to database
- to export data from a database table to a CSV file.
Example of usage:
COPY forecasts
FROM 'tmp/forecast.csv'
CSV HEADER;
This piece of SQL code will import the content from a CSV file to our forecasts
table. Note one thing: it's assumed that the number and order of columns in the table is the same as in the CSV file.
Results
Importing a CSV file with ~1M rows now takes under 4 seconds which is blazing fast when compared to previous solutions!
For more details, read the original blogpost.
Top comments (1)
@walker I think this post might resonate with you