The average day of a programmer is often filled with boring tasks such as bulk operations. Today I share my tips/tricks about how I automate this kind of tasks easily with help of REPLs.
Usecase 1: I have long list of record ids and want only show/update them in a database table.
with open('id_list.txt') as f:
id_list = f.read().split('\n')
print(f"select * from my_table where id in {tuple(id_list)}")
tuple
data type represent in Python as comma seperated values between parantheses that perfectly fit my need of where in
query here.
Usecase 2: I have to migrate one table to another in PostgreSQL.
In this case I have to migrate the old database tables to new ones. Column names may different, some columns may removed or remapped and new table may have new columns that contains aggregated data from other columns/tables.
I created views/materialized views using old tables that map new ones. And write little sql script that return insert queries from output of materialized view as:
select format('insert into table_v2 (field_a, field_b, field_c, ...) values (%L, %L, %L,...);', field_a, field_b, field_c,..)
from table_v2_mw
%L
format string handles quotation and null values for us here.
This method is super useful for applying one shot complex changes and report them. You can attach exported list of insert/update queries to Jira.
Usecase 3: I hate spreadsheets.
Excel/CSV files are industry standard if you work with data. Sometimes I need to manipulate them and spreadsheet softwares usually sucks, even in simple tasks. Instead of fighting with bloated nested IFs and non-sense functions, I prefer convert them in CSV format and manipulate them in Python. Also, Python can handle really big CSV files with help of Pandas.
with open("some_file_revised.csv", "w+") as csv_out:
with open("some_file.csv") as csv_in:
csv_reader = csv.DictReader(csv_in)
csv_writer = csv.DictWriter(csv_out, fieldnames=csv_reader.fieldnames)
csv_writer.writeheader()
for row in csv_reader:
# Manipulate data here
row["foo"] = row["foo"][4:]
csv_writer.write(row)
Top comments (0)