Introduction
In 2023, a friend of mine asked me to write a program to collect the data from an NFT collection on the NFTrade website. He was interested in the following information: all the NFTs currently for sale in the collection, the current price of BNB (the cryptocurrency the NFTs are listed for sale in), the price of each NFT converted to US dollars based on the current price of BNB, and put all that data into a spreadsheet that's easy to manipulate, sort, and filter.
I couldn't use my go-to JavaScript skills to fetch data via HTTP calls from the NFTrade site because there is no public-facing API, so instead, I taught myself to build a small web scraping script to visit the site and "scrape" that data from it.
Python seems to be a very popular language for such projects, so I went with it, and as I worked on it, the requirements got a bit more complex, and I learned a lot of useful things about Python as a result, which I've shared in a series of blog posts. This post will be the last in my series covering this web scraper.
In previous articles I:
- Scraped the NFT data off of NFTrade using the Selenium Python package,
- Filtered each NFT's raw data into a separate object in a list of NFTs,
- Narrowed the list down to NFTs just for sale, fetched the current price of BNB, and added the list price of each NFT in US dollars to each NFT.
And now, with all the data in one place, I needed to make it into a spreadsheet my friend could work with.
The Python csv.DictWriter module makes turning a list of Python objects into a standard CSV file a straightforward task, which I'll demonstrate in this article.
NOTE: I am not normally a Python developer so my code examples may not be the most efficient or elegant Python code ever written, but they get the job done.
Sample Python data
I need to set the stage and show you the shape of the my Python data before I show you the solution I ended up using. As I mentioned in the introduction, the data I wanted for each NFT included the following things:
- NFT ID
- NFT list price (in BNB)
- NFT rarity score (a number randomly assigned to each NFT in the collection)
- Current price of BNB in US dollars
- Cost of NFT in US dollars
- Cost of each rarity point assigned to the NFT in US dollars
What I ended up with is a list of objects (or "dictionaries" in Python parlance) because each object in the list is made up of key value pairs. Here's a sample of what the list of data looks like.
Sample NFT data to be added to CSV file
[
{'bnb': 352.44, 'cost_per_rs': 28.2, 'id': 4, 'nft_price': '0.8', 'price_usd': 281.95, 'rs': 10},
{'bnb': 352.44, 'cost_per_rs': 77.54, 'id': 42, 'nft_price': '1.1', 'price_usd': 387.68, 'rs': 5},
{'bnb': 352.44, 'cost_per_rs': 98.68, 'id': 174, 'nft_price': '1.4', 'price_usd': 493.42, 'rs': 5},
{'bnb': 352.44, 'cost_per_rs': 29.68, 'id': 184, 'nft_price': '1.6', 'price_usd': 563.9, 'rs': 19},
{'bnb': 352.44, 'cost_per_rs': 46.99, 'id': 256, 'nft_price': '2', 'price_usd': 704.88, 'rs': 15},
# more NFT data
]
NOTE: If you're interested in seeing how I got this list of data, check out my previous blog post for a more in-depth explanation.
Ok, now that I've established what the data looks like we can move on to how I turned this list into a CSV, complete with a header row of each key in the dictionary object.
The csv.DictWriter module
Unlike core JavaScript, which tends to be pretty bare-bones and makes users install packages for most everything they want to do, the Python standard library is quite robust.
It comes with a csv
module that implements classes to read and write tabular data in CSV format out of the box, as the CSV (Comma Separated Values) format is the most common import and export formats for spreadsheets and databases.
The csv
module had two packages of particular interest to me:
-
writer
- a writer object responsible for converting the user's data into delimited strings on a given file-like object. -
DictWriter
- an object like a writer but it maps dictionaries to output rows. This object requires afieldnames
parameter that is a sequence of keys which identifies the order in which values in the dictionary are passed to thewriterow()
method.
Since my Python data is formatted as a list of dictionaries, as shown in the previous section, it made the most sense for me to use the csv.DictWriter
module.
For me, the fieldnames
parameter would be composed of the keys of each dictionary: id
, bnb
, nft_price
, etc. and that would guarantee that each dictionary of NFT info passed through the DictWriter
object would match up the proper value with its corresponding key. And DictWriter
also has two handy public methods I wanted:
-
writeheader()
- a method to write a row with the field names to the writer's file object (i.e. the top row in a CSV that typically has the column names listed for each column of data). -
writerows()
- the method that writes all the elements in rows to the writer's file object (i.e. the list of NFT objects I wanted added to the CSV file).
DictWriter
was the right choice for my situation, so let's get to how I put it into practice to make my file next.
Format the Python data into a CSV and save it locally
To transform the list of Python dictionaries into a CSV using the csv.DictWriter
module, I created a new function called download_csv()
, and here is what the code looked like in my for_sale_scraper.py
file.
def download_csv(self, card_list):
"""Turn card list into CSV file."""
date_time = datetime.now().strftime("%Y_%m_%d-%I_%M_%p_")
# add date time to the front of the file name
with open(date_time + 'NFTs_For_Sale.csv', 'w', encoding='utf8', newline="") as output_file:
dict_writer = csv.DictWriter(output_file, fieldnames=['id', 'bnb', 'nft_price', 'price_usd', 'rs', 'cost_per_rs'])
dict_writer.writeheader()
dict_writer.writerows(card_list)
pprint("CSV NFTs For Sale file generated!")
It will probably be easiest to go through this function line by line, so let's start from the top.
The function takes in a card_list
argument which is what I want written to each row in the CSV, and the first thing it does is create a variable named date_time
.
date_time = datetime.now().strftime("%Y_%m_%d-%I_%M_%p_")
date_time
is a formatted date and time string representing the current date in the format of "Year_Month_Day-Hour_Minute_AM/PM"
. When called, it will generate a string like "2024_04_11-03_30_PM_"
, which will be tacked on to the beginning of the CSV's file name so it's easy to sort and identify the files after they're generated.
On to the next line of the function:
with open(date_time + 'NFTs_For_Sale.csv', 'w', encoding='utf8', newline="") as output_file:
with open
opens a file for writing with the specified filename, which is the newly created date_time
variable concatenated with the "NFTs_For_Sale.csv"
string. The 'w'
mode indicates the file will be opened for writing, the encoding='utf8'
parameter specifies the character encoding to be used, and the newline=""
ensures the proper newline character is used for writing rows to the CSV file. The whole thing is named output_file
for reference later in the function.
Then we come to the DictWriter
.
dict_writer = csv.DictWriter(output_file, fieldnames=['id', 'bnb', 'nft_price', 'price_usd', 'rs', 'cost_per_rs'])
dict_writer.writeheader()
dict_writer.writerows(card_list)
It is DictWriter
's time to shine, as a new instance of it is created and the output_file
is passed to it, along with the fieldnames
list which includes all the dictionary keys as the headers for the CSV file. In the following two lines of code, DictWriter
writes the header row to the CSV file with the names specified in fieldnames
, and writes the list of dictionaries (the card_list
passed to this function) to the CSV file. Each dictionary in the list represents a row in the CSV file.
Job complete!
pprint("CSV NFTs For Sale file generated!")
Finally, a success message is printed out at the end of the function indicating the CSV file has been successfully generated, and the file should show up in the repo alongside all the other previously project files.
Note all the generated CSV files, neatly ordered in descending order by their filename dates.
And the download_csv()
function is called from the main Python function like in the code snippet below.
I've included the other functions called before
download_csv
so you can see how I divvied up the work this file was doing, but I have not included the details of those functions because it's outside the scope of this post.
for_sale_scraper.py
if __name__ == ' __main__':
scraper = ForSaleNFTScraper();
cards = scraper.get_cards(max_card_count=200)
card_data = []
for card in cards:
info = (scraper.get_nft_data(card))
card_data.append(info)
# filter out any extra cards that aren't for sale
cards_for_sale = scraper.filter_priced_cards(card_data)
# add rarity scores to all cards in the list by merging them with the id_rs_list
cards_with_rs = scraper.get_cards_rarity_score(cards_for_sale)
# add the current bnb price, current usd price of cards and current usd price of each rs point
cards_with_bnb_rs = scraper.add_pricing_to_cards(cards_with_rs)
# generate csv file
scraper.download_csv(cards_with_bnb_rs)
Now, my friend can generate CSVs of the NFT collection whenever he wants to, or he could even set up a cron job on his machine to run this script on a regular basis and then review the generated files at his convenience.
Conclusion
Building a web scraper to collect data from the NFTrade site was a good exercise for me to learn more about Python, and it was a great opportunity for me to share my learnings in a series of blog posts for anyone else looking to do something similar.
I scraped the data with Selenium Python, narrowed it down to the pieces I wanted, added some extra info like the cost of each NFT in US dollars, and then bundled all data up into a downloadable CSV for easy scanning and sorting.
The extra nice thing about making it into a CSV is that the CSV reading and writing functions are part of the core Python library so I didn't even have to install any third party packages to make it work - it was actually quite straightforward in the end.
Check back in a few weeks — I’ll be writing more about JavaScript, React, IoT, or something else related to web development.
If you’d like to make sure you never miss an article I write, sign up for my newsletter here: https://paigeniedringhaus.substack.com
Thanks for reading. I hope learning to use Python's csv.DictWriter
module to make your own CSVs from list of data comes in handy in your own apps and projects.
References & Further Resources
- NFTrade website
- Python DictWriter documentation
- First blog post about scraping data from a lazy-loading website using Selenium Python
- Second blog post about limiting data searches to a particular element on a page instead of the whole page when using XPath
- Third blog post about filtering, merging, and updating lists of objects in Python
Top comments (0)