sc0v0ne

Posted on Jan 13 • Edited on Mar 3

Tutorial: Creating Dataset The Elder Scroll: Skyrim Armor and Sending to Kaggle Datasets

#python #jupyter #kaggle #pandas

One of the games that made it into my top 10 was The Elder Scroll Skyrim, making it nostalgic to be able to do a project related to the game. These days I was thinking about doing a project and I ended up having an idea. But I needed to have a set of data, I searched the internet and found some, but I lacked armor.

I searched The Elder Scrolls Wiki for this table of armors . I need to carry out the information in the table. In this simple tutorial, you will extract the table, perform pre-processing and finally add it to Kaggle Datasets.

Extract Dataset Tabular

To perform this step you will need the Pandas library. Pandas has a huge set of features to perform processing, extraction, processing and much more. On the Wiki website there are several tables with armor data, for this I need to use pandas.read_html.

With pd.read_html I can:

Read an HTML, where I pass the link as a string to the function.
Use Regular expression
Extract Links

That and some options that I'm going through, I recommend looking in the documentation to find out others.

Finally, it returns to me all the tables it found, already being Dataframes.

# Import Library Pandas
import pandas as pd

# Read HTML
tables_on_page = pd.read_html("https://elderscrolls.fandom.com/wiki/Armor_(Skyrim)")

# Amount Tables
len(tables_on_page) # output: 16

Each table is a type of armor, so I separated them by the names I found on the website. I won't include all types so as not to accumulate something repetitive in the post. But at the end I will leave the project link so you can check it out and test it too.

headgear_light_armor = tables_on_page[4]
headgear_light_armor

Preprocessing

I did this step individually for each table to add a new column. Where I needed to put the type of armor as a class. At the end I merge the tables, to get just one. As there was no difference between the columns, it was only necessary to use pd.concat. Then I renamed some columns, due to some functionality of the link in these columns it was not possible to return the text. Then I dropped the ID column because I didn't think it was necessary for what I'm creating. Finally I save the file in .csv .

# New Column
headgear_light_armor['type_armor'] = 'Headgear Light Armor'
headgear_light_armor

# Unfiy Datasets
df = pd.concat([gauntlets_heavy_armor, gauntlets_light_armor, boots_heavy_armor, boots_light_armor, cuirasses_heavy_armor, cuirasses_light_armor, headgear_heavy_armor, headgear_light_armor], axis=0)

# Rename Columns
df.rename(columns={'Unnamed: 1':'Armor','Unnamed: 2':'Encumbrance','Unnamed: 3':'Gold'}, inplace=True)

# Drop ID
df.drop(['Item ID'], axis=1, inplace=True)

# Save File
df.to_csv('dataset_armor_skyrim_1.csv', index=False)

Kaggle Datasets

When I was developing this notebook I didn't think about adding it to Kaggle. I wanted to use the result of the set for my project. From this point I thought why not share, I can create a post explaining the process and finally post the dataset for other users to use on their notebooks. So make a contribution. And here we are, let's go to post this dataset.

If you don't know much about Kaggle, there's a post of mine where I talk a little about this incredible tool.

5 Tools to Start Working with Python 🤯☢️😱

sc0v0ne ・ Jul 14 '23

#python #jupyter #terminal #kaggle

After you have created your account on Kaggle, you will be on the dashboard. Search for Datasets.

Then click on New Dataset.

Now you will get the created dataset, which was created in the notebook. You can drag or open the window to add the file.

Now you need to add a title to your dataset, you can add more files if you need. Choose between the public or private options, finally create the dataset. At this stage it will load your file and then direct you to the dataset page.

Finally, your dataset will be available on Kaggle for you to use in your projects or, if the dataset is public, other users can use it.

The Elder Scrolls Skyrim - Armor

Repository

You can find the project via the link below.

ProjectsSc0v0ne / Dataset Armor - The Elder Scroll - Skyrim · GitLab

I created this project to show how to obtain a table from a website, process the data and then send it to Kaggle.

gitlab.com

About the author:

sc0v0ne

Machine learning, deep learning, and raw code. Presented clearly and with examples.

A little more about me...

Graduated in Bachelor of Information Systems, in college I had contact with different technologies. Along the way, I took the Artificial Intelligence course, where I had my first contact with machine learning and Python. From this it became my passion to learn about this area. Today I work with machine learning and deep learning developing communication software. Along the way, I created a blog where I create some posts about subjects that I am studying and share them to help other users.

I'm currently learning TensorFlow and Computer Vision

Curiosity: I love coffee

My Latest Posts

My Super Powers as a Software Developer - 2024

sc0v0ne ・ Jan 6

#tools #softwaredevelopment #workstations #productivity

Tutorial: Docker - Communicate Between Containers

sc0v0ne ・ Dec 8 '23

#docker #containers #container

DEV Community

Tutorial: Creating Dataset The Elder Scroll: Skyrim Armor and Sending to Kaggle Datasets

Extract Dataset Tabular

Preprocessing

Kaggle Datasets

5 Tools to Start Working with Python 🤯☢️😱

sc0v0ne ・ Jul 14 '23

Repository

ProjectsSc0v0ne / Dataset Armor - The Elder Scroll - Skyrim · GitLab

About the author:

sc0v0ne

My Latest Posts

My Super Powers as a Software Developer - 2024

sc0v0ne ・ Jan 6

Tutorial: Docker - Communicate Between Containers

sc0v0ne ・ Dec 8 '23

Blueflix 🍿🎥 - Idea, Build, Deploy

sc0v0ne ・ Nov 19 '23

Resources

Top comments (0)

Read next

🌍 GeoIP Lookup Tool: Easily Get Geolocation Information of Any IP Address.

Understanding JSONify(), to_dict(), make_response(), and SerializerMixin in Flask

How I Saved Myself Hours Using Python, Google Gemini, & Meta Llama to Create a Time Saving Script

Flatten in PyTorch

Extract Dataset Tabular

Preprocessing

Kaggle Datasets

5 Tools to Start Working with Python 🤯☢️😱

sc0v0ne ・ Jul 14 '23

Repository

ProjectsSc0v0ne / Dataset Armor - The Elder Scroll - Skyrim · GitLab

About the author:

sc0v0neFollow

My Latest Posts

My Super Powers as a Software Developer - 2024

sc0v0ne ・ Jan 6

Tutorial: Docker - Communicate Between Containers

sc0v0ne ・ Dec 8 '23

Blueflix 🍿🎥 - Idea, Build, Deploy

sc0v0ne ・ Nov 19 '23

Resources

Read next

🌍 GeoIP Lookup Tool: Easily Get Geolocation Information of Any IP Address.

Understanding JSONify(), to_dict(), make_response(), and SerializerMixin in Flask

How I Saved Myself Hours Using Python, Google Gemini, & Meta Llama to Create a Time Saving Script

Flatten in PyTorch

sc0v0ne