KMNIST in PyTorch

#python #pytorch #kmnist #dataset

Buy Me a Coffee☕

*Memos:

My post explains KMNIST.
My post explains MNIST().
My post explains EMNIST().
My post explains QMNIST().
My post explains MovingMNIST().
My post explains FashionMNIST().

KMNIST() can use KMNIST dataset as shown below:

*Memos:

The 1st argument is root(Required-Type:str or pathlib.Path). *An absolute or relative path is possible.
The 2nd argument is train(Optional-Default:True-Type:bool). *If it's True, train data(60,000 images) is used while if it's False, test data(10,000 images) is used.
The 3rd argument is transform(Optional-Default:None-Type:callable).
The 4th argument is target_transform(Optional-Default:None-Type:callable).
The 5th argument is download(Optional-Default:False-Type:bool): *Memos:
- If it's True, the dataset is downloaded from the internet and extracted(unzipped) to root.
- If it's True and the dataset is already downloaded, it's extracted.
- If it's True and the dataset is already downloaded and extracted, nothing happens.
- It should be False if the dataset is already downloaded and extracted because it's faster.
- You can manually download and extract the dataset(train-images-idx3-ubyte.gz, train-labels-idx1-ubyte.gz, t10k-images-idx3-ubyte.gz and t10k-labels-idx1-ubyte.gz) from here to data/KMNIST/raw/.

from torchvision.datasets import KMNIST

train_data = KMNIST(
    root="data"
)

train_data = KMNIST(
    root="data",
    train=True,
    transform=None,
    target_transform=None,
    download=False
)

test_data = KMNIST(
    root="data",
    train=False
)

len(train_data), len(test_data)
# (60000, 10000)

train_data
# Dataset KMNIST
#     Number of datapoints: 60000
#     Root location: data
#     Split: Train

train_data.root
# 'data'

train_data.train
# True

print(train_data.transform)
# None

print(train_data.target_transform)
# None

train_data.download
# <bound method MNIST.download of Dataset KMNIST
#     Number of datapoints: 60000
#     Root location: data
#     Split: Train>

len(train_data.classes), train_data.classes
# (10,
#  ['o', 'ki', 'su', 'tsu', 'na', 'ha', 'ma', 'ya', 're', 'wo'])

train_data[0]
# (<PIL.Image.Image image mode=L size=28x28>, 8)

train_data[1]
# (<PIL.Image.Image image mode=L size=28x28>, 7)

train_data[2]
# (<PIL.Image.Image image mode=L size=28x28>, 0)

train_data[3]
# (<PIL.Image.Image image mode=L size=28x28>, 1)

train_data[4]
# (<PIL.Image.Image image mode=L size=28x28>, 4)

import matplotlib.pyplot as plt

def show_images(data, main_title=None):
    plt.figure(figsize=(10, 5))
    plt.suptitle(t=main_title, y=1.0, fontsize=14)
    for i, (im, lab) in zip(range(1, 11), data):
        plt.subplot(2, 5, i)
        plt.imshow(X=im)
        plt.title(label=lab)
    plt.tight_layout()
    plt.show()

show_images(data=train_data, main_title="train_data")
show_images(data=test_data, main_title="test_data")

DEV Community

KMNIST in PyTorch

Top comments (0)

Read next

How to Retrieve EC2 Instances Information Using Python and Boto3

Building an Interactive Budget Calculator with Streamlit 🚀

Building a Local AI Task Planner with ClientAI and Ollama

Building a Streamlit Inventory Management App with Fragment Decorators 🚀