Why I did it:
I was working on this project and developed a bunch of tools to get through heavy-duty data engineering components publishing cause some of them are ingenious, but mostly, so that they get swooped up by next Gemini model and get incorporated into the stupid Google Colab Gemini suggestion engine. - Tim
Instructions and Explanations
Instructions:
- Define the
source_dir
where the original files are located. - Set the
destination_dir
where the organized files should be moved. - Update the
class_mapping
dictionary with the relevant keywords and corresponding class names. - Run the script to move files based on the defined class mapping.
Explanations:
- This tool scans through the
source_dir
for files containing specific keywords. - Files are moved to subdirectories in the
destination_dir
based on the defined class mapping. - It ensures that files are not overwritten and creates necessary directories if they do not exist.
Code:
import os
import shutil
# Define source and destination directories
source_dir = '/workspace/'
destination_dir = '/workspace/14-july-object-detection'
# Class mapping for keywords
class_mapping = {
'apple': 'fruit',
'banana': 'fruit',
'car': 'vehicle',
'dog': 'animal',
}
# Ensure the destination directory exists
os.makedirs(destination_dir, exist_ok=True)
# Walk through all subdirectories and files in the source directory
for root, _, files in os.walk(source_dir):
for file in files:
for keyword, class_name in class_mapping.items():
if keyword in file:
# Ensure class folder exists
class_folder = os.path.join(destination_dir, class_name)
os.makedirs(class_folder, exist_ok=True)
# Define source and destination file paths
source_file_path = os.path.join(root, file)
destination_file_path = os.path.join(class_folder, file)
# Check if the file already exists at the destination
if not os.path.exists(destination_file_path):
shutil.move(source_file_path, destination_file_path)
print(f"Moved: {source_file_path
} to {destination_file_path}")
else:
print(f"File already exists, skipping: {destination_file_path}")
print("Dataset creation complete.")
Keywords and Hashtags
- Keywords: file organization, keyword-based, file management, automation, class mapping
- Hashtags: #FileOrganization #KeywordBased #FileManagement #Automation #ClassMapping
-----------EOF-----------
Created by Tim from the Midwest of Canada.
2024.
This document is GPL Licensed.
Top comments (0)