Hi 🙂🖐
In this post, I will show you how to Detect File Changes in Real-Time with Python File Monitoring and Hash Comparison
from time import sleep
from hashlib import md5
from os import listdir
src_dir = 'myFiles'
my_files = listdir(src_dir)
# create hash for files
def get_file_hash(filename):
return md5(open(filename, 'rb').read()).hexdigest()
file_data = {}
def load_files():
for filename in my_files:
file_hash = get_file_hash(f'{src_dir}/{filename}')
file_data[filename] = file_hash
load_files()
def check_changes():
for filename in my_files:
file_hash = get_file_hash(f'{src_dir}/{filename}')
if file_data[filename] != file_hash:
print(f'change detected in file {filename}')
sleep(1)
check_changes()
sleep(1)
check_changes()
This code is a Python script designed to monitor changes in files within a specified directory. Let's break it down step by step:
1. from time import sleep
: This imports the sleep
function from the time
module. It is used to introduce a pause in the execution of the script.
2. from hashlib import md5
: This imports the md5
hashing algorithm from the hashlib
module. MD5 is used here to generate a hash of the file content.
3. from os import listdir
: This imports the listdir
function from the os
module. It is used to list all files in a directory.
**4. src_dir = 'myFiles'
: **This specifies the directory (myFiles
) where the files to be monitored are located.
5. my_files = listdir(src_dir)
: This lists all the files in the directory specified by src_dir
and stores them in the my_files
list.
6. def get_file_hash(filename)
: This is a function that takes a filename as input, reads the file in binary mode, calculates its MD5 hash, and returns the hexadecimal representation of the hash.
7. file_data = {}
: This initializes an empty dictionary file_data
where the filename and its corresponding hash will be stored.
8. def load_files()
: This function iterates through all files in my_files
, calculates the hash of each file using get_file_hash
, and stores the filename and its hash in the file_data
dictionary.
**9. load_files()
: **This call to load_files
initializes the file_data
dictionary with the initial hashes of all files in the directory.
10. def check_changes()
: This function is responsible for monitoring changes in the files. It iterates through all files in my_files
, recalculates the hash of each file, and compares it with the hash stored in file_data
. If a change is detected, it prints a message indicating the change.
11. Inside check_changes
, there's a recursive call to itself wrapped within a sleep(1)
statement, which causes the script to pause for 1 second before recursively calling check_changes
again. This results in the script continuously checking for changes in the files.
12. The script ends with an initial sleep of 1 second followed by a call to check_changes()
to start the monitoring process.
In summary, this script continuously monitors the specified directory for changes in files by calculating their MD5 hashes and comparing them with previously stored hashes. If a change is detected, it prints a message indicating the change.
Top comments (0)