DEV Community

Cover image for Build an Antivirus with Python (Beginners Guide)
Scofield Idehen
Scofield Idehen

Posted on • Originally published at blog.learnhub.africa

Build an Antivirus with Python (Beginners Guide)

If you have been hit by a virus attack before, you will understand how annoying it is to lose your files because the virus has corrupted them.

My first encounter with a virus attack was when all my apps stopped working, my laptop started malfunctioning, and my productivity slowed as I tried to figure out how to get my files back and restore my system to its previous state.

In this guide, We will build a personal antivirus and ensure that we are not downloading and installing a virus instead of an antivirus.

Python is an excellent choice for developing an antivirus due to its simplicity, readability, and vast ecosystem of libraries. In this article, we'll guide you through building a basic antivirus using Python, even if you're a beginner.

Prerequisites

Before diving into the coding part, you'll need to have the following prerequisites:

  1. Python Installation: Make sure you have Python installed on your system. You can download the latest version from the official Python website.

  2. Basic Python Knowledge: While we'll try to explain everything in detail, it will be beneficial to have a basic understanding of Python syntax, data structures, and control flow statements.

  3. Pip: pip is the package installer for Python. It comes pre-installed with Python versions 3.4 and later. You'll need pip to install the required libraries for your antivirus project.

  4. Text Editor or IDE: You'll need a text editor or an Integrated Development Environment (IDE) to write and edit your Python code. Popular choices include Visual Studio Code, PyCharm, Sublime Text, and Atom.

    For the purpose of the guide, I will be using VScode, which you can download from their platform.

    Setting Up the Project

Let's start by creating a new folder for our project and setting up a virtual environment. A virtual environment is a self-contained directory tree that isolates the project's dependencies from other Python projects on your system.

You can learn more about how to set up a virtual environment from this guide.

  1. Open your terminal or command prompt and navigate to the desired location for your project.
  • Create a new folder for your project:
    mkdir antivirus_project
    cd antivirus_project
Enter fullscreen mode Exit fullscreen mode
  • Create a virtual environment:
    python -m venv env

  • Activate the virtual environment:

    • On Windows:

    env\Scripts\activate

    • On macOS or Linux: source env/bin/activate

Your terminal should now show the name of your virtual environment in parentheses, indicating that the virtual environment is active.

Installing Required Libraries

Our antivirus will utilize several Python libraries to perform various tasks such as file scanning, signature matching, and virus definition updates. Let's install the necessary libraries using pip:

  • Install the pyfiglet library. pip install pyfiglet

pyfiglet is a Python library that allows you to create ASCII art from text. This can be useful for enhancing the visual appeal of command-line interfaces or console output by generating stylized text banners. It's often used in scripts or applications to display headers, logos, or other textual decorations in an eye-catching way.

Example Use Case: If you're building a CLI tool and want to display a welcome message or logo in a stylized font, pyfiglet can generate this ASCII art.

  • Install the python-magic library. pip install python-magic

Python-magic is a library that examines a file's content to identify the type of data contained in it. It uses the same underlying functionality as the Unix file command.

This is particularly useful when you need to handle files whose types aren't known in advance or need to verify file types for security or processing purposes.

Example Use Case: If your application processes user-uploaded files, python-magic can help ensure that the files are of the expected type, regardless of their extensions.

  • Install the hashlib library for calculating File hashes: pip install hashlib

hashlib is part of the Python Standard Library and doesn't need to be installed separately. It creates secure hash functions, which are essential for verifying the integrity and authenticity of data. Common use cases include generating checksums for files, password hashing, and data integrity verification.

Example Use Case: To ensure a file has not been altered, you can generate its hash and compare it with a known good hash.

Installing it again is unnecessary because you will get an error, but ignore it and install the next one.

  • Install the requests library for making HTTP requests: pip install requests

requests is a popular and user-friendly library for making HTTP requests in Python. It simplifies sending HTTP/1.1 requests, such as GET and POST, to interact with web services and APIs. It's widely used for its ease of use, reliability, and well-designed API.

Example Use Case: If your application needs to communicate with a web API, requests can handle sending and receiving data.

These libraries will help us create a basic antivirus with features like ASCII banner display, file type identification, virus signature matching, and virus definition updates.

Creating the Antivirus Script

Now that the necessary libraries are installed, let's start writing the antivirus script. Create a new Python file in your project folder and name it antivirus.py.

Open the antivirus.py file in your text editor or IDE and import the required libraries.

    import os
    import hashlib
    import magic
    import pyfiglet
    import requests
Enter fullscreen mode Exit fullscreen mode

Next, let's define some functions that our antivirus will use.

  • Display ASCII Banner

The display_banner function will use the pyfiglet library to create an ASCII banner for our antivirus:

    def display_banner():
        banner = pyfiglet.figlet_format("AntiVirus")
        print(banner)
Enter fullscreen mode Exit fullscreen mode
  • Get File Hashes

The get_file_hashes function will calculate the SHA-256 hash of a given file using the hashlib library:

    def get_file_hashes(file_path):
        with open(file_path, 'rb') as file:
            file_data = file.read()
            sha256_hash = hashlib.sha256(file_data).hexdigest()
        return sha256_hash
Enter fullscreen mode Exit fullscreen mode
  • Identify File Type

The identify_file_type function will use the python-magic library to determine the type of a given file:

    def identify_file_type(file_path):
        file_type = magic.from_file(file_path)
        return file_type

Enter fullscreen mode Exit fullscreen mode
  • Check for Virus Signatures

The check_for_virus_signatures function will compare the file hash against a list of known virus signatures:

    def check_for_virus_signatures(file_path):
        file_hash = get_file_hashes(file_path)
        virus_signatures = ['known_virus_hash_1', 'known_virus_hash_2', ...]

        if file_hash in virus_signatures:
            return True
        else:
            return False
Enter fullscreen mode Exit fullscreen mode

In this example, we'll use a hardcoded list of known virus hashes for simplicity. In a real-world scenario, you would fetch these virus signatures from an online database or a local virus definition file.

Update Virus Definitions

The update_virus_definitions function will simulate fetching the latest virus definitions from an online source using the requests library:

    def update_virus_definitions():
        try:
            response = requests.get('https://example.com/virus_definitions.txt')
            if response.status_code == 200:
                virus_definitions = response.text.split('\n')
                print("Virus definitions updated successfully.")
                return virus_definitions
            else:
                print("Failed to update virus definitions.")
        except requests.exceptions.RequestException as e:
            print(f"Error updating virus definitions: {e}")
Enter fullscreen mode Exit fullscreen mode

In this example, we're using a placeholder URL (https://example.com/virus_definitions.txt). In a real-world scenario, you would replace this with the URL or file location where the virus definitions are stored.

Scan File

The scan_file function will tie everything together and perform the actual file scanning process.

    def scan_file(file_path):
        file_type = identify_file_type(file_path)
        print(f"Scanning file: {file_path} ({file_type})")

        if check_for_virus_signatures(file_path):
            print(f"Virus detected in {file_path}!")
        else:
            print(f"{file_path} is clean.")
Enter fullscreen mode Exit fullscreen mode
  • Main Function

Finally, let's create the main function, which will serve as the entry point for our antivirus.

    def main():
        display_banner()
        update_virus_definitions()

        while True:
            file_path = input("Enter the file path to scan (or 'q' to quit): ")
            if file_path.lower() == 'q':
                break
            if os.path.isfile(file_path):
                scan_file(file_path)
            else:
                print(f"Invalid file path: {file_path}")

    if __name__ == "__main__":
        main()
Enter fullscreen mode Exit fullscreen mode

In the main function, we first display the ASCII banner using the display_banner function. Then, we update the virus definitions by calling the update_virus_definitions function.

Next, we enter a loop where the user is prompted to enter a file path to scan. If the user enters 'q', the loop breaks, and the program exits. Otherwise, we check if the provided file path is valid using the os.path.isfile function. If the file path is valid, we call the scan_file function to scan the file for viruses.

    import os
    import hashlib
    import magic
    import pyfiglet
    import requests

    def display_banner():
        banner = pyfiglet.figlet_format("AntiVirus")
        print(banner)

    def get_file_hashes(file_path):
        with open(file_path, 'rb') as file:
            file_data = file.read()
            sha256_hash = hashlib.sha256(file_data).hexdigest()
        return sha256_hash

    def identify_file_type(file_path):
        file_type = magic.from_file(file_path)
        return file_type

    def check_for_virus_signatures(file_path):
        file_hash = get_file_hashes(file_path)
        virus_signatures = ['known_virus_hash_1', 'known_virus_hash_2', ...]

        if file_hash in virus_signatures:
            return True
        else:
            return False

    def update_virus_definitions():
        try:
            response = requests.get('https://example.com/virus_definitions.txt')
            if response.status_code == 200:
                virus_definitions = response.text.split('\n')
                print("Virus definitions updated successfully.")
                return virus_definitions
            else:
                print("Failed to update virus definitions.")
        except requests.exceptions.RequestException as e:
            print(f"Error updating virus definitions: {e}")

    def scan_file(file_path):
        file_type = identify_file_type(file_path)
        print(f"Scanning file: {file_path} ({file_type})")

        if check_for_virus_signatures(file_path):
            print(f"Virus detected in {file_path}!")
        else:
            print(f"{file_path} is clean.")

    def main():
        display_banner()
        update_virus_definitions()

        while True:
            file_path = input("Enter the file path to scan (or 'q' to quit): ")
            if file_path.lower() == 'q':
                break
            if os.path.isfile(file_path):
                scan_file(file_path)
            else:
                print(f"Invalid file path: {file_path}")

    if __name__ == "__main__":
        main()
Enter fullscreen mode Exit fullscreen mode

Running the Antivirus

Save the antivirus.py file and open your terminal or command prompt. Navigate to the project folder and run the following command to start the antivirus.

`python antivirus.py`
Enter fullscreen mode Exit fullscreen mode

If you do this right you might get an error that says you are missing libmagic follow the command to get it sorted

For macOS:

Run brew install libmagic this would install the dependency you need but we have to connect it with you venv

  • After the installation is complete, you need to link the libmagic library to the Python site-packages directory. Run the following command:

    brew link libmagic --overwrite

This command will create symbolic links to the libmagic library in your Python site-packages directory, allowing the python-magic library to find and use it.

For Windows:

  • You need to install the libmagic library manually. You can download the pre-compiled binaries from the following link
  • After downloading the binaries, extract them to a directory of your choice.
  • Add the directory containing the magic1.dll file to your system's PATH environment variable.
  • Once the PATH variable is updated, you can import the python-magic library without issues.

After installing and configuring libmagic correctly, you should be able to run your python antivirus.py

We will scan my document to see if our antivirus is working. The first step is to locate the file path for my document.

Run a pwd on your terminal after cd into the Document directory

Copy the file path and paste it into your antivirus and scan.

Congratulations! You've successfully built a basic antivirus using Python. Of course, this is just a starting point, and there's plenty of room for improvement and additional features. Potential Improvements and Additional Features While our basic antivirus is functional, it's far from a complete solution.

Here are some potential improvements and additional features you could consider:

  • Real-Time Monitoring: Implement real-time monitoring capabilities to scan files as they are accessed, modified, or executed on the system.

  • Heuristic Analysis: Incorporate heuristic analysis techniques to detect unknown or obfuscated malware based on suspicious behavior or patterns.

  • Quarantine and Disinfection: Add functionality to quarantine or disinfect infected files instead of simply reporting them.

  • User Interface: Develop a graphical user interface (GUI) for a more user-friendly experience.

  • Scheduled Scans: Allow users to schedule regular system scans at specific intervals.

  • Cloud-Based Virus Definitions: Implement a system to fetch virus definitions from a cloud-based service or a centralized database.

  • Multi-Platform Support: The antivirus will now work on multiple operating systems, such as Windows, macOS, and Linux.

  • Performance Optimization: Optimize the antivirus for better performance, especially when scanning large files or entire directories.

  • Logging and Reporting: Add logging and reporting features to keep track of scan results, quarantined files, and other relevant information.

  • Cloud-Based Scanning: Incorporate cloud-based scanning capabilities to offload resource-intensive tasks or leverage more robust analysis engines. Remember, building a comprehensive antivirus solution is a complex task that requires extensive knowledge and experience in cybersecurity, malware analysis, and software development.

This tutorial serves as a starting point to help you understand the basic concepts and components involved in building an antivirus with Python.

Conclusion

In this article, we've covered the steps required to build a basic antivirus using Python. We've learned how to set up the project, install necessary libraries, and create functions for file scanning, virus signature matching, and virus definition updates.

We've also discussed potential improvements and additional features that could be implemented to enhance the functionality of our antivirus. Building an antivirus is an excellent way to learn about cybersecurity, file analysis, and Python programming.

It's also a great exercise to understand the challenges of developing security solutions and the importance of keeping systems secure. Remember, this tutorial is meant to be a learning resource, and the antivirus we've built should not be considered a replacement for commercial antivirus solutions. Always use reputable and up-to-date security software to protect your systems from real-world threats.

Happy coding, and stay secure!

Resource

Top comments (0)