Rémy 🤖

Posted on May 12, 2021

The Ultimate Python main()

#python #tutorial #learning

So you want to write a CLI utility in Python. The goal is to be able to write:

$ ./say_something.py -w "hello, world"
hello, world

And then to get your script running. This isn't so hard and you've probably done it already, but have you thought of all the use cases? Let's go step by step on all the things you can do to make your main() bullet-proof. The first sections are going to be very basic and we'll build up something less obvious as we progress.

Hello, world

Let's start with a basic script that just says "hello, wold".

print("hello, world")

Now this is fairly basic but still won't work if you call it directly like ./say_something.py. In order for this to work, you need to add the famous shebang at the beginning.

#!/usr/bin/python3

print("hello, world")

Note — Let's also not forget to do a chmod a+x say_something.py to give execution rights on the file

However, what if you're using a virtualenv? By calling /usr/bin/python3 directly you're forcing the path to the Python interpreter. This can be desirable in some cases (by example package managers will tend to force this for safety) but if you're writing a distributable script it's simpler to get it from the environment.

#!/usr/bin/env python3

print("hello, world")

Note — We're calling python3 instead of python because in some systems you'll find that python points to Python 2 and who wants to be using Python 2?

Importability

Something that you'll learn early on from manuals is that you want to make sure that if for some reason someone imports your module then there won't be any side-effect like a surprise print().

Because of that, you need to "protect" your code with this special idiom:

#!/usr/bin/env python3

if __name__ == "__main__":
    print("hello, world")

You'll note the use of the special variable __name__ whose value is going to be __main__ if you're in the "root" file that Python is running and the module name otherwise.

Parsing arguments

You'll notice that so far the program only says "hello, world" but doesn't parse the -w argument that we've described above. Parsing arguments in Python is not so different from C — for those who had the chance of learning it at school — except that well it's Python so you have a few more goodies.

The main difference is that instead of receiving the arguments as a an argument of main() you'll have to import them.

#!/usr/bin/env python3
from sys import argv, stderr


def fail(msg: str):
    stderr.write(f"{msg}\n")
    exit(1)


if __name__ == "__main__":
    what = "hello, world"

    if len(argv) == 1:
        pass
    elif len(argv) == 3:
        if argv[1] != "-w":
            fail(f'Unrecognized argument "{argv[1]}"')
        what = argv[2]
    else:
        fail("Too many/few arguments")

    print(what)

Now, while this fits the bill, this is oh so much more laborious to parse arguments manually like this. For a long time I was afraid to use argparse because the documentation is quite the beast, but in the end its basic use is very simple.

#!/usr/bin/env python3
from argparse import ArgumentParser


if __name__ == "__main__":
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    args = parser.parse_args()

    print(args.what)

That way you let all the hard work to ArgumentParser() and as a bonus you get an auto-generated help text when calling with -h as an argument.

Outside call

All this is cool but what if you want to call the CLI tool from another Python program? Experience showed me that it's often a much more pragmatic thing to do rather than to design a specific API for Python and a different API for CLI (which is arguments parsing).

The problem with what we've got here is that we can't reference our call from the outside world. To run it has to be in __main__, which is not necessarily what we want.

Fortunately there is an easy solution to this: create a C-style main() function.

#!/usr/bin/env python3
from argparse import ArgumentParser


def main():
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    args = parser.parse_args()

    print(args.what)


if __name__ == "__main__":
    main()

While this works, this doesn't fit the bill of letting you call custom arguments from an outside package. However, all you need to do is add the arguments as an argument to your main function:

#!/usr/bin/env python3
from argparse import ArgumentParser
from typing import Sequence, Optional


def main(argv: Optional[Sequence[str]] = None):
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    args = parser.parse_args(argv)

    print(args.what)


if __name__ == "__main__":
    main()

The advantage of this is that by default it will keep on working like before except that this time it will let you pass custom arguments to call it from the outside, like that by example:

>>> from say_something import main
>>> main(["-w", "hello, world"])
hello, world

A bit of assistance

For good measure and clarity, let's take out the arguments parsing into a separate function.

#!/usr/bin/env python3
from argparse import ArgumentParser, Namespace
from typing import Sequence, Optional


def parse_args(argv: Optional[Sequence[str]] = None) -> Namespace:
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    return parser.parse_args(argv)


def main(argv: Optional[Sequence[str]] = None):
    args = parse_args(argv)
    print(args.what)


if __name__ == "__main__":
    main()

But now you'll notice that the typing annotation isn't very helpful. I usually like to be a little bit more verbose to get more assistance from my IDE later on.

#!/usr/bin/env python3
from argparse import ArgumentParser
from typing import Sequence, Optional, NamedTuple


class Args(NamedTuple):
    what: str


def parse_args(argv: Optional[Sequence[str]] = None) -> Args:
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    return Args(**parser.parse_args(argv).__dict__)


def main(argv: Optional[Sequence[str]] = None):
    args = parse_args(argv)
    print(args.what)


if __name__ == "__main__":
    main()

Handling signals

For the purpose of this demonstration, let's add some "sleep" in this code to simulate that it's doing something instead of printing the text right away.

Now what if the program receives a signal? There is two main things you want to handle:

SIGINT — When the user does a CTRL+C in their terminal, which raises a KeyboardInterrupt
SIGTERM — When the user kindly asks the program to die with a TERM signal, which can be handled (as opposed to SIGKILL) but we'll see that later

Let's start by handling CTRL+C:

#!/usr/bin/env python3
from argparse import ArgumentParser
from time import sleep
from typing import Sequence, Optional, NamedTuple
from sys import stderr


class Args(NamedTuple):
    what: str


def parse_args(argv: Optional[Sequence[str]] = None) -> Args:
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    return Args(**parser.parse_args(argv).__dict__)


def main(argv: Optional[Sequence[str]] = None):
    args = parse_args(argv)
    sleep(100)
    print(args.what)


if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        stderr.write("ok, bye\n")
        exit(1)

As you can see, it's as simple as catching the KeyboardInterrupt exception. Please note that we do this outside of main() because the famous hypothetical "caller module" will already have its signal handling in place so that's something that we only need to setup if we're running on ourself.

Next comes the handling of SIGTERM. By default the program will just stop but we don't really want this. By example if we're using the following pattern inside the code:

try:
    # do something
finally:
    # cleanup

We won't have a chance to cleanup if an exception isn't raised. Fortunately for us, we can raise the exception for ourselves.

#!/usr/bin/env python3
from argparse import ArgumentParser
from time import sleep
from typing import Sequence, Optional, NamedTuple
from sys import stderr
from signal import signal, SIGTERM


class Args(NamedTuple):
    what: str


def parse_args(argv: Optional[Sequence[str]] = None) -> Args:
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    return Args(**parser.parse_args(argv).__dict__)


def sigterm_handler(_, __):
    raise SystemExit(1)


def main(argv: Optional[Sequence[str]] = None):
    args = parse_args(argv)
    sleep(100)
    print(args.what)


if __name__ == "__main__":
    signal(SIGTERM, sigterm_handler)

    try:
        main()
    except KeyboardInterrupt:
        stderr.write("ok, bye\n")
        exit(1)

What happens is that we're using signal() to register sigterm_handler() as our handler for SIGTERM. What happens "under the hood" is that the signal handler will be executed before the "next instruction" that the interpreter would otherwise have considered. This gives you a chance to raise an exception which is going to bubble up to our __main__ and exit the program with a 1 return code while triggering all the finally and context managers along the way.

As stated before, there is a sleep() in the middle of the function. This means that you can run the code from this section and then hit CTLR+C or send a SIGTERM to see what happens when you interrupt the program.

Error Reporting

Sometimes — more often that you'd like — your program will fail either by its own fault or because the inputs are incorrect. By example, if the user tries to open a file that doesn't exist, you might want to report it. And by "report it" I mean nicely, not with a harsh stack trace. Keep the stack trace for the cases that you didn't expect this way you know that something is really wrong.

Let's imagine that we want to forbid the user to say "ni". In order to report the error, we're going to create a specific error type for our program

class SaySomethingError(Exception):
    pass

Then we're going to handle it in our __main__:

    except SaySomethingError as e:
        stderr.write(f"Error: {e}")
        exit(1)

And finally we're going to raise the error from main()

    if args.what == "ni":
        raise SaySomethingError('Saying "ni" is forbidden')

For an overall code that is:

#!/usr/bin/env python3
from argparse import ArgumentParser
from typing import Sequence, Optional, NamedTuple
from sys import stderr
from signal import signal, SIGTERM


class Args(NamedTuple):
    what: str


class SaySomethingError(Exception):
    pass


def parse_args(argv: Optional[Sequence[str]] = None) -> Args:
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    return Args(**parser.parse_args(argv).__dict__)


def sigterm_handler(_, __):
    raise SystemExit(1)


def main(argv: Optional[Sequence[str]] = None):
    args = parse_args(argv)

    if args.what == "ni":
        raise SaySomethingError('Saying "ni" is forbidden')

    print(args.what)


if __name__ == "__main__":
    signal(SIGTERM, sigterm_handler)

    try:
        main()
    except KeyboardInterrupt:
        stderr.write("ok, bye\n")
        exit(1)
    except SaySomethingError as e:
        stderr.write(f"Error: {e}")
        exit(1)

With this very simple system, you can raise an error from your code no matter how deep you are. It will trigger all the cleanup functions from context managers and finally blocks. And finally it will be caught by the __main__ to display a proper error message.

Packaging

A good idea when providing a bin script inside a package is to have it be called with Python's -m option. By example instead of writing pip, I usually write python3 -m pip to be sure that the Pip I'm running is indeed the one from my virtual env. As a bonus, you don't need the environment's bin directory in your $PATH. You'll find that most famous and not-so-famous packages provide both ways of calling their binaries.

In order to do that, you need to put your script into a __main__.py file. Let's do this (this is UNIX commands but Windows should have no trouble translating):

mkdir say_something
touch say_something/__init__.py
mv say_something.py say_something/__main__.py

Now you can call your script with python3 -m say_something.

Note — We're creating an empty __init__.py file to signify to Python that this is a proper module

Here it's clearly becoming a question of preferences but my personal favorite tool for packaging has become Poetry because it is so simple. Let's create a basic pyproject.toml for our project.

[tool.poetry]
name = "say_something"
version = "0.1.0"
description = ""
authors = []
license = "WTFPL"

[tool.poetry.dependencies]
python = "^3.8"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"

Let's try this out to see if it worked

poetry run python -m say_something

This is good but did the command go? Have no fear, Poetry lets you declare your "bin" commands pretty easily. The only thing is that it needs a function to call directly, including all the signal-handling shenanigans. Let's move it all into a separate function then:

#!/usr/bin/env python3
from argparse import ArgumentParser
from typing import Sequence, Optional, NamedTuple
from sys import stderr
from signal import signal, SIGTERM


class Args(NamedTuple):
    what: str


class SaySomethingError(Exception):
    pass


def parse_args(argv: Optional[Sequence[str]] = None) -> Args:
    parser = ArgumentParser()
    parser.add_argument("-w", "--what", default="hello, world")

    return Args(**parser.parse_args(argv).__dict__)


def sigterm_handler(_, __):
    raise SystemExit(1)


def main(argv: Optional[Sequence[str]] = None):
    args = parse_args(argv)

    if args.what == "ni":
        raise SaySomethingError('Saying "ni" is forbidden')

    print(args.what)


def __main__():
    signal(SIGTERM, sigterm_handler)

    try:
        main()
    except KeyboardInterrupt:
        stderr.write("ok, bye\n")
        exit(1)
    except SaySomethingError as e:
        stderr.write(f"Error: {e}")
        exit(1)


if __name__ == "__main__":
    __main__()

Now we can add the following to the pyproject.toml file:

[tool.poetry.scripts]
say_something = "say_something.__main__:__main__"

And finally, let's try this out:

poetry run say_something -w "hello, poetry"

This way when someone installs your package they will have the say_something command available.

Conclusion

The pattern presented in this article is one that I use all the time. Weirdly, I never did a proper template for it but now that I wrote this article I know where to come and copy it. I hope that you found it useful!