DEV Community

Dead Simple Python: Project Structure and Imports

Jason C. McDonald on January 15, 2019

Like the articles? Buy the book! Dead Simple Python by Jason C. McDonald is available from No Starch Press. The worst part of tutorials is alwa...

Read full post

Sandor Dargo • Sep 28 '20

Great article, Jason

I cannot figure out a problem, maybe you can share your ideas.

I have the following structure:

.
├── cmake_project_creator
│   ├── dependency.py
│   ├── directory_factory.py
│   ├── directory.py
│   ├── include_directory.py
│   ├── __init__.py
│   ├── project_creator.py
│   ├── source_directory.py
│   └── test_directory.py
├── examples
│   ├── dual.json
│   ├── nested_dual.json
│   └── single.json
├── README.md
└── tests
    ├── __init__.py
    ├── test_dependency.py
    ├── test_directory_factory.py
    ├── test_directory.py
    ├── test_include_directory.py
    ├── test_project_creator.py
    ├── test_source_directory.py
    └── test_test_directory.py

The main entry point is cmake_project_creator/project_creator.py asking for a couple of parameters.

If I try to invoke it from Pycharm, everything is fine.
The tests running by nosetests --with-coverage --cover-erase running fine. But if I try to invoke cmake_project_creator/project_creator.py from the terminal, this is what I get:

sdargo@mymachine (master) ~/personal/dev/project_creator $ python cmake_project_creator/project_creator.py -c
Traceback (most recent call last):
  File "cmake_project_creator/project_creator.py", line 6, in <module>
    from cmake_project_creator import directory_factory
ModuleNotFoundError: No module named 'cmake_project_creator'

Do you have any idea what can be the issue?

Jason C. McDonald • Sep 28 '20

Absolutely. Your package needs a dedicated entry point for any imports off cmake_project_creator to work.

Add __main__.py to cmake_project_creator/. Your __main__.py file should look something like this:

from . import project_creator

if __name__ == "__main__":
    project_creator.WHATEVER_YOUR_INITIAL_FUNCTION_IS

Then, you can invoke the package directly with...

python3 -m cmake_project_creator

Sandor Dargo • Sep 28 '20

Thanks a lot, Jason! This partly solved my problem.

Now I can run for example python3 -m cmake_project_creator -c where -c is a parameter and it works like a charm. But after adding the correct shebang and execution rights, I still cannot simply run ./cmake_project_creator/project_creator.py -c as I have the same failure of :

Traceback (most recent call last):
  File "./cmake_project_creator/project_creator.py", line 8, in <module>
    from cmake_project_creator import directory_factory
ModuleNotFoundError: No module named 'cmake_project_creator'

Do I really have to manipulate sys.path for that?

Jason C. McDonald • Sep 28 '20 • Edited

As a rule, never ever ever ever EVER manipulate sys.path to solve Python import issues. It has some pretty serious and nasty side-effects that can break other programs.

You shouldn't invoke modules within a package like this. Instead, I'd recommend adding command-line argument support to your __main__.py, via argparse.

With __main__.py becoming the dedicated entry point, you should update it further to have a dedicated main() function, like this:

def main():
    # Do whatever....

if __name__ == '__main__':
    main()

The sole entry point to your package should be python3 -m cmake_project_creator, or an entry point script that invokes cmake_project_creator.main()

Sandor Dargo • Sep 28 '20

Ok, thanks. Yes, I've been already using argparse to get the CL arguments.
So one option is to use the -m option and the other way I managed to make it work is to add the repo-root to the PYTHONPATH, which could be done by a setup.py and most probably it would be OK to have it in a virtualenv.

Thanks once more!

Jason C. McDonald • Sep 28 '20

Well, like I said, changing the path is always wrong. Yes, even in a virtualenv, especially since you can't guarantee that it'll always be run in one by another user. So, you only have one option, being the one I described. But, shrug, I've said my piece.

Sandor Dargo • Sep 29 '20 • Edited

I got your point and at the same time, in general, I don't believe in "having only one option". My problem with invoking a product with -m is twofold. One, it's not at all user-friendly, and the other is that it's leaking an abstraction. The product is implemented as a module with that name.

Following your recommendation not to change any path variable, I found to way to overcome this.
1) I wrap the python3 -m cmake_project_creator into a shell script. As such users don't have to bother with -m, not even with pretending the module or script name with python3. On the other hand, it's not very portable (what about Win users for example?), this might or might not be acceptable. In my case, it would be.
2) I managed to invoke the module with runpy.run_module("cmake_project_creator", run_name='__main__') from another python script that given a correct shebang I can simply call ./run.py <args>. To me this seems ideal as I keep the invocation (from a user perspective) as simple as possible and as portable as possible and I encapsulate both the module name and the fact that the product is implemented as a module.

PS: The product is going to be completely free, with the word product I only want to emphasize that it's meant to be used by people who might not even know with python -m is or python at all.

Jason C. McDonald • Sep 29 '20

That's why you have an entry point script, or even several, as I alluded to earlier. You can use your setup.py to provide those, and those scripts can even be named such as you describe. But editing the Python path is still always wrong, for technical reasons.

Python quite often is meant to have only one right way of doing something. The language is built that way.

As I haven't yet been able to write the article on setup.py, please read this article by Chris Warrick: chriswarrick.com/blog/2014/09/15/p...

Sandor Dargo • Sep 30 '20

Thanks for the recommendation. I'm definitely going to read it as that's pretty much my next thing to do, understand what I need to put in the setup.py. Thanks again!

Ashley Hoff • Sep 19 '19 • Edited

Hi!
Firstly great article. This has been one of the clearest examples of how it should be done. Thanks.

I am not sure whether this is an edge case, but I have a structure that looks like this:

generateandsend/
├── __init__.py
├── __main__.py
│
├── generatedata/
│   ├── __init__.py
│   └── generate_data.py
│ 
├── senddata/
│   ├── __init__.py
│   └── send_data.py
│ 
├── utilities/
│   ├── __init__.py
│   └── local_utilities.py
│ 
├── Readme.md
└── License.md

Both generate_data.py and send_data.py reference functions in local_utilities.py.

The issue I have, more often then not, I would be calling send_data.py or generate_data.py

I know that if I call either them specifically, I will need to add a reference to be able to import local_utilities.

Does this go against the general accepted practice? Would it be better to either separate them into different projects (I would like to keep all the code together) or use an argparser in __main__ and call the respective module using args?

Thanks
Ashley

Jason C. McDonald • Sep 24 '19 • Edited

Hi Ashley,

Sorry for the tremendous delay in reply. So, just to be clear, you're wanting to be able to call generate_data.py and send_data.py directly, and those are supposed to be able to import a module from elsewhere in the project?

If so, I would actually consider why you want to execute those modules directly. If you're simply wanting to be able to execute the two separately from the command-line, it may be worth fleshing out __main__.py to accept a command-line argument, so python3 -m generateandsend send or python3 -m generateandsend generate will execute what you want. That'll also be the easiest solution. That way, you're always executing the top-level package (generateandsend)

In fact, I'm not entirely sure off the top of my head how to get multiple projects to talk to one another within a shared directory! I know it has to do with PYTHONPATH, but I think that will necessitate more research on my part. ;)

Ashley Hoff • Sep 24 '19

Thanks for replying (& no problem on the delay - we all have a life to live!).

I have thought about this more and agree - Why is it that I want to call them separately, where a parameter will suffice. So, I have abandoned the idea and gone with the python3 -m generateandsend generate approach

Cheers for the reply though. Appreciated.

Ashley Hoff • Sep 20 '19

I also have one more question - if I wanted to include an ini/configuration file in a resources folder, how would I import it?

Thanks

Jason C. McDonald • Sep 24 '19 • Edited

I like to put all such non-code files in a project subdirectory (not a package) called resources, and then use the built-in package pkg_resources to access it.

For example, in my omission project, the module omission/game/content_loader.py needs to load the text file omission/resources/content/content.txt. I do that with...

import pkg_resources

class ContentLoader(object):

    def __init__(self):
        """
        Open the file and load the contents in.
        """

        # ...

        path = pkg_resources.resource_filename(
            __name__,
            os.path.join(os.pardir, 'resources', 'content', 'content.txt')
        )

        with open(path, 'rt', encoding='utf-8') as content_file:
            raw_content = content_file.read()

        # ...

Simple as that!

P.S. If you find yourself needing to access files outside of your project directory, say, in the user's home directory, I recommend the package appdirs.

Ashley Hoff • Sep 24 '19

Again, thanks for the reply.

I've had a play with this one. Considering I am dealing with an ini file, it appears that configparser does what I want. This is the snippet I've come up with:

def conf():
    config = configparser.ConfigParser(converters={'list': lambda x: [i.strip() for i in x.split(',')]},
                                       allow_no_value=True)
    config.read('generateandsend/Resources/generateandsend.ini')
    section = config['test']
    string_a = section.get('StringA', None)
    string_b = section.get('StringB', None)

    return string_a, string_b

Is hard coding the relative path in that way frowned apon?

Jason C. McDonald • Sep 24 '19 • Edited

Is hard coding the relative path in that way frowned apon?

Most certainly, especially because you have to account for differences in path format between operating systems.

I'd recommend incorporating pkg_resources into your approach above.

def conf():
    config = configparser.ConfigParser(converters={'list': lambda x: [i.strip() for i in x.split(',')]},
                                       allow_no_value=True)
    path = pkg_resources.resource_filename(
            __name__,
            os.path.join(os.pardir, 'Resources', 'generateandsend.ini')
        )
    config.read(path)
    section = config['test']
    string_a = section.get('StringA', None)
    string_b = section.get('StringB', None)

    return string_a, string_b

I believe that will work? You'll have to check how config.read() handles an absolute path.

Ashley Hoff • Sep 24 '19

Beautiful. Thanks. I had to massage it a little and remove os.pardir, as it was giving me a false directory on my windows machine (C:\tmp\generateandsend\..\Resources\generateandsend.ini).

The resultant path variable now looks like:

path = pkg_resources.resource_filename(__name__, os.path.join('Resources', 'dummy.ini'))

I just need to test this on my Linux box

Cheers again. Send the bill to...... 😉

mkaut • Dec 11 '19

Great introduction.
I have one question: you have tests inside the project directory, while this guide places both docs and tests into the git root. Are there any up- or down-sides to either of the choices?

Jason C. McDonald • Dec 11 '19

My method just makes the imports a lot easier. You'll notice that the guide you linked to requires some complex imports for the tests to work, whereas my approach requires nothing of the sort, since tests are part of the module.

I suppose if you absolutely don't want to ship tests as part of your finished product, that might justify the other approach. That said, I prefer to always ship tests in the project; it makes debugging on another system a lot more feasible.

mkaut • Dec 16 '19

Good point, thanks.

So, in your approach, how do you import, let's say game_item.py from test_game_item.py?
And does it then have to be run from a specific folder (omission-git, omission-git/omission/, or omission-git/omission/tests) or does it work from all the above?

Jason C. McDonald • Dec 16 '19

Within omission/tests/test_game_item.py, I would import that other module via...

import omission.game.game_item

I always run python -m omission or pytest omission from within omission-git.

rhymes • Jan 15 '19

Hi Jason, nice article!

Just a question: I've noticed you didn't talk about namespace packages. Is it because it might be outside the scope of a "dead simple" intro?

I'm mentioning it because I believe they are a simpler concept for a new developer, as in: folders are packages, if you need initialization code for such package, add a __init__.py, otherwise you can't totally ignore the file. I'm over simplifying here of course.

Thank you!

Jason C. McDonald • Jan 15 '19

That was something I actually didn't know about. Thanks for the link! It is probably more advanced than I want to go in the article series, but thanks for parking it in a comment anyhow. I'll look at this again later, and see if it might be worth adding to the guide after all. Thank you!

rhymes • Jan 15 '19

An example:

➜ tree
.
└── smart_door
    └── open.py

1 directory, 1 file

➜  cat smart_door/open.py
print("I have opened")

➜  python
Python 3.7.2 (default, Jan 13 2019, 22:54:07)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> from smart_door import open
I have opened

You can read more about it here.

Grzegorz Krug • Sep 2 '19 • Edited

My tree example:

---src
init.py
main.py
------game
---------cards98.py

------reinforced
---------rl_agent.py

------supervised

Readme.md
License.md

I can not reach parent module, from rl_agent.py

I added some init.py but it does not solves.
I have tried:
from game.cards98 import GameCards98
from src.game.cards98 import GameCards98

And all I got is ModuleNotFoundError: No module named 'src'
This works fin in pycharm, but not in idle :/

Jason C. McDonald • Sep 2 '19 • Edited

This structure should work:

src/
├── __init__.py
├── __main__.py
│
├── game/
│   ├── __init__.py
│   └── cards98.py
│ 
├── reinforced/
│   ├── __init__.py
│   └── rl_agent.py
│ 
├── supervised/
│   └── __init__.py
│ 
├── Readme.md
└── License.md

You need __init__.py under each directory that you want to use as a package.

Then, from rl_agent.py, you should be able to use this import:

from src.game.cards98 import GameCards98

Grzegorz Krug • Sep 2 '19 • Edited

I know it should work, but it does not. I got
__init__.py everywhere and __main.py__ in top level.
Do I need to run it with -m param? I am definitely missing something. I was running scripts from top level to combine modules, but It can get messy sometimes :P

This is my repo: Github Cards98

Jason C. McDonald • Sep 2 '19

It shouldn't be messy. But, yes, you'd need to invoke your top-level package (not your top-level script.)

python3 -m src

By the by, I recommend renaming src to your project name, cards98, and then renaming the subpackage by the same name to something like game.

Grzegorz Krug • Sep 2 '19

Yes, now it is working. python -m cards98
Well... but only for invoking top level in console. It does not work for normal execution, like clicking 2 times __main.py__ with mouse.
This also makes debugging and testing harder, cause I have to change it always in __main__.py. Where can I use it? I think it just complicates everything.

Thanks for help in understanding this

Jason C. McDonald • Sep 2 '19 • Edited

This is, to my knowledge, the official (and only) way to structure a Python project. There are two to create an executable file to start everything.

Option 1: Native Script

Many Python projects offer a Bash script (on UNIX-like systems) or a Windows .bat file that will run the python3 -m cards98 command. I see this a lot.

Option 2: Python Script

This is the method I'd recommend, as it's the most portable.

Outside of your top-level package, you can write a separate Python script, such as run.py or cards98.py, and then use that to execute your main function.

For example, in cards98/__main__.py, you can put this...

def main():
    # The logic for starting your application.

if __name__ == "__main__":
    main()

And then, outside of the cards98 package, create the file cards98.py, with the following:

#!/usr/bin/env python3
from cards98.__main__ import main
main()

To start your Python application, just double-click cards98.py.

P.S. Thanks for bringing up this situation. I realized I never addressed it in the book!

Kyle R. Conway • Jan 15 '19

Thank you so much for this article. Hard to overstate how helpful this is for someone who feels relatively competent at the language but completely inexperienced at building something sane looking or structured appropriately.

Nikita Sobolev • Jan 16 '19 • Edited

Thanks for this article! It is very useful for beginners.

I would like to suggest to mention wemake-python-styleguide in one of the future articles. In my practice, it is very helpful for beginners, since it enforce insane rule to struct and clean your code. That's what stimulates learning progress!

Anyway, great series. Waiting for the next articles.

Marc Hanisch • Jul 25 '19 • Edited

Such a great article, thank you very much. I've just dived into Python, having used multiple languages before. But these are exactly the explanations needed by Python newcomers to get a better understanding how things work in Python.

Tony • May 9 '19

Nice article!

While this is probably beyond the scope of this article, one useful addition for those that need to create packages frequently would be to look into using cookiecutter. It lets you create a "package template". While these templates can be simple, they can also include support for many dev tools such as docker, travis-ci, sphinx, doctests (via pytest/nose/etc), etc.

Once the cookiecutter template is ready, you run a quick wizard and it generates the project directory/files for you. There are also a bunch of templates already available, some of which are specialized for specific tasks (such as data analysis).

For more info:
cookiecutter.readthedocs.io/en/lat...
github.com/audreyr/cookiecutter

Johann Krauter • Jan 24 '22

Hi together,

I have some "beauty" buggy behaviour with importing typing hints in my docstring. I'm using Sphinx with the intersphinx extension to build a docu based on the typing hints and docstring of my code.
By using the extension "intersphinx" and the intersphinx_mapping you can map "python, numpy, matplotlib" docu references in your docu.

In the screenshot you see: First parameter is a np.ndarray ("import numpy as np") type without any references. The second parameter has the references to the python docu.

When I import numpy as "import numpy" and type in the docstring numpy.ndarray, I get the docu references in the build html docu.

Can somebody example why it does not go with the np.ndarray typehint?

Benjamin Ewert • May 28 '19

thanks a lot for your article, this was exactly what i was looking for since this topic is skipped by basically everyone else...

You said that the topic with the import of app.py instead of using main.py is out of scope... Do you plan on writing up something that is picking this topic up?

I'm quite interested in the reasoning of this approach, do you have any reference by chance?

Jason C. McDonald • May 28 '19

I don't think there are any formal guidelines on the topic, to be honest. My use of app.py has a lot to do with separation of concerns; I put my GUI startup code in app.py, and my non-GUI startup code in __main__.py. I can't really point to something that says this is "right" or "wrong"...it just works out pretty well for my project. It's something that has to be considered on a project-by-project basis, really.

Jay Ta'ala • Feb 5 '20 • Edited

Finally, a straight-forward description that is really well explained. As a Java developer getting into python, I've always been frustrated when asking python devs where I am about solid source code structure for python projects that makes sense and won't lead to the apparent mess of what I've seen in lots of python code (they usually just say I'm being too much of an uptight "java" dev... and I should just create a .py file and start "hacking" away).

Cheers,

Jay.

Jason C. McDonald • Feb 5 '20 • Edited

I certainly apologize on behalf of the Python community for your being treated like that! "Just start hacking" is, I believe, what someone says when they really don't know the answer, and are afraid you're going to find them out.

In my experience, the #python Freenode IRC room isn't like that most of the time. We have frequent conversations about proper Python project structure; most of this article came out of those conversations. Of course, as with any community, it depends on which people you encounter, but I would recommend checking that room out in general.

Christian Brintnall • Jan 16 '19

Nice article, you dropped a _ in this snippet:

import smart door
smart_door.open()
smart_door.close()

Just so you know!

Jason C. McDonald • Jan 16 '19

Thanks for catching that! I just went back and fixed it. :)

Luis Hernández • Jan 15 '19

That's an excellent explanation. I can understand the project structure better with this.

Thanks for your article, Jason. Looking forward to read the following ones :D

Felix Coutinho • May 19 '23

Nice Article @codemouse92

Adriano Machado • Jul 8 '19

Nice article, Jason. The code shown in this article is available somewhere? I'd like to check some of the inner details.

Jason C. McDonald • Jul 8 '19

Here you are, although I'll warn you that it's being heavily restructured at the moment.

mousepawmedia / omission

A game with a deceptively difficult premise: find the missing letter.

Omission

A game where you find what letter has been removed from a passage.

Content Notes

The content for this game has been derived from "Bartlett's Familiar Quotations".

For brevity, the source of each quote has been omitted - both title and author.

Sections previously italicized have been replaced with CAPS for easier display.

Passages have been trimmed and rearranged to be no more than 4-5 lines.

Authors

Jason C. McDonald
Jarek G. Thomas
Anne McDonald (Content)
Jane McArthur (Content)
tshirtman (omission.spec)

Thanks to the following:

tshirtman (Freenode/#kivy): Pyinstaller help
pabs (OFTC/#packaging): Debian packaging help

Dependencies

Python3
Kivy >= 1.10
appdirs >= 1.4.3

Installing

To install from source, see BUILDING.md.

Contributions

We do NOT accept pull requests through GitHub If you would like to contribute code, please read our Contribution Guide.

All contributions are licensed to us under the MousePaw Media Terms of Development.

License

Omission is licensed…

View on GitHub

Guillermo Chussir • Dec 8 '19

Excellent guide! Thanks!

Feruz Oripov • Jan 16 '19

Hi Jason, when will you publish the next part of tutorial ?

Jason C. McDonald • Jan 16 '19 • Edited

I don't have a specific schedule in mind, and I'm balancing a few things, but I hope to have the next published later this week, or early next.

That said, I cannot promise any particular timeline beyond that. It all depends on how some other pieces of my life go. It might be really quick, or I might only post every other week. Can't say for sure.

Given how popular this series is, though, it's a very high priority of mine to update and finish.

Ezichi Ebere Ezichi • Jan 16 '19

Nice article!!!

Marlon Ugocioni Marcello • Jan 16 '19

This series has been awesome Jason, thanks!

Crocodile Forest • Jan 17 '19

I've just start learning to code. I love this series. Thank you for making this.

adir abargil • Oct 30 '19

i wonder where to place the venv directory in this project structure?

Jason C. McDonald • Oct 30 '19 • Edited

venv always belongs at the top-most level of the repository (make sure you untrack it via .gitignore!), or else outside of the repository altogether.