Dependency management

There are many dependency management tools for python: conda, poetry, pipenv, tox, etc. Each of them has its pros and cons, fans and opponents. After spending an awful lot of time researching them, resolving conflicts in the environments, duckduckgoing many cryptic errors, I feel that in the long run they are not really worth it. Currently, my favorite setup is the build-ins: pip + venv. For handling multiple versions of Python on the same machine, I found pyenv to work great.

The only time when pip goes wrong is when you use the wrong pip, and for example, pip install a package to a different virtual environment or use pip when you should have used pip3. The solution is trivial, just always call python -m pip instead of just pip, so that you use the “pip for the python interpreter I’m currently using”. You can also add an alias pip='python -m pip' and never worry about it again.

For virtual environments, venv “just works”, is lightweight and doesn’t need any additional dependencies, commands, or special formats for the configurations.

To make the environment variables consistent, .env files are great. .env has a trivial format. There are many tools to auto-load the files like python-dotenv and environs that can validate the settings as well.

For the best reliability and portability, I use docker.

Automation

There are many tools for automating repeatable tasks, like running tests, and deployments. I stick to Makefiles and Bash. Not that they are the best, but they are the most widely known and probably pre-installed on your machine. Makefile has its quirks as it was designed for compiling C code, so projects like just try to skip them by leaving only the good parts, but it’s not that mature yet, so I’m still hesitating.

Linting and formatting

There are many holy wars about linters and formatters. I don’t like wasting time on discussing formatting during code review, so I like to just black everything. Together with black, I use isort that nicely sorts the imports, that has black compatibility mode. My usual pyproject.toml config is something like:

[tool.black]
line-length = 120
skip-string-normalization = true
target-version = ['py37', 'py38', 'py39']
include = '\.pyi?$'
exclude = '''
/(
    \.git
  | \.mypy_cache
  | \.tox
  | \.venv
  | build
  | data
)/
'''

[tool.isort]
profile = "black"
line_length = 120

While I agree that linters are very helpful as an additional code validation layer, I don’t like linters that need overtly complex configuration. For me, flake8 is the sweet spot, as it mostly works out-of-the-box. Additionally, flake8 uses mccabe to check for code complexity issues. To configure it, I usually ignore the formatting warnings in setup.cfg, as sometimes they conflict with black.

[flake8]
max_line_length = 120
max_complexity = 10
ignore =
    # Formatting (fixed by black)
    E1,E2,E3,
    # line too long
    E501,
    # Whitespace warning
    W2,
    # Line break warning
    W5

Testing

Pytest is currently the gold standard when it comes to testing Python code, so there is probably no reason to use anything else. That doesn’t mean you should ignore the good old unittest, for example, it has great unittest.mock module. It is worth mentioning that Pytest has many good plugins, for example, it can run doctest tests, and pytest-cov produces test coverage reports. When running it from command-line, there are two useful flags: -x for failing fast at the first failed test, and --ff for running the tests that failed at the previous run before other tests, so you immediately know if you fixed the issue. The whole command then becomes:

python -m pytest -x --ff -v --color=yes --doctest-modules .

By the way, you should also consider always calling it with python -m pytest.

I also found mypy pretty useful for finding issues with the code (“this function should not return None”, “you assumed np.array but pass pd.Series”, etc), though I agree that it can sometimes be too picky.

There are also other code testing tools that I really like but haven’t chance to use yet: behave for Behavior Driven Development and semgrep that lets you write custom pattern-based tests for the code.

Other code issues

There are two more developer tools, that I found useful and I’d like to mention: pydeps finds and visualizes the dependencies between the modules in your code and is very helpful for tracking the parts of code that are too tightly coupled, and vulture can help with finding dead code.

Ready template

The template for this setup can be found here: https://github.com/twolodzko/base-python-project.