The unglamorous side of engineering: Testing
I recently had the pleasure of presenting a talk at PyBay 2023 in San Francisco. PyBay is the largest Python regional conference in the San Francisco Bay area. It also serves as the primary fundraiser for the Bay Area Python Association (BAPyA), a volunteer-run organization dedicated to building a stronger Python developer community. It was my second time speaking at the conference, and an even more welcoming and warm experience than my first.
While the topic of my ten-minute talk was pretty rudimentary, I wasn’t surprised by some of the more complex conversations it inspired. This was my intention. Entitled “Testing Strategies for Python,” the talk itself covered the basic concepts of software testing and demonstrated a couple of features of unittest’s TestCase class, but what I really hoped attendees would walk away with was a new approach to an old problem: testing as a boring chore.
As an engineer, I worked on a team responsible for improving the quality of code throughout the organization. We wanted to reduce the number of developer-introduced bugs that were causing incidents and costing the company money by building out a testing-as-a-service platform that enabled developers to more easily perform systems integration tests. What we discovered is that developers wanted to do the right thing, but they didn’t always know how to, where to start, or have the right tools. As a result, testing was an afterthought instead of a practice.
Besides the quantitative benefits of software testing such as bug prevention, reduction in development costs, and improved performance, the most compelling benefit of software testing is better engineers. Testing forces us to ask ourselves, “What exactly is the expected behavior of this method or application?” When software becomes “difficult to test,” it is usually a good indicator of code smells and an opportunity to refactor a method or even the entire design of a system.
There are lots of different kinds of testing we can explore to answer lots of different kinds of questions, but the easiest win for beginner and advanced Python developers alike is unit testing, so it was a good place to start for my talk. (Which I hope to continue to expand upon!)
Testing different inputs with one setup: self.subTest()
(Quick note: The rest of this post assumes a basic familiarity with unit testing and Python’s built-in unit testing framework, unittest.)
After giving my talk at PyBay, I was delighted to learn from attendees that I had introduced them to a unittest tool: subTest(). I don’t think I will ever outgrow my impostor syndrome, so it is always validating when I can offer something useful to people I consider far more knowledgeable than I am. Because I received such a positive response, I thought I would use this as an opportunity to take a closer look at subTest() in unittest.
As of Python 3.4, unittest’s TestCase class is equipped with a context manager called subTest(). According to the documentation, “When there are very small differences among your tests, for instance some parameters, unittest allows you to distinguish them inside the body of a test method using the subTest() context manager.”
In other words, subTest() enables parameterization with a single test setup. This is appropriate for cases where you want to test multiple inputs with one resource. This can be especially useful when a resource or test setup is particularly expensive, such as a database query. Moreover, with subTest(), you can assert on different inputs without an assertion failure causing the overarching test method to fail.
If the advantages of subTest() are still a little unclear to you, don’t worry! The following tutorial hopes to provide further clarity.
Get your hands dirty with subTest()
A little bit of context and setup
You can find the code for this tutorial in this repo on the subtest-tutorial
branch.
While the repo contains a Flask application, the full web application is not necessary for this tutorial. This tutorial is concerned with the Pug class in pug.py
and in particular, the check_for_puppy_dinner
method. The accompanying tests are in test_pug.py
. Feel free to comment out the imports for OpenAI and Python-dotenv.
The check_for_puppy_dinner
method (inspired by Puppy Songs) gets the current time and compares it with the puppy dinner time supplied to an instance of Pug class. If the current time equals puppy dinner time, the function returns a happy message; if the times do not match, the function returns a sad message.
In the accompanying test, there are two versions of the test for check_for_puppy_dinner()
. One version uses subTest()
and the other does not. In both cases, there is a commented-out set of test data containing an intentional test failure. (I have done my best to indicate all of this in the comments.)
for data in test_data:
with self.subTest(msg=data['case']):
# Set return_value for mocked datetime.now() method
mock_datetime.datetime.now.return_value = data['time']
# Perform the test
test_result = Pug.check_for_puppy_dinner(self.test_pug.puppy_dinner)
# Assert the rest result matches the expected result
self.assertEqual(test_result, data['expected_result'],
msg="Test for check_for_puppy_dinner failed")
For this tutorial, the screenshots are taken from Visual Studio Code’s Testing interface, however, you are encouraged to use whatever tools you are already familiar with. For help configuring the Python plugin to run your tests in VS Code, check out the documentation.
Run the tests
(For even more human readability, it might be easier to configure tests to View as Tree
, which you can select by clicking the three dots in the upper right-hand corner of the Testing interface.)
-
Run the tests by clicking the
Run Test
arrow icon fortest_check_for_puppy_dinner()
. The tests should pass. Now run the tests for
test_check_for_puppy_dinner_without_subtest()
. They should also pass.
While these two tests are essentially the same, the test using subTest() nests the different test cases beneath it and displays the different subtest messages.
This can be helpful when it comes to discerning test results, but the real benefit of subTest() is in the way it handles test failures.
Fail the tests
- In both
test_check_for_puppy_dinner()
andtest_check_for_puppy_dinner_without_subtest()
, uncomment the test data at the TODOs and comment out the original test data. - Run both tests again. They should fail.
If you compare how the tests are handled, you should see that even though test_check_for_puppy_dinner()
shows a failure, it also indicates a passing test nested beneath it. This is because even though the overall test failed, since the test uses the subTest() context manager, the test still continued with the next set of data in the test data list. The data for which the test passed is indicated with a green checkmark.
When it comes to test_check_for_puppy_dinner_without_subtest()
, however, the test simply indicates a failure with no further details.
But what are we really testing?
While subTest() offers the practical benefits of more efficient test setup and more specific test results, I think there is a more conceptual advantage to subTest(). As Paul Ganssle explains in his excellent blog post on subtesting, “One thing that subtests do very well is to let you keep to the spirit of ‘one assertion per test’ when you'd like to explore multiple properties of the state of a system.” In other words, subTest() can be one way to respond to, “What exactly is the expected behavior of this method or application?”
In the case of check_for_puppy_dinner()
, the expected behavior is to compare two times and return a message based on that evaluation. With subtesting, we are making one assertion: Given two times to compare (current time and puppy dinner time), we should receive a specific outcome.
Have you used subTest
in your testing? Feel free to share examples below!
Happy testing!
Top comments (0)