Thursday, April 18, 2013

Unittesting Without Patching

Python has the power to override any attribute on any module or class, but just because you can doesn't mean you should. This is true in regular code, but just as true of unittests. Many testing libraries (mock, Twisted's trial, py.test) provide facilities for overriding some piece of global state; you can also do so manually. Occasionally these facilities prove invaluable, but often they are used unnecessarily. Better alternatives are available.

Before I explain why patching is problematic, let's look at an example. Consider the following module:

import os

def exit_with_result(function):
    result = function()
    if result:
        os._exit(0)
    else:
        os._exit(1)

On the face of it patching is necessary to test this example. The tests would look something like this:

import unittest
import os

from exitersketch import exit_with_result


class ExiterTests(unittest.TestCase):
    def setUp(self):
        self.exited = None
        self.originalExit = os._exit
        os._exit = self.fakeExit

    def fakeExit(self, code=0):
        self.exited = code

    def tearDown(self):
        os._exit = self.originalExit

    def test_exiter_success(self):
        exit_with_result(lambda: True)
        self.assertEqual(self.exited, 0)

    def test_exiter_failure(self):
        exit_with_result(lambda: False)
        self.assertEqual(self.exited, 1)


if __name__ == '__main__':
    unittest.main()

Having seen patching, and seen that it works as a testing technique, why should we avoid it?

  1. Patching is fragile. If the example above changed import os to from os import _exit, the patching would need to be modified. However, if you forgot to modify the patching, unexpected code will run. In this case, your test run will mysterious exit half way through. If the function you are attempting to patch is more destructive, worse things may happen: credit cards may get charged, data may get deleted, etc..
  2. Patching leads to unexpected behaviour. Because patching is a global change, the patched code may be called not just by the function being tested, but by code it is calling which happens to use the same patched code.
  3. Patching indicates bad design. Code code should be designed to be easily testable. Having to modify global state suggests that the code is not as modular as one might hope.

How to avoid patching? Parameterization, aka dependency injection. We refactor the code to accept the _exit function as a parameter. Notice the the public API has not changed:

import os

class _API(object):
    def __init__(self, exit):
        self.exit = exit

    def exit_with_result(self, function):
        result = function()
        if result:
            self.exit()
        else:
            self.exit(1)


_api = _API(os._exit)
exit_with_result = _api.exit_with_result

Our tests can now test both that _API.exit_with_result class has the correct behavior in general, and that the public exit_with_result is going to call os._exit in particular.

import unittest
import os

from exiter import _api, _API, exit_with_result


class ExiterTests(unittest.TestCase):
    def setUp(self):
        self.exited = None

    def fakeExit(self, code=0):
        self.exited = code

    def test_api(self):
        self.assertIsInstance(_api, _API)
        self.assertEqual(_api.exit, os._exit)
        self.assertEqual(exit_with_result, _api.exit_with_result)

    def test_exiter_success(self):
        _API(self.fakeExit).exit_with_result(lambda: True)
        self.assertEqual(self.exited, 0)

    def test_exiter_failure(self):
        _API(self.fakeExit).exit_with_result(lambda: False)
        self.assertEqual(self.exited, 1)


if __name__ == '__main__':
    unittest.main()

The same technique is useful when you are tempted to store some state in a module. Instead, store an instance of a class:

class _Counter(object):
    value = 0

    def increment(self):
        self.value += 1

    def value(self):
        return self.value


_counter = _Counter()
increment = _counter.increment
value = _counter.value

As I've demonstrated, patching can often be avoided by restructuring code to be more testable. The same Python features that make patching so easy also make avoiding patching just as easy. Given the choice, you should avoid changing global state when testing individual components.

9 comments:

  1. My issue with this: why should you write lots of additional code making the program more general and more difficult to understand just so you can test that code?

    ReplyDelete
  2. One day you will need to change your code, due to a bug, or a feature request, or your dependencies changing. Or someone else will have to. When that day comes, having the reassurance that changes don't break anything is a huge benefit. Doing manual testing of hundreds or even thousands of scenarios just isn't realistic.

    Without tests, you have know way of knowing if changes broke your code. And half the time you have no way of knowing if your code even worked from the start. Consider error cases: they don't happen that open, so casual manual testing is not likely to cover them. Without automated unit tests, you can't tell your error handling works.

    ReplyDelete
  3. There is also a way to do this that keeps the changes to the code to a minimum: just pass the function in as a new argument, with a default: "def exit_with_result(function, exit_function=os._exit): ..."

    Of course that is a small change to the public API, that may or may not be a greater problem than the benefit in your case. It's also not useable on functions that already do things with random keyword arguments, and so on. But it stays much closer to the original code.

    In general though I dislike making the code more complex to make testing easier. Making code more testable can improve it, but that's when you break up the code in small chunks to test them separately, that sort of thing.

    ReplyDelete
    Replies
    1. For isolated functions, where you only need to override one or two things, the style involving an extra argument is reasonable, yes. The class style is more useful if you need to either override many things, or you have a group of functions all of which use the same set of dependencies, e.g. a group of functions that all use sys.stdout.

      Delete
    2. I wrote a more detailed reply as a blog post (http://blog.futurefoundries.com/2013/04/unittesting-without-patching-followup.html).

      Delete
  4. While your point about the risks of patching destructive calls is valid, if you're going to decry the practice of using mocks in tests, at least decry a version which uses them properly. In your first example you are patching the wrong module - you shouldn't patch os._exit (with potentially non-local effects), you should patch the module under test so that *in that module only*, the reference "os._exit" resolves to your patched function.

    Most functions under test *aren't* destructive (so you'll get the expected test result failure), and by jumping straight to dependency injection in cases where you don't need it, you can end up adding a huge amount of complexity to your production code without adequate reason *and* give yourself additional code paths to test in the process. Dependency injection should be used only if there is a *production* related reason for adding it (and "this function is destructive, so we should use dependency injection rather than mocking to test it" is a valid reason).

    For those non-destructive cases, you can avoid most of the non-local effects without adding complexity to the production code by localising your mock operation to as narrow a target as possible.

    For instance (continuing with the destructive example though, since that's what the post is comparing): http://pastebin.com/CQq0A40C (since Blogger comments can't handle code)

    ReplyDelete
    Replies
    1. I like that idea, definitely better than patching globally, but I still think it's brittle; notice you're still vulnerable to running destructive code if the import changes to "from os import _exit" in the exitersketch.py module. I will write another followup post demonstrating your technique, and explain my reservations there.

      Delete
  5. Maybe the problem with the example, and the reason why it is hard to test, is that you should not call os._exit directly, because the underscore before exit is a convention to indentify a private function.
    If you used sys.exit, your test would only need to check that exit_with_result throws SystemtExit with the corresponding exit code.

    Yes, I know is just an example, but sometimes the solution for hard to test code is not dependency injection, (with its extra layer of complexity), but rethinking your solution.

    ReplyDelete
    Replies
    1. The point was to succinctly demonstrate calling a dangerous function; even if this example is less likely, the situation is real. I could just have well have called a charge_customer_creditcard() function, but this way the code is actually runnable. Sometimes you have no choice but to do either patching or dependency injection, which is the situation I am discussing.

      Delete