Saturday, October 19, 2013

Announcing Crochet v1.0: Use Twisted Anywhere!

It's been a busy 6 months since I first released Crochet, and now it's up to v1.0. Along the way I've expanded the documentation quite a bit and moved it to Sphinx, fixed a whole bunch of bug reports from users, added some new APIs and probably introduced some new bugs. What is Crochet, you ask?

Crochet is an MIT-licensed library that makes it easier for blocking or threaded applications like Flask or Django to use the Twisted networking framework. Crochet provides the following features:

  • Runs Twisted's reactor in a thread it manages.
  • The reactor shuts down automatically when the process' main thread finishes.
  • Hooks up Twisted's log system to the Python standard library logging framework. Unlike Twisted's built-in logging bridge, this includes support for blocking Handler instances.
  • A blocking API to eventual results (i.e. Deferred instances). This last feature can be used separately, so Crochet is also useful for normal Twisted applications that use threads.
You can download Crochet at: http://pypi.python.org/pypi/crochet

Documentation can be found on Read The Docs.

Bugs and feature requests should be filed at the project Github page.

Monday, June 10, 2013

Available for Python Consulting

Need someone to write high-quality, well-tested and robust Python code? If you need some Python or Twisted development done (or some other language, for that matter), I now have some free time in my schedule. You can reach me at itamar@futurefoundries.com.

Friday, May 24, 2013

Announcing Crochet 0.7: Easily use Twisted from threaded applications

Crochet is an MIT-licensed library that makes it easier for threaded applications like Flask or Django to use the Twisted networking framework. Features include:
  • Runs Twisted's reactor in a thread it manages.
  • Hooks up Twisted's log system to the Python standard library logging framework. Unlike Twisted's built-in logging bridge, this includes support for blocking logging.Handler instances.
  • Provides a blocking API to eventual results (i.e. Deferred instances).
This release includes better documentation and API improvements, as well as better error reporting.
You can see some examples, read the documentation, and download the package at:

https://pypi.python.org/pypi/crochet

For those of you who have seen Crochet before, I'd like to feature a new example. In the following code you can see how Twisted and Crochet allow you download information in the background every few seconds and then cache it, so that the request handler for your web application is not slowed down retrieving the information:

"""
An example of scheduling time-based events in the background.

Download the latest EUR/USD exchange rate from Yahoo every 30
seconds in the background; the rendered Flask web page can use
the latest value without having to do the request itself.

Note this is example is for demonstration purposes only, and
is not actually used in the real world. You should not do this
in a real application without reading Yahoo's terms-of-service
and following them.
"""

from flask import Flask

from twisted.internet.task import LoopingCall
from twisted.web.client import getPage
from twisted.python import log

from crochet import run_in_reactor, setup
setup()


class ExchangeRate(object):
    """
    Download an exchange rate from Yahoo Finance using Twisted.
    """

    def __init__(self, name):
        self._value = None
        self._name = name

    # External API:
    def latest_value(self):
        """
        Return the latest exchange rate value.

        May be None if no value is available.
        """
        return self._value

    @run_in_reactor
    def start(self):
        """
        Start the background process.
        """
        self._lc = LoopingCall(self._download)
        # Run immediately, and then every 30 seconds:
        self._lc.start(30, now=True)

    def _download(self):
        """
        Do an actual download, runs in Twisted thread.
        """
        print "Downloading!"
        def parse(result):
            print("Got %r back from Yahoo." % (result,))
            values = result.strip().split(",")
            self._value = float(values[1])
        d = getPage(
            "http://download.finance.yahoo.com/d/quotes.csv?e=.csv&f=c4l1&s=%s=X"
            % (self._name,))
        d.addCallback(parse)
        d.addErrback(log.err)
        return d


# Start background download:
EURUSD = ExchangeRate("EURUSD")
EURUSD.start()


# Flask application:
app = Flask(__name__)

@app.route('/')
def index():
    rate = EURUSD.latest_value()
    if rate is None:
        rate = "unavailable, please refresh the page"
    return "Current EUR/USD exchange rate is %s." % (rate,)


if __name__ == '__main__':
    import sys, logging
    logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
    app.run()

Thursday, April 25, 2013

Unittesting With Localized Patching

In my previous two posts, I gave examples of alternatives to patching in unittests: class-based and function-based parameterization. Nick Coghlan pointed out that my example of patching was a little bit of a strawman argument - the most global (and therefore the most side-effecty) way of doing patching. Here's what he wrote:
While your point about the risks of patching destructive calls is valid, if you're going to decry the practice of using mocks in tests, at least decry a version which uses them properly. In your first example you are patching the wrong module - you shouldn't patch os._exit (with potentially non-local effects), you should patch the module under test so that *in that module only*, the reference "os._exit" resolves to your patched function.

Most functions under test *aren't* destructive (so you'll get the expected test result failure), and by jumping straight to dependency injection in cases where you don't need it, you can end up adding a huge amount of complexity to your production code without adequate reason *and* give yourself additional code paths to test in the process. Dependency injection should be used only if there is a *production* related reason for adding it (and "this function is destructive, so we should use dependency injection rather than mocking to test it" is a valid reason).

For those non-destructive cases, you can avoid most of the non-local effects without adding complexity to the production code by localising your mock operation to as narrow a target as possible.
The version of patching he suggests is definitely a lot better than my initial example, so let's take a look. First, the module we're going to test:
import os

def exit_with_result(function):
    result = function()
    if result:
        os._exit(0)
    else:
        os._exit(1)
And now the patch-based tests, based on example code from Nick Coghlan (any mistakes were added by me):

import unittest
# Note that we don't import os, because we're not touching it!

import exitersketch

class FakeOS:
    EXIT_NOT_CALLED = object()
    CALLED_WITH_DEFAULT = object()

    def __init__(self, module):
        self.module = module
        self.exit_code = self.EXIT_NOT_CALLED

    def _exit(self, code=CALLED_WITH_DEFAULT):
        self.exit_code = code

    def __getattr__(self, attr):
        return getattr(self.original_os, attr)

    def __enter__(self):
        self.original_os = self.module.os
        self.module.os = self
        return self

    def __exit__(self, *args):
        self.module.os = self.original_os


class ExiterTests(unittest.TestCase):

    def test_exiter_success(self):
        with FakeOS(exitersketch) as fake:
            exitersketch.exit_with_result(lambda: True)
        self.assertEqual(fake.exit_code, 0)

    def test_exiter_failure(self):
        with FakeOS(exitersketch) as fake:
            exitersketch.exit_with_result(lambda: False)
        self.assertEqual(fake.exit_code, 1)


if __name__ == '__main__':
    unittest.main()
This version is definitely a superior form of patching: only one module's view is impacted. Notice also the use of __getattr__ to ensure overriding exitersketch.os only overrides os._exit and not other parts of the os module. Nonetheless, it still suffers from the inherent problem of patching: it's overriding more state than necessary. Thus it's still possible to call destructive functions by mistake if you rearrange your imports. For non-destructive functions it's still possible to have a test unexpectedly call a patched function, albeit only from code in the same module rather than globally. If you are going to use patching, though, making the patching as local as possible is definitely the way to go.

Monday, April 22, 2013

Unittesting Without Patching: A Followup

I got a couple questions about my previous post asking why I didn't show the simpler, function-based style of parameterization. This style does make unittesting possible, and with less complexity than creating a new class:

import os

def exit_with_result(function, _exit=os._exit):
    result = function()
    if result:
        _exit(0)
    else:
        _exit(1)

The problem is that when you add arguments to a function, the parameterization leaks into your public API. This means that:
  • You need to document the fact that these extra arguments (e.g. _exit in the example above) should not be used.
  • *args and **kwargs can't be used at all.
  • Changing the function signature later on can be more difficult.
  • If you have large numbers of things you need to parameterize, the function definition gets pretty long and ugly.
In the class style in contrast the public API is not affected by the need to unittest.

What's more, you will often have a group of related functions using the same modules, functions or classes. By grouping them in a class, you can implement the parameterization hook once, rather than for every function. You can see an example of this in Crochet (specifically the Eventloop class), where the parameterized reactor is used by multiple functions. If the code you need to parameterize is already a method, setting the parameters in __init__ or as a class attribute is even more attractive, requiring only minimal additional complexity.

Update: If you go with this style of parameterization, you still need to assert that in the default case it actual calls the correct function (e.g. os._exit for exit_with_result). Probably the nicest way to do so is to use inspect.getcallargs.

Thursday, April 18, 2013

Unittesting Without Patching

Python has the power to override any attribute on any module or class, but just because you can doesn't mean you should. This is true in regular code, but just as true of unittests. Many testing libraries (mock, Twisted's trial, py.test) provide facilities for overriding some piece of global state; you can also do so manually. Occasionally these facilities prove invaluable, but often they are used unnecessarily. Better alternatives are available.

Before I explain why patching is problematic, let's look at an example. Consider the following module:

import os

def exit_with_result(function):
    result = function()
    if result:
        os._exit(0)
    else:
        os._exit(1)

On the face of it patching is necessary to test this example. The tests would look something like this:

import unittest
import os

from exitersketch import exit_with_result


class ExiterTests(unittest.TestCase):
    def setUp(self):
        self.exited = None
        self.originalExit = os._exit
        os._exit = self.fakeExit

    def fakeExit(self, code=0):
        self.exited = code

    def tearDown(self):
        os._exit = self.originalExit

    def test_exiter_success(self):
        exit_with_result(lambda: True)
        self.assertEqual(self.exited, 0)

    def test_exiter_failure(self):
        exit_with_result(lambda: False)
        self.assertEqual(self.exited, 1)


if __name__ == '__main__':
    unittest.main()

Having seen patching, and seen that it works as a testing technique, why should we avoid it?

  1. Patching is fragile. If the example above changed import os to from os import _exit, the patching would need to be modified. However, if you forgot to modify the patching, unexpected code will run. In this case, your test run will mysterious exit half way through. If the function you are attempting to patch is more destructive, worse things may happen: credit cards may get charged, data may get deleted, etc..
  2. Patching leads to unexpected behaviour. Because patching is a global change, the patched code may be called not just by the function being tested, but by code it is calling which happens to use the same patched code.
  3. Patching indicates bad design. Code code should be designed to be easily testable. Having to modify global state suggests that the code is not as modular as one might hope.

How to avoid patching? Parameterization, aka dependency injection. We refactor the code to accept the _exit function as a parameter. Notice the the public API has not changed:

import os

class _API(object):
    def __init__(self, exit):
        self.exit = exit

    def exit_with_result(self, function):
        result = function()
        if result:
            self.exit()
        else:
            self.exit(1)


_api = _API(os._exit)
exit_with_result = _api.exit_with_result

Our tests can now test both that _API.exit_with_result class has the correct behavior in general, and that the public exit_with_result is going to call os._exit in particular.

import unittest
import os

from exiter import _api, _API, exit_with_result


class ExiterTests(unittest.TestCase):
    def setUp(self):
        self.exited = None

    def fakeExit(self, code=0):
        self.exited = code

    def test_api(self):
        self.assertIsInstance(_api, _API)
        self.assertEqual(_api.exit, os._exit)
        self.assertEqual(exit_with_result, _api.exit_with_result)

    def test_exiter_success(self):
        _API(self.fakeExit).exit_with_result(lambda: True)
        self.assertEqual(self.exited, 0)

    def test_exiter_failure(self):
        _API(self.fakeExit).exit_with_result(lambda: False)
        self.assertEqual(self.exited, 1)


if __name__ == '__main__':
    unittest.main()

The same technique is useful when you are tempted to store some state in a module. Instead, store an instance of a class:

class _Counter(object):
    value = 0

    def increment(self):
        self.value += 1

    def value(self):
        return self.value


_counter = _Counter()
increment = _counter.increment
value = _counter.value

As I've demonstrated, patching can often be avoided by restructuring code to be more testable. The same Python features that make patching so easy also make avoiding patching just as easy. Given the choice, you should avoid changing global state when testing individual components.

Friday, April 12, 2013

SSH Into Your Python Server

Have you ever wanted to see what's going on inside your Python server? With Crochet and Twisted, you can add a Python prompt to you process that is accessible via SSH, allowing you to poke around in the internals of your running program. Here's an example session to a Flask server:
$ ssh admin@localhost -p 5022
admin@localhost's password: ******
>>> app
<flask.app.Flask object at 0x28a96d0>
>>> app.url_map
Map([<Rule '/' (HEAD, OPTIONS, GET) -> index>,
 <Rule '/static/<filename>' (HEAD, OPTIONS, GET) -> static>])
>>> from twisted.internet import reactor
>>> reactor._selectables
{9: <SSHServerTransport #0 on 5022>, 3: <<class 'twisted.internet.tcp.Port'> of twisted.conch.manhole_ssh.ConchFactory on 5022>, 6: <twisted.internet.posixbase._UnixWaker object at 0x28a2510>}
The code to start the SSH server has quite a lot of boilerplate, so I filed a ticket to provide a utility function. If you're using the system Twisted, you may need to install Twisted's Conch package, e.g. apt-get install python-twisted-conch on Ubuntu.
import logging

from flask import Flask
from crochet import setup, in_reactor
setup()

# Web server:
app = Flask(__name__)

@app.route('/')
def index():
    return "Welcome to my boring web server!"


@in_reactor
def start_ssh_server(reactor, port, username, password,
                     namespace):
    """
    Start an SSH server on the given port, exposing a Python
    prompt with the given namespace.
    """
    from twisted.conch.insults import insults
    from twisted.conch import manhole, manhole_ssh
    from twisted.cred.checkers import (
        InMemoryUsernamePasswordDatabaseDontUse as MemoryDB)
    from twisted.cred.portal import Portal

    sshRealm = manhole_ssh.TerminalRealm()
    def chainedProtocolFactory():
        return insults.ServerProtocol(manhole.Manhole,
                                      namespace)
    sshRealm.chainedProtocolFactory = chainedProtocolFactory

    portal = Portal(
        sshRealm, [MemoryDB(**{username: password})])
    reactor.listenTCP(port, manhole_ssh.ConchFactory(portal),
                      interface="127.0.0.1")


if __name__ == '__main__':
    import sys
    logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
    start_ssh_server(
        5022, "admin", "secret", {"app": app}).wait()
    app.run()

Wednesday, April 10, 2013

Crochet: Background Operations for Threaded Applications

In my previous post I showed Crochet doing a blocking call against a Twisted API. In this example, you can see how Twisted and Crochet allow you to run an operation in the background. An HTTP request for a new user starts a download in the background, and a reference is stored in the user's session. Every time the user reloads the page, a check is made to see if the download is finished, and if it is done it is display. You can also see the stash()/retrieve_result() API in use, which allows temporarily storing results under a key suitable for serialization in a session object.

import logging
from flask import Flask, session, escape
from crochet import setup, in_reactor, retrieve_result, TimeoutError
setup()

app = Flask(__name__)


@in_reactor
def download_page(reactor, url):
    """
    Download a page.
    """
    from twisted.web.client import getPage
    return getPage(url)


@app.route('/')
def index():
    if 'download' not in session:
        # @in_reactor functions return a DeferredResult:
        result = download_page('http://google.com')
        session['download'] = result.stash()
        return "Starting download, refresh to track progress."

    # retrieval is a one-time operation:
    result = retrieve_result(session.pop('download'))
    try:
        download = result.wait(timeout=0.1)
        return "Downloaded: " + escape(download)
    except TimeoutError:
        session['download'] = result.stash()
        return "Download in progress..."


if __name__ == '__main__':
    import os, sys
    logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
    app.secret_key = os.urandom(24)
    app.run()

Tuesday, April 9, 2013

Presenting Crochet: Use Twisted as a library

Twisted is an event-driven framework; by default it expects to run the reactor event loop in your main thread to drive your application. If however you're writing a Django or Flask application you may want to use Twisted as just another library. Unless you choose to use Twisted as a WSGI container, this requires you to run the reactor in a thread. Today I am happy to announce Crochet, which makes using Twisted even easier in this situation.

Here's an example program using Crochet, allowing it to easily use Twisted from a normal, blocking command-line tool:

from __future__ import print_function

from crochet import setup, in_reactor
setup()


@in_reactor
def mx(reactor, domain):
    """
    Return list of MX domains for a given domain.
    """
    from twisted.names.client import lookupMailExchange
    def got_records(result):
        hosts, authorities, additional = result
        return [str(record.name) for record in additional]
    d = lookupMailExchange(domain)
    d.addCallback(got_records)
    return d


def main(domain):
    print("Mail servers for %s:" % (domain,))
    for mailserver in mx(domain).wait():
        print(mailserver)


if __name__ == '__main__':
    import sys
    main(sys.argv[1])
When we run it on the command line, output looks this:
$ python mxquery.py gmail.com
Mail servers for gmail.com:
alt2.gmail-smtp-in.l.google.com
alt2.gmail-smtp-in.l.google.com
alt3.gmail-smtp-in.l.google.com
alt3.gmail-smtp-in.l.google.com
alt4.gmail-smtp-in.l.google.com
alt4.gmail-smtp-in.l.google.com
alt1.gmail-smtp-in.l.google.com
gmail-smtp-in.l.google.com
gmail-smtp-in.l.google.com
The library provides much more functionality, but that's the gist of it: it runs and stops the Twisted reactor for you, and wraps asynchronous results in a blocking API. If you'd like to try out Crochet, or learn more about its other features, visit Crochet's PyPI page.

Tuesday, February 12, 2013

2-day Twisted class in San Francisco: Early Bird ends Friday

If you want to build reliable, well-tested network applications in Python, Twisted may be the tool you need. In this two-day class, taking place in March 11th and 12th (right before PyCon) we will cover the basic principles and core APIs of Twisted. Early bird pricing will save you $100, and ends in just three more days.

Covered material will include:
  • Understanding Event Loops: we'll re-implement Twisted's core APIs step-by-step (reactor, transport, protocol), explaining the why and how of event-driven networking.
  • TCP Clients and Servers.
  • Scheduling Timed Events.
  • Deferreds: the motivation and uses of Twisted's result callback abstraction. 
  • Producers and Consumers: dealing with large amounts of data.
  • Unit Testing: how to why test your networking code.
  • A large, self-paced exercise, implementing a HTTP server and client from scratch using pre-written unit tests as guidance, and our help as needed. (These last two points will also be presented at PyCon, at the Twisted testing tutorial.)

To learn more and sign up for the class visit our Eventbrite page.


Abous us:

Jean-Paul Calderone has consulted for Fortune 500 companies, startups and research institutions. He has taught Twisted tutorials at PyCon, Fluendo SA in Barcelona, and Rackspace Inc. Jean-Paul has been one of the core Twisted maintainers since 2002, and is the maintainer of pyOpenSSL.

Itamar Turner-Trauring spent many years working on distributed applications as part of ITA Software and then Google's airline reservation system, coding in Python (often using Twisted), C++ and a little bit of Common Lisp. Itamar has also worked on projects ranging from a reliable multicast messaging system with congestion control, a prototype-based configuration language, to a multimedia kiosk for a museum. Itamar has been one of the core Twisted maintainers since 2001.

Wednesday, February 6, 2013

Mock Assurances

I recently tried the mock library; it's quite useful, and in general using it was a pleasant experience... until things turned scary. While refactoring some code and corresponding tests I hit a point where a test should have been failing, and yet was nonetheless passing. A little investigation led me to the problem, and that's when I got really nervous.

Here's a rather silly example of two unit tests using mock.
import unittest
import mock

class C:
    def function(self, x):
        pass

class Tests(unittest.TestCase):
    def test_positive(self):
        C2 = mock.Mock(spec=C)
        obj = C2()
        obj.function(1)
        obj.function.assert_called_once_with(1)

    def test_negative(self):
        C2 = mock.Mock(spec=C)
        obj = C2()
        obj.function(1)
        obj.function.assert_not_called()


if __name__ == '__main__':
    unittest.main()
One would expect test_negative to fail, but in fact both tests pass:
$ python mocktests.py 
..
------------------------------------
Ran 2 tests in 0.001s

OK
Oops.

If you've used mock before, you probably know what's going on. The mock library creates new attributes on demand if they don't exist. Thus:
>>> import mock
>>> obj = mock.Mock()
>>> obj.assert_called_once_with
<bound method Mock.assert_called_once_with of <mock.Mock object at 0x7f6e1c18a990>>
>>> obj.assert_not_called
<Mock name='mock.assert_not_called' id='140110894509008'>
Once again, oops. I had invented a new assertion method assert_not_called which doesn't actually exist, and mock had happily created a new object for me. My test was completely broken. My mistake, and therefore my fault. And in fact the mock documentation does mention the possibility of this happening, deep within the bowels of the API documentation: "Because mocks auto-create attributes on demand, and allow you to call them with arbitrary arguments, if you misspell one of these assert methods then your assertion is gone." An appropriate fix is provided, the mock.create_autospec API. It would have been better to include this suggestion in the intro documentation; better yet would be preventing the problem in the first place.

By their very nature, test assertions do nothing silently in the expected case. It's thus quite dangerous to have a library specifically intended for testing where typos create calls that are supposed to be assertions but actually assert nothing, silently. In general, I prefer tools that don't assume I'm perfect. If I never made mistakes I wouldn't need to write tests in the first place, at least for code where I was the only maintainer.

What's more, this will also be a problem if new assertion methods are ever added to future versions of mock. Imagine developer A writes tests using a new assertion only available in mock v1.1; when she runs the tests, they work correctly. Developer B is working on the same code base, but forgot to upgrade and is using mock v1.0 that lacks the new assertion. When he runs the tests they are not actually testing what they seem to be testing. Oops.

The basic design flaw here is having a public API on objects that also create arbitrary attributes on demand. The whole public API of mock.Mock (assertions, attributes, etc.) should be exposed as module-level functions, so that typos or misremembering the API will result in a nice informative AttributeError. Until that happens, I will stick to mock.create_autospec, and avoid mock.Mock or mock.MagicMock.

Tuesday, January 29, 2013

Test-Driven Development with Twisted: A PyCon 2013 Tutorial

Testing network applications is hard: the order of events is unpredictable, the passage of time is important, and the sources of errors are many. At PyCon this year I will be teaching a three-hour tutorial on test-driven development with Twisted, demonstrating how to build well-tested network applications.

(As a reminder, I'm also teaching a two day Twisted class in San Francisco with Jean-Paul Calderone, on the Monday and Tuesday before PyCon; early bird pricing expires Feb 15th. The material in the testing tutorial is also included in the longer class.)

In the lecture part of the tutorial I will cover:
  • Testing network protocols in a deterministic manner (no need for actual TCP connections).
  • Testing the passage of time (no need to wait 2 hours in your test to prove that a timeout is hit).
  • Twisted's testing infrastructure for running the reactor and handling Deferreds.
Once that is done, students will begin a hands-on lab, implementing a HTTP server from scratch. I will provide pre-written unit tests, and students will write code to make these tests pass, with help from me (and perhaps an assistant or two depending on number of students).

The lessons here are implicit in the design of the tests, and the design of the server as shaped by the tests. If anything these lessons are more important than understanding Twisted's testing APIs:
  • The scoping of tests into small units of work.
  • Separation of concerns - parsing/generating bytes vs. business logic.
  • Design patterns for Deferred APIs.
  • Building robust network applications, including dealing with bad input and timeouts.
  • Separation of library code and application configuration.
Students who finish early can move on to a more difficult exercise, implementing both the tests and logic for an HTTP client, but benefiting from the ability to ask for in-person help.

You can sign up at the PyCon website as part of registration, or read more on the tutorial's PyCon web page.

Monday, January 28, 2013

Deferred Cancellation, part 3: Timeouts

Let's send an email! Here's the steps involved in what from the outside looks like a simple function call (and this is of course a very high-level view):
  1. Look up the IP of our SMTP server's domain using DNS. This may involve a series of UDP messages to one or more servers, which may do further work on our behalf.
  2. Establish a TCP connection with the server.
  3. Exchange a series of commands with the server over the TCP connection, some of which may involve arbitrarily complex processing on the server-side.
Obviously a lot can go wrong here, from communication problems to hardware failures to software bugs. The fact it usually works is an impressive engineering feat. For our purposes the interesting point is that failure can take an arbitrary amount of time.

The necessary and obvious solution is a timeout: if enough time has passed without getting a response, abort the operation and consider it to have failed. We may end up sending duplicate emails if we retry, but that is a business logic decision tied to the specifics of an application, so not something I'll be talking about. Now, we could have timeouts on each step of the process (DNS lookup, TCP connection, each command), and in fact may want timeouts for each of these. But from the point of view of the email sending API, the time it takes to do the underlying steps is irrelevant, except perhaps for debugging or performance: if we want to send an email within 5 seconds, we want it to take 5 seconds, and don't care which step happens to be the slow one.

This is where Deferred cancellation comes in. We want to make sure each step along the way has a cancellation function registered, if possible, but that's not strictly necessary. Our code looks something like this:
def sendmail(from, to, data, smtphost, smtpport=25):
    endpoint = TCPv4ClientEndpoint(smtphost, smtpport)
    d = endpoint.connect(SMTPFactory())
    def gotProtocol(smtpProtocol)
        return smtpProtocol.send(from, to, data)
    d.addCallback(gotProtocol)
    return d
The wonderful thing about Deferreds is that, as with results, cancellation also gets automatically chained. Thus if we call cancel() on the result of sendmail(), it will cancel the Deferred connecting to the server if that's where we are in the process, or the Deferred return from SMTPProtocol.send() if that's what we're waiting for. So if want to time out sending an email after 5 seconds... all we have to do is cancel the Deferred returned by the sendmail() function after 5 seconds if we haven't gotten a result! The following utility function, soon to be part of Twisted (ticket #5786), does just that:
def timeoutDeferred(deferred, timeout):
    delayedCall = reactor.callLater(timeout, deferred.cancel)
    def gotResult(result):
        if delayedCall.active():
            delayedCall.cancel()
        return result
    deferred.addBoth(gotResult)
And now, we can send an email with a timeout of our choice, e.g. 5 seconds:
sent = sendmail("from@example.net", "to@example.net",
                "An email message.", "smtp.example.net")
timeoutDeferred(sent, 5)
The nice thing about this API is that it doesn't require adding extra timeout arguments to every function. Instead, the highest-level caller just adds a timeout. And underlying callers (e.g. the TCP connect, its underlying DNS lookup, etc.) can have their own, more limited timeouts as well.

To summarize: supporting Deferred cancellation is a great way to make the integration points of your library code more useful, by allowing users of your code both ad-hoc and timeout-driven cancellation of your operations. And as the user of a Twisted library, timeouts can be easily added to any Deferred-returning API, in particular those that explicitly support cancellation for you.

Sunday, January 13, 2013

Update: Early bird discount for our Twisted class

If you pay for our class before February 15th, you'll get $100 off; sign up for our two-day Twisted class in San Francisco now at http://futurefoundries.eventbrite.com.

Tuesday, January 8, 2013

2-day Twisted Class in San Francisco

Interested in learning the fundamentals of Twisted and event-driven networking with Python? If you live in the Bay Area or SF, or are visiting for PyCon, you can join Jean-Paul Calderone and I for a two day intro to Twisted in San Francisco.
Location: San Francisco, exact site TBA.
Dates: March 11-12, the Monday and Tuesday before PyCon.
Cost: $650 early bird, or $750 after Feb 15.

Sign up now!

Covered Material

Combining a lecture with plenty of hands-on exercises, covered topics will include:
  • Understanding Event Loops: we'll re-implement Twisted's core APIs step-by-step (reactor, transport, protocol), explaining the why and how of event-driven networking.
  • TCP Clients and Servers.
  • Scheduling Timed Events.
  • Deferreds: the motivation and uses of Twisted's result callback abstraction.
  • Producers and Consumers: dealing with large amounts of data.
  • Unit Testing: how to test your networking code.
  • A large, self-paced exercise, implementing a HTTP server and client from scratch using pre-written unit tests as guidance, and our help as needed.

Daily Schedule (Tentative)

9:30-12:30: Lecture and exercises.
12:30-13:30: Lunch break.
13:30-16:30: Lecture and exercises.
16:30-17:30: Extended exercise time, and in-depth Q&A.