__del__or weakref callbacks. Once that code finishes running control is returned to the original location in the code.
Unfortunately, due to a bug in Python, Queue.put() can deadlock in the following situation:
- As part of calling Queue.put(), a thread acquires the Queue's lock. This lock does not support being acquired more than once by the same thread.
- GC or a signal handler interrupts the function call.
- If the GC or signal handler code then also does Queue.put(), it will try to acquire the lock again... and since it's already locked it blocks waiting for the lock to be released.
- Since the signal handler/GC code is now blocked, control is never returned to original code, so lock is never released there.
- The thread is now deadlocked and will never recover.
Unfortunately there was no way to prevent the Queue.put() in GC; the Queue accepts log messages, and this is a GC-caused logging message coming out of code that is not under Crochet control's.
The obvious short-term solution is to reimplement a simplified Queue using Python's RLock, which allows the same thread to acquire the lock multiples times. But... RLock isn't reentrancy safe either due to another bug in Python! I could wrap OS-specific reentrant lock implementations, but that's a bigger project than I want to start.
The solution I ended up with (suggested by Jean-Paul Calderone I believe) was giving up on using threading primitives to communicate across threads. Instead I used the self-pipe trick. That is, the thread uses select() (or poll() or epoll()) to wait on one end of the pipe; to wake the thread up and tell it to check for new messages to process we simply write a byte to the other end of the pipe. Since Crochet uses Twisted, I had a pre-written event loop that already implemented self-pipe waking, and now the logging thread runs another Twisted reactor in addition to the regular reactor thread.
As far as I can tell this works, but it feels a bit like overkill. I'd welcome suggestions for other solutions.