Thread locals are an interesting idea: a way to give each thread its own storage, useful for global state that you don't want to share between threads.
The obvious caveat is that threadlocals are still effectively global (for the current thread), and like all global state, should be treated with great suspicion. If you can do without one, you're probably better off.
Let's look at the Python official docs about threading.local. Here is all of it:
class threading.local
A class that represents thread-local data. Thread-local data are data whose values are thread specific. To manage thread-local data, just create an instance of
local
(or a subclass) and store attributes on it:
mydata = threading.local()
mydata.x = 1
The instance’s values will be different for separate threads.
For more details and extensive examples, see the documentation string of the
_threading_local
module.
Okay, so let's play with that example and see what happens.
>>> import threading
>>>
>>> mydata = threading.local()
>>> mydata.x = 'hello'
>>> class Worker(threading.Thread):
... def run(self):
... mydata.x = self.name
... print mydata.x
...
>>> w1, w2 = Worker(), Worker()
>>> w1.start(); w2.start(); w1.join(); w1.join()
Thread-1
Thread-2
>>> print mydata.x
hello
Cool! Each thread does indeed set and read its own data in the x
attribute.
Mutable threadlocals part 1: oops
Let's try the same exact example again with a different value for x
.
Something mutable. Say, a dict.
>>> import threading
>>> mydata = threading.local()
>>> mydata.x = {}
>>>
>>> class Worker(threading.Thread):
... def run(self):
... mydata.x['message'] = self.name
... print mydata.x['message']
...
>>> w1, w2 = Worker(), Worker()
>>> w1.start(); w2.start(); w1.join(); w2.join()
Exception in thread Thread-2:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "<stdin>", line 3, in run
AttributeError: 'thread._local' object has no attribute 'x'
Exception in thread Thread-1:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "<stdin>", line 3, in run
AttributeError: 'thread._local' object has no attribute 'x'
Um. Wat. So mydata.x
exists if it's assigned an immutable value, and does
not exist if it's assigned a mutable value?
I don't fully understand this.
It's a bit irksome that this is glossed over in the published python docs and lies here waiting to bite unsuspecting users. Remind me to submit a docs patch...
Mutable threadlocals part 2: Subclass threading.local
But here's a better way, based on a close read of the docstring that was
mentioned in the docs.
It works if you subclass threading.local
and create your mutable
object inside __init__
.
You can see what happens if you do that by reading the _threading_local
source: both __new__
and __getattribute__
are defined such that your
__init__()
gets run for each thread as needed, with locking around it.
The Worker class here is the same as above:
>>> import threading
>>> class MyData(threading.local):
... def __init__(self):
... self.x = {}
...
>>>
>>> mydata = MyData()
>>>
>>> class Worker(threading.Thread):
... def run(self):
... mydata.x['message'] = self.name
... print mydata.x['message']
...
>>>
>>> w1, w2 = Worker(), Worker()
>>> w1.start(); w2.start(); w1.join(); w2.join()
Thread-1
Thread-2
>>> print mydata.x
{}
What if you wanted to create a mutable value explicitly per thread, instead
of inside __init__()
?
That's actually easier, just do that inside code that's run per thread instead of
globally, and assign it. No local
subclass needed.
Eg. in this case inside the run
method:
>>> mydata = threading.local()
>>>
>>> class Worker(threading.Thread):
... def run(self):
... mydata.x = {'message': self.name}
... print mydata.x['message']
...
>>> w1, w2 = Worker(), Worker()
>>> w1.start(); w2.start(); w1.join(); w2.join()
Thread-17
Thread-18