This will work as long as you give it an initial empty Counter
to start with (otherwise it starts with 0
and complains)
1 Like
Ah thanks. Anyhow, the idea is to minimize the work done in the single-threaded “gather” phase at the end, by having each thread individually count in a lock-free way.
1 Like
I don’t think that is true. If free threading is possible, the cat will be out of the bag, even developers that only cares about single threaded work will still be affected by threading issues. If a library starts a thread in the background for whatever reason, they can cause threading issue in my code even though I never subscribed for having threading problems.
Many libraries that had async-to-sync bridges spawns threads to simulate async tasks. Django, FastAPI, SQLAlchemy is just a few off the top of my head. And then there’s tools like IPython that starts a couple background threads for who knows what reasons.
Multithreading has a reputation for being hard. But really, I think they are considered hard because of the existence of free threading. Languages like Rust that doesn’t have free threading (or to be more precise, it has an almost free threading with some severe restrictions) actually fared better at making multithreading a lot easier to use.
The arena-based threading with subinterpreters I had mentioned earlier would be that similar sort of that almost-free threading with forced discipline.
One way to think of arena-based threading is that it’s basically like a dynamic/runtime borrow checker, enforcing acquisition of the arena locks before working with any objects owned by the arena. I think it can even be flexible enough to allow future experimentation with non standard arenas that have different borrowing rules.
2 Likes
There are language-level tools like golang’s race condition detector, thread sanitizer, etc, which take the common mistakes and test for them. It’s also possible someone could implement something like a borrow checker or thread safety heuristics on top of python’s type system, e.g. with passthrough types along the lines of Mutable / Immutable / Shared / Local, and auditing nonlocal variable access or object types passed into threads.
This wouldn’t be the case with my proposal to make threads take a voluntary lock by default. In a sense, you could leave something like the GIL in place, but make it safe to release for specific threads while accessing python code/objects.
I don’t think anyone has demonstrated how this would happen, and I’d view it as a fundamental flaw in the implementation if it could.
I’m not saying it’s impossible, but I think it would be useful to have specific examples–even if only theoretical–before it’s considered a significant problem.
1 Like
I guess this is a bit too open ended. I think the thread in question can only interact with your code unintentionally if you happen to share a resource with that thread in an unsafe way, furthermore to be scary it would need to be an unsafe way that isn’t possible today. Even with the GIL another thread can already do a lot of things, like mess with your file descriptors, stdout, signals. And threads sharing access to any variable is already inconsistent for non-atomic ops at the Python level.
2 Likes
Can you provide a (hypothetical?) example?
Yep, CPython can switch threads in the middle of these two operations. So, there’s a problem.
2 Likes
Again, of course. But I understood that @pf_moore made the very fine point that due to specialisations we are discussing here (e.g. BINARY_SUBSCR_DICT
), and hence the GIL, things which are nominally not thread-safe are effectively so in current CPython, because they are specialised to a single native instruction. And this, I think, only needs an explicit specification for whether or not such operations are to be considered effectively “atomic” or not. Otherwise, yes, these are just undiscovered bugs, currently protected by a CPython implementation detail.
… of a library starting a background thread. Not exactly a library, but idlelib.run has
sockthread = threading.Thread(target=manage_socket,
name='SockThread',
args=((LOCALHOST, port),))
as a result of which threading.activecount()
and theading.enumerate()
returns are greater when running on IDLE. Someone once asked why the difference on Stackoverflow. (I have not idea whether no-gil will require any change to manage_socket
or that chance of user code having a problem.)
import threading
THREAD_COUNT = 3
BY_HOW_MUCH = 1_000_000
class Incrementor:
def __init__(self):
self.c = 0
def incr(incrementor, by_how_much):
for i in range(by_how_much):
incrementor.c += 1
incrementor = Incrementor()
threads = [
threading.Thread(target=incr, args=(incrementor, BY_HOW_MUCH))
for i in range(THREAD_COUNT)
]
for t in threads:
t.start()
for t in threads:
t.join()
print(incrementor.c)
prints 3 million when ran it on 3.10. Does it mean you can rely on +=
being atomic when writing Python code? No! If you run it on 3.9 it prints between 1.5 and 2 million. Soon a Faster CPython team member can swoop in and change (not break!) it again.
BTW if Java and .net developers can have free threads, then so can we.
4 Likes
The word “can” here translates to (potentially) decades of work, which was the case for Java: