Immutability and Thread Safety
it depends on what the definition of "is" is
When I talk about immutable collections and immutable objects being thread-safe, what do I really mean by that?
Part of what makes traditional multithreaded programming so difficult is that you never have truly accurate information about the state of things. If you ask a list "are you empty?" and the list says "no", you can very well get an empty list error when you ask for an item. Why? Because some other thread mutated the list by removing the items. In that kind of reality, adding an item to a list can produce an empty list. Removing the last item from a list can produce a list with 100 items on it. If you are thinking that sounds like causality, you are absolutely correct.
The boring old world of single-threaded programming more closely resembles our intuitive understanding of how the real world works.
If I'm in Dallas and you're in New York and I start building a model airplane, I can know with certainty that you are not manipulating that same model airplane. Sure I can mail it to you then you can make changes and send it back (message passing). But we both can't be in possession of it at the same time. Furthermore, you won't attempt to paint the wings red before I've glued them on because the actions we take are ordered. A happened, which caused B, which caused C. If I see the results of C, I know that A and B already occurred. If we think about our programs as an object, we can even reason about pre-emptive multitasking by imagining that the operating system walks into the room, takes the model away, then comes back later and returns it. It causes us no confusion when the wheel is missing - the operating system did something to it while it was not in our possession.
A multithreaded program is like a universe in which events are not globally ordered in time. Objects can be in multiple places simultaneously. Cause does not necessarily precede effect. In such a world, as you begin to paint the wings red, the tail of the plane appears out of nowhere. Sometimes a sticker appears under your brushstroke. Sure, in your local reference frame time appears to be marching forward, but events happen apparently at random due to actions in some other reference frame.
Is living in such a reality even possible? Perhaps we have a tag tied to the model airplane. When I am working on the model, I pencil my name on the tag. When you sit down to paint the wings, you first check the tag. If it is blank, you pencil your name in. If you see my name, then you wait. When I'm done, I erase my name and suddenly you see the tag is blank, pencil your name in, and proceed.
This system works perfectly fine, it just requires that every single human, animal, and physical process in the universe agrees to abide by the tag rules. But what if your neighbor Bob shares the paintbrush with you?
You sit down to paint the wing, you pencil your name on the tag, then you reach for the brush but Bob's name is on the brush's tag. You see the bristles have paint on them and are moving back and forth. Now I sit down to work on the model but I see your name on the tag. It seems we have some lock contention. I am waiting on you, and you are waiting on Bob. Neither of us can get any work done until Bob is done.
If Bob wants to paint the fuselage, we now have a deadlock. Bob is waiting on you and you are waiting on him. Proceed to wait until the heat death of the universe.
Of course we can add a rule that you can't wait to put your name on a tag for more than an hour. Sure, we waste a lot of time, but at least we'll eventually give up. Of course if I happen to get the tag first and take the wings off to glue some extra pieces on them, you'll finally get your turn on the tag and discover that you can't work because the wings are gone so you give up the tag. When you check again, you find the wings are back but Bob has the brush again. This is getting really annoying.
We have another problem. If Bob wants to paint the wings, we have a race condition; whoever sits down first paints the wings their chosen color. Sure, you can paint over Bob's ugly green, but he might just paint over your red in turn.
We're not even addressing who's name is on your apartment tag, or the building elevator's tag, vehicle tags, manhole cover tags, how to put tags on food, and so on. Everyone spends more and more of their waking hours just checking tags; God forbid someone writes their name on a tag and goes to sleep! The only way to be certain that the lights stay on, your apartment is yours, et al is to start from the highest level tag and work your way down. After all, if someone grabs the power substation tag and shuts it down you'll be painting in the dark. If someone grabs your apartment tag, you'll be homeless. Reasoning about the world requires everyone to cooperate, abide by the rules, and keep a huge list of tags in their head. If anyone makes a single mistake, the entire universe could end.
Immutability is different; nothing in the world appears to change unless we change it. That's a much simpler way to reason about things. Unlike the tag system, we aren't relying on everyone to follow the rules to the letter at every moment in time and we don't have to consider the state of the entire universe. In that system, we both have a copy of the model airplane. You paint the wings red while I get busy gluing. Ah, life is good!
Of course there is no free lunch; we have two major problems in this world. First, we can end up with a lot of duplicated objects! One way to reduce that problem is to make sure that so long as only one person is using an object, we don't bother making a copy. Further, if multiple people are merely inspecting an object then they can all use the same copy. It is only when multiple people make changes simultaneously that we need to bother actually copying anything, even though logically we pretend that everything is copied.
The second problem is what to do when we both finish our work on the model airplane. We've made changes and somehow need to merge them. As long as the changes are compatible, that's easy enough (take the wings off your model and put them on mine). But if they aren't, then we need some kind of policy to determine who wins. In that world, you may paint the wings, then get a text message saying "So sorry, your changes conflicted and were undone". When you look down, the wings are now painted black. Bummer man, optimistic transaction commit failure.
Immutability gives us a different kind of thread safety than what we are used to. The key is greatly narrowing the scope of the problem into one we can reason about much more clearly. Instead of having to keep the entire state of the universe in our heads (making sure to be extra careful with every single object we might possibly depend on or touch), we only need to consider what to do when we want to make changes visible to other people (when we publish them).
Of course the fact that related changes start to become coordinated transactions, and rollbacks of in-memory mutations become trivial are just icing on the cake.
Part 1: Russell's Rules For Highly Concurrent Software
Part 2: Immutability Part 2
Part 3: ContextLocal
Part 4: On Rediscovering the Classic Coordinated Transaction Problem
Part 5: Interlocked.CompareExchange
This blog represents my own personal opinion and is not endorsed by my employer.