Russell's Rules for Highly Concurrent Software
mutable state is the devil's handiwork
This is the first in a series of posts I plan regarding immutability (edit: yay, follow-through!). In this post, I want to cover some of the reasoning around why you'd want to use immutability and some of the awesome things that just fall out of an immutable design naturally. Future posts will cover making the compiler create immutable classes with builders and mutation accessors, how copy-on-write immutable lists work, etc.
OK, let's make some assumptions:
- We need to load 10-50GB of data into memory
- Hundreds or thousands of users may work with this data interactively
- Tens or hundreds of thousands of automated requests modify this data continuously
- Real-time validation of changes and property calculations are happening and can, in theory, touch any or all of this data to produce a result. Demand-loading from the database would slow the system by several orders of magnitude; it must live in memory.
- Exporting to external systems can take hours in some cases, and requires a consistent view of the data
When you start to talk about data at this scale, some of the traditional strategies are simply unworkable.
You can't use a simple lock since everything would grind to a halt while exporting, not to mention the contention among tens of thousands of requests.
Reader-Writer locks just shift the problem; start one two-hour export and no one can make any changes for the duration.
You can't copy the data; the amount of system memory required just goes out of control. Plus how do you deal with mutations? If you pile them up, then that copy of the data is essentially useless for any other readers (who presumably want to see the latest changes). If you do it cross-process, things get even worse, requiring synchronization events and introducing additional latency.
If we take a step back for a minute, what would the ideal situation be?
We'd want a system that let us take arbitrary snapshots of both the data and the metadata, allowing long-running read processes to operate off the snapshot, without interrupting anyone else.
Once we are doing snapshots, we'd want this system to support writable non-permanent snapshots to allow us to run what-if scenarios, pre-validate changes, and other similar operations. And of course we would also want to allow writable permanent snapshots, which could either be committed if all validation rules pass, or thrown away if anything fails.
This gets us part of the way there, but the natural next move is to make all the data structures immutable. There is just too much risk that someone makes a mistake and some piece of code mutates what is supposed to be a read-only snapshot, ruining all our hard work. Our design should be a pit of success, preventing us from doing the wrong thing if at all possible.
Once everything is immutable, there's no need for locking or blocking. All read-only threads can share the same global read-only snapshot. We'll employ lock-free algorithms ( Interlocked.CompareExchange
is magic). Any thread that wants to mutate data can do so by asking the read-only snapshot to vend a read-write builder that can be freely mutated. The thread can throw that mutated copy away, or ask our central repository to commit it atomically. (If writing to a database you do have the classic coordinated transactions problem which I hope to cover in another post).
Immutability is like a virus. Once it starts touching some of your data structures, you quickly realize it has to infect all of them. You'll fight it, but eventually acceptance comes and you realize how much easier the rest of your code is and how easy it becomes to reason about correctness. Just by looking at a few small bits of code, I can be 100% certain that no changes in state leak outside my thread, and that all the changes to my in-memory objects and my database either commit or roll back in a coordinated way, but without having to remember how to undo everything I already did in memory if the database commit fails.
Great, so everything's perfect right? Not quite. Simply copying data to create a snapshot is fine if your object is 100k. It doesn't work quite as well with a 50 GB dataset. We reasoned out a logical solution that seems like it ought to solve all our problems and yields a clean design that is hard to misuse. It would be a shame to let such a simple thing as the size of RAM prevent us from using it.
That's where copy-on-write strategies come in. The concept isn't new; various filesystems have been using it for years. (It's all filesystems all the way down). In this instance, we'll make use of someone else's hard work and install-package microsoft.bcl.immutable
. These immutable lists, dictionaries, etc use copy-on-write B-trees under the covers. That means if I need to mutate a tree with a hundred million objects in it, I only create enough new nodes to reach the leaf being mutated. The vast majority of all snapshots will be sharing the vast majority of their data. This is another wonderful thing we can do by accepting that all our data structures need to be immutable.
Next time on immutability: Creating objects and builders by hand sucks, so let's make the compiler do it.
This blog represents my own personal opinion and is not endorsed by my employer.