The Standoff

The locking system from Part 37 eliminated race conditions. Velociraptors placed busy stones, worked on their fields, and removed the stones when done. For three weeks, the system ran without incident. Then, on a busy morning during the post-Hrijpa accounting rush, two velociraptors froze.

The Freeze

Plekva was running a job that required two steps: first, update the herd count on Bretchka's record (field A), then update the feed allocation on the same customer's supply record (field B). She locked field A — placed her busy stone — and began the count update.

Grontch was running a different job on the same customer: first, update the feed allocation (field B), then adjust the herd count to reflect animals sold (field A). He locked field B and began the feed update.

Plekva finished her count update and moved to step 2: update field B. She checked field B for a busy stone. There was one — Grontch's. She waited.

Grontch finished his feed update and moved to step 2: update field A. He checked field A for a busy stone. There was one — Plekva's. He waited.

Plekva was waiting for Grontch to release field B. Grontch was waiting for Plekva to release field A. Neither could proceed. Neither could release their own lock, because they needed the other field to complete their task before they could remove their busy stone.

They stood there for forty minutes before Blortz noticed.

Blortz: They are both waiting for each other. Neither will ever move.

Glagalbagal: That cannot be right. They are following their instruction tablets.

Blortz: They are following their instruction tablets perfectly. That is the problem. Each tablet says "lock what you need, do your work, then unlock." But each tablet assumes it can acquire all the locks it needs. When two tablets need the same locks in different orders, they can trap each other.

The Deadlock

Blortz drew the situation on the cave wall:

Plekva holds: A. Needs: B. Grontch holds: B. Needs: A.

This was a circular dependency. Plekva's progress depended on Grontch, and Grontch's progress depended on Plekva. The circle had no exit. This was, in the terminology Glagalbagal was developing, a deadlock.

Glagalbagal: We need to break the circle. How?

Blortz: Two options. Detect it after it happens, or prevent it from happening.

Detection

The detection approach was simple in concept. A supervising velociraptor — Blortz proposed the role of "lock watcher" — would periodically scan all active locks and check for circular dependencies. If Plekva held A and waited for B, and Grontch held B and waited for A, the lock watcher would identify the circle and force one of them to release its lock, abandoning its in-progress work.

The abandoned work would need to be restarted from the last checkpoint (Part 32). This was wasteful — the velociraptor might have nearly completed its task — but it was better than both velociraptors waiting forever.

Glagalbagal: Which one do we force to release?

Blortz: The one that has done less work. If Plekva is 90% complete and Grontch is 20% complete, force Grontch to restart. Less work is lost.

The lock watcher checked for deadlocks every ten minutes. When one was detected, the velociraptor with the least progress was interrupted, its locks released, its task re-queued. It was inelegant but effective.

Two velociraptors standing frozen on opposite sides of a shelf, each with a busy stone on one basket and a claw reaching toward the other's basket. Between them, a third velociraptor (the lock watcher) examines the situation with a detective-like magnifying glass, about to break the deadlock

Prevention

Detection was reactive — it fixed deadlocks after they happened. Glagalbagal wanted to prevent them entirely.

Blortz analysed the deadlock. The circular dependency arose because the two velociraptors acquired locks in different orders: Plekva locked A then B, Grontch locked B then A. If both had locked in the same order — say, A first, then B — the deadlock could not have formed. One of them would have successfully locked A, and the other would have waited at A, never proceeding to lock B first.

The rule was surprisingly simple:

All locks must be acquired in a fixed global order. If a task needs locks on fields A and B, it must lock A before B, regardless of which field it modifies first.

The global order was defined alphabetically by field identifier (or, more precisely, by the numeric code of each field). A velociraptor needing fields A, B, and D would lock them in that order: A, then B, then D. If another velociraptor needed D and A, it would still lock A first, then D — even if its computation started with D.

Grontch: I need to update the feed allocation first. But I have to lock the herd count first, even though I do not need it yet?

Glagalbagal: You lock the herd count, then the feed allocation. You update the feed allocation first — the lock does not dictate the order of work, only the order of locking. When both updates are done, you release both locks.

Grontch: This seems unnecessarily rigid.

Glagalbagal: It prevents you from standing motionless for forty minutes while your colleague does the same. Rigidity, in this case, is the point.

The Cost of Safety

The prevention rule worked. No deadlocks occurred after its introduction. But it had a cost: velociraptors sometimes held locks longer than necessary, because they acquired all needed locks at the beginning rather than one at a time as needed. This increased wait times for other velociraptors who needed those fields.

The trade-off was familiar by now. Detection was cheaper in the normal case (locks held only as long as needed) but expensive when deadlocks occurred (work was lost and restarted). Prevention was more expensive in the normal case (locks held longer) but eliminated deadlocks entirely. Glagalbagal chose prevention for critical customer data and detection for routine operations.

Blortz: Every concurrency problem is a choice between correctness and performance. You have been making this choice since the beginning — but now the stakes are higher because six velociraptors are working simultaneously, and any two of them might need the same basket.

Glagalbagal: The system was simpler when I did everything myself.

Blortz: The system was simpler. It was also slower, smaller, and less profitable. Complexity is the price of scale.

Deadlocks happen in computer systems for exactly the same reason as in GlagalCloud: two processes each hold a resource the other needs, and neither can proceed. Operating systems, databases, and distributed systems all deal with deadlocks. The prevention strategy Glagalbagal used — always acquiring locks in the same global order — is a real technique called lock ordering. It works because a circular dependency requires locks to be acquired in different orders. If everyone follows the same order, no circle can form. Databases like PostgreSQL detect deadlocks and abort one of the conflicting transactions. Think about a four-way intersection with no traffic lights, where each car waits for the car to its right. All four cars wait forever — a deadlock. How do traffic rules prevent this? What is the "global order" equivalent in traffic?