Part 15 of 58
The Sleeping Velociraptor
By Madhav Kaushish · Ages 12+
The weight penalty helped. But Trviksha noticed another problem. Even with moderate weights, the network sometimes developed over-reliance on a single hidden velociraptor. If that velociraptor's weights happened to correlate well with a large chunk of the training data, the network would route most of its prediction through that one unit, leaving the others underused.
Then an accident pointed toward a solution.
The Accident
During one training session, Grinjka fell ill. Not metaphorically — the velociraptor developed a fever and could not report for work. The network had to process training patients that day with only nineteen of its twenty hidden velociraptors functioning.
Trviksha expected the predictions to suffer. They did — slightly. Training accuracy dropped from 91% to 87% for the session. But when she tested on held-out data the next day, with all twenty velociraptors back at their stations, the test accuracy was 86% — a point higher than usual.
Trviksha: One velociraptor missed a day of training, and the network got better on new data? How?
Blortz: With Grinjka absent, the other nineteen velociraptors had to compensate. They could not rely on Grinjka's contribution. So they adjusted their own weights to cover the gap. The network distributed its knowledge more evenly.
Trviksha: And when Grinjka came back, the network had two sources of information where it previously had one — Grinjka's patterns and the backup patterns the others had developed in her absence.
The Method
Trviksha turned the accident into a deliberate practice. On each training pass, she randomly selected a handful of hidden velociraptors — say, four of the twenty — and sent them to sleep. They did nothing for that pass. The remaining sixteen processed the data and adjusted their weights. On the next pass, a different random four slept while the previous four returned.
No velociraptor could predict when it would be asked to sleep. No velociraptor could rely on always being present. The network had to function well even when any subset of its units was missing.

Drysska: This is inefficient. You are deliberately reducing my team's capacity during training.
Trviksha: I am deliberately preventing your team from becoming fragile. If every velociraptor is essential, then the loss of any one of them is catastrophic. If no single velociraptor is essential, the network is robust.
Drysska: Robustness through redundancy. Each pattern is encoded in multiple velociraptors rather than one.
Trviksha: Exactly. And redundancy prevents memorization, because no single velociraptor can encode a specific patient if it might be asleep the next time that patient comes through. The knowledge must be shared.
The Results
She re-trained the twenty-velociraptor network with random sleeping — four velociraptors disabled on each pass, chosen at random.
Training accuracy: 87%. Lower than the 91% with the weight penalty alone. The sleeping velociraptors cost some training performance.
Test accuracy: 87%.
The gap between training and test had effectively vanished. The network could no longer memorize, because any memorization strategy depended on specific velociraptors being present — and they might not be. The only strategies that survived the random sleeping were general ones, distributed across the full team.
Blortz: You have combined two forces against overfitting. The weight penalty makes extreme individual weights expensive. The random sleeping makes reliance on individual velociraptors unreliable. Together, they push the network toward distributed, moderate, general patterns.
Trviksha: And the remarkable thing is: at test time, when all twenty velociraptors are awake, the network performs better than it did when all twenty were always awake during training. The training disruption made the final network stronger.
Glagalbagal: Like pruning a tree. Cut some branches, and the remaining ones grow thicker.
Trviksha: I would not have expected that analogy from you.
Glagalbagal: I have been a shepherd longer than you have been alive. I know about pruning.