Part 38 of 58

The Surprise

By Madhav Kaushish · Ages 12+

The language model had been trained on one task: predict the next token. It had never been trained to translate between dialects. It had never been trained to summarise documents. It had never been trained to answer questions about history. And yet.

The Translation

Hjentova was testing the model on a routine task — generating continuations for partial texts — when she tried something on impulse.

She wrote a prompt: "The following is a passage in Northern dialect, followed by its translation into Standard Sonhlagoti. Northern: 'Vrothka glem portzhaj en kvalik dremja.' Standard:"

The model produced: "'The merchant from the port city arrived at dawn.'"

Hjentova checked. The translation was correct. She had never trained the model on translation pairs. She had never labelled any text as "Northern dialect" or "Standard Sonhlagoti." The model had simply seen both dialects in the archive — Northern texts and Standard texts, sometimes discussing the same events — and had learned enough about the correspondence between them to translate when prompted in the right format.

Hjentova: I did not ask you to build a translator.

Trviksha: I did not build a translator. I built a next-word predictor. The translation capability is... a side effect.

The Pattern

She tested more tasks that the model had never been explicitly trained for.

Summarisation: Given a long passage followed by "Summary:", the model produced reasonable summaries.

Question answering: Given a passage followed by "Question: [question]. Answer:", the model answered correctly about seventy percent of the time.

Sentiment: Given a text followed by "This passage expresses: ", the model predicted "satisfaction" or "complaint" or "concern" with surprising accuracy.

None of these tasks were part of the training. The model had never seen a "Summary:" label in the training data. But the archive contained texts that naturally included summaries, questions and answers, and evaluative language. The model had absorbed these patterns, and when prompted with the right format, it produced the corresponding output.

Blortz: How is this possible? It was only trained to predict the next word.

Trviksha: Predicting the next word is harder than it sounds. To predict what comes after "Summary:", the model must understand what a summary is. To predict what comes after a question, it must understand the question. The prediction task, applied to enough diverse text, forces the model to develop a surprising range of capabilities — because those capabilities are useful for prediction.

A single transformer network (depicted as a row of velociraptors at workstations) with multiple speech bubbles coming out of it. One bubble shows a translation, another a summary, a third an answer to a question, a fourth a sentiment label. Hjentova stands beside the network looking astonished, holding up a sign that reads "Trained on: predict next word." Trviksha shrugs

Few-Shot Learning

The most surprising capability emerged when Trviksha gave the model examples in the prompt itself.

She had never trained the model on plant classification. But she wrote a prompt:

"Classify the following plants. Vrentjak: flowering. Ghorla: non-flowering. Plombik: flowering. Tressvaj: non-flowering. Krinthol:"

The model predicted: "flowering."

It was correct. Trviksha had included information about Krinthol deep in the agricultural reports. But the remarkable thing was not that the model knew the answer — it was that the model understood the task from the examples alone. Four examples of classification, and the model inferred that it should classify the fifth.

Trviksha: I did not tell it to classify. I showed it four examples of classification and gave it a fifth to complete. It inferred the task from the pattern.

Glagalbagal: It learned from examples inside the prompt? Without any training?

Trviksha: Without any additional training. The model's weights did not change. It read the examples, recognised the pattern, and applied it. The "learning" happened entirely within the forward pass — the model used its existing knowledge of patterns to recognise a new pattern in the prompt.

She tested with two examples instead of four. The accuracy dropped but was still above chance. With zero examples — just the instruction "Classify the following plant:" — the model sometimes got it right and sometimes did not, depending on how clear the instruction was.

Blortz: More examples in the prompt means better performance. But the model is not training on these examples. It is not adjusting its weights. How?

Trviksha: I am not entirely sure. The attention mechanism allows the model to look at the examples and the new input simultaneously. Somehow, the pattern of "input, label, input, label, input, ???" is a pattern the model has learned to complete — because the training data contains many instances of similar structures. Lists, tables, categorised entries.

Blortz: You are saying you do not fully understand why this works.

Trviksha: Correct. I know that it works. I can measure that it works. I have a rough hypothesis about why. But the detailed mechanism — exactly how the attention weights implement this in-context task recognition — is not something I can explain precisely. The model is doing something clever, and I built it, and I do not fully understand what it is doing.

This was an uncomfortable admission for someone who had built the system from pebbles up. But it was honest. The model's capabilities had outgrown Trviksha's ability to explain them mechanistically. She could describe what it did. She could not always explain how.