Einlang

Chapter 11

Storage Follows Observation

A recurrence tells us which values depend on which earlier values. It does not automatically tell us how much history must be stored. Storage depends on two things together: what each point reads, and which points later code observes.

The Design Fork: Array Semantics or Family Semantics

If an indexed recurrence is treated immediately as an array allocation, the language has already chosen a storage policy. Every time point appears to need materialization because every time point has a name. That is simple, but it throws away an optimization opportunity before analysis begins.

The other choice is to treat the recurrence first as a family of values. The family has a dependency relation; storage is chosen later, after observations are known. This lets the same source definition support a rolling state, a full history, or a checkpointed evaluation depending on how the value is used.

Einlang’s recurrence notation takes the second route. Define the values first; commit to memory only after the program shows what it will observe.

This is a semantic claim first and an implementation hook second. The current compiler performs recurrence ordering and records lowered execution facts that backends can use, including vectorized and recurrence-aware execution paths. The book’s checkpointing examples describe the policy space those facts make available; they are not a promise that every checkpoint schedule is already a separate user-visible pass.

Consider:

let h[0] = init
let h[t in 1..T] = step(h[t - 1], x[t])
let final = h[T - 1]

The recurrence reads only one step back. If the only observed value is final, an implementation can evaluate the sequence with a rolling slot:

previous -> current -> previous -> current

The full array h[0..T] does not have to exist as stored data, even though it exists as a family of values in the source model. This is a distinction between meaning and storage: the recurrence defines all of the values, but the runtime may not need to retain all of them.

Ask for the same recurrence twice. First ask only for the last state. Then ask for a plot of every state. The recurrence did not change, but the storage obligation did. Observation is not an afterthought; it is half of the storage story.

A Small Execution Trace

Take four time steps:

h[0] = init
h[1] = step(h[0], x[1])
h[2] = step(h[1], x[2])
h[3] = step(h[2], x[3])

If the program only asks for h[3], the runtime can overwrite:

slot = h[0]
slot = step(slot, x[1])
slot = step(slot, x[2])
slot = step(slot, x[3])

This execution does not deny that h[1] and h[2] are meaningful source values. It only says they do not need long-term storage for this observation. That distinction is hard to make when the source program is already written as mutation. The recurrence gives names to the whole family first; the storage plan comes second.

Put differently, there are two valid views of the same definition:

source family: h[0], h[1], h[2], h[3]
runtime slots: previous, current

The first is the semantic object. The second is one legal storage plan under a particular observation.

When History Is Observed

Now change only the observation:

let all[t] = h[t]

The recurrence is the same. The dependency offset is still t - 1. But the program now asks for every member of the family, so the implementation must make the whole history available.

This distinction is important:

definition  what values mean
observation which values are demanded
storage     what the runtime must retain

In a loop, these concerns are often entangled with mutation. In a recurrence, they can be discussed separately. That separation is useful even when the final runtime still lowers the recurrence to a loop.

Observation is not only a model output. Debugging can observe a value as well:

let h[0] = init;
let h[t in 1..T] = step(h[t - 1], x[t]);
print(h);

Before the print, an implementation may have room to keep only a rolling state if no later expression needs the full history. After print(h), the program has asked to see the family. That request changes what must be evaluated and materialized. A debugging statement is therefore part of the storage story.

This is why printing is more subtle in a language of formulas. Printing is not only display. It is an observation boundary. It turns an expression, a recurrence family, or a derivative request into a concrete witness.

For a smaller example:

let z[i] = x[i] * x[i];
print(z);
print(shape(z));

The first line defines a family of values. The print asks the runtime to show that family. The shape print asks for a different witness: not the values, but the coordinate extent. Both observations reduce implementation freedom because the program has made a demand.

A Wider Window

Some recurrences need more history:

let y[0] = a
let y[1] = b
let y[t in 2..T] = y[t - 1] + y[t - 2]

If only the final value is observed, a two-value rolling window is enough. The source makes that visible through the offsets t - 1 and t - 2.

If a later expression reads every y[t], the storage plan changes. The meaning of the recurrence does not.

The Caveat

The largest backward offset is not a complete storage plan. It is only the local dependency window. Other forces may require more:

later code may observe intermediate states
reverse-mode differentiation may need saved activations
debugging may request a trace
checkpointing may trade recomputation for memory

Visible recurrence does not solve those trade-offs by itself. It makes the first fact explicit: what each point needs in order to be computed. Once that fact is visible, storage choices become compiler/runtime policy rather than guesswork hidden inside a loop.

Checkpointing as a Policy

Reverse-mode differentiation is the clearest case where the caveat matters. A training run may need intermediate states during the backward pass. Keeping every state is simple but memory-heavy. Recomputing every state is memory-light but expensive. Checkpointing chooses points in between: save some states, recompute segments as needed.

The recurrence notation does not choose the checkpoint policy. It gives the policy the dependency facts:

h[t] needs h[t - 1]
y[t] needs y[t - 1] and y[t - 2]

With those facts, the compiler or runtime can reason about legal recomputation. Without them, it has to recover the temporal structure from a loop body and mutation pattern.

This is why the chapter says storage follows observation, not “storage is solved by recurrence.” Observation includes ordinary outputs, derivative requests, debug traces, and profiling hooks. Each observation changes what must be retained or recomputed.

Which Observations Force Memory?

The largest backward offset in a recurrence gives the local history window for forward computation. But storage is not determined by that window alone. Later uses, derivative requests, and debugging observations can demand more storage than the recurrence itself. The source tells us the dependency; observation tells us how much of the family must become concrete.

Definition Before Storage

The storage discussion keeps the recurrence story honest. It would be too easy to say “visible dependency offsets imply storage optimization” and stop there. The truth is subtler. Visible offsets give the compiler a fact it can use, but storage is a policy decision under observation.

That distinction is important for production systems. A compiler should be able to use the one-step dependency of an RNN when only the final state is needed. It should also be able to keep all states when the output sequence is requested. It should be able to choose checkpointing when differentiation makes the memory/recompute trade-off worthwhile.

The source notation should not pretend those policies are all the same. It should expose the dependency relation clearly enough that different policies can be justified. Visible structure does not remove implementation choices; it makes their inputs explicit.

A useful warning follows from this: do not confuse a recurrence definition with an array allocation request. The notation:

let h[t] = ...

defines a family of values. It does not necessarily demand that every h[t] be stored simultaneously. Whether the family is materialized depends on later uses and runtime policy.

That distinction is one of the quiet benefits of a declarative source form. It lets the program describe the mathematical object first and lets the implementation decide how much of that object must become memory at once. The better the dependency information, the more room the implementation has to make that decision responsibly.

The practical advice is therefore conservative: never infer storage only from the surface existence of an indexed family. First read the dependency window. Then read the observations. Only then talk about materialization.

Pressure Test: Same Recurrence, Different Observations

Consider this recurrence. The dependency is simple; the storage question is not, because observation changes the contract:

let h[0] = init
let h[t in 1..8] = step(h[t - 1], x[t])

The dependency window is one step. That fact alone permits a rolling execution for a final-state query:

let final = h[7]

An implementation can keep a single live state:

slot = h[0]
slot = step(slot, x[1])
slot = step(slot, x[2])
...
slot = step(slot, x[7])

The source family still contains h[0] through h[7]. The storage plan does not. The difference is legal because the observation only demands the last member of the family.

Now add one line:

let trace[t] = h[t]

The recurrence definition has not changed. The dependency window is still one step. But the observation now demands every time point, so the runtime must make all eight values available. A rolling slot is no longer enough unless the runtime also emits or stores each value as it is produced.

A third observation changes the policy again:

let every_other[u] = h[2 * u]

Now not every time point is externally needed. A storage planner could retain only even states for the output while still using odd states transiently during computation. The source recurrence and the observation together determine the obligation.

This is the audit:

definition    h[t] depends on h[t - 1]
observation   final, trace, or every_other
policy        rolling slot, full history, partial materialization

Checkpointing is a more complex version of the same separation. Suppose a reverse pass needs intermediate states, but memory cannot hold all of them. The dependency graph says which states can be recomputed from which earlier states. A checkpoint schedule might store h[0], h[4], and h[7], then recompute segments when the backward computation needs them. The recurrence notation does not choose that schedule, but it gives the schedule a graph that is explicit rather than recovered from mutation.

Debugging belongs in the same analysis. A line such as:

print(h);

is not harmless from the storage point of view. It observes the family. If a compiler had planned to keep only the final state, the print changes the contract. The user has asked for a witness of all h[t] values, so the runtime must either materialize them or produce them in an observable stream.

This concrete audit prevents two opposite mistakes. The first mistake is to assume every indexed family is an array allocation. That gives up optimization too early. The second is to assume a small dependency window always means small memory. That ignores observations, debugging, and autodiff. The visible source form lets the program state the family first and lets implementation policy respond to what is actually demanded.

The Minimal Compiler Rule

A practical compiler can begin with a conservative rule:

dependency window gives what is needed next
observation set gives what must remain available

For h[t] reading h[t - 1], the next-step dependency is small. For print(h) or let trace[t] = h[t], the observation set is large. For let final = h[T - 1], the observation set is small. This rule does not solve all memory planning, but it prevents the most common confusion: source families are semantic objects, not automatic allocation demands.

Once that distinction is stable, more advanced policies can be discussed without changing the source meaning. Full materialization, rolling buffers, streaming output, and checkpointing are different answers to the same visible dependency and observation facts.