Einlang

Notes

Notes on Notation

The notation here keeps to a small Einlang core. Some examples are executable Einlang; others use mathematical spelling for the same coordinate ideas. Executable snippets keep Einlang’s normal semicolons; explanatory equations may omit them when the coordinate relation is the point.

Core Forms

let x = 5
let y[i] = x[i] + 1
let z[i, j] = A[i, j] + B[i, j]

An indexed let defines a family of values. Free indices on the left are the visible coordinates of the result.

let dot = sum[i](u[i] * v[i])
let C[i, j] = sum[k](A[i, k] * B[k, j])

sum introduces a local index and consumes it. Indices outside the reduction survive into the result.

let dy_dx = @y / @x

A derivative request asks how one named value changes with respect to another named value.

let h[0] = init
let h[t in 1..T] = step(h[t - 1], x[t])

A recurrence defines a family of values over an index and makes offsets such as t - 1 visible.

Implementation-Aligned Contracts

The book sometimes speaks about “axis contracts” because coordinate names carry semantic obligations. The implemented type syntax is more concrete:

let n: i32 = 4;
let x: [f32; 2, 3] = [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]];
let y: [f32; ?, ?] = load_matrix();
let z: [f32; *] = load_dynamic_rank();

i32, i64, f32, f64, bool, and str are scalar types. [f32; 2, 3] is a rectangular tensor contract: element type plus rank and, when known, extents. ? keeps a dimension unknown but rank-checked. * means dynamic rank.

Index variables are not annotated with role types. They are introduced by indexed declarations, reductions, and explicit ranges:

let doubled[i, j] = matrix[i, j] * 2.0;
let total = sum[i in 0..10](data[i]);

Here i and j are integer coordinates. Their ranges are inferred from indexed operands or stated with in 0..10. A role name such as batch may appear as an index name or a rest prefix such as ..batch, but Einlang does not use syntax like i: batch to assign an axis type.

The compiler pipeline that enforces these contracts is similarly concrete:

parse
name resolution
AST to IR
Einstein grouping
constraint classification
rest-pattern preprocessing
range analysis
shape analysis
type inference
autodiff rewriting
autodiff request lowering
Einstein lowering
recurrence ordering
lowered execution facts
validation

This is why the book distinguishes a conceptual reading from an implementation claim. If the two ever disagree, the implementation wins.

How to Read Inference Passes in the Book

The implementation passes are not side machinery; they are the operational version of the book’s claims.

Name resolution gives each binding and index occurrence an identity. Without that, i is only a spelling.

Einstein grouping collects clauses that define the same indexed family. That lets later passes reason about base cases, literal-index clauses, and range clauses as one definition rather than unrelated statements.

Constraint classification reads where clauses and separates local bindings from guards. A binding introduces a value used by the expression. A guard filters cells or reduction terms.

Rest-pattern preprocessing handles prefixes such as ..batch. When rank is known, it expands the prefix into concrete index slots so later passes work with flat index lists.

Range analysis assigns domains to index variables. It uses explicit binders such as i in 0..10, indexed operands such as A[i, k], and derived index expressions such as t - 1.

Shape analysis uses those domains to compute result extents and check rectangular consistency. It is also where literal shapes, casts, and indexed families meet.

Type inference runs after shape analysis so scalar types and rectangular shapes can agree. It checks arithmetic, annotations, casts, and function boundaries.

Function calls are also part of this story. Unannotated parameters are monomorphized at call sites when enough type information is available. Specialized function bodies can then be revisited by rest-pattern, shape, and type analysis. A function is therefore not always opaque to the compiler, but it is still an abstraction boundary in the source. The book treats it as appropriate when it preserves the coordinate contract and suspicious when it hides the coordinate role that decides correctness.

Autodiff then works over IR that already has those facts. Einstein lowering and recurrence ordering come later, after the high-level coordinate relationships have been checked and transformed.

Standard Library Examples

The chapters draw on small excerpts from stdlib/ml:

  • transform_ops.ein: transpose and depth_to_space for coordinate maps.
  • reduction_ops.ein: reduce_sum for surviving and consumed indices.
  • activations.ein: softmax for batch-preserving normalization.
  • linalg_ops.ein: matmul and batch_matmul for contraction structure.
  • recurrent_ops.ein: rnn for time as an explicit dependency axis.
  • attention_ops.ein: attention for a larger model-shaped example.

These excerpts are not reference documentation. They are concrete examples of visible coordinate structure.

How to Read the Examples

Each example answers a design question before it answers a structural one:

  • What fact would a conventional tensor notation hide here?
  • Which later phase would need that fact back?
  • What does the visible-index form force into the source?

Then it answers the structural questions:

  • Which binding is being added to the environment?
  • Which indices are free and therefore survive?
  • Which indices are local to a sum, max, or recurrence rule?
  • What dependency relation can the compiler preserve?

Then it answers the practical questions:

  • What concrete coordinate would you test by hand?
  • What small mistake would this notation catch early?
  • What phrase keeps the example precise enough to remember?

Code Fence Labels

The examples use three fence labels:

  • python shows a familiar tensor-programming contrast.
  • rust marks Einlang library excerpts.
  • text marks equations, coordinate readings, and compiler-visible structure.

The label separates executable excerpts from explanatory equations.

Missing-Coordinate Marker

The marker ~ is used in the broadcasting chapter to mean an intentionally missing coordinate:

let a[i, ~, k] = ...

Read it as a boundary annotation: the value has no dependence on that coordinate.

Likewise, expressions such as:

let y[b * group, feat * slice] = x[b, feat, group, slice]

show coordinate relationships. Implementations may spell packing, ranges, or layout constraints more explicitly.

Boundary Notes

Python and host-language interop mostly sit outside the notation’s core, but one boundary rule matters for production code: external values need receipts before tensor formulas rely on them.

let image = python::numpy::load("digit.npy") as [f32; 28, 28];
let centered[i, j] = image[i, j] - 0.5;

The Python call may load data dynamically. The cast states the contract that the tensor formula needs: rank, element type, and extents. If the file contains a flattened vector of length 784, the mismatch belongs at the boundary. The formula should not have to discover later that the value did not have row and column coordinates.

This is the boundary-receipt principle:

host language brings the value
Einlang receives a shaped contract
indexed code relies on the contract

The same idea applies to debugging. print(x) is an observation boundary; it asks the runtime to produce a concrete witness for a value that may otherwise remain symbolic, lazy, or subject to storage optimization.

The Rule of the Notation

When in doubt, ask one question:

Which coordinates are visible here?

That question is the whole experiment.