Mathematical proofs can be difficult to read—even for the person who wrote them. Some of this difficulty is intrinsic: abstract structures, nested quantifiers, long dependency chains. But some is incidental, arising not from the logical content but from choices in exposition. Proofs tend to be compressed, and less often refactored for comprehension.

The interpretive burden

Disciplines that manage complex information tend to develop organizational systems. Accounting has double-entry bookkeeping, standardized ledgers, and audit trails. Software has type systems, version control, explicit interfaces, and refactoring practices. Law has citation systems and structured argumentation.

Mathematics has conventions too, but they tend toward implicit patterns absorbed through practice rather than explicit structure in the writing itself. Difficulty with a proof is usually attributed to the reader’s background—but sometimes the writing itself could be clearer.

There may be historical reasons. Mathematical practice evolved partly in oral contexts, where a proof was explained and discussed. Compression made sense when you could say “here’s what I mean” in real time. Written proofs inherit the compression without the accompanying dialogue.

Prior work

Others have thought carefully about proof presentation and mathematical writing.

Leslie Lamport developed structured proofs—a hierarchical style where each step is numbered by level, with explicit statement of what is being proved. Lamport, whose background is in concurrent and distributed systems, found that rewriting proofs in structured form revealed errors that traditional presentation had obscured. His papers “How to Write a Proof” (1995) and “How to Write a 21st Century Proof” (2012) describe this approach. The style is also used in the TLA+ proof system.

Donald Knuth taught a course on mathematical writing at Stanford (1987) covering clarity, precision, and exposition. Topics included word choice, notation, the refereeing process, and how to present algorithms. Knuth is also known for literate programming, an approach where programs are written as narratives with embedded code.

Cognitive load theory, developed by John Sweller, distinguishes intrinsic load (inherent complexity of material) from extraneous load (load imposed by how material is presented). A related distinction appears in Fred Brooks' “No Silver Bullet” (1986): essential complexity (inherent to the problem) versus accidental complexity (introduced by the solution).

Constructive mathematics requires proofs to provide explicit witnesses: to prove something exists, one must show how to construct it. Proof assistants like Lean, Coq, and Agda enforce this through their type systems, though their primary aim is machine-checkable correctness rather than human readability.

The question here isn’t formal verification. It’s more modest: what would it look like to make the semantic structure of a proof more visible in the writing?

What stays implicit

In conventional mathematical prose, several aspects of the reasoning tend to remain unstated:

Intent. Why is this object introduced? What role will it play? A proof often reads forward (step after step) while the reasoning that motivated it ran backward from the goal. The reader discovers purpose retrospectively, reconstructing the author’s planning after the fact.

Scope and quantification. Which variables are fixed for the whole proof? Which are arbitrary within a local scope? Which are existentially chosen, depending on prior values? The word “let” alone can mean: let $x$ be arbitrary (universal), let $x$ be this specific value (definition), let $x$ be some value satisfying… (obtaining a witness), or let $x$ be given (assumption). Context disambiguates, but the reader must infer the quantifier status.

Strategy. Is this a direct proof, a contradiction, a contrapositive, a case analysis, an induction? The technique may become apparent only after reading the whole proof, or may be recognized partway through from structural cues.

Justification. What licensed each step? Which prior statements are being used? Sometimes obvious, sometimes requiring the reader to scan backward or supply background knowledge.

Decomposition. Complex proofs are often broken into lemmas, but the structure of that breakdown can be opaque. Why was this lemma isolated? How do the pieces fit together? The same questions of intent and dependency arise at the level of proof organization.

Context. Which mathematical setting are we in? A proof in analysis carries conventions about limits and approximation; a proof in algebra assumes familiarity with group-theoretic moves. Notation, standard techniques, and background definitions often go unstated. This is less about proof structure than about prerequisites—but it’s another layer that remains implicit.

These are matters of exposition, not rigor. A proof can be correct without being clear.

Proofs as plans

A proof is a deductive argument with a goal: starting from axioms, definitions, and prior results, proceeding through justified steps to a conclusion. It’s not empirical or inductive (in the epistemological sense, not mathematical induction, which is deductive).

Each move in a proof has a semantic role. Without aiming to be exhaustive, some common ones:

  • Goal—stating what needs to be shown. The main goal comes from the theorem; subgoals arise from proof strategies (e.g., to prove $P \Rightarrow Q$, we assume $P$ and set $Q$ as our new goal).
  • Introducing objects—bringing a new object into scope (a variable, constant, function, set, etc.), with a specific quantifier status:
    • Arbitrary: to prove a universal claim, we let $x$ be an arbitrary element and reason about it without using any special properties. What we derive holds for all $x$.
    • Witness: to use an existential claim, we obtain a specific object satisfying some property. This object may depend on earlier choices (e.g., “for this $\varepsilon$, pick $N$ such that…”).
    • Defined: we construct a new object from existing ones (e.g., “let $N = \max(N_1, N_2)$”). This is neither arbitrary nor a witness—it’s a specific construction.
  • Assuming—temporarily taking a proposition as given, opening a scope that will later close. This happens in conditional proofs (assume $P$ to derive $Q$, then conclude $P \Rightarrow Q$), proof by contradiction (assume $\neg P$, derive a contradiction), case analysis, and “without loss of generality” (assume one of several symmetric cases, claiming the others follow analogously).
  • Deriving—obtaining a new fact from what’s established, by applying a definition, inference rule, or prior result. The justification may be explicit or left for the reader to supply.
  • Concluding—discharging a goal and closing a scope. This happens when we complete a subproof (e.g., having derived $Q$ under assumption $P$, we conclude $P \Rightarrow Q$ and exit that scope).

These roles correspond to moves in natural deduction and related proof systems, though informal proofs rarely label them explicitly.

A proof is often constructed backward (from goal to subgoals to known facts) but written forward (from assumptions through derivations to conclusion). The reader sees the forward presentation but must infer the backward reasoning that motivated each step. Marking the semantic role of each move can reduce this reconstruction effort.

The logical content remains the same; the presentation carries more of the reasoning explicitly.

Example: uniqueness of limits

Theorem. In a metric space, a convergent sequence has a unique limit.

A conventional proof:

Suppose $(x_n) \to L$ and $(x_n) \to M$. Let $\varepsilon > 0$. There exist $N_1, N_2$ such that $n \geq N_1$ implies $d(x_n, L) < \varepsilon/2$ and $n \geq N_2$ implies $d(x_n, M) < \varepsilon/2$. For $n \geq \max(N_1, N_2)$, we have $d(L, M) \leq d(L, x_n) + d(x_n, M) < \varepsilon$. Since $\varepsilon$ was arbitrary, $d(L, M) = 0$, so $L = M$. $\blacksquare$

Correct and concise. But parsing requires reconstructing: What kind of argument? Why $\varepsilon/2$? What’s the role of $N_1, N_2$—fixed or chosen? Why does “$\varepsilon$ arbitrary” yield $d(L,M) = 0$?

A version making structure explicit:

Theorem. In a metric space $(X, d)$, if a sequence $(x_n)$ in $X$ converges, its limit is unique.

Definition. For $L \in X$, $(x_n) \to L$ means: $\forall \varepsilon > 0$, $\exists N$, $\forall n \geq N$: $d(x_n, L) < \varepsilon$.

Strategy. Assume two limits exist; show they must be equal by proving $d(L,M) < \varepsilon$ for arbitrary $\varepsilon$, forcing $d(L,M) = 0$.


Proof.

Let $(x_n)$ be a sequence in $X$ with $(x_n) \to L$ and $(x_n) \to M$.
Goal: $L = M$.

Let $\varepsilon > 0$ be arbitrary.

From $(x_n) \to L$: pick $N_1$ such that $n \geq N_1 \Rightarrow d(x_n, L) < \varepsilon/2$.
From $(x_n) \to M$: pick $N_2$ such that $n \geq N_2 \Rightarrow d(x_n, M) < \varepsilon/2$.

Let $N = \max(N_1, N_2)$. Take $n = N$. By the triangle inequality: $$d(L, M) \leq d\bigl(L, x_n\bigr) + d\bigl(x_n, M\bigr) < \varepsilon/2 + \varepsilon/2 = \varepsilon$$

Since $\varepsilon > 0$ was arbitrary, $d(L, M) < \varepsilon$ for all $\varepsilon > 0$.
Therefore $d(L, M) = 0$, so $L = M$. $\blacksquare$

The logical steps are the same. The structure makes explicit what the conventional version leaves implicit: the strategy is stated upfront, the role of each variable is marked (arbitrary vs. chosen), the definition anchors notation, and the $\varepsilon/2$ is visibly motivated (to sum to $\varepsilon$).

Example: composition of continuous functions

Theorem. The composition of continuous functions between metric spaces is continuous.

A conventional proof:

Let $f: X \to Y$ and $g: Y \to Z$ be continuous. Let $x \in X$ and $\varepsilon > 0$. Since $g$ is continuous at $f(x)$, there exists $\delta' > 0$ such that $d_Y(y, f(x)) < \delta'$ implies $d_Z(g(y), g(f(x))) < \varepsilon$. Since $f$ is continuous at $x$, there exists $\delta > 0$ such that $d_X(x', x) < \delta$ implies $d_Y(f(x'), f(x)) < \delta'$. Then $d_X(x', x) < \delta$ implies $d_Z(g(f(x')), g(f(x))) < \varepsilon$. $\blacksquare$

The reader must track: which $\delta$ came from which condition, what depends on what, why $g$’s continuity is used before $f$’s.

A version making structure explicit:

Theorem. Let $(X, d_X)$, $(Y, d_Y)$, $(Z, d_Z)$ be metric spaces. If $f: X \to Y$ and $g: Y \to Z$ are continuous, then $g \circ f: X \to Z$ is continuous.

Definition. $f$ continuous at $x$ means: $\forall \varepsilon > 0$, $\exists \delta > 0$, $\forall x' \in X$: $d_X(x', x) < \delta \Rightarrow d_Y\bigl(f(x'), f(x)\bigr) < \varepsilon$.

Strategy. Given $x \in X$ and $\varepsilon > 0$, find $\delta > 0$ satisfying the continuity condition for $g \circ f$. Work backward: use $g$’s continuity to obtain an intermediate bound $\delta'$, then $f$’s continuity to achieve it.


Proof.

Let $x \in X$ be arbitrary. Let $\varepsilon > 0$ be arbitrary.
Goal: find $\delta > 0$ such that for all $x' \in X$, $d_X(x', x) < \delta \Rightarrow d_Z\Bigl(g\bigl(f(x')\bigr), g\bigl(f(x)\bigr)\Bigr) < \varepsilon$.

Step 1. Apply $g$’s continuity at $f(x)$.

$g$ is continuous at $f(x)$, so for this $\varepsilon$:
pick $\delta' > 0$ such that for all $y \in Y$, $d_Y\bigl(y, f(x)\bigr) < \delta' \Rightarrow d_Z\Bigl(g(y), g\bigl(f(x)\bigr)\Bigr) < \varepsilon$.

Step 2. Apply $f$’s continuity at $x$.

$f$ is continuous at $x$, so for $\delta'$ from Step 1:
pick $\delta > 0$ such that for all $x' \in X$, $d_X(x', x) < \delta \Rightarrow d_Y\bigl(f(x'), f(x)\bigr) < \delta'$.

Step 3. Chain the implications.

Let $x' \in X$ satisfy $d_X(x', x) < \delta$.
By Step 2: $d_Y\bigl(f(x'), f(x)\bigr) < \delta'$.
By Step 1 (with $y = f(x')$): $d_Z\Bigl(g\bigl(f(x')\bigr), g\bigl(f(x)\bigr)\Bigr) < \varepsilon$.

We have produced $\delta > 0$ with the required property.

Since $x$ and $\varepsilon$ were arbitrary, $g \circ f$ is continuous. $\blacksquare$

The explicit structure shows: why $g$’s continuity comes first (to get the target $\delta'$), what each choice depends on, how implications chain.

Discussion

This is not formal verification. Proof assistants like Lean or Coq serve a different purpose: machine-checkable correctness under specific type theories. The approach here is closer to documentation—prose that remains human-readable but carries more structural context.

Experienced mathematicians already recognize proof patterns implicitly: a contradiction setup, an approaching use of compactness. This knowledge exists but lives in the reader’s mind rather than in the text. Explicit structure externalizes some of it.

Restructuring proofs adds effort, but can clarify the writer’s own understanding and improve archival value. Language models may lower this cost by assisting with translation between formats.

Closing remarks

The aim is not to prescribe a format, but to notice that conventional mathematical writing favors some qualities (concision, elegance) over others (explicitness, accessibility). The right balance can depend on context—audience, purpose, complexity of the argument.

Proofs that mark their semantic structure don’t become less rigorous. Making the logical moves visible can sometimes reveal gaps or ambiguities that compressed prose obscures. This practice is compatible with mathematical work; it simply asks more of the written artifact.

References