Syntax and Semantics 3. The Expression Problem — why you cannot have it all

Table of Contents

Syntax and Semantics - This article is part of a series.

Part 1: Syntax and Semantics 1: Your code has a grammar problem

Part 2: Syntax and Semantics 2: Three ways to define what your code means

Part 3: This Article

In Part 1, we saw that every enum and sealed trait is a formal grammar: it defines the syntax of your domain. In Part 2, we gave those structures meaning through semantics: match arms, pure functions, type-level contracts. Now we ask a harder question: what happens when you need to extend both?

You have a data type. You have operations over it. Then a new requirement arrives: a new variant, or a new operation, or both. How much of your existing code must you touch?

Philip Wadler posed this as a precise challenge in 1998, building on earlier work by John Reynolds (1975) and William Cook (1990): define a data type by cases where one can add new cases and new functions over the data type, without recompiling existing code, while retaining static type safety. He called it the Expression Problem.

The matrix
#

The Expression Problem becomes clear when you think of your code as a two-dimensional matrix. Rows are data variants: the syntax, what can exist. Columns are operations: the semantics, what you can do with what exists.

                    Operations (Semantics)
                    eval()    print()    optimize()
          ┌─────────┬─────────┬───────────┐
Num(n)    │  n      │  "n"    │  Num(n)   │
          ├─────────┼─────────┼───────────┤
Add(l,r)  │  l+r    │  "l+r"  │  ...      │
          ├─────────┼─────────┼───────────┤
Mul(l,r)  │  l*r    │  "l*r"  │  ...      │
          └─────────┴─────────┴───────────┘

Adding a row    = new syntax   (new variant)
Adding a column = new semantics (new operation)

Every cell in the matrix is a piece of logic: what does this operation do for this variant? The entire matrix must be filled. No cell can be left blank; the compiler insists on exhaustiveness, one way or another.

The question is: can you add both a row and a column freely, without touching existing cells? The answer, with standard tools, is no. You must choose which axis is easy.

Rust: easy to add columns, hard to add rows
#

Rust’s enum plus match organizes code by operation. Each function contains a match expression that handles every variant. The logic for a single operation lives in one place.

enum Expr {
    Num(f64),
    Add(Box<Expr>, Box<Expr>),
}

fn eval(e: &Expr) -> f64 {
    match e {
        Expr::Num(n) => *n,
        Expr::Add(l, r) => eval(l) + eval(r),
    }
}

fn print_expr(e: &Expr) -> String {
    match e {
        Expr::Num(n) => n.to_string(),
        Expr::Add(l, r) => format!("({} + {})", print_expr(l), print_expr(r)),
    }
}

Adding a new column, a new operation, is trivial. Write a new function with its own match. No existing code changes:

// New operation: easy. Zero changes to existing code.
fn optimize(e: &Expr) -> Expr {
    match e {
        Expr::Num(n) => Expr::Num(*n),
        Expr::Add(l, r) => {
            let l = optimize(l);
            let r = optimize(r);
            match (&l, &r) {
                (Expr::Num(a), Expr::Num(b)) => Expr::Num(a + b),
                _ => Expr::Add(Box::new(l), Box::new(r)),
            }
        }
    }
}

Now add a new row. Add Mul(Box<Expr>, Box<Expr>) to the enum. The compiler immediately flags every existing match as non-exhaustive: eval, print_expr, optimize, all of them. The number of functions you must update is proportional to the number of existing operations. Rust’s exhaustive matching guarantees you will not forget a case, which is valuable. But the work is real. Adding a row touches every column.

Scala: easy to add rows, hard to add columns
#

Scala’s OOP style, a trait with methods implemented by case classes, organizes code by variant. Each class contains the logic for every operation. The logic for a single variant lives in one place.

trait Expr {
  def eval: Double
  def printExpr: String
}

case class Num(n: Double) extends Expr {
  def eval: Double = n
  def printExpr: String = n.toString
}

case class Add(l: Expr, r: Expr) extends Expr {
  def eval: Double = l.eval + r.eval
  def printExpr: String = s"(${l.printExpr} + ${r.printExpr})"
}

Adding a new row, a new variant, is trivial. Write a new case class that implements the trait. No existing code changes:

// New variant: easy. Zero changes to existing code.
case class Mul(l: Expr, r: Expr) extends Expr {
  def eval: Double = l.eval * r.eval
  def printExpr: String = s"(${l.printExpr} * ${r.printExpr})"
}

Now add a new column. Add def optimize: Expr to the Expr trait. Every existing class, Num, Add, Mul, must be updated to provide an implementation. The number of classes you must modify is proportional to the number of existing variants. Adding a column touches every row.

A note: Scala also supports the FP style: sealed trait with external pattern matching. When you use that approach, the extension trade-offs flip and behave exactly like Rust’s. The OOP style (methods on trait) is what gives Scala the opposite bias.

The fundamental tension
#

This is not a bug in Rust or Scala. It is not something a sufficiently clever language feature will make disappear. It is a consequence of how code is organized in two dimensions.

Organizing by columns (one function per operation, matching all variants) makes adding a new column cheap. Organizing by rows (one class per variant, implementing all operations) makes adding a new row cheap. You can choose one axis, but you cannot choose both simultaneously. Not with concrete ADTs and not with concrete class hierarchies. Any solution that claims to solve both axes introduces some form of indirection or abstraction.

Wadler put it precisely:

“The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety.”

Reynolds saw the duality in 1975. Cook formalized the relationship between abstract types and objects in 1990. Wadler named the problem in 1998. The tension is fundamental. The question is not whether a trade-off exists, but whether the abstraction that resolves it is worth its cost.

The resolution exists
#

There is a technique that solves the Expression Problem cleanly. The key insight is this: make the syntax abstract. Instead of a concrete enum or a concrete sealed trait, define the grammar as a type class (Rust) or an abstract trait parameterized by a type constructor (Scala). Each operation becomes an implementation of that abstraction. Each new variant extends the abstraction without touching existing implementations.

This technique is called tagless final. It is not a trick. It is denotational semantics applied to program construction: the syntax is never materialized as a data structure. Instead, it is interpreted directly into its meaning. New variants and new operations both extend the system without recompilation, without losing type safety, and without modifying existing code.

We cover tagless final in full detail, with working code in both Rust and Scala, in the dedicated post.

Series wrap-up
#

Three posts, one thread. We traced the connection from formal grammars to your domain types, from formal semantics to your pattern matching and trait implementations, and from the Expression Problem to the fundamental tension between extending what can be said and extending what it means. Syntax defines the space. Semantics assigns meaning. The Expression Problem asks whether you can grow both without breaking what already works. The answer is yes, but only if you are willing to make the syntax abstract.

Syntax and Semantics - This article is part of a series.

Part 1: Syntax and Semantics 1: Your code has a grammar problem

Part 2: Syntax and Semantics 2: Three ways to define what your code means

Part 3: This Article

The matrix #

Rust: easy to add columns, hard to add rows #

Scala: easy to add rows, hard to add columns #

The fundamental tension #

The resolution exists #

Series wrap-up #