Skip to main content
  1. Posts/

Tagless Final is denotational semantics in disguise

·10 mins
Rafael Fernandez
Author
Rafael Fernandez
Mathematics, programming, and life stuff

“Doing a tagless-final embedding is literally writing a denotational semantics for the DSL, in a host programming language rather than on paper.” , Oleg Kiselyov

If you have ever defined a trait in Rust and provided two implementations, one for production, one for tests, you have done tagless final. If you have ever written a Scala type class with given instances for different types, you have done tagless final. The pattern that Kiselyov, Carette, and Shan formalized in their 2007 paper is the same pattern that hexagonal architecture calls “ports and adapters” and that every dependency injection framework calls “interface plus implementation.”

The formal name does not add ceremony. It reveals why the pattern works: because defining an interface and swapping implementations is exactly what denotational semantics does: mapping syntax (what operations exist) to meaning (what they do). The Syntax and Semantics series introduced these concepts: grammars define syntax, interpretation functions define semantics, and the gap between them is where bugs live. Tagless final is the technique that closes that gap at the type level.

Initial vs final: two ways to represent a language
#

There are two fundamentally different ways to encode a small language inside a host language. The first one, the one you learned first, is to build a data structure.

Initial encoding: build the tree, then interpret it
#

// Rust: initial encoding
enum Expr {
    Lit(i32),
    Add(Box<Expr>, Box<Expr>),
    Neg(Box<Expr>),
}

fn eval(e: &Expr) -> i32 {
    match e {
        Expr::Lit(n) => *n,
        Expr::Add(l, r) => eval(l) + eval(r),
        Expr::Neg(e) => -eval(e),
    }
}
// Scala: initial encoding
sealed trait Expr
case class Lit(n: Int) extends Expr
case class Add(l: Expr, r: Expr) extends Expr
case class Neg(e: Expr) extends Expr

def eval(e: Expr): Int = e match {
  case Lit(n)    => n
  case Add(l, r) => eval(l) + eval(r)
  case Neg(e)    => -eval(e)
}

You construct an AST, a syntax tree, and then fold over it with a function. Adding a new interpretation (say, pretty-printing) means writing another function over the same tree. Adding a new node type (say, Mul) means touching every existing function. This is the initial algebra: constructors, induction, building up.

Final encoding: define syntax as an interface, compute directly
#

// Rust: final encoding (tagless final)
trait ExprSym {
    type Repr;
    fn lit(&self, n: i32) -> Self::Repr;
    fn add(&self, l: Self::Repr, r: Self::Repr) -> Self::Repr;
    fn neg(&self, e: Self::Repr) -> Self::Repr;
}

struct Eval;
impl ExprSym for Eval {
    type Repr = i32;
    fn lit(&self, n: i32) -> i32 { n }
    fn add(&self, l: i32, r: i32) -> i32 { l + r }
    fn neg(&self, e: i32) -> i32 { -e }
}

struct PrettyPrint;
impl ExprSym for PrettyPrint {
    type Repr = String;
    fn lit(&self, n: i32) -> String { n.to_string() }
    fn add(&self, l: String, r: String) -> String { format!("({l} + {r})") }
    fn neg(&self, e: String) -> String { format!("(-{e})") }
}

// Same program, different meanings:
fn program<S: ExprSym>(s: &S) -> S::Repr {
    s.add(s.lit(1), s.neg(s.lit(2)))
}
// Scala: final encoding (tagless final)
trait ExprSym[Repr] {
  def lit(n: Int): Repr
  def add(l: Repr, r: Repr): Repr
  def neg(e: Repr): Repr
}

given ExprSym[Int] with {
  def lit(n: Int) = n
  def add(l: Int, r: Int) = l + r
  def neg(e: Int) = -e
}

given ExprSym[String] with {
  def lit(n: Int) = n.toString
  def add(l: String, r: String) = s"($l + $r)"
  def neg(e: String) = s"(-$e)"
}

def program[R](using alg: ExprSym[R]): R =
  alg.add(alg.lit(1), alg.neg(alg.lit(2)))

No AST is built. The program function computes its result directly, and the result type depends entirely on which interpreter you plug in. Call program(&Eval) and you get -1. Call program(&PrettyPrint) and you get "(1 + (-2))". Same syntax, different semantics.

This IS denotational semantics. The trait defines the syntax: what operations exist. Each implementation maps that syntax to a mathematical domain: integers, strings, effects. The ⟦.⟧ double-bracket notation from the formal semantics post is exactly this: ⟦program⟧_Eval = -1, ⟦program⟧_PrettyPrint = "(1 + (-2))".

What “tagged” and “tagless” mean
#

Early embeddings of typed DSLs inside a host language used runtime type tags to maintain type safety. You would wrap values in a Value enum with variants like IntVal(i32) and BoolVal(bool), then check tags at runtime. This is the “tagged” approach, and it throws away the host language’s type system.

“Tagless” means the host compiler handles types directly. In the final encoding above, Repr = i32 and Repr = String are checked at compile time. No tags, no casts, no runtime type errors. The metalanguage (Rust, Scala) does the work for you.

What “initial” and “final” mean
#

The names come from category theory, but the intuition is concrete. Initial means “defined by constructors”: you build up a structure (the AST) and then consume it. In algebraic terms, an initial algebra is the least fixed point: the smallest structure that satisfies the grammar. You can pattern-match over it, serialize it, optimize it.

Final means “defined by observations”: you interact through an interface and never see the internal representation. A final coalgebra is the greatest fixed point: it is characterized not by how you build it but by what you can ask of it. You cannot inspect a tagless-final program; you can only interpret it.

The ADT is initial. The trait is final.

Tagless final in Scala: the ecosystem pattern
#

The Scala ecosystem, particularly Cats-Effect, adopted tagless final as its dominant architecture pattern. The recipe is always the same: define an algebra (a trait parameterized by an effect type F[_]), write programs polymorphic in F, and choose a concrete effect at the edge of the world.

trait UserRepo[F[_]] {
  def find(id: UserId): F[Option[User]]
  def save(user: User): F[Unit]
}

def createUser[F[_]: Monad](repo: UserRepo[F], name: String): F[User] =
  for {
    user <- Monad[F].pure(User(name))
    _    <- repo.save(user)
  } yield user

// Production: F = IO
// Testing:    F = State[TestState, *]

The UserRepo[F[_]] trait is the algebra. Each implementation, one backed by PostgreSQL, one backed by an in-memory map, is an interpreter. The business logic in createUser is a program that works with any interpreter. You choose IO for production and State for testing. This is effect polymorphism: the program is written once and runs under different computational models.

The tagless final pattern in Scala IS dependency injection, but at the type level rather than at the runtime level. The algebra is the interface. The interpreter is the implementation. The F[_] parameter is the extra degree of freedom that lets you abstract over the entire effect system.

John De Goes famously criticized this pattern as “dependency injection with extra steps.” The criticism is fair when you never actually exploit effect polymorphism, when F is always IO and you have a single interpreter per algebra. In that case, you pay for the abstraction without using it. But the criticism is unfair when you genuinely need multiple interpretations: production vs test, sync vs async, real vs mock. The power of the pattern is proportional to the number of interpreters you actually write.

Tagless final in Rust: you are already doing it
#

Here is the claim: if you have written trait-based hexagonal architecture in Rust, you have written tagless final. You just never called it that.

Consider the port trait from the todo-cli series:

// This IS a tagless-final algebra:
pub trait TaskRepository {
    fn save(&mut self, task: &Task) -> Result<(), RepoError>;
    fn find_all(&self) -> Result<Vec<Task>, RepoError>;
    fn delete(&mut self, id: &TaskId) -> Result<(), RepoError>;
}

// These are interpreters:
impl TaskRepository for JsonFileTaskRepository { /* ... */ }
impl TaskRepository for InMemoryTaskRepository { /* ... */ }

// This is a program polymorphic in the interpreter:
pub struct AddTaskService<R: TaskRepository> {
    repo: R,
}

The trait TaskRepository defines the syntax: what operations exist. JsonFileTaskRepository maps those operations to JSON file I/O. InMemoryTaskRepository maps them to a HashMap. These are two different denotations of the same syntax. The generic parameter R: TaskRepository plays the same role as F[_] in Scala: it abstracts over the interpreter, and the business logic never knows which one it is talking to.

Rust does not have native higher-kinded types, so you cannot write trait UserRepo[F[_]] the way Scala does. This means the full effect-polymorphic version of tagless final, where F abstracts over IO, State, Future simultaneously, requires GATs or associated types and is more verbose. But for the non-effect-polymorphic case, which covers most application architecture, trait + impl + generic is sufficient. And that case is the common one.

The pattern maps cleanly:

Scala (tagless final) Rust (hexagonal architecture)
Algebra (trait Repo[F[_]]) Port trait (trait TaskRepository)
Interpreter (given Repo[IO]) Adapter (impl TaskRepository for JsonFile...)
Program (def logic[F[_]: Monad]) Service (struct Service<R: TaskRepository>)
Effect type (F[_]) Generic parameter (R: TaskRepository)

The vocabulary differs. The pattern is identical.

Resolving the Expression Problem
#

The Expression Problem, first articulated by Philip Wadler in 1998, asks: can you add both new data variants and new operations without modifying existing code? With the initial encoding (ADTs), adding a new operation is easy (write a new function) but adding a new variant forces you to update every existing function. With the final encoding (traits), both axes are independently extensible.

Adding new syntax? Define an extension trait:

trait MulSym: ExprSym {
    fn mul(&self, l: Self::Repr, r: Self::Repr) -> Self::Repr;
}
trait MulSym[Repr] extends ExprSym[Repr] {
  def mul(l: Repr, r: Repr): Repr
}

Adding new semantics? Write a new implementation. Neither change touches existing code. The Expression Problem post explores this tension in detail; tagless final is one of its cleanest resolutions.

When to use initial vs final
#

Neither encoding is universally better. The choice depends on what you need to do with the language.

Use initial (ADTs) when… Use final (tagless final) when…
You need to inspect, serialize, or optimize the AST You need multiple interpretations
Pattern matching is your primary tool You want to swap implementations at the type level
Operations grow faster than data types Data types grow faster than operations
You need multi-pass transformations You want effect polymorphism (Scala)
You need to debug by printing the tree You are doing hexagonal architecture (Rust)

In practice, many real systems use both. A compiler front-end builds an AST (initial encoding) and then interprets it through a pipeline of passes. A service layer defines port traits (final encoding) and injects implementations at startup. The distinction is not a religion; it is a design dimension.

The Boehm-Berarducci isomorphism proves that the two encodings are equivalent in expressive power: any initial encoding can be mechanically transformed into a final one via Church encoding, and vice versa. They are dual perspectives on the same structure. The question is never “which one is correct” but “which one makes change easier for this codebase.”

Closing
#

The connection from formal semantics to daily engineering is not a metaphor. It is a precise correspondence. Building an ADT is writing a grammar. Writing a match over it is writing operational semantics. Defining a trait and providing implementations is writing denotational semantics. Kiselyov’s insight was not to invent a new pattern; it was to notice that the pattern programmers already used had a formal name and a body of theory behind it.

The next time you define a trait with multiple implementations, know exactly what you are doing: you are writing denotational semantics in a programming language rather than on paper. The algebra is your syntax. The implementation is your semantic function. The compiler is your proof assistant. You have been doing formal methods all along.

References:

  • Kiselyov, Carette, Shan, “Finally Tagless, Partially Evaluated” (JFP 2009; conference version 2007)
  • Wadler, “The Expression Problem” (1998)
  • okmij.org/ftp/tagless-final/, Kiselyov’s resource page on tagless final