In Making Invalid States Unrepresentable 1. Why boolean flags are bugs in disguise we saw that boolean flags create an exponential explosion of representable states, most of which are nonsense. Enums fix this by offering a 1:1 mapping between type and domain.
But why does this work? The answer is in the name: algebraic data types. Types obey arithmetic, and the difference between booleans and enums is the difference between multiplication and addition.
Product types (AND): structs, tuples, records #
A product type combines multiple fields. Its cardinality (number of possible values) is the product of its fields’ cardinalities:
|A x B| = |A| * |B|A struct with two bool fields has 2 * 2 = 4 possible values. A struct with a bool and a three-variant enum has 2 * 3 = 6 values.
Think of it as a grid. A bool x bool is a 2x2 grid with 4 cells. Every cell is a representable value, whether it makes sense or not.
Sum types (OR): enums, tagged unions, variants #
A sum type is one of several alternatives. Its cardinality is the sum of its variants’ cardinalities:
|A + B| = |A| + |B|An enum with three unit variants has 1 + 1 + 1 = 3 possible values. An enum where one variant carries a String and another carries nothing has |String| + 1 values.
Think of it as a list. You pick exactly one item from the list. No combinations, no grid.
The arithmetic in practice #
| Type | Algebra | Cardinality |
|---|---|---|
bool |
– | 2 |
(bool, bool) |
2 * 2 | 4 |
(bool, bool, bool) |
2 * 2 * 2 | 8 |
(bool, bool, bool, bool) |
2^4 | 16 |
enum { A, B, C } |
1 + 1 + 1 | 3 |
enum { A, B, C, D, E } |
1 + 1 + 1 + 1 + 1 | 5 |
Option<T> |
|T| + 1 | |T| + 1 |
Result<T, E> |
|T| + |E| | |T| + |E| |
() (unit) |
– | 1 |
! (never) |
– | 0 |
The key insight: Option<T> is not a special language feature. It is a sum type: Some(T) + None = |T| + 1. Result<T, E> is also a sum type: Ok(T) + Err(E) = |T| + |E|. The entire Rust error handling model is algebraic data types all the way down.
Multiplication vs addition: a concrete comparison #
Recall the online order from Part 1. Four boolean flags (is_paid, is_shipped, is_delivered, is_cancelled) create a product type:
Cardinality = 2 * 2 * 2 * 2 = 16 states
Valid states = 5 (Placed, Paid, Shipped, Delivered, Cancelled)
Wasted states = 11An enum with five variants creates a sum type:
Cardinality = 1 + 1 + 1 + 1 + 1 = 5 states
Valid states = 5
Wasted states = 0The difference between 2^n and n is not style. It is the difference between a type that lies about your domain and one that tells the truth.
Exhaustive pattern matching: your compile-time safety net #
When you match on an enum, the compiler performs three checks that mirror formal verification techniques:
- Totality checking: every possible variant is handled. No case is forgotten.
- Unreachability checking: if a branch can never match, the compiler warns. Dead code is detected statically.
- Refactoring safety: adding a new variant causes compile errors at every incomplete
match.
Compare the boolean approach to the enum approach:
// Booleans: no compiler help
if is_paid {
// handle paid
} else if is_shipped {
// handle shipped
} else {
// "placed"... but what about cancelled? delivered?
// The compiler says nothing when you add is_refunded.
}
// Enum: compiler enforces completeness
match status {
OrderStatus::Placed => { /* ... */ }
OrderStatus::Paid { .. } => { /* ... */ }
OrderStatus::Shipped { .. } => { /* ... */ }
OrderStatus::Delivered { .. } => { /* ... */ }
OrderStatus::Cancelled { .. } => { /* ... */ }
// Add OrderStatus::Refunded and this match fails to compile.
}The match is not syntactic sugar. It is a proof obligation: you must demonstrate that your code handles every possible state. The compiler verifies the proof on every build.
The same principle across languages #
Making invalid states unrepresentable is not a Rust-specific idea. Every language with sum types supports it.
Haskell #
data OrderStatus
= Placed
| Paid String -- transaction id
| Shipped String -- tracking number
| Delivered UTCTime
| Cancelled String -- reason
handleOrder :: OrderStatus -> IO ()
handleOrder Placed = putStrLn "Awaiting payment"
handleOrder (Paid txn) = putStrLn ("Paid: " ++ txn)
handleOrder (Shipped track) = putStrLn ("Shipped: " ++ track)
-- Missing Delivered and Cancelled: compile warning with -WallHaskell has had sum types since 1990. GHC’s -fwarn-incomplete-patterns turns missing cases into warnings.
OCaml #
type order_status =
| Placed
| Paid of string
| Shipped of string
| Delivered of float
| Cancelled of string
let handle_order = function
| Placed -> print_endline "Awaiting payment"
| Paid txn -> print_endline ("Paid: " ^ txn)
(* Missing cases: Warning 8 *)OCaml pioneered exhaustive pattern matching as an error, not just a warning. Jane Street’s entire trading infrastructure runs on this principle.
TypeScript #
type OrderStatus =
| { kind: "placed" }
| { kind: "paid"; transactionId: string }
| { kind: "shipped"; trackingNumber: string }
| { kind: "delivered"; deliveredAt: Date }
| { kind: "cancelled"; reason: string };
function handleOrder(status: OrderStatus): string {
switch (status.kind) {
case "placed": return "Awaiting payment";
case "paid": return `Paid: ${status.transactionId}`;
case "shipped": return `Shipped: ${status.trackingNumber}`;
case "delivered": return `Delivered at ${status.deliveredAt}`;
case "cancelled": return `Cancelled: ${status.reason}`;
default: {
const _exhaustive: never = status;
return _exhaustive; // Compile error if a case is missing
}
}
}TypeScript’s never trick provides exhaustive checking. The discriminant field (kind) acts as the tag. TypeScript narrows the type in each branch, so status.transactionId is only accessible inside case "paid".
Java (17+) #
sealed interface OrderStatus
permits Placed, Paid, Shipped, Delivered, Cancelled {}
record Placed() implements OrderStatus {}
record Paid(String transactionId) implements OrderStatus {}
record Shipped(String trackingNumber) implements OrderStatus {}
record Delivered(Instant deliveredAt) implements OrderStatus {}
record Cancelled(String reason) implements OrderStatus {}
// Java 21 exhaustive switch:
String label = switch (status) {
case Placed p -> "Awaiting payment";
case Paid p -> "Paid: " + p.transactionId();
case Shipped s -> "Shipped: " + s.trackingNumber();
case Delivered d -> "Delivered at " + d.deliveredAt();
case Cancelled c -> "Cancelled: " + c.reason();
// Missing case: compile error
};Java arrived late to this party (sealed in Java 17, pattern matching switch in Java 21), but the mechanism is the same.
Scala #
enum OrderStatus:
case Placed
case Paid(transactionId: String)
case Shipped(trackingNumber: String)
case Delivered(deliveredAt: Instant)
case Cancelled(reason: String)
def handleOrder(status: OrderStatus): String = status match
case OrderStatus.Placed => "Awaiting payment"
case OrderStatus.Paid(txn) => s"Paid: $txn"
case OrderStatus.Shipped(track) => s"Shipped: $track"
case OrderStatus.Delivered(at) => s"Delivered at $at"
case OrderStatus.Cancelled(reason) => s"Cancelled: $reason"Scala 3’s enum keyword provides first-class sum types with exhaustive matching. In Scala 2, sealed trait + case class achieved the same effect.
The takeaway #
The arithmetic is simple: product types multiply, sum types add. When your domain has k mutually exclusive states and you model them as n boolean flags, you represent 2^n states of which only k are valid. The remaining 2^n - k values are bugs in waiting.
The fix is also simple: use a sum type. In Rust, that is an enum. In Haskell, a data declaration. In TypeScript, a discriminated union. In Java, a sealed interface. In Scala, an enum or sealed trait. The syntax changes; the principle does not.
In Making Invalid States Unrepresentable 3. Real bugs from representable nonsense, we look at real-world consequences: the bugs that happen when types lie about your domain.