Skip to main content
  1. Posts/

Making Invalid States Unrepresentable 3: Real bugs from representable nonsense

·6 mins
Rafael Fernandez
Author
Rafael Fernandez
Mathematics, programming, and life stuff
Making Invalid States Unrepresentable - This article is part of a series.
Part 3: This Article

In Making Invalid States Unrepresentable 1. Why boolean flags are bugs in disguise we saw why boolean flags are bugs in disguise. In Making Invalid States Unrepresentable 2. The algebra behind your types we learned the algebra that explains it. Now let us look at the real damage this causes.

These are not hypothetical scenarios. They are patterns that have caused outages, security vulnerabilities, and financial loss across the industry.

Tony Hoare’s billion-dollar mistake
#

The most famous invalid state that should have been unrepresentable: null.

Tony Hoare, the inventor of null references, called it his “billion-dollar mistake.” In his own words:

“I call it my billion-dollar mistake. It was the invention of the null reference in 1965. […] This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.”

The problem is structural. In languages like Java, C#, or JavaScript, every reference type implicitly includes null in its domain. A String is not really a string; it is a String | null. Every function that takes a String must consider the possibility that it is actually nothing.

The fix is the simplest case of making invalid states unrepresentable:

// Instead of a nullable string:
let name: String = get_name(); // could be null in other languages

// Use Option to make the absence explicit:
let name: Option<String> = get_name();

match name {
    Some(n) => println!("Hello, {n}"),
    None => println!("No name provided"),
}

Option<String> forces you to handle both cases. You cannot call .len() on an Option<String> without first proving it is Some. The compiler checks, not programmer memory.

The UI state management epidemic
#

This pattern is everywhere in frontend applications:

interface FetchState {
  isLoading: boolean;
  isError: boolean;
  data: Data | null;
  error: string | null;
}

Four fields. How many combinations? 2 * 2 * 2 * 2 = 16 (treating Data | null and string | null as two states each). How many make sense?

State isLoading isError data error Valid?
Idle false false null null Yes
Loading true false null null Yes
Success false false Data null Yes
Error false true null string Yes
??? true true Data string No
??? true false Data null No
…12 more No

Only 4 states are meaningful. The other 12 are ghosts. What does isLoading: true, isError: true, data: someValue mean? Loading with an error but also with data? Every component that reads this state has to decide for itself.

The fix:

type FetchState<T> =
  | { status: "idle" }
  | { status: "loading" }
  | { status: "success"; data: T }
  | { status: "error"; error: string };

Four variants. Four states. Each carries only its relevant data. A loading state does not have an error message. A success state does not have an error. The impossible combinations are gone.

React, Vue, and Angular applications are full of bugs caused by components rendering in impossible states because the type does not prevent them. Libraries like TanStack Query and SWR have moved toward discriminated union patterns precisely because of this.

Shotgun parsing
#

The 2016 paper “The Seven Turrets of Babel: A Taxonomy of LangSec Errors and How to Expunge Them” defines shotgun parsing: a programming antipattern where validation checks are scattered across processing code instead of being concentrated at the input boundary.

The consequence: some invalid input gets processed before being detected, leaving the program in an unpredictable state.

This is exactly what happens with boolean flags. Each function that reads the flags performs its own partial validation (“is this combination valid?”), and if any check is missing, invalid state flows through. The validation is shotgunned across the codebase instead of being handled once at the boundary.

With an enum, validation happens once: at the point where data enters the system and is parsed into the sum type. After that, every function downstream works with a type that can only represent valid states. No shotgun parsing needed.

Payment processing
#

A payment system using booleans:

struct Payment {
    is_authorized: bool,
    is_captured: bool,
    is_refunded: bool,
    is_voided: bool,
}

16 combinations. Most are financial nonsense:

  • is_captured: true, is_voided: true: Captured the money and voided the transaction?
  • is_refunded: true, is_captured: false: Refunded money that was never captured?
  • is_authorized: true, is_refunded: true, is_captured: false: Authorized and refunded but never captured?

Bugs in this pattern result in double charges, lost refunds, or inconsistent ledger entries. This is not a theoretical concern. Payment processing bugs are measured in dollars.

The fix:

enum PaymentStatus {
    Pending,
    Authorized { auth_code: String, expires_at: DateTime },
    Captured { auth_code: String, captured_at: DateTime },
    Refunded { original_capture: String, refunded_at: DateTime },
    Voided { reason: String, voided_at: DateTime },
}

Each state carries only its relevant data. You cannot have a Refunded payment without the original capture reference. You cannot have a Captured payment without an auth code. Invalid transitions become type errors.

State machines encoded in types
#

The deeper pattern behind all these examples is that domain states form a state machine, and the type should encode that machine.

Each variant of a sum type corresponds to a state in a finite state automaton. The methods that consume one variant and produce another are the transitions:

impl Payment {
    fn authorize(self, auth_code: String) -> Payment {
        // Pending -> Authorized
    }
    fn capture(self) -> Payment {
        // Authorized -> Captured
    }
    fn refund(self) -> Payment {
        // Captured -> Refunded
    }
    fn void(self) -> Payment {
        // Authorized -> Voided (cannot void after capture)
    }
}

The type system ensures that only valid transitions are expressible. You cannot call refund on a Pending payment because the method signature prevents it. The state machine is not documented in a wiki page or a code comment. It is encoded in the type.

This is exactly what formal verification does with model checkers: enumerate all reachable states and verify that invalid states are unreachable. With algebraic data types, the “model checker” is the compiler, and it runs on every build.

The takeaway
#

Making invalid states unrepresentable is not advice about coding style. It is a design principle rooted in type theory, connected to formal methods, and validated by decades of real-world bugs:

  • Null references are an invalid state (String that is not a string) that should have been Option<String>.
  • UI loading states with boolean flags create ghost states that cause rendering bugs.
  • Shotgun parsing scatters validation across the codebase instead of concentrating it at the boundary.
  • Payment state machines with boolean flags create financial nonsense that costs real money.

The fix is always the same: replace the product of booleans with a sum type. Let the compiler enforce your domain invariants. Design your types so that the impossible is inexpressible.

That is what Minsky meant. Not “enums are nice.” But: design your types so that the impossible is inexpressible.


If you are curious about the theoretical foundation behind this principle, how types relate to logical propositions and why a well-typed program is literally a proof, check out The Curry-Howard Correspondence. When types become proofs.

Making Invalid States Unrepresentable - This article is part of a series.
Part 3: This Article