Stuart Axelbrooke e221cd8502 add crazy idea

2025-12-01 02:14:27 +08:00

8.7 KiB

Raw Blame History

Event-Sourced CPN Framework

A vision for a Rust library/framework combining event sourcing, Colored Petri Net semantics, and compile-time safety for building correct distributed systems.

The Problem

In highly connected applications with multiple entity types and relationships (like databuild's Wants, JobRuns, Partitions), developers face combinatorial complexity:

For each edge type between entities, you need:

Forward accessor
Inverse accessor (index)
Index maintenance on creation
Index maintenance on deletion
Consistency checks
Query patterns for traversal

As the number of entities and edges grows, this becomes:

Hard to keep in your head
Error-prone (forgot to update an index)
Lots of boilerplate
Testing burden for plumbing rather than business logic

The temptation is to "throw hands up" and use SQL with foreign keys, or accept eventual consistency. But this sacrifices the compile-time guarantees Rust can provide.

The Vision

A framework where developers declare:

Entities with their valid states (state machines)
Edges between entities (typed, directional, with cardinality)
Transitions (what state changes are valid, and when)

And the framework provides:

Auto-generated accessors (both directions)
Auto-maintained indexes
Compile-time invalid transition errors
Runtime referential integrity (fail-fast or transactional)
Event log as source of truth with replay capability
Potential automatic concurrency from CPN place-disjointness

Why

Correctness Guarantees

Compile-time: Invalid state transitions are type errors
Compile-time: Edge definitions guarantee bidirectional navigability
Runtime: Referential integrity violations detected immediately
Result: "If it compiles and the event log replays, the state is consistent"

Performance "For Free"

Indexes auto-maintained as edges are created/destroyed
No query planning needed - traversal patterns known at compile time
Potential: CPN place-disjointness → automatic safe concurrency

Developer Experience

Declare entities, states, edges, transitions
Library generates: accessors, inverse indexes, transition methods, consistency checks
Focus on what not how - the plumbing disappears
Still Rust: escape hatch to custom logic when needed

Testing Burden Reduction

No tests for "did I update the index correctly"
No tests for "can I traverse this relationship backwards"
Focus tests on business logic, not graph bookkeeping

How

Foundations

Colored Petri Nets for state machine composition semantics
Typestate pattern for compile-time transition validity
Event sourcing for persistence and replay

Implementation Approach

Declarative DSL or proc macros for entity/edge/transition definitions:

// Hypothetical syntax
entity! {
    Want {
        states: [New, Idle, Building, Successful, Failed, Canceled],
        transitions: [
            New -> Idle,
            New -> Building,
            Idle -> Building,
            Building -> Successful,
            Building -> Failed,
            // ...
        ]
    }
}

edge! {
    servicing_wants: JobRun -> many Want,
    built_by: Partition -> one JobRun,
}

Code generation produces:

Entity structs with state type parameters
Edge storage with auto-maintained inverses
Transition methods that enforce valid source states
Query methods for traversal in both directions

The Graph Model

Entities are nodes (with state)
Edges are typed, directional, with cardinality (one/many)
Both directions always queryable
Edge creation/deletion is transactional within a step

Entry Point

Single step(event) -> Result<(), StepError> that:

Validates the event against current state
Applies state transitions
Updates all affected indexes
Returns success or rolls back

Transactionality

Beyond Fail-Fast

Instead of panicking on consistency violations, support transactional semantics:

// Infallible (panics on error)
state.step(event);

// Fallible (returns error, state unchanged on failure)
state.try_step(event) -> Result<(), StepError>;

// Explicit transaction (for multi-event atomicity)
let txn = state.begin();
txn.apply(event1)?;
txn.apply(event2)?;
txn.commit(); // or rollback on drop

What This Enables

Local atomicity: A single event either fully applies or doesn't - no partial states
Distributed coordination: If step can return Err instead of panicking:
- Try to apply an event
- If it fails, coordinate with other systems before retrying
- Implement saga patterns, 2PC, etc.
Speculative execution: "What if I applied this event?" without committing
- Useful for validation, dry-runs, conflict detection
Optimistic concurrency:
- Multiple workers try to apply events concurrently
- Conflicts detected and rolled back
- Retry with updated state

Implementation Options

Copy-on-write / snapshot: Clone state, apply to clone, swap on success
- Simple but memory-heavy for large state
Command pattern / undo log: Record inverse operations, replay backwards on rollback
- More complex, but efficient for small changes to large state
MVCC-style: Version all entities, only "commit" versions on success
- Most sophisticated, enables concurrent reads during transaction

Relationship to Datomic

Datomic is a distributed database built on similar principles that validates many of these ideas in production:

Shared Concepts

Concept	Datomic	This Framework
Immutable facts	Datoms (E-A-V-T tuples)	BEL events
Time travel	`as-of` queries	Event replay
Speculative execution	`d/with`	`try_step()` / transactions
Atomic commits	`d/transact` = `d/with` + durable swap	`step()` = validate + apply + persist
Transaction-time validation	Transaction functions with `db-before`	Transition guards
Post-transaction validation	Entity specs with `db-after`	Invariant checks
Single writer	Transactor serializes all writes	Single `step()` entry point
Horizontal read scaling	Peers cache and query locally	Immutable state snapshots

Datomic's Speculative Writes

Datomic's d/with is particularly relevant - it's a pure function that takes a database value and proposed facts, returning a new database value without persisting. This enables:

Testing transactions without mutation
Composing transaction data before committing
Enforcing invariants by speculatively applying, checking, then committing or aborting
Development against production data safely (via libraries like Datomock)

What Datomic Doesn't Provide

CPN state machine semantics: Typed transitions between entity states
Compile-time transition validity: Invalid transitions caught by the type system
Auto-generated bidirectional indexes: Declared edges automatically traversable both ways
Rust: Memory safety, zero-cost abstractions, embeddable

The vision here is essentially: Datomic's transaction model + CPN state machines + Rust compile-time safety

Open Questions

How to express transition guards (conditions beyond "in state X")?
How to handle edges to entities that don't exist yet (forward references)?
Serialization format for the event log?
How much CPN formalism to expose vs. hide?
What's the right granularity for "places" in the CPN model?
How does this interact with async/distributed systems?

Potential Names

Something evoking: event-sourced + graph + state machines + Rust

petri-graph
ironweave (iron = Rust, weave = connected graph)
factforge
datumflow

Prior Art to Investigate

Datomic (Clojure, distributed immutable database)
Bevy ECS (Rust, entity-component-system with events)
CPN Tools (Petri net modeling/simulation)
Diesel / SeaORM (Rust, compile-time SQL checking)
EventStoreDB (event sourcing infrastructure)

Next Steps

This document captures the "why" and "how" at a conceptual level. To validate:

Prototype the macro/DSL syntax for a simple 2-3 entity system
Implement auto-indexed bidirectional edges
Implement typestate transitions
Add speculative execution (try_step)
Benchmark against hand-written equivalent
Evaluate ergonomics in real use (databuild as first consumer)

8.7 KiB Raw Blame History