241 lines
8.7 KiB
Markdown
241 lines
8.7 KiB
Markdown
# Event-Sourced CPN Framework
|
|
|
|
A vision for a Rust library/framework combining event sourcing, Colored Petri Net semantics, and compile-time safety for building correct distributed systems.
|
|
|
|
## The Problem
|
|
|
|
In highly connected applications with multiple entity types and relationships (like databuild's Wants, JobRuns, Partitions), developers face combinatorial complexity:
|
|
|
|
For each edge type between entities, you need:
|
|
1. Forward accessor
|
|
2. Inverse accessor (index)
|
|
3. Index maintenance on creation
|
|
4. Index maintenance on deletion
|
|
5. Consistency checks
|
|
6. Query patterns for traversal
|
|
|
|
As the number of entities and edges grows, this becomes:
|
|
- Hard to keep in your head
|
|
- Error-prone (forgot to update an index)
|
|
- Lots of boilerplate
|
|
- Testing burden for plumbing rather than business logic
|
|
|
|
The temptation is to "throw hands up" and use SQL with foreign keys, or accept eventual consistency. But this sacrifices the compile-time guarantees Rust can provide.
|
|
|
|
## The Vision
|
|
|
|
A framework where developers declare:
|
|
- **Entities** with their valid states (state machines)
|
|
- **Edges** between entities (typed, directional, with cardinality)
|
|
- **Transitions** (what state changes are valid, and when)
|
|
|
|
And the framework provides:
|
|
- Auto-generated accessors (both directions)
|
|
- Auto-maintained indexes
|
|
- Compile-time invalid transition errors
|
|
- Runtime referential integrity (fail-fast or transactional)
|
|
- Event log as source of truth with replay capability
|
|
- Potential automatic concurrency from CPN place-disjointness
|
|
|
|
## Why
|
|
|
|
### Correctness Guarantees
|
|
|
|
- **Compile-time**: Invalid state transitions are type errors
|
|
- **Compile-time**: Edge definitions guarantee bidirectional navigability
|
|
- **Runtime**: Referential integrity violations detected immediately
|
|
- **Result**: "If it compiles and the event log replays, the state is consistent"
|
|
|
|
### Performance "For Free"
|
|
|
|
- Indexes auto-maintained as edges are created/destroyed
|
|
- No query planning needed - traversal patterns known at compile time
|
|
- Potential: CPN place-disjointness → automatic safe concurrency
|
|
|
|
### Developer Experience
|
|
|
|
- Declare entities, states, edges, transitions
|
|
- Library generates: accessors, inverse indexes, transition methods, consistency checks
|
|
- Focus on *what* not *how* - the plumbing disappears
|
|
- Still Rust: escape hatch to custom logic when needed
|
|
|
|
### Testing Burden Reduction
|
|
|
|
- No tests for "did I update the index correctly"
|
|
- No tests for "can I traverse this relationship backwards"
|
|
- Focus tests on business logic, not graph bookkeeping
|
|
|
|
## How
|
|
|
|
### Foundations
|
|
|
|
- **Colored Petri Nets** for state machine composition semantics
|
|
- **Typestate pattern** for compile-time transition validity
|
|
- **Event sourcing** for persistence and replay
|
|
|
|
### Implementation Approach
|
|
|
|
Declarative DSL or proc macros for entity/edge/transition definitions:
|
|
|
|
```rust
|
|
// Hypothetical syntax
|
|
entity! {
|
|
Want {
|
|
states: [New, Idle, Building, Successful, Failed, Canceled],
|
|
transitions: [
|
|
New -> Idle,
|
|
New -> Building,
|
|
Idle -> Building,
|
|
Building -> Successful,
|
|
Building -> Failed,
|
|
// ...
|
|
]
|
|
}
|
|
}
|
|
|
|
edge! {
|
|
servicing_wants: JobRun -> many Want,
|
|
built_by: Partition -> one JobRun,
|
|
}
|
|
```
|
|
|
|
Code generation produces:
|
|
- Entity structs with state type parameters
|
|
- Edge storage with auto-maintained inverses
|
|
- Transition methods that enforce valid source states
|
|
- Query methods for traversal in both directions
|
|
|
|
### The Graph Model
|
|
|
|
- Entities are nodes (with state)
|
|
- Edges are typed, directional, with cardinality (one/many)
|
|
- Both directions always queryable
|
|
- Edge creation/deletion is transactional within a step
|
|
|
|
### Entry Point
|
|
|
|
Single `step(event) -> Result<(), StepError>` that:
|
|
1. Validates the event against current state
|
|
2. Applies state transitions
|
|
3. Updates all affected indexes
|
|
4. Returns success or rolls back
|
|
|
|
## Transactionality
|
|
|
|
### Beyond Fail-Fast
|
|
|
|
Instead of panicking on consistency violations, support transactional semantics:
|
|
|
|
```rust
|
|
// Infallible (panics on error)
|
|
state.step(event);
|
|
|
|
// Fallible (returns error, state unchanged on failure)
|
|
state.try_step(event) -> Result<(), StepError>;
|
|
|
|
// Explicit transaction (for multi-event atomicity)
|
|
let txn = state.begin();
|
|
txn.apply(event1)?;
|
|
txn.apply(event2)?;
|
|
txn.commit(); // or rollback on drop
|
|
```
|
|
|
|
### What This Enables
|
|
|
|
1. **Local atomicity**: A single event either fully applies or doesn't - no partial states
|
|
|
|
2. **Distributed coordination**: If `step` can return `Err` instead of panicking:
|
|
- Try to apply an event
|
|
- If it fails, coordinate with other systems before retrying
|
|
- Implement saga patterns, 2PC, etc.
|
|
|
|
3. **Speculative execution**: "What if I applied this event?" without committing
|
|
- Useful for validation, dry-runs, conflict detection
|
|
|
|
4. **Optimistic concurrency**:
|
|
- Multiple workers try to apply events concurrently
|
|
- Conflicts detected and rolled back
|
|
- Retry with updated state
|
|
|
|
### Implementation Options
|
|
|
|
1. **Copy-on-write / snapshot**: Clone state, apply to clone, swap on success
|
|
- Simple but memory-heavy for large state
|
|
|
|
2. **Command pattern / undo log**: Record inverse operations, replay backwards on rollback
|
|
- More complex, but efficient for small changes to large state
|
|
|
|
3. **MVCC-style**: Version all entities, only "commit" versions on success
|
|
- Most sophisticated, enables concurrent reads during transaction
|
|
|
|
## Relationship to Datomic
|
|
|
|
[Datomic](https://docs.datomic.com/datomic-overview.html) is a distributed database built on similar principles that validates many of these ideas in production:
|
|
|
|
### Shared Concepts
|
|
|
|
| Concept | Datomic | This Framework |
|
|
|---------|---------|----------------|
|
|
| Immutable facts | Datoms (E-A-V-T tuples) | BEL events |
|
|
| Time travel | `as-of` queries | Event replay |
|
|
| Speculative execution | [`d/with`](https://docs.datomic.com/transactions/transaction-processing.html) | `try_step()` / transactions |
|
|
| Atomic commits | `d/transact` = `d/with` + durable swap | `step()` = validate + apply + persist |
|
|
| Transaction-time validation | [Transaction functions](https://docs.datomic.com/transactions/transaction-functions.html) with `db-before` | Transition guards |
|
|
| Post-transaction validation | [Entity specs](https://docs.datomic.com/transactions/model.html) with `db-after` | Invariant checks |
|
|
| Single writer | Transactor serializes all writes | Single `step()` entry point |
|
|
| Horizontal read scaling | Peers cache and query locally | Immutable state snapshots |
|
|
|
|
### Datomic's Speculative Writes
|
|
|
|
Datomic's `d/with` is particularly relevant - it's a [pure function](https://vvvvalvalval.github.io/posts/2018-11-12-datomic-event-sourcing-without-the-hassle.html) that takes a database value and proposed facts, returning a new database value *without persisting*. This enables:
|
|
|
|
- Testing transactions without mutation
|
|
- Composing transaction data before committing
|
|
- [Enforcing invariants](https://stackoverflow.com/questions/48268887/how-to-prevent-transactions-from-violating-application-invariants-in-datomic) by speculatively applying, checking, then committing or aborting
|
|
- Development against production data safely (via libraries like Datomock)
|
|
|
|
### What Datomic Doesn't Provide
|
|
|
|
- **CPN state machine semantics**: Typed transitions between entity states
|
|
- **Compile-time transition validity**: Invalid transitions caught by the type system
|
|
- **Auto-generated bidirectional indexes**: Declared edges automatically traversable both ways
|
|
- **Rust**: Memory safety, zero-cost abstractions, embeddable
|
|
|
|
The vision here is essentially: *Datomic's transaction model + CPN state machines + Rust compile-time safety*
|
|
|
|
## Open Questions
|
|
|
|
- How to express transition guards (conditions beyond "in state X")?
|
|
- How to handle edges to entities that don't exist yet (forward references)?
|
|
- Serialization format for the event log?
|
|
- How much CPN formalism to expose vs. hide?
|
|
- What's the right granularity for "places" in the CPN model?
|
|
- How does this interact with async/distributed systems?
|
|
|
|
## Potential Names
|
|
|
|
Something evoking: event-sourced + graph + state machines + Rust
|
|
|
|
- `petri-graph`
|
|
- `ironweave` (iron = Rust, weave = connected graph)
|
|
- `factforge`
|
|
- `datumflow`
|
|
|
|
## Prior Art to Investigate
|
|
|
|
- Datomic (Clojure, distributed immutable database)
|
|
- Bevy ECS (Rust, entity-component-system with events)
|
|
- CPN Tools (Petri net modeling/simulation)
|
|
- Diesel / SeaORM (Rust, compile-time SQL checking)
|
|
- EventStoreDB (event sourcing infrastructure)
|
|
|
|
## Next Steps
|
|
|
|
This document captures the "why" and "how" at a conceptual level. To validate:
|
|
|
|
1. Prototype the macro/DSL syntax for a simple 2-3 entity system
|
|
2. Implement auto-indexed bidirectional edges
|
|
3. Implement typestate transitions
|
|
4. Add speculative execution (`try_step`)
|
|
5. Benchmark against hand-written equivalent
|
|
6. Evaluate ergonomics in real use (databuild as first consumer)
|