Build, Compile, Verify Pipeline¶

This guide explains the end-to-end pipeline for building a GDS specification, compiling it into an intermediate representation, verifying its structural and semantic properties, and extracting the canonical mathematical decomposition.

Overview¶

A GDS model moves through six stages from domain definitions to verified formal structure:

flowchart LR
    A["<b>Define</b><br/>types, spaces,<br/>entities"] --> B["<b>Build</b><br/>blocks with<br/>interfaces"]
    B --> C["<b>Compose</b><br/>operators:<br/>&gt;&gt; | .feedback()<br/>.loop()"]
    C --> D["<b>Register</b><br/>GDSSpec<br/>registry"]
    D --> E["<b>Compile</b><br/>SystemIR"]
    E --> F["<b>Verify</b><br/>generic +<br/>semantic checks"]
    D --> G["<b>Canonical</b><br/>h = f &#8728; g<br/>projection"]

There are two independent paths after registration:

Compile path: GDSSpec (or raw Block tree) --> compile_system() --> SystemIR --> verify() -- validates structural topology
Semantic path: GDSSpec --> semantic checks (SC-001..SC-007) + project_canonical() -- validates domain properties and extracts the formal decomposition

These paths are independent. You can compile without a GDSSpec, and you can run semantic checks without compiling. Most real workflows use both.

Step 1: Define Your Domain¶

Every GDS model starts with domain definitions: what types of data exist, what communication channels carry them, and what stateful entities persist across temporal boundaries.

from gds import typedef, space, entity, state_var

# Types -- runtime-validated data types
Temperature = typedef("Temperature", float, units="K")
Command = typedef("Command", float)
Energy = typedef("Energy", float, constraint=lambda x: x >= 0)

# Spaces -- typed communication channels (transient within an evaluation)
sensor_space = space("SensorSpace", measured_temp=Temperature)
command_space = space("CommandSpace", heater_command=Command)

# Entities -- stateful objects that persist across temporal boundaries (the state space X)
room = entity("Room",
    temperature=state_var(Temperature, symbol="T"),
    energy_consumed=state_var(Energy, symbol="E"),
)

These objects are plain Pydantic models. They do not reference each other and have no behavior. They become meaningful only when registered into a GDSSpec.

What `GDSSpec.collect()` does¶

GDSSpec is a mutable registry, not a validator. It stores objects by type and enforces name uniqueness. The collect() method type-dispatches each argument to the appropriate register_*() call:

from gds import GDSSpec

spec = GDSSpec(name="Thermostat")
spec.collect(
    Temperature, Command, Energy,     # TypeDefs
    sensor_space, command_space,       # Spaces
    room,                              # Entities
)

This is equivalent to calling register_type(), register_space(), and register_entity() individually. collect() accepts TypeDef, Space, Entity, Block, and ParameterDef instances. SpecWiring must be registered explicitly via register_wiring().

Step 2: Build Blocks¶

Blocks are the computational units of a GDS model. Each block has a role (what kind of computation it represents) and an interface (what ports it exposes for composition).

from gds import BoundaryAction, Policy, Mechanism, interface

# BoundaryAction -- exogenous input (no forward_in ports)
sensor = BoundaryAction(
    name="Temperature Sensor",
    interface=interface(forward_out=["Measured Temperature"]),
)

# Policy -- decision / observation logic
controller = Policy(
    name="PID Controller",
    interface=interface(
        forward_in=["Measured Temperature"],
        forward_out=["Heater Command"],
    ),
    params_used=["Kp", "Ki"],
)

# Mechanism -- state update (no backward ports)
heater = Mechanism(
    name="Heater",
    interface=interface(
        forward_in=["Heater Command"],
        forward_out=["Room Temperature"],
    ),
    updates=[("Room", "temperature"), ("Room", "energy_consumed")],
)

Four roles exist. Each enforces constraints at construction time:

Role	Purpose	Constraint
`BoundaryAction`	Exogenous input	`forward_in` must be empty
`Policy`	Decision / observation logic	No special constraints
`Mechanism`	State update	`backward_in` and `backward_out` must be empty
`ControlAction`	Reserved	No special constraints

Register blocks into the spec:

spec.collect(sensor, controller, heater)

Step 3: Compose¶

Compose blocks into a tree using four operators:

Operator	Syntax	Meaning
Sequential	`a >> b`	Output of `a` feeds input of `b`
Parallel	`a \\| b`	Side-by-side, no shared wires
Feedback	`a.feedback(wiring)`	Backward flow within an evaluation
Temporal	`a.loop(wiring)`	Forward flow across temporal boundaries

# Build the composition tree
root = sensor >> controller >> heater

The >> operator auto-wires ports by token overlap: port names are tokenized (split on + and ,, then lowercased), and ports with overlapping tokens are connected automatically. For example, "Temperature + Setpoint" auto-wires to a block with "Temperature" because they share the token "temperature".

When auto-wiring is insufficient, use explicit Wiring declarations in a StackComposition. See the Blocks & Roles guide for details.

Step 4: Compile to SystemIR¶

compile_system() transforms a Block composition tree into a flat SystemIR -- the intermediate representation used for verification and visualization. The compiler runs three stages:

flowchart TB
    subgraph "compile_system(name, root)"
        direction TB
        S1["<b>Stage 1: Flatten</b><br/>DFS walk extracts all<br/>AtomicBlock leaves"]
        S2["<b>Stage 2: Wire</b><br/>Extract explicit wirings +<br/>auto-wire stack compositions"]
        S3["<b>Stage 3: Hierarchy</b><br/>Build composition tree<br/>for visualization"]
        S1 --> IR1["list[BlockIR]"]
        S2 --> IR2["list[WiringIR]"]
        S3 --> IR3["HierarchyNodeIR"]
    end
    IR1 --> SIR["SystemIR"]
    IR2 --> SIR
    IR3 --> SIR

from gds import compile_system

system_ir = compile_system("Thermostat", root)

What each stage does¶

Stage 1 -- Flatten (flatten_blocks): Walks the composition tree depth-first and extracts every AtomicBlock leaf. Each leaf is passed through a block_compiler callback that produces a BlockIR with the block's name and interface signature.

Stage 2 -- Wire (extract_wirings): Walks the tree again, collecting all connections between blocks. For StackComposition nodes, it emits either the explicit Wiring declarations or auto-wires by token overlap. For FeedbackLoop and TemporalLoop nodes, it emits the feedback/temporal wirings with appropriate flags. Each connection is tagged with its origin (AUTO, EXPLICIT, FEEDBACK, TEMPORAL).

Stage 3 -- Hierarchy (extract_hierarchy): Builds a HierarchyNodeIR tree that mirrors the composition structure. Binary sequential/parallel chains are flattened into n-ary groups for cleaner visualization.

SystemIR structure¶

The output SystemIR is a flat, serializable model:

class SystemIR(BaseModel):
    name: str
    blocks: list[BlockIR]                    # All atomic blocks
    wirings: list[WiringIR]                  # All connections
    inputs: list[InputIR]                    # External inputs (domain-supplied)
    composition_type: CompositionType
    hierarchy: HierarchyNodeIR | None = None # Composition tree for visualization
    source: str = ""                         # Origin identifier
    metadata: dict[str, Any] = {}            # Domain-specific metadata
    parameter_schema: ParameterSchema = ...  # Parameter space (Theta)

SystemIR has no knowledge of types, spaces, entities, or roles. It is a purely structural graph of blocks and wires. This is by design -- it separates the compilation concern (structural topology) from the specification concern (domain semantics).

Domain-specific compilation¶

Domain DSL packages (stockflow, control, games) provide their own block_compiler and wiring_emitter callbacks to compile_system(). These produce BlockIR with domain-specific block_type and metadata fields. The three stages are also available as standalone functions (flatten_blocks, extract_wirings, extract_hierarchy) for custom compiler pipelines.

Step 5: Verify¶

Verification is where GDS catches specification errors. There are two independent verification paths that operate on different objects and catch different classes of problems.

flowchart TB
    subgraph "Generic Checks (Layer 0)"
        direction TB
        SIR["SystemIR"] --> V1["verify(system_ir)"]
        V1 --> G["G-001..G-006<br/>Structural topology"]
    end

    subgraph "Semantic Checks (Layer 1)"
        direction TB
        SPEC["GDSSpec"] --> SC1["check_completeness(spec)"]
        SPEC --> SC2["check_determinism(spec)"]
        SPEC --> SC3["check_reachability(spec, a, b)"]
        SPEC --> SC4["check_type_safety(spec)"]
        SPEC --> SC5["check_parameter_references(spec)"]
        SPEC --> SC6["check_canonical_wellformedness(spec)"]
    end

    G --> R1["VerificationReport"]
    SC1 --> F["list[Finding]"]
    SC2 --> F
    SC3 --> F
    SC4 --> F
    SC5 --> F
    SC6 --> F

Path A: Generic checks on SystemIR¶

Generic checks validate the structural topology of a compiled system. They know nothing about types, spaces, or domain semantics -- only blocks, wirings, and their signatures.

from gds import verify

report = verify(system_ir)

print(f"{report.checks_passed}/{report.checks_total} checks passed")
print(f"Errors: {report.errors}, Warnings: {report.warnings}")

for finding in report.findings:
    if not finding.passed:
        print(f"  [{finding.severity}] {finding.check_id}: {finding.message}")

The six generic checks:

Check	Name	What It Validates
G-001	Domain/codomain matching	Covariant wiring labels are token-subsets of source output or target input
G-002	Signature completeness	Every block has at least one input and one output slot
G-003	Direction consistency	No COVARIANT+feedback or CONTRAVARIANT+temporal contradictions; contravariant port-slot matching
G-004	Dangling wirings	All wiring endpoints reference blocks or inputs that exist in the system
G-005	Sequential type compatibility	Stack wiring labels are token-subsets of BOTH source output AND target input
G-006	Covariant acyclicity	The covariant (within-evaluation) flow graph is a DAG

You can also run a subset of checks:

from gds.verification.generic_checks import (
    check_g001_domain_codomain_matching,
    check_g006_covariant_acyclicity,
)

report = verify(system_ir, checks=[
    check_g001_domain_codomain_matching,
    check_g006_covariant_acyclicity,
])

Path B: Semantic checks on GDSSpec¶

Semantic checks validate domain properties at the specification level. They operate on GDSSpec (not SystemIR) because they need access to entities, types, spaces, parameters, and block roles -- information that is lost during compilation.

from gds.verification.spec_checks import (
    check_completeness,
    check_determinism,
    check_type_safety,
    check_parameter_references,
    check_canonical_wellformedness,
)

# Each returns list[Finding]
findings = check_completeness(spec)
findings += check_determinism(spec)
findings += check_type_safety(spec)
findings += check_parameter_references(spec)
findings += check_canonical_wellformedness(spec)

for f in findings:
    if not f.passed:
        print(f"  [{f.severity}] {f.check_id}: {f.message}")

The semantic checks:

Check	Name	What It Validates
SC-001	Completeness	Every entity variable is updated by at least one Mechanism
SC-002	Determinism	No entity variable is updated by multiple Mechanisms in the same wiring
SC-003	Reachability	Signal can propagate from one block to another through wires
SC-004	Type safety	Wire space references resolve to registered spaces
SC-005	Parameter references	All `params_used` entries on blocks match registered parameter definitions
SC-006	Canonical wellformedness	At least one Mechanism exists (f is non-empty)
SC-007	Canonical wellformedness	State space X is non-empty (entities with variables exist)

Why two verification paths?

Generic checks and semantic checks answer fundamentally different questions:

Generic checks ask: "Is this composition graph structurally sound?" They work on any Block tree, with or without a GDSSpec.
Semantic checks ask: "Does this specification make sense as a dynamical system?" They require the full domain context that only GDSSpec provides.

A model can pass all generic checks (structurally valid graph) but fail semantic checks (e.g., orphan state variables that no Mechanism updates). Conversely, a specification can pass all semantic checks but produce a SystemIR with structural problems.

Findings and reports¶

Both paths produce Finding objects:

class Finding(BaseModel):
    check_id: str              # e.g., "G-001" or "SC-002"
    severity: Severity         # ERROR, WARNING, or INFO
    message: str               # Human-readable description
    source_elements: list[str] # Block/variable names involved
    passed: bool               # True if the check passed

verify() aggregates findings into a VerificationReport with convenience properties:

class VerificationReport(BaseModel):
    system_name: str
    findings: list[Finding]

    @property
    def errors(self) -> int: ...      # Failed findings with ERROR severity
    @property
    def warnings(self) -> int: ...    # Failed findings with WARNING severity
    @property
    def checks_passed(self) -> int: ...
    @property
    def checks_total(self) -> int: ...

Custom checks¶

You can register custom verification checks with the @gds_check decorator:

from gds import gds_check, all_checks, Finding, Severity, SystemIR

@gds_check("CUSTOM-001", Severity.WARNING)
def check_no_orphan_blocks(system: SystemIR) -> list[Finding]:
    """Flag blocks with no wirings."""
    wired = {w.source for w in system.wirings} | {w.target for w in system.wirings}
    return [
        Finding(
            check_id="CUSTOM-001",
            severity=Severity.WARNING,
            message=f"Block '{b.name}' has no wirings",
            source_elements=[b.name],
            passed=False,
        )
        for b in system.blocks if b.name not in wired
    ]

# Run built-in + custom checks together
report = verify(system_ir, checks=all_checks())

Step 6: Canonical Projection¶

project_canonical() is a pure function that derives the formal GDS mathematical decomposition from a GDSSpec:

$$h_\theta : X \to X \quad \text{where} \quad h = f \circ g, \quad \theta \in \Theta$$

from gds import project_canonical

canonical = project_canonical(spec)

print(canonical.formula())
# "h_θ : X → X  (h = f_θ ∘ g_θ, θ ∈ Θ)"  (with parameters)
# "h : X → X  (h = f ∘ g)"                  (without parameters)

# State space X: entity variables
print(f"|X| = {len(canonical.state_variables)}")
for entity_name, var_name in canonical.state_variables:
    print(f"  {entity_name}.{var_name}")

# Decomposition
print(f"|f| = {len(canonical.mechanism_blocks)} mechanisms")
print(f"|g| = {len(canonical.policy_blocks)} policies")
print(f"|U| = {len(canonical.input_ports)} input ports")
print(f"|D| = {len(canonical.decision_ports)} decision ports")

The canonical projection classifies every registered block by its role:

Role	Maps to	Canonical component
`BoundaryAction`	Exogenous signal space Z	Exogenous inputs
`Policy`	Decision space D	g: X x Z --> D
`Mechanism`	State transition	f: X x D --> X
`ControlAction`	Output space Y	Output observable y = C(x, d)

project_canonical() operates on GDSSpec, not SystemIR, because it needs role information and entity definitions. It is deterministic, stateless, and never mutates the spec. The CanonicalGDS result is frozen (immutable).

When to use canonical projection vs verification

Use verification to check if a model is well-formed (catches errors).
Use canonical projection to understand what a model means mathematically (extracts structure).
SC-006 and SC-007 bridge both: they run project_canonical() internally to check that the canonical form is well-formed.

Two Type Systems¶

GDS has two coexisting type systems that serve different purposes at different stages of the pipeline:

Token-based types (compile-time)¶

Used during composition and wiring. Port names are automatically tokenized -- split on spaces, lowercased into a frozenset. The >> operator and auto-wiring use token overlap to match ports.

# These ports auto-wire because they share the token "temperature"
#   "Measured Temperature" -> {"measured", "temperature"}
#   "Room Temperature"     -> {"room", "temperature"}
#   Overlap: {"temperature"} -- match!

Token matching is used by the compiler (Stage 2) and by generic checks G-001 and G-005.

TypeDef-based types (runtime)¶

Used for data validation at the value level. TypeDef wraps a Python type with an optional constraint predicate. Used by Space and Entity to validate actual data values.

Temperature = typedef("Temperature", float, units="K")
Energy = typedef("Energy", float, constraint=lambda x: x >= 0)

TypeDefs are never invoked during compilation. They exist for runtime validation when actual data flows through the system (e.g., during simulation, which is outside the current scope of GDS).

These type systems do not interact

Token-based matching operates on port name strings. TypeDef validation operates on Python values. A port named "Temperature" will auto-wire to any port whose name contains the token "temperature", regardless of what TypeDef (if any) is associated with the data flowing through it.

Complete Example¶

Putting it all together:

from gds import (
    GDSSpec, BoundaryAction, Policy, Mechanism,
    compile_system, verify, project_canonical,
    typedef, space, entity, state_var, interface,
)
from gds.verification.spec_checks import (
    check_completeness, check_determinism,
    check_canonical_wellformedness,
)

# --- Step 1: Define domain ---
Temperature = typedef("Temperature", float, units="K")
Command = typedef("Command", float)

room = entity("Room", temperature=state_var(Temperature, symbol="T"))

# --- Step 2: Build blocks ---
sensor = BoundaryAction(
    name="Sensor",
    interface=interface(forward_out=["Temperature"]),
)
controller = Policy(
    name="Controller",
    interface=interface(
        forward_in=["Temperature"],
        forward_out=["Command"],
    ),
)
heater = Mechanism(
    name="Heater",
    interface=interface(
        forward_in=["Command"],
        forward_out=["Temperature"],
    ),
    updates=[("Room", "temperature")],
)

# --- Step 3: Compose ---
root = sensor >> controller >> heater

# --- Step 4: Register ---
spec = GDSSpec(name="Thermostat")
spec.collect(Temperature, Command, room, sensor, controller, heater)

# --- Step 5a: Compile + generic verification ---
system_ir = compile_system("Thermostat", root)
report = verify(system_ir)
print(f"Generic: {report.checks_passed}/{report.checks_total} passed")

# --- Step 5b: Semantic verification ---
findings = check_completeness(spec)
findings += check_determinism(spec)
findings += check_canonical_wellformedness(spec)
for f in findings:
    status = "PASS" if f.passed else "FAIL"
    print(f"  [{status}] {f.check_id}: {f.message}")

# --- Step 6: Canonical projection ---
canonical = project_canonical(spec)
print(f"\n{canonical.formula()}")
print(f"|X| = {len(canonical.state_variables)}, |f| = {len(canonical.mechanism_blocks)}")