Semantic Web Integration: What We Learned¶
A team summary of GDS + OWL/SHACL/SPARQL integration via gds-owl.
The Short Version¶
We can export 85% of a GDS specification to Turtle/RDF files and import it back losslessly. The 15% we lose is Python callables (transition functions, constraint predicates, distance functions). This is a mathematical certainty, not a gap we can close.
What Gets Exported (R1 -- Fully Representable)¶
Everything structural round-trips perfectly through Turtle:
| GDS Concept | RDF Representation | Validated By |
|---|---|---|
| Block names, roles, interfaces | OWL classes + properties | SHACL shapes |
| Port names and type tokens | Literals on Port nodes | SHACL datatype |
| Wiring topology (who connects to whom) | Wire nodes with source/target | SHACL cardinality |
| Entity/StateVariable declarations | Entity + StateVariable nodes | SHACL |
| TypeDef (name, python_type, units) | TypeDef node + properties | SHACL |
| Space fields | SpaceField blank nodes | SHACL |
| Parameter schema (names, types, bounds) | ParameterDef nodes | SHACL |
| Mechanism update targets (what writes where) | UpdateMapEntry nodes | SHACL |
| Admissibility dependencies (what reads what) | AdmissibilityDep nodes | SHACL |
| Transition read dependencies | TransitionReadEntry nodes | SHACL |
| State metric variable declarations | MetricVariableEntry nodes | SHACL |
| Canonical decomposition (h = f . g) | CanonicalGDS node | SHACL |
| Verification findings | Finding nodes | SHACL |
13 SHACL shapes enforce structural correctness on the RDF graph. 7 SPARQL query templates enable cross-node analysis (blocks by role, dependency paths, entity update maps, parameter impact, verification summaries).
What Requires SPARQL (R2 -- Structurally Representable)¶
Some properties can't be checked by SHACL alone (which validates individual nodes) but CAN be checked by SPARQL queries over the full graph:
| Property | SPARQL Feature | Why SHACL Can't |
|---|---|---|
| Acyclicity (G-006) | Transitive closure (p+) |
No path traversal in SHACL-core |
| Completeness (SC-001) | FILTER NOT EXISTS |
No "for all X, exists Y" |
| Determinism (SC-002) | GROUP BY + HAVING |
No cross-node aggregation |
| Dangling wirings (G-004) | FILTER NOT EXISTS |
Name existence, not class membership |
These all terminate (SPARQL over finite graphs always does) and are decidable.
What Cannot Be Exported (R3 -- Not Representable)¶
These are fundamentally non-exportable. Not a tooling gap -- a mathematical impossibility (Rice's theorem for callables, computational class separation for string processing):
| GDS Concept | Why R3 | What Happens on Export |
|---|---|---|
TypeDef.constraint (e.g. lambda x: x >= 0) |
Arbitrary Python callable | Exported as boolean flag hasConstraint; imported as None |
f_behav (transition functions) |
Arbitrary computation | Not stored in GDSSpec -- user responsibility |
AdmissibleInputConstraint.constraint |
Arbitrary callable | Exported as boolean flag; imported as None |
StateMetric.distance |
Arbitrary callable | Exported as boolean flag; imported as None |
| Auto-wiring token computation | Multi-pass string processing | Results exported (WiringIR edges); process is not |
| Construction validation | Python @model_validator logic |
Structural result preserved; validation logic is not |
Key insight: The results of R3 computation are always R1. Auto-wiring produces WiringIR edges (R1). Validation produces pass/fail (R1). Only the process is lost.
The Boundary in One Sentence¶
You can represent everything about a system except what its programs actually do. The canonical decomposition
h = f . gmakes this boundary explicit:g(topology) andf_struct(update targets) are fully representable;f_behav(how state actually changes) is not.
Practical Implications¶
What You Can Do With the Turtle Export¶
- Share specs between tools -- any RDF-aware tool (Protege, GraphDB, Neo4j via neosemantics) can import a GDS spec
- Validate specs without Python -- SHACL processors (TopBraid, pySHACL) can check structural correctness
- Query specs with SPARQL -- find all mechanisms that update a given entity, trace dependency paths, check acyclicity
- Version and diff specs -- Turtle is text, diffs are meaningful
- Cross-ecosystem interop -- other OWL ontologies can reference GDS classes/properties
What You Cannot Do¶
- Run simulations from Turtle -- you need the Python callables back
- Verify behavioral properties -- "does this mechanism converge?" requires
executing
f_behav - Reproduce auto-wiring -- the token overlap computation can't run in SPARQL
Round-Trip Fidelity¶
Tested with property-based testing (Hypothesis): 100 random GDSSpecs generated, exported to Turtle, parsed back, reimported. All structural fields survive. Known lossy fields:
TypeDef.constraint->NoneTypeDef.python_type-> falls back tostrfor non-builtin typesAdmissibleInputConstraint.constraint->NoneStateMetric.distance->None- Port/wire ordering -> set-based (RDF is unordered)
- Blank node identity -> content-based comparison, not node ID
Numbers¶
| Metric | Count |
|---|---|
| R1 concepts (fully representable) | 13 |
| R2 concepts (SPARQL-needed) | 3 |
| R3 concepts (not representable) | 7 |
| SHACL shapes | 18 |
| SPARQL templates | 7 |
| Verification checks expressible in SHACL | 6 of 15 |
| Verification checks expressible in SPARQL | 6 more |
| Checks requiring Python | 2 of 15 |
| Round-trip PBT tests | 26 |
| Random specs tested | ~2,600 |
Paper Alignment¶
The structural/behavioral split is a framework design choice, not a
paper requirement. The GDS paper (Zargham & Shorish 2022) defines
U: X -> P(U) as a single map; we split it into U_struct (dependency
graph, R1) and U_behav (constraint predicate, R3) for ontological
engineering. Same for StateMetric and TransitionSignature. The
canonical decomposition h = f . g IS faithful to the paper.
Open Question: Promoting Common Constraints to R2¶
Zargham's feedback: "We can probably classify them as two different kinds of predicates -- those associated with the model structure (owl/shacl/sparql) and those associated with the runtime."
Currently all TypeDef.constraint callables are treated as R3 (lossy).
But many common constraints ARE expressible in SHACL:
lambda x: x >= 0-->sh:minInclusive 0lambda x: 0 <= x <= 1-->sh:minInclusive 0+sh:maxInclusive 1lambda x: x in {-1, 0, 1}-->sh:in (-1 0 1)
A constraint classifier could promote these from R3 to R2, making them round-trippable through Turtle. The general case (arbitrary callable) remains R3. See #152 for the design proposal.
Files¶
packages/gds-owl/-- the full export/import/SHACL/SPARQL implementationdocs/research/formal-representability.md-- the 800-line formal analysisdocs/research/verification/r3-undecidability.md-- proofs for the R3 boundarydocs/research/verification/representability-proof.md-- R1/R2 decidability + partition independence