Cascade SpecificationΒΆ
This document specifies how DataJoint propagates restrictions across the foreign-key (FK) graph for cascade delete and cascade preview operations. It defines the propagation rules, the role of part_integrity, and the Part-to-Master upward propagation that handles renamed FKs and Part-of-Part chains.
For the user-facing entry points, see Delete Data. For dependent concepts, see Master-Part, Diagram, and Data Manipulation.
OverviewΒΆ
A cascade starts at a (possibly restricted) seed table and propagates the restriction to every table that depends on it via foreign keys, so that a delete or preview affects all dependents consistently. Cascade is invoked by:
Table.delete(part_integrity=...)β executes the cascade as a delete.Diagram.cascade(table_expr, part_integrity=...).counts()β preview, no mutation.
Both follow the same propagation rules; only the terminal step (delete vs. count) differs.
Dependency graphΒΆ
DataJoint loads its FK structure into a directed acyclic graph (Connection.dependencies, a networkx.DiGraph). The graph encodes two kinds of structure:
| Element | Encodes |
|---|---|
| Node | A table, named by its fully-qualified SQL identifier (e.g. `schema`.`table_name`). The graph stores the table's primary key set as node data. |
Edge parent β child |
An FK constraint from child to parent. Edge data: attr_map (dict mapping child's FK columns β parent's referenced columns), aliased (true iff any column was renamed), primary (true iff the FK is in the child's primary key). |
| Alias node | Inserted between a parent and child when the FK is aliased=True. Named by an integer-string ("1", "2", β¦) to distinguish from real table nodes. Both half-edges (parent β alias, alias β child) carry the same attr_map and aliased props. Treat as transparent when walking. |
The cascade engine operates on a copy of this graph (the Diagram class), recording per-table restrictions in _cascade_restrictions and the set of restricted attributes in _restriction_attrs.
Restriction propagation rulesΒΆ
When propagating a restriction across an edge parent β child, one of three rules applies. The rule depends on whether the FK renames columns (aliased) and on whether the parent's currently-restricted attributes (parent_attrs) are contained in the child's primary key (child_pk).
Forward propagation (parent β child)ΒΆ
| Rule | Trigger | Effect on child |
|---|---|---|
| F1. Copy | not aliased and parent_attrs β child_pk |
Child inherits the parent's restriction directly (same attribute names; literal restriction values copy as-is). |
| F2. Aliased rename | aliased |
Child's restriction is parent.proj(**{fk_col: parent_col for fk_col, parent_col in attr_map.items()}) β the parent expression with columns renamed to match the child's column names. |
| F3. Project | not aliased and parent_attrs β child_pk |
Child's restriction is parent.proj() β the parent projected to its primary key. |
After applying the rule, the child's restricted-attributes set is updated to track what's now constrained on it. The child becomes a propagation source for its own children in the next pass.
Upward propagation (child β parent)ΒΆ
Symmetric inverses of the forward rules. Used by part_integrity="cascade" to propagate a Part's restriction up to its Master through the FK chain. Edge metadata is the same; the direction of travel is reversed.
| Rule | Trigger | Effect on parent |
|---|---|---|
| U1. Copy | not aliased and child_attrs β parent_pk |
Parent inherits the child's restriction directly (shared attribute names). |
| U2. Aliased reverse-rename | aliased |
Parent's restriction is child.proj(**{parent_col: fk_col for fk_col, parent_col in attr_map.items()}) β the child expression with FK columns renamed back to the parent's column names. |
| U3. Project | not aliased and child_attrs β parent_pk |
Parent's restriction is child.proj() β the child projected to attributes in the parent's primary key. |
Cascade flowΒΆ
flowchart TB
A[seed table_expr] --> B[Diagram.cascade]
B --> C{seed is a Part?
part_integrity=cascade?}
C -- yes --> D[Upward walk:
_propagate_part_to_master]
C -- no --> E[Forward propagation
over descendants]
D --> E
E --> F{any new master pulled in?}
F -- yes --> E
F -- no --> G[Build subgraph,
count or delete]
The engine performs multiple passes when part_integrity="cascade" is in effect: each pass forward-propagates over all allowed_nodes, and a pass may pull in a new master (with its descendants) that requires another pass. The loop terminates because the graph is a DAG and only finitely many nodes can be added.
part_integrity modesΒΆ
Master-Part integrity reflects the contract that a Master row exists for every Part row that references it. Three modes govern how cascade enforces or relaxes that contract:
| Mode | Behavior |
|---|---|
"enforce" (default) |
If a delete would remove a Part row without removing the corresponding Master row, the entire delete is rolled back with DataJointError. The Master row is checked after the delete; the integrity violation is detected post-hoc and reversed. |
"ignore" |
No upward propagation; no post-check. Use when the Master row is intentionally preserved and the user has accepted that Part rows may be orphaned. Caller is responsible for the consequences. |
"cascade" |
Upward propagation enabled. When cascade reaches a Part, the Master is also restricted (via the upward rules below), and the Master then forward-cascades back down to all its Parts (siblings of the originating Part included). Used when the user wants the master-part group treated atomically. |
This document focuses on the "cascade" mode; the "enforce" and "ignore" modes do not change the propagation graph.
Part-to-Master upward propagationΒΆ
When part_integrity="cascade" and the cascade reaches (or starts at) a Part node, the engine triggers an upward walk of the FK graph from the Part to its Master, applying the upward rules (U1, U2, U3) at each edge.
Identifying the MasterΒΆ
The Master is identified by naming convention via dependencies.extract_master: a Part table's SQL name ends in master__partname, and the Master is the prefix before the __. The graph must contain a node with that Master name; otherwise the upward walk is skipped.
Walking the FK pathΒΆ
The walk uses nx.shortest_path(master, part) to find the FK chain from Master to Part:
Master β [intermediate Part(s)] β Part
Alias nodes ("1", "2", β¦) on the path are skipped β both half-edges carry the same attr_map and aliased props, so the engine reads props from one and reasons about the real edge. Each real edge along the walk fires one upward rule (U1, U2, or U3) per the edge's metadata. Intermediate Parts in a Part-of-Part chain are restricted along the way β not only the Master.
Materialization at the MasterΒΆ
After the upward walk completes, the Master's accumulated restrictions are materialized to a literal value tuple via (master_ft & restrictions).proj().to_arrays() and stored as a single condition. Subsequent forward propagation from the Master back down to its other Parts then generates WHERE pk IN (literal-list) rather than WHERE pk IN (SELECT ... FROM <originating-part>).
Why materialization matters. Without this step, the Master's restriction would be a QueryExpression referencing the same Part being deleted. When the Master forward-cascades back to its Parts (including the originating Part), the generated DELETE contains a subquery against the table being modified β which MySQL forbids ("error 1093: You can't specify target table 'T' for update in FROM clause"). Materializing converts the restriction into a static value set, which both backends accept.
Intermediate Parts in the chain are not materialized β they appear only as restrictions on the path, not as forward-cascade sources, so the self-reference issue doesn't arise there.
Seed-is-Part caseΒΆ
When Table.delete(part_integrity="cascade") is invoked on a Part directly (e.g. Master.PartB.delete(part_integrity="cascade")), the cascade seed is the Part itself. A leaf Part has no out-edges, so the main propagation loop β which fires the part_integrity block from inside the out_edges iteration β would not trigger.
The engine handles this case by invoking the upward walk explicitly for the seed before the main loop runs. After the walk pulls in the Master and any intermediate Parts, the main loop then forward-cascades from the Master to its remaining descendants in the usual way.
Worked examplesΒΆ
Example 1: Part-of-Part with renamed FKΒΆ
import datajoint as dj
schema = dj.Schema("imaging")
@schema
class Subject(dj.Manual):
definition = """
subject_id : int32
"""
class Session(dj.Part):
definition = """
-> master
session_id : int32
"""
class Recording(dj.Part):
definition = """
-> Subject.Session.proj(src_subject='subject_id', src_session='session_id')
recording_id : int32
"""
Recording's columns are {src_subject, src_session, recording_id} β none of them are named subject_id. The FK from Subject.Session β Subject.Recording is aliased.
When (Subject.Recording & {"recording_id": 5}).delete(part_integrity="cascade") runs:
- Seed-is-Part check.
extract_master(Recording) == Subject. Trigger the upward walk. - FK path.
shortest_path(Subject, Recording) = [Subject, Subject.Session, alias_node, Subject.Recording]β real path[Subject, Subject.Session, Subject.Recording]. - Walk reversed.
- Edge
Subject.Session β Subject.Recording:aliased=True. Apply U2 βSubject.Sessionis restricted bySubject.Recording.proj(subject_id='src_subject', session_id='src_session'). - Edge
Subject β Subject.Session:aliased=False,child_attrs={subject_id, session_id} β parent_pk={subject_id}? No (session_idnot in parent pk). Apply U3 βSubjectis restricted bySubject.Session.proj()(projected tosubject_id).
- Edge
- Materialize Master.
Subject's restriction is fetched into a value tuple; replaces the chainedQueryExpression. - Forward pass. Master forward-cascades back down to
Subject.SessionandSubject.Recording(and any sibling Parts not on the original path), now with the materialized restriction.
Without the FK walk (the pre-fix behavior), the engine joined subject_ft.proj() & recording_ft.proj() on shared attribute names. Subject has subject_id; Recording has src_subject. No shared columns β empty restriction β Master not restricted. This is the failure mode from #1429 Case 1.
Example 2: Part-of-Part with no Master reference in PartBΒΆ
@schema
class Master(dj.Manual):
definition = """
master_id : int32
"""
class PartA(dj.Part):
definition = """
-> master
part_a_id : int32
"""
class PartB(dj.Part):
definition = """
-> Master.PartA
part_b_id : int32
"""
PartB's definition references Master.PartA, not master directly. The FK chain Master β PartA β PartB still exists in the dependency graph (PartA's FK to Master is aliased=False; PartA β PartB is also aliased=False).
For (Master.PartB & {"master_id": 1}).delete(part_integrity="cascade"):
- Upward walk: PartB β PartA β Master.
- Edge PartA β PartB:
aliased=False,child_attrs β parent_pk?child_attrs = {master_id},parent_pk = {master_id, part_a_id}. Yes β. Apply U1 βPartAinheritsPartB's restriction directly. - Edge Master β PartA:
aliased=False,child_attrs = {master_id} β parent_pk = {master_id}. Apply U1 βMasterinheritsPartA's restriction.
- Edge PartA β PartB:
- Materialize Master.
- Forward cascade Master β PartA β PartB picks up all sibling rows under
master_id=1.
Without the FK walk (the pre-fix behavior), the engine jumped directly from PartB to Master via master_ft.proj() & partB_ft.proj(). PartA was never restricted, and the chain semantics were silently incorrect. This is #1429 Case 2.
Algorithmic complexityΒΆ
For a cascade subgraph with N nodes and E edges, propagation runs in at most O(N Β· E) edge visits per pass, with at most one pass per master pulled in by upward propagation. For typical schemas (small Master-Part groups, shallow chains), the cost is dominated by the materialization fetch at the Master β one SELECT ... FROM master_ft over the restricted master rows.
What is not part of this specificationΒΆ
Diagram.trace()for general upstream restriction propagation: a related but distinct feature tracked in #1423.trace()exposes upstream propagation as a first-class operator; the cascade engine's upward walk above is a closed implementation detail ofpart_integrity="cascade".- Custom propagation rules (user-defined): not supported. The three forward and three upward rules cover the cases the FK graph can produce.
- Cross-schema cascade: handled by
dependencies.load_all_downstream()called fromDiagram.cascade(); orthogonal to the propagation rules described here.
ReferencesΒΆ
- Source:
src/datajoint/diagram.pyβDiagram.cascade,_propagate_restrictions,_apply_propagation_rule,_apply_propagation_rule_upward,_propagate_part_to_master. - Source:
src/datajoint/dependencies.pyβ graph construction,extract_master, alias node insertion. - Issue: datajoint-python #1429 β bug report and motivating examples for the upward propagation rules.
- Master-Part Specification β Part-Master contract.
- Diagram Specification β graph operations on the dependency graph.
- Data Manipulation Specification β
delete()anddelete_quick()user-facing API.