Table¶

Base Table and FreeTable classes

ValidationResult `dataclass` ¶

Result of table.validate() call.

Attributes: is_valid: True if all rows passed validation errors: List of (row_index, field_name, error_message) tuples rows_checked: Number of rows that were validated

raise_if_invalid ¶

raise_if_invalid()

Raise DataJointError if validation failed.

summary ¶

summary()

Return formatted error summary.

Table ¶

Bases: QueryExpression

Table is an abstract class that represents a table in the schema. It implements insert and delete methods and inherits query functionality. To make it a concrete class, override the abstract properties specifying the connection, table name, database, and definition.

heading `property` ¶

heading

Return the table's heading, or raise a helpful error if not configured.

Overrides QueryExpression.heading to provide a clear error message when the table is not properly associated with an activated schema. For base tier classes (Lookup, Manual, etc.), returns None to support introspection (e.g., help()).

declare ¶

declare(context=None)

Declare the table in the schema based on self.definition.

Parameters:

Name	Type	Description	Default
`context`	`dict`	The context for foreign key resolution. If None, foreign keys are not allowed.	`None`

alter ¶

alter(prompt=True, context=None)

Alter the table definition from self.definition

from_clause ¶

from_clause()

Return the FROM clause of SQL SELECT statements.

Returns:

Type	Description
`str`	The full table name for use in SQL FROM clauses.

get_select_fields ¶

get_select_fields(select_fields=None)

Return the selected attributes from the SQL SELECT statement.

Parameters:

Name	Type	Description	Default
`select_fields`	`list`	List of attribute names to select. If None, selects all attributes.	`None`

Returns:

Type	Description
`str`	The SQL field selection string.

parents ¶

parents(primary=None, as_objects=False, foreign_key_info=False)

Return the list of parent tables.

Parameters:

Name	Type	Description	Default
`primary`	`bool`	If None, then all parents are returned. If True, then only foreign keys composed of primary key attributes are considered. If False, return foreign keys including at least one secondary attribute.	`None`
`as_objects`	`bool`	If False, return table names. If True, return table objects.	`False`
`foreign_key_info`	`bool`	If True, each element in result also includes foreign key info.	`False`

Returns:

Type	Description
`list`	List of parents as table names or table objects with (optional) foreign key information.

children ¶

children(primary=None, as_objects=False, foreign_key_info=False)

Return the list of child tables.

Parameters:

Name	Type	Description	Default
`primary`	`bool`	If None, then all children are returned. If True, then only foreign keys composed of primary key attributes are considered. If False, return foreign keys including at least one secondary attribute.	`None`
`as_objects`	`bool`	If False, return table names. If True, return table objects.	`False`
`foreign_key_info`	`bool`	If True, each element in result also includes foreign key info.	`False`

Returns:

Type	Description
`list`	List of children as table names or table objects with (optional) foreign key information.

descendants ¶

descendants(as_objects=False)

Return list of descendant tables in topological order.

Parameters:

Name	Type	Description	Default
`as_objects`	`bool`	If False (default), return a list of table names. If True, return a list of table objects.	`False`

Returns:

Type	Description
`list`	List of descendant tables in topological order.

ancestors ¶

ancestors(as_objects=False)

Return list of ancestor tables in topological order.

Parameters:

Name	Type	Description	Default
`as_objects`	`bool`	If False (default), return a list of table names. If True, return a list of table objects.	`False`

Returns:

Type	Description
`list`	List of ancestor tables in topological order.

parts ¶

parts(as_objects=False)

Return part tables for this master table.

Parameters:

Name	Type	Description	Default
`as_objects`	`bool`	If False (default), the output is a list of full table names. If True, return table objects.	`False`

Returns:

Type	Description
`list`	List of part table names or table objects.

is_declared `property` ¶

is_declared

Check if the table is declared in the schema.

Returns:

Type	Description
`bool`	True if the table is declared in the schema.

full_table_name `property` ¶

full_table_name

Return the full table name in the schema.

Returns:

Type	Description
`str`	Full table name in the format `database`.`table_name`.

adapter `property` ¶

adapter

Database adapter for backend-agnostic SQL generation.

update1 ¶

update1(row)

Update one existing entry in the table.

Caution: In DataJoint the primary modes for data manipulation is to insert and delete entire records since referential integrity works on the level of records, not fields. Therefore, updates are reserved for corrective operations outside of main workflow. Use UPDATE methods sparingly with full awareness of potential violations of assumptions.

The primary key attributes must always be provided.

Parameters:

Name	Type	Description	Default
`row`	`dict`	A dict containing the primary key values and the attributes to update. Setting an attribute value to None will reset it to the default value (if any).	required

Examples:

>>> table.update1({'id': 1, 'value': 3})  # update value in record with id=1
>>> table.update1({'id': 1, 'value': None})  # reset value to default

validate ¶

validate(rows, *, ignore_extra_fields=False)

Validate rows without inserting them.

Validates: - Field existence (all fields must be in table heading) - Row format (correct number of attributes for positional inserts) - Codec validation (type checking via codec.validate()) - NULL constraints (non-nullable fields must have values) - Primary key completeness (all PK fields must be present) - UUID format and JSON serializability

Cannot validate (database-enforced): - Foreign key constraints - Unique constraints (other than PK) - Custom MySQL constraints

Parameters:

Name	Type	Description	Default
`rows`	`iterable`	Same format as insert() - iterable of dicts, tuples, numpy records, or a pandas DataFrame.	required
`ignore_extra_fields`	`bool`	If True, ignore fields not in the table heading.	`False`

Returns:

Type	Description
`ValidationResult`	Result with is_valid, errors list, and rows_checked count.

Examples:

>>> result = table.validate(rows)
>>> if result:
...     table.insert(rows)
... else:
...     print(result.summary())

insert1 ¶

insert1(row, **kwargs)

Insert one data record into the table.

For kwargs, see insert().

Parameters:

Name	Type	Description	Default
`row`	`numpy.void, dict, or sequence`	A numpy record, a dict-like object, or an ordered sequence to be inserted as one row.	required
`**kwargs`		Additional arguments passed to `insert()`.	`{}`

staged_insert1 `property` ¶

staged_insert1

Context manager for staged insert with direct object storage writes.

Use this for large objects like Zarr arrays where copying from local storage is inefficient. Allows writing directly to the destination storage before finalizing the database insert.

Example: with table.staged_insert1 as staged: staged.rec['subject_id'] = 123 staged.rec['session_id'] = 45

    # Create object storage directly
    z = zarr.open(staged.store('raw_data', '.zarr'), mode='w', shape=(1000, 1000))
    z[:] = data

    # Assign to record
    staged.rec['raw_data'] = z

# On successful exit: metadata computed, record inserted
# On exception: storage cleaned up, no record inserted

Yields: StagedInsert: Context for setting record values and getting storage handles

insert ¶

insert(rows, replace=False, skip_duplicates=False, ignore_extra_fields=False, allow_direct_insert=None, chunk_size=None)

Insert a collection of rows.

Parameters:

Name	Type	Description	Default
`rows`	`iterable or Path`	Either (a) an iterable where an element is a numpy record, a dict-like object, a pandas.DataFrame, a polars.DataFrame, a pyarrow.Table, a sequence, or a query expression with the same heading as self, or (b) a pathlib.Path object specifying a path relative to the current directory with a CSV file, the contents of which will be inserted.	required
`replace`	`bool`	If True, replaces the existing tuple.	`False`
`skip_duplicates`	`bool`	If True, silently skip rows with duplicate primary key values. On PostgreSQL, secondary unique constraint violations still raise an error even when `skip_duplicates=True`, because the generated `ON CONFLICT (pk) DO NOTHING` clause targets only the primary key. On MySQL, `ON DUPLICATE KEY UPDATE` catches all unique-key conflicts, so secondary unique violations are also silently skipped.	`False`
`ignore_extra_fields`	`bool`	If False (default), fields that are not in the heading raise error.	`False`
`allow_direct_insert`	`bool`	Only applies in auto-populated tables. If False (default), insert may only be called from inside the make callback.	`None`
`chunk_size`	`int`	If set, insert rows in batches of this size. Useful for very large inserts to avoid memory issues. Each chunk is a separate transaction.	`None`

Examples:

>>> Table.insert([
...     dict(subject_id=7, species="mouse", date_of_birth="2014-09-01"),
...     dict(subject_id=8, species="mouse", date_of_birth="2014-09-02")])

Large insert with chunking:

>>> Table.insert(large_dataset, chunk_size=10000)

insert_dataframe ¶

insert_dataframe(df, index_as_pk=None, **insert_kwargs)

Insert DataFrame with explicit index handling.

This method provides symmetry with to_pandas(): data fetched with to_pandas() (which sets primary key as index) can be modified and re-inserted using insert_dataframe() without manual index manipulation.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame to insert.	required
`index_as_pk`	`bool`	How to handle DataFrame index: None (default): Auto-detect. Use index as primary key if index names match primary_key columns. Drop if unnamed RangeIndex. True: Treat index as primary key columns. Raises if index names don't match table primary key. False: Ignore index entirely (drop it).	`None`
`**insert_kwargs`		Passed to insert() - replace, skip_duplicates, ignore_extra_fields, allow_direct_insert, chunk_size.	`{}`

Examples:

Round-trip with to_pandas():

>>> df = table.to_pandas()           # PK becomes index
>>> df['value'] = df['value'] * 2    # Modify data
>>> table.insert_dataframe(df)       # Auto-detects index as PK

Explicit control:

>>> table.insert_dataframe(df, index_as_pk=True)   # Use index
>>> table.insert_dataframe(df, index_as_pk=False)  # Ignore index

delete_quick ¶

delete_quick(get_count=False)

Deletes the table without cascading and without user prompt. If this table has populated dependent tables, this will fail.

delete ¶

delete(transaction=True, prompt=None, part_integrity='enforce')

Deletes the contents of the table and its dependent tables, recursively.

Uses graph-driven cascade: builds a dependency diagram, propagates restrictions to all descendants, then deletes in reverse topological order (leaves first).

With safemode=True (the default), delete previews all affected tables and row counts, executes within a transaction, and asks for confirmation before committing. Declining rolls back all changes — effectively a built-in dry run.

To preview cascade impact without executing, use Diagram::

dj.Diagram.cascade(MyTable & restriction).counts()

Args: transaction: If True, use of the entire delete becomes an atomic transaction. This is the default and recommended behavior. Set to False if this delete is nested within another transaction. prompt: If True, show what will be deleted and ask for confirmation. If False, delete without confirmation. Default is dj.config['safemode']. part_integrity: Policy for master-part integrity. One of: - "enforce" (default): Error if parts would be deleted without masters. - "ignore": Allow deleting parts without masters (breaks integrity). - "cascade": Also delete masters when parts are deleted (maintains integrity).

Returns: Number of deleted rows (excluding those from dependent tables).

Raises: DataJointError: When deleting within an existing transaction. DataJointError: Deleting a part table before its master (when part_integrity="enforce"). ValueError: Invalid part_integrity value.

drop_quick ¶

drop_quick()

Drops the table without cascading to dependent tables and without user prompt.

drop ¶

drop(prompt=None, part_integrity='enforce')

Drop the table and all tables that reference it, recursively.

Uses graph-driven traversal: builds a dependency diagram and drops in reverse topological order (leaves first).

With safemode=True (the default), drop previews all affected tables and row counts and asks for confirmation before proceeding.

Args: prompt: If True, show what will be dropped and ask for confirmation. If False, drop without confirmation. Default is dj.config['safemode']. part_integrity: Policy for master-part integrity. One of: - "enforce" (default): Error if parts would be dropped without masters. - "ignore": Allow dropping parts without masters.

describe ¶

describe(context=None, printout=False)

Return the definition string for the query using DataJoint DDL.

Parameters:

Name	Type	Description	Default
`context`	`dict`	The context for foreign key resolution. If None, uses the caller's local and global namespace.	`None`
`printout`	`bool`	If True, also log the definition string.	`False`

Returns:

Type	Description
`str`	The definition string for the table in DataJoint DDL format.

lookup_class_name ¶

lookup_class_name(name, context, depth=3)

Find a table's class in the context given its full table name.

Given a table name in the form schema_name.table_name, find its class in the context.

Parameters:

Name	Type	Description	Default
`name`	`str`	Full table name in format `schema_name`.`table_name`.	required
`context`	`dict`	Dictionary representing the namespace.	required
`depth`	`int`	Search depth into imported modules, helps avoid infinite recursion.	`3`

Returns:

Type	Description
`str or None`	Class name found in the context or None if not found.

FreeTable ¶

Bases: Table

A base table without a dedicated class.

Each instance is associated with a table specified by full_table_name.

Parameters:

Name	Type	Description	Default
`conn`	`Connection`	A DataJoint connection object.	required
`full_table_name`	`str`	Full table name in format `database`.`table_name`.	required

Table¶

ValidationResult dataclass ¶

raise_if_invalid ¶

summary ¶

Table ¶

heading property ¶

declare ¶

alter ¶

from_clause ¶

get_select_fields ¶

parents ¶

children ¶

descendants ¶

ancestors ¶

parts ¶

is_declared property ¶

full_table_name property ¶

adapter property ¶

update1 ¶

validate ¶

insert1 ¶

staged_insert1 property ¶

insert ¶

insert_dataframe ¶

delete_quick ¶

delete ¶

drop_quick ¶

drop ¶

describe ¶

lookup_class_name ¶

FreeTable ¶

ValidationResult `dataclass` ¶

heading `property` ¶

is_declared `property` ¶

full_table_name `property` ¶

adapter `property` ¶

staged_insert1 `property` ¶