Skip to content

Table

Base Table and FreeTable classes

ValidationResult dataclass

Result of table.validate() call.

Attributes: is_valid: True if all rows passed validation errors: List of (row_index, field_name, error_message) tuples rows_checked: Number of rows that were validated

raise_if_invalid

raise_if_invalid()

Raise DataJointError if validation failed.

summary

summary()

Return formatted error summary.

Table

Bases: QueryExpression

Table is an abstract class that represents a table in the schema. It implements insert and delete methods and inherits query functionality. To make it a concrete class, override the abstract properties specifying the connection, table name, database, and definition.

heading property

heading

Return the table's heading, or raise a helpful error if not configured.

Overrides QueryExpression.heading to provide a clear error message when the table is not properly associated with an activated schema. For base tier classes (Lookup, Manual, etc.), returns None to support introspection (e.g., help()).

declare

declare(context=None)

Declare the table in the schema based on self.definition.

Parameters:

Name Type Description Default
context dict

The context for foreign key resolution. If None, foreign keys are not allowed.

None

alter

alter(prompt=True, context=None)

Alter the table definition from self.definition

from_clause

from_clause()

Return the FROM clause of SQL SELECT statements.

Returns:

Type Description
str

The full table name for use in SQL FROM clauses.

get_select_fields

get_select_fields(select_fields=None)

Return the selected attributes from the SQL SELECT statement.

Parameters:

Name Type Description Default
select_fields list

List of attribute names to select. If None, selects all attributes.

None

Returns:

Type Description
str

The SQL field selection string.

parents

parents(primary=None, as_objects=False, foreign_key_info=False)

Return the list of parent tables.

Parameters:

Name Type Description Default
primary bool

If None, then all parents are returned. If True, then only foreign keys composed of primary key attributes are considered. If False, return foreign keys including at least one secondary attribute.

None
as_objects bool

If False, return table names. If True, return table objects.

False
foreign_key_info bool

If True, each element in result also includes foreign key info.

False

Returns:

Type Description
list

List of parents as table names or table objects with (optional) foreign key information.

children

children(primary=None, as_objects=False, foreign_key_info=False)

Return the list of child tables.

Parameters:

Name Type Description Default
primary bool

If None, then all children are returned. If True, then only foreign keys composed of primary key attributes are considered. If False, return foreign keys including at least one secondary attribute.

None
as_objects bool

If False, return table names. If True, return table objects.

False
foreign_key_info bool

If True, each element in result also includes foreign key info.

False

Returns:

Type Description
list

List of children as table names or table objects with (optional) foreign key information.

descendants

descendants(as_objects=False)

Return list of descendant tables in topological order.

Parameters:

Name Type Description Default
as_objects bool

If False (default), return a list of table names. If True, return a list of table objects.

False

Returns:

Type Description
list

List of descendant tables in topological order.

ancestors

ancestors(as_objects=False)

Return list of ancestor tables in topological order.

Parameters:

Name Type Description Default
as_objects bool

If False (default), return a list of table names. If True, return a list of table objects.

False

Returns:

Type Description
list

List of ancestor tables in topological order.

parts

parts(as_objects=False)

Return part tables for this master table.

Parameters:

Name Type Description Default
as_objects bool

If False (default), the output is a list of full table names. If True, return table objects.

False

Returns:

Type Description
list

List of part table names or table objects.

is_declared property

is_declared

Check if the table is declared in the schema.

Returns:

Type Description
bool

True if the table is declared in the schema.

full_table_name property

full_table_name

Return the full table name in the schema.

Returns:

Type Description
str

Full table name in the format database.table_name.

adapter property

adapter

Database adapter for backend-agnostic SQL generation.

update1

update1(row)

Update one existing entry in the table.

Caution: In DataJoint the primary modes for data manipulation is to insert and delete entire records since referential integrity works on the level of records, not fields. Therefore, updates are reserved for corrective operations outside of main workflow. Use UPDATE methods sparingly with full awareness of potential violations of assumptions.

The primary key attributes must always be provided.

Parameters:

Name Type Description Default
row dict

A dict containing the primary key values and the attributes to update. Setting an attribute value to None will reset it to the default value (if any).

required

Examples:

>>> table.update1({'id': 1, 'value': 3})  # update value in record with id=1
>>> table.update1({'id': 1, 'value': None})  # reset value to default

validate

validate(rows, *, ignore_extra_fields=False)

Validate rows without inserting them.

Validates: - Field existence (all fields must be in table heading) - Row format (correct number of attributes for positional inserts) - Codec validation (type checking via codec.validate()) - NULL constraints (non-nullable fields must have values) - Primary key completeness (all PK fields must be present) - UUID format and JSON serializability

Cannot validate (database-enforced): - Foreign key constraints - Unique constraints (other than PK) - Custom MySQL constraints

Parameters:

Name Type Description Default
rows iterable

Same format as insert() - iterable of dicts, tuples, numpy records, or a pandas DataFrame.

required
ignore_extra_fields bool

If True, ignore fields not in the table heading.

False

Returns:

Type Description
ValidationResult

Result with is_valid, errors list, and rows_checked count.

Examples:

>>> result = table.validate(rows)
>>> if result:
...     table.insert(rows)
... else:
...     print(result.summary())

insert1

insert1(row, **kwargs)

Insert one data record into the table.

For kwargs, see insert().

Parameters:

Name Type Description Default
row numpy.void, dict, or sequence

A numpy record, a dict-like object, or an ordered sequence to be inserted as one row.

required
**kwargs

Additional arguments passed to insert().

{}
See Also

insert : Insert multiple data records.

staged_insert1 property

staged_insert1

Context manager for staged insert with direct object storage writes.

Use this for large objects like Zarr arrays where copying from local storage is inefficient. Allows writing directly to the destination storage before finalizing the database insert.

Example: with table.staged_insert1 as staged: staged.rec['subject_id'] = 123 staged.rec['session_id'] = 45

    # Create object storage directly
    z = zarr.open(staged.store('raw_data', '.zarr'), mode='w', shape=(1000, 1000))
    z[:] = data

    # Assign to record
    staged.rec['raw_data'] = z

# On successful exit: metadata computed, record inserted
# On exception: storage cleaned up, no record inserted

Yields: StagedInsert: Context for setting record values and getting storage handles

insert

insert(rows, replace=False, skip_duplicates=False, ignore_extra_fields=False, allow_direct_insert=None, chunk_size=None)

Insert a collection of rows.

Parameters:

Name Type Description Default
rows iterable or Path

Either (a) an iterable where an element is a numpy record, a dict-like object, a pandas.DataFrame, a polars.DataFrame, a pyarrow.Table, a sequence, or a query expression with the same heading as self, or (b) a pathlib.Path object specifying a path relative to the current directory with a CSV file, the contents of which will be inserted.

required
replace bool

If True, replaces the existing tuple.

False
skip_duplicates bool

If True, silently skip duplicate inserts.

False
ignore_extra_fields bool

If False (default), fields that are not in the heading raise error.

False
allow_direct_insert bool

Only applies in auto-populated tables. If False (default), insert may only be called from inside the make callback.

None
chunk_size int

If set, insert rows in batches of this size. Useful for very large inserts to avoid memory issues. Each chunk is a separate transaction.

None

Examples:

>>> Table.insert([
...     dict(subject_id=7, species="mouse", date_of_birth="2014-09-01"),
...     dict(subject_id=8, species="mouse", date_of_birth="2014-09-02")])

Large insert with chunking:

>>> Table.insert(large_dataset, chunk_size=10000)

insert_dataframe

insert_dataframe(df, index_as_pk=None, **insert_kwargs)

Insert DataFrame with explicit index handling.

This method provides symmetry with to_pandas(): data fetched with to_pandas() (which sets primary key as index) can be modified and re-inserted using insert_dataframe() without manual index manipulation.

Parameters:

Name Type Description Default
df DataFrame

DataFrame to insert.

required
index_as_pk bool

How to handle DataFrame index:

  • None (default): Auto-detect. Use index as primary key if index names match primary_key columns. Drop if unnamed RangeIndex.
  • True: Treat index as primary key columns. Raises if index names don't match table primary key.
  • False: Ignore index entirely (drop it).
None
**insert_kwargs

Passed to insert() - replace, skip_duplicates, ignore_extra_fields, allow_direct_insert, chunk_size.

{}

Examples:

Round-trip with to_pandas():

>>> df = table.to_pandas()           # PK becomes index
>>> df['value'] = df['value'] * 2    # Modify data
>>> table.insert_dataframe(df)       # Auto-detects index as PK

Explicit control:

>>> table.insert_dataframe(df, index_as_pk=True)   # Use index
>>> table.insert_dataframe(df, index_as_pk=False)  # Ignore index

delete_quick

delete_quick(get_count=False)

Deletes the table without cascading and without user prompt. If this table has populated dependent tables, this will fail.

delete

delete(transaction=True, prompt=None, part_integrity='enforce')

Deletes the contents of the table and its dependent tables, recursively.

Args: transaction: If True, use of the entire delete becomes an atomic transaction. This is the default and recommended behavior. Set to False if this delete is nested within another transaction. prompt: If True, show what will be deleted and ask for confirmation. If False, delete without confirmation. Default is dj.config['safemode']. part_integrity: Policy for master-part integrity. One of: - "enforce" (default): Error if parts would be deleted without masters. - "ignore": Allow deleting parts without masters (breaks integrity). - "cascade": Also delete masters when parts are deleted (maintains integrity).

Returns: Number of deleted rows (excluding those from dependent tables).

Raises: DataJointError: Delete exceeds maximum number of delete attempts. DataJointError: When deleting within an existing transaction. DataJointError: Deleting a part table before its master (when part_integrity="enforce"). ValueError: Invalid part_integrity value.

drop_quick

drop_quick()

Drops the table without cascading to dependent tables and without user prompt.

drop

drop(prompt=None)

Drop the table and all tables that reference it, recursively.

Args: prompt: If True, show what will be dropped and ask for confirmation. If False, drop without confirmation. Default is dj.config['safemode'].

size_on_disk property

size_on_disk

Return the size of data and indices in bytes on the storage device.

Returns:

Type Description
int

Size of data and indices in bytes.

describe

describe(context=None, printout=False)

Return the definition string for the query using DataJoint DDL.

Parameters:

Name Type Description Default
context dict

The context for foreign key resolution. If None, uses the caller's local and global namespace.

None
printout bool

If True, also log the definition string.

False

Returns:

Type Description
str

The definition string for the table in DataJoint DDL format.

lookup_class_name

lookup_class_name(name, context, depth=3)

Find a table's class in the context given its full table name.

Given a table name in the form schema_name.table_name, find its class in the context.

Parameters:

Name Type Description Default
name str

Full table name in format schema_name.table_name.

required
context dict

Dictionary representing the namespace.

required
depth int

Search depth into imported modules, helps avoid infinite recursion.

3

Returns:

Type Description
str or None

Class name found in the context or None if not found.

FreeTable

Bases: Table

A base table without a dedicated class.

Each instance is associated with a table specified by full_table_name.

Parameters:

Name Type Description Default
conn Connection

A DataJoint connection object.

required
full_table_name str

Full table name in format database.table_name.

required