Table¶
Base Table and FreeTable classes
ValidationResult
dataclass
¶
Result of table.validate() call.
Attributes: is_valid: True if all rows passed validation errors: List of (row_index, field_name, error_message) tuples rows_checked: Number of rows that were validated
Table ¶
Bases: QueryExpression
Table is an abstract class that represents a table in the schema. It implements insert and delete methods and inherits query functionality. To make it a concrete class, override the abstract properties specifying the connection, table name, database, and definition.
heading
property
¶
heading
Return the table's heading, or raise a helpful error if not configured.
Overrides QueryExpression.heading to provide a clear error message when the table is not properly associated with an activated schema. For base tier classes (Lookup, Manual, etc.), returns None to support introspection (e.g., help()).
declare ¶
declare(context=None)
Declare the table in the schema based on self.definition.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context
|
dict
|
The context for foreign key resolution. If None, foreign keys are not allowed. |
None
|
from_clause ¶
from_clause()
Return the FROM clause of SQL SELECT statements.
Returns:
| Type | Description |
|---|---|
str
|
The full table name for use in SQL FROM clauses. |
get_select_fields ¶
get_select_fields(select_fields=None)
Return the selected attributes from the SQL SELECT statement.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
select_fields
|
list
|
List of attribute names to select. If None, selects all attributes. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
The SQL field selection string. |
parents ¶
parents(primary=None, as_objects=False, foreign_key_info=False)
Return the list of parent tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
primary
|
bool
|
If None, then all parents are returned. If True, then only foreign keys composed of primary key attributes are considered. If False, return foreign keys including at least one secondary attribute. |
None
|
as_objects
|
bool
|
If False, return table names. If True, return table objects. |
False
|
foreign_key_info
|
bool
|
If True, each element in result also includes foreign key info. |
False
|
Returns:
| Type | Description |
|---|---|
list
|
List of parents as table names or table objects with (optional) foreign key information. |
children ¶
children(primary=None, as_objects=False, foreign_key_info=False)
Return the list of child tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
primary
|
bool
|
If None, then all children are returned. If True, then only foreign keys composed of primary key attributes are considered. If False, return foreign keys including at least one secondary attribute. |
None
|
as_objects
|
bool
|
If False, return table names. If True, return table objects. |
False
|
foreign_key_info
|
bool
|
If True, each element in result also includes foreign key info. |
False
|
Returns:
| Type | Description |
|---|---|
list
|
List of children as table names or table objects with (optional) foreign key information. |
descendants ¶
descendants(as_objects=False)
Return list of descendant tables in topological order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
as_objects
|
bool
|
If False (default), return a list of table names. If True, return a list of table objects. |
False
|
Returns:
| Type | Description |
|---|---|
list
|
List of descendant tables in topological order. |
ancestors ¶
ancestors(as_objects=False)
Return list of ancestor tables in topological order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
as_objects
|
bool
|
If False (default), return a list of table names. If True, return a list of table objects. |
False
|
Returns:
| Type | Description |
|---|---|
list
|
List of ancestor tables in topological order. |
parts ¶
parts(as_objects=False)
Return part tables for this master table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
as_objects
|
bool
|
If False (default), the output is a list of full table names. If True, return table objects. |
False
|
Returns:
| Type | Description |
|---|---|
list
|
List of part table names or table objects. |
is_declared
property
¶
is_declared
Check if the table is declared in the schema.
Returns:
| Type | Description |
|---|---|
bool
|
True if the table is declared in the schema. |
full_table_name
property
¶
full_table_name
Return the full table name in the schema.
Returns:
| Type | Description |
|---|---|
str
|
Full table name in the format |
update1 ¶
update1(row)
Update one existing entry in the table.
Caution: In DataJoint the primary modes for data manipulation is to insert and
delete entire records since referential integrity works on the level of records,
not fields. Therefore, updates are reserved for corrective operations outside of main
workflow. Use UPDATE methods sparingly with full awareness of potential violations of
assumptions.
The primary key attributes must always be provided.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
row
|
dict
|
A dict containing the primary key values and the attributes to update. Setting an attribute value to None will reset it to the default value (if any). |
required |
Examples:
>>> table.update1({'id': 1, 'value': 3}) # update value in record with id=1
>>> table.update1({'id': 1, 'value': None}) # reset value to default
validate ¶
validate(rows, *, ignore_extra_fields=False)
Validate rows without inserting them.
Validates: - Field existence (all fields must be in table heading) - Row format (correct number of attributes for positional inserts) - Codec validation (type checking via codec.validate()) - NULL constraints (non-nullable fields must have values) - Primary key completeness (all PK fields must be present) - UUID format and JSON serializability
Cannot validate (database-enforced): - Foreign key constraints - Unique constraints (other than PK) - Custom MySQL constraints
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rows
|
iterable
|
Same format as insert() - iterable of dicts, tuples, numpy records, or a pandas DataFrame. |
required |
ignore_extra_fields
|
bool
|
If True, ignore fields not in the table heading. |
False
|
Returns:
| Type | Description |
|---|---|
ValidationResult
|
Result with is_valid, errors list, and rows_checked count. |
Examples:
>>> result = table.validate(rows)
>>> if result:
... table.insert(rows)
... else:
... print(result.summary())
insert1 ¶
insert1(row, **kwargs)
Insert one data record into the table.
For kwargs, see insert().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
row
|
numpy.void, dict, or sequence
|
A numpy record, a dict-like object, or an ordered sequence to be inserted as one row. |
required |
**kwargs
|
Additional arguments passed to |
{}
|
See Also
insert : Insert multiple data records.
staged_insert1
property
¶
staged_insert1
Context manager for staged insert with direct object storage writes.
Use this for large objects like Zarr arrays where copying from local storage is inefficient. Allows writing directly to the destination storage before finalizing the database insert.
Example: with table.staged_insert1 as staged: staged.rec['subject_id'] = 123 staged.rec['session_id'] = 45
# Create object storage directly
z = zarr.open(staged.store('raw_data', '.zarr'), mode='w', shape=(1000, 1000))
z[:] = data
# Assign to record
staged.rec['raw_data'] = z
# On successful exit: metadata computed, record inserted
# On exception: storage cleaned up, no record inserted
Yields: StagedInsert: Context for setting record values and getting storage handles
insert ¶
insert(rows, replace=False, skip_duplicates=False, ignore_extra_fields=False, allow_direct_insert=None, chunk_size=None)
Insert a collection of rows.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rows
|
iterable or Path
|
Either (a) an iterable where an element is a numpy record, a dict-like object, a pandas.DataFrame, a polars.DataFrame, a pyarrow.Table, a sequence, or a query expression with the same heading as self, or (b) a pathlib.Path object specifying a path relative to the current directory with a CSV file, the contents of which will be inserted. |
required |
replace
|
bool
|
If True, replaces the existing tuple. |
False
|
skip_duplicates
|
bool
|
If True, silently skip duplicate inserts. |
False
|
ignore_extra_fields
|
bool
|
If False (default), fields that are not in the heading raise error. |
False
|
allow_direct_insert
|
bool
|
Only applies in auto-populated tables. If False (default), insert may only be called from inside the make callback. |
None
|
chunk_size
|
int
|
If set, insert rows in batches of this size. Useful for very large inserts to avoid memory issues. Each chunk is a separate transaction. |
None
|
Examples:
>>> Table.insert([
... dict(subject_id=7, species="mouse", date_of_birth="2014-09-01"),
... dict(subject_id=8, species="mouse", date_of_birth="2014-09-02")])
Large insert with chunking:
>>> Table.insert(large_dataset, chunk_size=10000)
insert_dataframe ¶
insert_dataframe(df, index_as_pk=None, **insert_kwargs)
Insert DataFrame with explicit index handling.
This method provides symmetry with to_pandas(): data fetched with to_pandas() (which sets primary key as index) can be modified and re-inserted using insert_dataframe() without manual index manipulation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame to insert. |
required |
index_as_pk
|
bool
|
How to handle DataFrame index:
|
None
|
**insert_kwargs
|
Passed to insert() - replace, skip_duplicates, ignore_extra_fields, allow_direct_insert, chunk_size. |
{}
|
Examples:
Round-trip with to_pandas():
>>> df = table.to_pandas() # PK becomes index
>>> df['value'] = df['value'] * 2 # Modify data
>>> table.insert_dataframe(df) # Auto-detects index as PK
Explicit control:
>>> table.insert_dataframe(df, index_as_pk=True) # Use index
>>> table.insert_dataframe(df, index_as_pk=False) # Ignore index
delete_quick ¶
delete_quick(get_count=False)
Deletes the table without cascading and without user prompt. If this table has populated dependent tables, this will fail.
delete ¶
delete(transaction=True, prompt=None, part_integrity='enforce')
Deletes the contents of the table and its dependent tables, recursively.
Args:
transaction: If True, use of the entire delete becomes an atomic transaction.
This is the default and recommended behavior. Set to False if this delete is
nested within another transaction.
prompt: If True, show what will be deleted and ask for confirmation.
If False, delete without confirmation. Default is dj.config['safemode'].
part_integrity: Policy for master-part integrity. One of:
- "enforce" (default): Error if parts would be deleted without masters.
- "ignore": Allow deleting parts without masters (breaks integrity).
- "cascade": Also delete masters when parts are deleted (maintains integrity).
Returns: Number of deleted rows (excluding those from dependent tables).
Raises: DataJointError: Delete exceeds maximum number of delete attempts. DataJointError: When deleting within an existing transaction. DataJointError: Deleting a part table before its master (when part_integrity="enforce"). ValueError: Invalid part_integrity value.
drop_quick ¶
drop_quick()
Drops the table without cascading to dependent tables and without user prompt.
drop ¶
drop(prompt=None)
Drop the table and all tables that reference it, recursively.
Args:
prompt: If True, show what will be dropped and ask for confirmation.
If False, drop without confirmation. Default is dj.config['safemode'].
size_on_disk
property
¶
size_on_disk
Return the size of data and indices in bytes on the storage device.
Returns:
| Type | Description |
|---|---|
int
|
Size of data and indices in bytes. |
describe ¶
describe(context=None, printout=False)
Return the definition string for the query using DataJoint DDL.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context
|
dict
|
The context for foreign key resolution. If None, uses the caller's local and global namespace. |
None
|
printout
|
bool
|
If True, also log the definition string. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
The definition string for the table in DataJoint DDL format. |
lookup_class_name ¶
lookup_class_name(name, context, depth=3)
Find a table's class in the context given its full table name.
Given a table name in the form schema_name.table_name, find its class in
the context.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Full table name in format |
required |
context
|
dict
|
Dictionary representing the namespace. |
required |
depth
|
int
|
Search depth into imported modules, helps avoid infinite recursion. |
3
|
Returns:
| Type | Description |
|---|---|
str or None
|
Class name found in the context or None if not found. |
FreeTable ¶
Bases: Table
A base table without a dedicated class.
Each instance is associated with a table specified by full_table_name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
A DataJoint connection object. |
required |
full_table_name
|
str
|
Full table name in format |
required |