How-To Guides¶

Practical guides for common tasks.

These guides help you accomplish specific tasks with DataJoint. Unlike tutorials, they assume you understand the basics and focus on getting things done.

Setup¶

Installation — Installing DataJoint
Manage Secrets and Credentials — Secure configuration management
Configure Database Connection — Connection settings
Configure Object Storage — S3, MinIO, file stores
Use the Command-Line Interface — Interactive REPL

Schema Design¶

Define Tables — Table definition syntax
Model Relationships — Foreign key patterns
Master-Part Tables — Compositional data patterns
Design Primary Keys — Key selection strategies
Read Schema Diagrams — Interpret visual diagrams

Project Management¶

Manage a Pipeline Project — Multi-schema pipelines, team collaboration
Deploy to Production — Production mode, schema prefixes, environment config

Data Operations¶

Insert Data — Single rows, batches, transactions
Query Data — Operators, restrictions, projections
Fetch Results — DataFrames, dicts, streaming
Delete Data — Safe deletion with cascades

Computation¶

Run Computations — populate() basics
Distributed Computing — Multi-process, cluster
Handle Errors — Error recovery and job management
Monitor Progress — Dashboards and status

Object Storage¶

Object Storage Overview — Navigation guide for all storage docs
Choose a Storage Type — Decision guide for codecs
Use Object Storage — When and how
Use the <npy> Codec — NumPy arrays with lazy loading
Use Plugin Codecs — Install codec packages via entry points
Create Custom Codecs — Domain-specific types
Manage Large Data — Blobs, streaming, efficiency
Clean Up Object Storage — Garbage collection

Maintenance¶

Migrate to v2.0 — Upgrading existing pipelines
Alter Tables — Schema evolution
Backup and Restore — Data protection

Testing¶

Testing Best Practices — Integration testing with pytest