How-To Guides¶
Practical guides for common tasks.
These guides help you accomplish specific tasks with DataJoint. Unlike tutorials, they assume you understand the basics and focus on getting things done.
Setup¶
- Installation โ Installing DataJoint
- Manage Secrets and Credentials โ Secure configuration management
- Configure Database Connection โ Connection settings
- Configure Object Storage โ S3, MinIO, file stores
- Use the Command-Line Interface โ Interactive REPL
Schema Design¶
- Define Tables โ Table definition syntax
- Model Relationships โ Foreign key patterns
- Master-Part Tables โ Compositional data patterns
- Design Primary Keys โ Key selection strategies
- Read Schema Diagrams โ Interpret visual diagrams
Project Management¶
- Manage a Pipeline Project โ Multi-schema pipelines, team collaboration
- Deploy to Production โ Production mode, schema prefixes, environment config
Data Operations¶
- Insert Data โ Single rows, batches, transactions
- Query Data โ Operators, restrictions, projections
- Fetch Results โ DataFrames, dicts, streaming
- Delete Data โ Safe deletion with cascades
Computation¶
- Run Computations โ populate() basics
- Distributed Computing โ Multi-process, cluster
- Handle Errors โ Error recovery and job management
- Monitor Progress โ Dashboards and status
Object Storage¶
- Object Storage Overview โ Navigation guide for all storage docs
- Choose a Storage Type โ Decision guide for codecs
- Use Object Storage โ When and how
- Use the
<npy>Codec โ NumPy arrays with lazy loading - Use Plugin Codecs โ Install codec packages via entry points
- Create Custom Codecs โ Domain-specific types
- Manage Large Data โ Blobs, streaming, efficiency
- Clean Up Object Storage โ Garbage collection
Maintenance¶
- Migrate to v2.0 โ Upgrading existing pipelines
- Alter Tables โ Schema evolution
- Backup and Restore โ Data protection
Testing¶
- Testing Best Practices โ Integration testing with pytest