About DataJoint¶
DataJoint is an open-source framework for building scientific data pipelines. It was created to address the challenges of managing complex, interconnected data in research laboratories.
What is DataJoint?¶
DataJoint implements the Relational Workflow Modelโa paradigm that extends relational databases with native support for computational workflows. Unlike traditional databases that only store data, DataJoint pipelines define how data flows through processing steps, when computations run, and how results depend on inputs.
Key characteristics:
- Declarative schema design โ Define tables and relationships in Python
- Automatic dependency tracking โ Foreign keys encode workflow dependencies
- Built-in computation โ Imported and Computed tables run automatically
- Data integrity โ Referential integrity and transaction support
- Reproducibility โ Immutable data with full provenance
History¶
DataJoint was developed at Baylor College of Medicine starting in 2009 to support neuroscience research. It has since been adopted by laboratories worldwide for a variety of scientific applications.
Citation¶
If you use DataJoint in your research, please cite it appropriately.
Contributing¶
DataJoint is developed openly on GitHub. Contributions are welcome.
License¶
DataJoint is released under the Apache License 2.0.
Copyright 2024 DataJoint Inc. and contributors.