What's New in DataJoint 2.2.4¶
DataJoint 2.2.4 introduces env-var-only configuration of storage, a public plugin-adapter contract for third-party storage protocols, and tightens credential loading for files.
Upgrading from 2.2.0โ2.2.3? No breaking changes for projects using
datajoint.jsonor.secrets/. The new env vars are purely additive.
Overview¶
The DataJoint platform โ and many production deployments generally โ provision configuration entirely from environment variables: there is no datajoint.json in the container image and no .secrets/ directory on disk. Until 2.2.4, this worked for the database connection (DJ_HOST, DJ_USER, DJ_PASS, โฆ) but not for object stores: per-store credentials had to be configured through datajoint.json or .secrets/stores.<name>.<attr> files.
DataJoint 2.2.4 closes that gap with two new env vars, both purely additive:
DJ_STORESโ a JSON-encoded copy of the entirestoresdict, in the same shape used indatajoint.json.DJ_IGNORE_CONFIG_FILEโ a boolean flag that skips bothdatajoint.jsonand the secrets directory entirely.
The 2.2.4 release also formalizes the storage-adapter plugin contract (datajoint.storage entry-point group), which had been used internally since 2.0 but lacked a published spec. Third-party packages can now register storage protocols (Databricks Unity Catalog Volumes, custom HTTP-based stores, lab-specific archive systems, โฆ) by subclassing dj.StorageAdapter and declaring an entry point.
DJ_STORES โ JSON-encoded stores configuration¶
New in 2.2.4
DJ_STORES accepts a JSON object identical to the stores block of datajoint.json.
A single env var carries the entire stores dict. The format matches what users already write in datajoint.json, so config can be moved between file and env var by copy-paste โ no per-field naming scheme to learn.
export DJ_STORES='{
"default": "main",
"main": {
"protocol": "s3",
"endpoint": "s3.amazonaws.com",
"bucket": "my-bucket",
"location": "my-project/production",
"access_key": "AKIA...",
"secret_key": "wJal..."
}
}'
For plugin-registered adapters, the field names are whatever the adapter defines โ token, api_key, workspace_url, etc.:
export DJ_STORES='{
"uc": {
"protocol": "databricks",
"workspace_url": "https://my-workspace.cloud.databricks.com",
"volume": "main.default.my_volume",
"token": "dapibd..."
}
}'
Precedence¶
DJ_STORES, if set, replaces the stores block loaded from datajoint.json wholesale. The .secrets/ directory still runs after DJ_STORES and fills in any attributes that DJ_STORES omits โ useful if a deployment wants to inject only secrets via env vars while leaving non-sensitive store config in a file.
| Source | Priority |
|---|---|
dj.config["stores"][...] (programmatic) |
1 (highest) |
DJ_STORES env var |
2 |
datajoint.json stores block |
3 |
.secrets/stores.<name>.<attr> files |
4 (fills missing attrs only) |
Errors¶
If DJ_STORES is set but unparsable, DataJoint raises ValueError at config load time with the JSON error, rather than failing later with a confusing KeyError from a half-loaded store.
ValueError: DJ_STORES contains invalid JSON: Expecting property name enclosed in double quotes...
DJ_IGNORE_CONFIG_FILE โ skip files entirely¶
New in 2.2.4
Set DJ_IGNORE_CONFIG_FILE=true to skip datajoint.json and the secrets directory.
For env-var-only deployments โ Kubernetes pods, Lambda functions, the DataJoint platform โ set:
export DJ_IGNORE_CONFIG_FILE=true
When true, DataJoint skips:
- the recursive parent-directory search for
datajoint.json - the project
.secrets/directory - the Docker/Kubernetes
/run/secrets/datajoint/directory
Only env vars (DJ_HOST, DJ_USER, DJ_PASS, DJ_STORES, โฆ) and defaults apply. This guarantees that no stray file in a container image can leak into config.
| Variable | Values | Default | Description |
|---|---|---|---|
DJ_IGNORE_CONFIG_FILE |
true, 1, yes / false, 0, no |
false |
Skip file-based config sources |
.secrets/stores.<name>.<attr> accepts any attribute¶
New in 2.2.4
Any .secrets/stores.<name>.<attr> file loads into dj.config["stores"][<name>][<attr>], not just access_key / secret_key.
Previously, only .secrets/stores.<name>.access_key and .secrets/stores.<name>.secret_key were honored. Plugin-registered adapters often need other field names โ a Databricks adapter wants a Bearer token, an HTTP adapter might want api_key, etc.
In 2.2.4, any file matching stores.<name>.<attr> under the secrets directory is loaded:
.secrets/
โโโ stores.uc.token # Databricks Bearer token
โโโ stores.main.access_key # S3 access key
โโโ stores.main.secret_key # S3 secret key
Config-file values and DJ_STORES still take precedence โ secrets only fill attributes that are not already set.
Storage-adapter plugin contract¶
New in 2.2.4
The datajoint.storage entry-point group is now part of the public API.
Third-party packages can register additional storage protocols (Databricks Unity Catalog Volumes, custom HTTP-based stores, lab archive systems) by declaring an entry point. The built-in file, s3, gcs, and azure protocols continue to be served by the existing internal dispatch in StorageBackend; migrating them onto the public adapter contract is tracked separately.
# pyproject.toml of a plugin package
[project.entry-points."datajoint.storage"]
databricks = "dj_databricks:DatabricksVolumesAdapter"
Once installed, the protocol name (databricks in the example) is accepted in any stores.<name>.protocol field, and DataJoint will use the adapter to construct the underlying fsspec filesystem.
See Storage Adapter API for the full plugin contract.
See Also¶
- What's New in 2.2 โ Previous release (isolated instances, thread-safe mode, graph-driven cascade)
- Release Notes (v2.2.4) โ GitHub changelog
- Manage Secrets โ Updated for
DJ_STORESandDJ_IGNORE_CONFIG_FILE - Configure Object Storage โ Env-var-only deployments
- Storage Adapter API โ Plugin contract
- Configuration Reference โ Full env-var table
- datajoint-python PR #1452 โ Implementation