Working with Instances¶
This tutorial introduces dj.Instance, which provides isolated database connections. You'll learn to:
- Create an Instance with explicit credentials
- Define tables using
inst.Schema() - Insert and query data (same API as the global pattern)
- Connect to multiple databases simultaneously
- Understand when to use Instances vs the global pattern
New in DataJoint 2.2. For the rationale behind Instances, see What's New in 2.2.
Setup¶
Instead of configuring dj.config and calling dj.Schema(), we create an Instance with explicit credentials:
import os
import datajoint as dj
inst = dj.Instance(
host=os.environ.get("DJ_HOST", "localhost"),
user=os.environ.get("DJ_USER", "root"),
password=os.environ.get("DJ_PASS", "tutorial"),
)
[2026-02-19 18:32:23] Loading configuration from /main/datajoint.json
[2026-02-19 18:32:23][WARNING]: SSL connection failed (connection to server at "postgres" (172.20.0.3), port 5432 failed: server does not support SSL, but SSL was required ). Falling back to non-SSL connection. To require SSL, set use_tls=True explicitly.
The Instance has its own config. Any settings we change here don't affect dj.config:
inst.config.safemode = False
print(f"Instance safemode: {inst.config.safemode}")
print(f"Global safemode: {dj.config.safemode}")
Instance safemode: False Global safemode: False
Define Tables¶
Create a Schema from the Instance. Table definitions work exactly like the global pattern:
schema = inst.Schema('tutorial_instances')
@schema
class Lab(dj.Manual):
definition = """
lab_id : int32
---
lab_name : varchar(100)
institution : varchar(200)
"""
@schema
class Researcher(dj.Manual):
definition = """
-> Lab
researcher_id : int32
---
researcher_name : varchar(100)
"""
Insert and Query Data¶
All data operations work identically—insert, restriction, projection, join, aggregation:
Lab.insert([
{'lab_id': 1, 'lab_name': 'Neural Dynamics', 'institution': 'MIT'},
{'lab_id': 2, 'lab_name': 'Vision Lab', 'institution': 'Stanford'},
])
Researcher.insert([
{'lab_id': 1, 'researcher_id': 1, 'researcher_name': 'Alice Chen'},
{'lab_id': 1, 'researcher_id': 2, 'researcher_name': 'Bob Park'},
{'lab_id': 2, 'researcher_id': 3, 'researcher_name': 'Carol Ruiz'},
])
# Restriction
Researcher & {'lab_id': 1}
| lab_id | researcher_id | researcher_name |
|---|---|---|
| 1 | 1 | Alice Chen |
| 1 | 2 | Bob Park |
Total: 2
# Join
Researcher * Lab
| lab_id | researcher_id | researcher_name | lab_name | institution |
|---|---|---|---|---|
| 1 | 1 | Alice Chen | Neural Dynamics | MIT |
| 1 | 2 | Bob Park | Neural Dynamics | MIT |
| 2 | 3 | Carol Ruiz | Vision Lab | Stanford |
Total: 3
# Aggregation
Lab.aggr(Researcher, num_researchers='count(*)')
| lab_id | num_researchers |
|---|---|
| 1 | 2 |
| 2 | 1 |
Total: 2
# Fetch
Researcher.to_dicts()
[{'lab_id': 1, 'researcher_id': 1, 'researcher_name': 'Alice Chen'},
{'lab_id': 1, 'researcher_id': 2, 'researcher_name': 'Bob Park'},
{'lab_id': 2, 'researcher_id': 3, 'researcher_name': 'Carol Ruiz'}]
Multiple Instances¶
The real power of Instances is connecting to multiple databases at once. Let's create a second Instance that points to a different schema on the same server (in practice, these could be entirely different servers):
inst2 = dj.Instance(
host=os.environ.get("DJ_HOST", "localhost"),
user=os.environ.get("DJ_USER", "root"),
password=os.environ.get("DJ_PASS", "tutorial"),
)
schema2 = inst2.Schema('tutorial_instances_2')
@schema2
class Project(dj.Manual):
definition = """
project_id : int32
---
project_name : varchar(200)
"""
Project.insert([
{'project_id': 1, 'project_name': 'Brain Mapping Initiative'},
{'project_id': 2, 'project_name': 'Visual Cortex Study'},
])
[2026-02-19 18:32:23] Loading configuration from /main/datajoint.json
[2026-02-19 18:32:23][WARNING]: SSL connection failed (connection to server at "postgres" (172.20.0.3), port 5432 failed: server does not support SSL, but SSL was required ). Falling back to non-SSL connection. To require SSL, set use_tls=True explicitly.
# Query each independently
print("Labs from instance 1:")
for row in Lab.to_dicts():
print(f" {row['lab_name']} ({row['institution']})")
print("\nProjects from instance 2:")
for row in Project.to_dicts():
print(f" {row['project_name']}")
Labs from instance 1: Neural Dynamics (MIT) Vision Lab (Stanford) Projects from instance 2: Brain Mapping Initiative Visual Cortex Study
Global Pattern vs Instance Pattern¶
The global pattern reads credentials from config files and environment variables:
# Global pattern (unchanged, still works)
import datajoint as dj
schema = dj.Schema('my_db') # uses dj.config credentials
The Instance pattern provides credentials explicitly:
# Instance pattern (new in 2.2)
inst = dj.Instance(host='localhost', user='root', password='secret')
schema = inst.Schema('my_db') # uses inst's own connection
Once a Schema exists, all table operations are the same regardless of how it was created.
When to use which:
| Scenario | Recommendation |
|---|---|
| Interactive notebook, one database | Global pattern |
| Web server, multi-tenant | Instance pattern |
| Testing with isolation | Instance pattern |
| Multiple databases at once | Instance pattern |
| Single-user script | Either works |
Summary¶
| Concept | Description |
|---|---|
dj.Instance(host, user, password) |
Create an isolated config + connection |
inst.Schema(name) |
Create a Schema using the Instance's connection |
inst.config |
Per-instance configuration |
inst.FreeTable(full_name) |
Access an existing table without a class |
DJ_THREAD_SAFE=true |
Disable global state, force Instance usage |
Next Steps¶
- What's New in 2.2 — Feature overview and rationale
- Use Isolated Instances — Task-oriented guide
- Configuration Reference — Thread-safe mode settings
# Cleanup
schema.drop(prompt=False)
schema2.drop(prompt=False)