JSON Data Type¶
The json type stores semi-structured data as complete objects. This tutorial covers:
- When to use JSON vs normalized tables
- Defining, inserting, and fetching JSON data
When to Use JSON¶
Good for:
- Evolving schemas not yet finalized
- Heterogeneous data with varying fields per entry
- Metadata and configuration storage
- Preserving structure from external APIs
Prefer normalized tables when:
- Structure is consistent across entries
- You need database-level filtering on specific fields
- Referential integrity matters
JSON fields are stored and retrieved as complete objects. Filter on JSON content in Python after fetching.
In [1]:
Copied!
import datajoint as dj
schema = dj.Schema('tutorial_json')
import datajoint as dj
schema = dj.Schema('tutorial_json')
[2026-02-06 11:44:28] DataJoint 2.1.0 connected to datajoint@127.0.0.1:5432
Table Definition¶
In [2]:
Copied!
@schema
class Equipment(dj.Manual):
definition = """
equipment_id : int32
---
name : varchar(100)
specs=null : json # flexible specifications
"""
@schema
class Equipment(dj.Manual):
definition = """
equipment_id : int32
---
name : varchar(100)
specs=null : json # flexible specifications
"""
Inserting JSON Data¶
Pass Python dicts for JSON fields. Structure can vary between entries:
In [3]:
Copied!
Equipment.insert([
{
'equipment_id': 1,
'name': 'Microscope A',
'specs': {
'magnification': [10, 40, 100],
'camera': {'model': 'XR500', 'resolution': [2048, 2048]},
'calibrated': True,
},
},
{
'equipment_id': 2,
'name': 'Electrode Array',
'specs': {
'channels': 64,
'impedance_kohm': 0.5,
'material': 'tungsten',
},
},
{
'equipment_id': 3,
'name': 'Pending Setup',
'specs': None, # null allowed
},
])
Equipment.insert([
{
'equipment_id': 1,
'name': 'Microscope A',
'specs': {
'magnification': [10, 40, 100],
'camera': {'model': 'XR500', 'resolution': [2048, 2048]},
'calibrated': True,
},
},
{
'equipment_id': 2,
'name': 'Electrode Array',
'specs': {
'channels': 64,
'impedance_kohm': 0.5,
'material': 'tungsten',
},
},
{
'equipment_id': 3,
'name': 'Pending Setup',
'specs': None, # null allowed
},
])
Viewing Data¶
JSON displays as json in previews (like blobs):
In [4]:
Copied!
Equipment()
Equipment()
Out[4]:
| equipment_id | name | specs |
|---|---|---|
| 1 | Microscope A | json |
| 2 | Electrode Array | json |
| 3 | Pending Setup | json |
Total: 3
Fetching JSON Data¶
JSON deserializes to Python dicts on fetch:
In [5]:
Copied!
# Fetch all
for row in Equipment.to_dicts():
print(f"{row['name']}: {row['specs']}")
# Fetch all
for row in Equipment.to_dicts():
print(f"{row['name']}: {row['specs']}")
Microscope A: {'camera': {'model': 'XR500', 'resolution': [2048, 2048]}, 'calibrated': True, 'magnification': [10, 40, 100]}
Electrode Array: {'channels': 64, 'material': 'tungsten', 'impedance_kohm': 0.5}
Pending Setup: None
In [6]:
Copied!
# Fetch one and access nested fields
microscope = (Equipment & {'equipment_id': 1}).fetch1()
specs = microscope['specs']
print(f"Camera model: {specs['camera']['model']}")
print(f"Magnifications: {specs['magnification']}")
# Fetch one and access nested fields
microscope = (Equipment & {'equipment_id': 1}).fetch1()
specs = microscope['specs']
print(f"Camera model: {specs['camera']['model']}")
print(f"Magnifications: {specs['magnification']}")
Camera model: XR500 Magnifications: [10, 40, 100]
Filtering on JSON Content¶
Fetch then filter in Python:
In [7]:
Copied!
# Find calibrated equipment
calibrated = [
e for e in Equipment.to_dicts()
if e['specs'] and e['specs'].get('calibrated')
]
print("Calibrated:", [e['name'] for e in calibrated])
# Find equipment with >32 channels
multi_channel = [
e for e in Equipment.to_dicts()
if e['specs'] and e['specs'].get('channels', 0) > 32
]
print("Multi-channel:", [e['name'] for e in multi_channel])
# Find calibrated equipment
calibrated = [
e for e in Equipment.to_dicts()
if e['specs'] and e['specs'].get('calibrated')
]
print("Calibrated:", [e['name'] for e in calibrated])
# Find equipment with >32 channels
multi_channel = [
e for e in Equipment.to_dicts()
if e['specs'] and e['specs'].get('channels', 0) > 32
]
print("Multi-channel:", [e['name'] for e in multi_channel])
Calibrated: ['Microscope A'] Multi-channel: ['Electrode Array']
Updating JSON Data¶
In [8]:
Copied!
# Fetch, modify, delete, reinsert
pending = (Equipment & {'equipment_id': 3}).fetch1()
pending['specs'] = {'type': 'behavioral', 'sensors': ['IR', 'pressure']}
(Equipment & {'equipment_id': 3}).delete(prompt=False)
Equipment.insert1(pending)
(Equipment & {'equipment_id': 3}).fetch1()
# Fetch, modify, delete, reinsert
pending = (Equipment & {'equipment_id': 3}).fetch1()
pending['specs'] = {'type': 'behavioral', 'sensors': ['IR', 'pressure']}
(Equipment & {'equipment_id': 3}).delete(prompt=False)
Equipment.insert1(pending)
(Equipment & {'equipment_id': 3}).fetch1()
[2026-02-06 11:44:28] Deleting 1 rows from "tutorial_json"."equipment"
Out[8]:
{'equipment_id': 3,
'name': 'Pending Setup',
'specs': {'type': 'behavioral', 'sensors': ['IR', 'pressure']}}
Design Guidelines¶
| JSON | Normalized Tables |
|---|---|
| Flexible schema | Fixed schema |
| Filter in Python | Filter in SQL |
| No type enforcement | Type-safe |
| No referential integrity | FK constraints |
Tips:
- Use JSON for metadata, configs, external API responses
- If you frequently filter on a field, normalize it to a column
- Document expected JSON structure in comments
Cleanup¶
In [9]:
Copied!
schema.drop(prompt=False)
schema.drop(prompt=False)