Database Backends Specification¶
New in 2.1
PostgreSQL support was introduced in DataJoint 2.1.
DataJoint supports multiple database backends through a unified adapter architecture.
Supported Backends¶
| Backend | Minimum Version | Default Port | Status |
|---|---|---|---|
| MySQL | 8.0 | 3306 | Production |
| PostgreSQL | 15 | 5432 | Production |
Configuration¶
Select the backend via configuration:
import datajoint as dj
dj.config['database.backend'] = 'mysql' # Default
dj.config['database.backend'] = 'postgresql' # PostgreSQL
Or via environment variable:
export DJ_BACKEND=postgresql
Adapter Architecture¶
DataJoint uses database adapters to generate backend-specific SQL while maintaining a consistent API.
flowchart TB
subgraph api[DataJoint API]
tables[Tables, Queries, Schemas]
end
subgraph adapter[Database Adapter]
gen[SQL Generation, Type Mapping]
end
subgraph backends[Backend Adapters]
mysql[MySQL Adapter]
postgres[PostgreSQL Adapter]
end
api --> adapter
adapter --> mysql
adapter --> postgres
Backend Compatibility¶
Fully Compatible¶
The following features work identically across all backends:
- Table definitions: Same definition syntax for all backends
- Core types:
int32,float64,varchar,datetime, etc. - Codec types:
<blob>,<attach>,<object@>, etc. - Query operations: Restriction, projection, join, aggregation
- Foreign keys: Inheritance, nullable, unique modifiers
- Indexes: Single-column, composite, unique
- Auto-populate: Jobs queue, distributed computation
Backend-Specific Behavior¶
| Feature | MySQL | PostgreSQL |
|---|---|---|
| Default port | 3306 | 5432 |
| Schema = Database | Yes | Yes (uses schemas) |
| JSON operators | ->, ->> |
->, ->> |
| BLOB storage | LONGBLOB |
BYTEA |
| Boolean type | TINYINT(1) |
BOOLEAN |
String Quoting¶
MySQL and PostgreSQL handle quotes differently in SQL:
| Quote Type | MySQL | PostgreSQL |
|---|---|---|
Single quotes '...' |
String literals | String literals |
Double quotes "..." |
String literals | Identifiers (column names) |
Backticks `...` |
Identifiers | Not supported |
This affects restriction strings. MySQL accepts both quote styles for string values, but PostgreSQL interprets double quotes as identifier (column) references.
# MySQL only - double quotes work as string literals
Table & 'name = "Alice"'
# Both backends - single quotes for string literals
Table & "name = 'Alice'"
PostgreSQL migration: Replace double quotes with single quotes inside SQL restriction strings:
# Before (MySQL)
Table & 'strain = "C57BL/6"'
Table & 'date > "2024-01-01"'
# After (PostgreSQL compatible)
Table & "strain = 'C57BL/6'"
Table & "date > '2024-01-01'"
Dictionary restrictions handle quoting automatically but only support equality:
Table & {'name': 'Alice'} # Equality only - backend-agnostic
For range comparisons (>, <, LIKE, etc.), use string restrictions with single-quoted values.
SQL Function Translation¶
DataJoint automatically translates certain SQL functions between backends, allowing portable code:
| Function | MySQL | PostgreSQL |
|---|---|---|
| String aggregation | GROUP_CONCAT(col) |
STRING_AGG(col, ',') |
| String aggregation with separator | GROUP_CONCAT(col SEPARATOR ';') |
STRING_AGG(col, ';') |
You can use either syntax in your code—DataJoint translates to the appropriate form:
# Both work on both backends
Person.aggr(Proficiency, languages='GROUP_CONCAT(lang_code)')
Person.aggr(Proficiency, languages="STRING_AGG(lang_code, ',')")
The translation is bidirectional:
- On PostgreSQL:
GROUP_CONCAT(col)→STRING_AGG(col::text, ',') - On MySQL:
STRING_AGG(col, ',')→GROUP_CONCAT(col)
Type Mapping¶
DataJoint core types map to native database types:
| Core Type | MySQL | PostgreSQL |
|---|---|---|
int8 |
TINYINT |
SMALLINT |
int16 |
SMALLINT |
SMALLINT |
int32 |
INT |
INTEGER |
int64 |
BIGINT |
BIGINT |
float32 |
FLOAT |
REAL |
float64 |
DOUBLE |
DOUBLE PRECISION |
bool |
TINYINT(1) |
BOOLEAN |
varchar(n) |
VARCHAR(n) |
VARCHAR(n) |
char(n) |
CHAR(n) |
CHAR(n) |
date |
DATE |
DATE |
datetime |
DATETIME |
TIMESTAMP |
datetime(n) |
DATETIME(n) |
TIMESTAMP(n) |
json |
JSON |
JSONB |
uuid |
BINARY(16) |
UUID |
bytes |
LONGBLOB |
BYTEA |
Connection Management¶
Both backends support the same connection patterns:
# Singleton connection (default)
conn = dj.conn()
# Context manager (explicit lifecycle)
with dj.Connection(**creds) as conn:
schema = dj.Schema('my_schema', connection=conn)
# Reset connection
conn = dj.conn(reset=True)
Testing Against Multiple Backends¶
For development and CI/CD, you can test against both backends:
# Test MySQL only
DJ_BACKEND=mysql pytest tests/
# Test PostgreSQL only
DJ_BACKEND=postgresql pytest tests/
Migration Between Backends¶
DataJoint does not provide automatic migration between backends. To migrate data:
- Export data using
fetch()and pandas DataFrames - Create new schemas on the target backend
- Import data using
insert()
Schema definitions are portable—the same Python class definitions work on both backends.
See Also¶
- Configure Database — Connection setup
- Type System — Core type definitions
- Table Declaration — Definition syntax