Skip to content

Blob

Binary serialization

Binary serialization for DataJoint blob storage.

Provides (de)serialization for Python/NumPy objects with backward compatibility for MATLAB mYm-format blobs. Supports arrays, scalars, structs, cells, and Python built-in types (dict, list, tuple, set, datetime, UUID, Decimal).

MatCell

Bases: ndarray

NumPy ndarray subclass representing a MATLAB cell array.

Used to distinguish cell arrays from regular arrays during serialization for MATLAB compatibility.

MatStruct

Bases: recarray

NumPy recarray subclass representing a MATLAB struct array.

Used to distinguish struct arrays from regular recarrays during serialization for MATLAB compatibility.

Blob

Binary serializer/deserializer for DataJoint blob storage.

Handles packing Python objects into binary format and unpacking binary data back to Python objects. Supports two protocols:

  • mYm: Original MATLAB-compatible format (default)
  • dj0: Extended format for Python-specific types

Parameters:

Name Type Description Default
squeeze bool

If True, remove singleton dimensions from arrays and convert 0-dimensional arrays to scalars. Default False.

False

Attributes:

Name Type Description
protocol bytes or None

Current serialization protocol (b"mYm\0" or b"dj0\0").

set_dj0

set_dj0()

Switch to dj0 protocol for extended type support.

squeeze

squeeze(array, convert_to_scalar=True)

Remove singleton dimensions from an array.

Parameters:

Name Type Description Default
array ndarray

Input array.

required
convert_to_scalar bool

If True, convert 0-dimensional arrays to Python scalars. Default True.

True

Returns:

Type Description
ndarray or scalar

Squeezed array or scalar value.

pack_array

pack_array(array)

Serialize a NumPy array into bytes.

Parameters:

Name Type Description Default
array ndarray

Array to serialize. Scalars are encoded with ndim=0.

required

Returns:

Type Description
bytes

Serialized array data.

read_recarray

read_recarray()

Serialize an np.ndarray with fields, including recarrays

pack_recarray

pack_recarray(array)

Serialize a Matlab struct array

read_struct

read_struct()

deserialize matlab struct

pack_struct

pack_struct(array)

Serialize a Matlab struct array

read_cell_array

read_cell_array()

Deserialize MATLAB cell array.

Handles edge cases from MATLAB: - Empty cell arrays ({}) - Cell arrays with empty elements ({[], [], []}) - Nested arrays ({[1,2], [3,4,5]}) - ragged arrays - Cell matrices with mixed content

read_datetime

read_datetime()

deserialize datetime.date, .time, or .datetime

pack

pack(obj, compress=True)

Serialize a Python object to binary blob format.

Parameters:

Name Type Description Default
obj any

Object to serialize. Supports NumPy arrays, Python scalars, collections (dict, list, tuple, set), datetime objects, UUID, Decimal, and MATLAB-compatible MatCell/MatStruct.

required
compress bool

If True (default), compress blobs larger than 1000 bytes using zlib.

True

Returns:

Type Description
bytes

Serialized binary data.

Raises:

Type Description
DataJointError

If the object type is not supported.

Examples:

>>> data = np.array([1, 2, 3])
>>> blob = pack(data)
>>> unpacked = unpack(blob)

unpack

unpack(blob, squeeze=False)

Deserialize a binary blob to a Python object.

Parameters:

Name Type Description Default
blob bytes

Binary data from pack() or MATLAB mYm serialization.

required
squeeze bool

If True, remove singleton dimensions from arrays. Default False.

False

Returns:

Type Description
any

Deserialized Python object.

Examples:

>>> blob = pack({'a': 1, 'b': [1, 2, 3]})
>>> data = unpack(blob)
>>> data['b']
[1, 2, 3]