Motivation
The serialization protocols implemented by this module have the following aims:
Readability / ease of manual editing of serialized format.
Ease of extensibility with minimal coding overhead - Most classes can become serializable simply by decorating them with
@serializable(). More flexibility can be obtained by deriving fromSerializable.Code unobstrusiveness - Custom objects can also be made serializable by instead implementing a stand-alone
TypeSerializer, settinghandled_typeto the class to make serializable – this approach offers the most flexibility.Syntax unobtrusiveness - JSON/YAML-compatible base types (numeric types,
str,list,dict) are converted to serializable objects without any added verbosity [1]. Custom serializable types are serialized as dictionaries with a__type__key.Builtin type (
tuple,set,slice) support out-of-the-box.Numpy support (
numpy.dtype,numpy.ndarray,numpy.datetime64), including (nested and/or shaped) structured dtypes out-of-the-box.Safety - Only
Serializableobjects or those with aTypeSerializerwill be deserialized into objects bySerializer, and users have fine-grained control of enabled third-party and builtin plugins.
Todo
Document numpy support, including the _as_bytes options and how to enable them.
Syntax
See Examples for example serializations of various builtin Python types.
Syntax Overhead
JSON/YAML-compatible base types are converted to serializable objects without any added verbosity. The exception is dictionaries that contain the key __type__. Such dictionaries are represented in the following more verbose form:
{'__type__': 'dict',
'value': <original dictionary>}
Composibility
Getting started
from xerializer import Serializer
serializer = Serializer()
# Get a human-readable string representation of a supported object
my_object = [{'key1':'val1', 'key2':[1,2,3]},
('tuple1', 'tuple2'),
{'set1', 'set2'},
slice(None,30)]
my_object_str = serializer.serialize(my_object)
assert my_object == serializer.deserialize(my_object_str)
print(my_object_str)
[{"key1": "val1", "key2": [1, 2, 3]}, {"__type__": "tuple", "value": ["tuple1", "tuple2"]}, {"__type__": "set", "value": ["set1", "set2"]}, {"__type__": "slice", "stop": 30}]
Examples of all builtin-types can be found in Examples.
Serializing custom types
Todo
Change intro to reflect @serializable discussion. Move this discussion to the top of the list.
There are two approaches to support custom types: By having the custom type derive from Serializable – this requires modifying the source code for that class. Or by creating a stand-alone TypeSerializer – a less obtrusive approach.
Regardless of the approach, the user is responsible for implementing a method as_serializable() that maps the custom class to a dictionary with values that are builtin or custom serializable types. By default, this dictionary is used as keyword arguments for method from_serializable() which is, by default, the handled type’s __init__().
… by making the class a Serializable
A custom type can be made serializable by inheriting from xerializer.Serializable and implementing method as_serializable(). Optionally, the custom class can also implement class method from_serializable() and set class attributes signature and register.
# CREATING A SERIALIZABLE CLASS
#############################################################
# An new class that derives `Serializable`
#############################################################
from xerializer import Serializable
class MySerializable(Serializable):
def __init__(self, arg1, arg2):
self.arg1 = arg1
self.arg2 = arg2
############################
# Required
############################
def as_serializable(self):
return {'arg1': self.arg1, 'arg2': self.arg2}
############################
# Optional (defaults shown)
############################
@classmethod
def from_serializable(cls, **kwargs):
return cls(**kwargs)
signature = '<module name>:MySerializable'
register = True
###################################################
# To serialize a type, `MySerializable` needs to
# be declared before `Serializer` is instantiated.
###################################################
from xerializer import Serializer
print(Serializer().serialize(MySerializable(1,2)))
{"__type__": "my_serializable_module.MySerializable", "arg1": 1, "arg2": 2}
… with a stand-alone TypeSerializer
For classes that already exist, one can instead create a standalone type serializer without needing to modify the original source code:
# CREATING A STANDALONE TYPE SERIALIZER FOR AN EXISTING CLASS
#############################################################
# An existing class that cannot be modified
# in module `my_package.my_non_serializable_module`
#############################################################
class MyNonSerializable:
def __init__(self, arg1, arg2):
self.arg1 = arg1
self.arg2 = arg2
####################################################
# A type serializer to handle `MyNonSerializable`
# in module `_my_xerializer`
####################################################
# from my_package.my_non_serializable_module import MyNonSerializable
from xerializer import TypeSerializer
class MyClassSerializer(TypeSerializer):
# Required
handled_type = MyNonSerializable
def as_serializable(self, obj):
return {'arg1': obj.arg1, 'arg2': obj.arg2}
# Optional (defaults shown)
# (TypeSerializer.from_serializable is a regular **instance** method)
def from_serializable(cls, **kwargs):
return cls(**kwargs)
signature = 'my_non_serializable_module:MyNonSerializable'
register = True
####################################
# Serializing the custom type
####################################
# To serialize a type, the custom module containing MyClassSerializer needs to be imported
# before the Serializer is instantiated. Importing the module declares the class,
# which automatically registers it
# import _my_xerializer
from xerializer import Serializer
print(Serializer().serialize(MyNonSerializable(1,2)))
{"__type__": "my_non_serializable_module.MyNonSerializable", "arg1": 1, "arg2": 2}
… with the serializable class decorator
The module also exposes an @serializable() class decorator that greatly simplifies the process of making custom types serializables for the special case of classes that
are initialized only with serializable arguments and
have initializer signature that are all introspectable with inspect.signature – this includes the vast majority of methods, including those with
*argsand**kwargsarguments.
Classes decorated with @serializable() will have the __init__ method wrapped in a function that appends an attribute _xerializable_params to the instantiated object. The decorator can also be used as a stand-alone function to make an existing class serializable – note that this also modifies the class initializer and needs to be done before instantiating the class.
Unlike classes deriving from xerializer.Serializable, classes derived from @serializable()-decorated classes do not inherit the serializable quality.
Example: Using serializable as a decorator
serializable() can be used as a class, method or function decorator, automatically making instances of these objects serializable, and calls to these methods or functions de-serializable.
from xerializer import Serializer, serializable
#####################################################
# Using serializable as a decorator.
# `signature` optional, defaults to fully qualified
# class name.
#####################################################
@serializable(signature='MyClass1')
class MyClass1:
def __init__(self, a, b=2):
self.a = a
self.b = b
def __eq__(self, x):
return self.a == x.a and self.b == x.b
###########################
# Serializing/deserializing
###########################
# The `serializable`-decorated class declaration needs to happen before `Serializer` is instantiated.
>>> srlzr = Serializer()
>>> mc1 = MyClass1(1)
>>> mc1_srlzd = srlzr.serialize(mc1)
>>> assert mc1 == srlzr.deserialize(mc1_srlzd)
>>> print(mc1_srlzd)
{"__type__": "MyClass1", "a": 1, "b": 2}
Example: Using serializable as a function
Using serializable as a function makes it possible to register classes without modifying their source code, similarly to the approach that uses a stand-alone type serializer.
#########################################
# The target class to make serializable.
#########################################
class MyClass2(MyClass1):
def __init__(self, a, *args, b=2, **kwargs):
self.a = a
self.b = b
#########################################
# Using `serializable` as a function
#########################################
# Setting explicit_defaults=False means that defaults such as b=2 are
# not serialized.
# The default is explicit_defaults=True.
MyClass2 = serializable(explicit_defaults=False, signature='MyClass2')(MyClass2)
###########################
# Serializing/deserializing
###########################
# The `serializable` call needs to happen before `Serializer` is instantiated.
>>> srlzr = Serializer()
#
>>> mc2 = MyClass2(3)
>>> mc2_srlzd = srlzr.serialize(mc2)
>>> assert mc2 == srlzr.deserialize(mc2_srlzd)
>>> print(mc2_srlzd)
{"__type__": "MyClass2", "a": 3}
Decorator serialization syntax
Type serializers generated automatically with the @serializable decorator will attempt to produce serializations that are compact and human-readable:
print(srlzr.serialize(MyClass2(1, 2, 3, b=10, c=20, d=30)))
{"__type__": "MyClass2", "a": 1, "args": [2, 3], "b": 10, "c": 20, "d": 30}
Todo
Won’t this clashes also happen if a keyword arg has the name kwargs?
This syntax will create name clashes when one of the variable keywords has the same name 'args' as the variable positional argument *args, a situation that is detected automatically and addressed with a more verbose syntax:
# The keyword 'args' has the same name as the variable positional
# argument '*args' in the signature of MyClass2.__init__
print(srlzr.serialize(MyClass2(1, 2, 3, b=10, c=20, d=30, args=40)))
{"__type__": "MyClass2", "a": 1, "args": [2, 3], "b": 10, "kwargs": {"c": 20, "d": 30, "args": 40}}
The serializable() decorator takes a kwargs_level argument that can be used to explicitly choose the more compact syntax (kwargs_level='root') in situations where the user is sure no clashes will occur (detected name clashes will raise an exception). The more verbose but safe syntax can also be set explicitly (kwargs_level='safe'). By default, the choice is done automatically on-the-fly (kwargs_level='auto').
Registering custom types
By default, all non-abstract class derived from TypeSerializer (including those generated automatically for non-abstract Serializable derived types, and those decorated with serializable()) are automatically registered by module xerializer. This means that any Serializer instantiated after their definition will by default include those plugins.
This behavior can be customized (except for the decorator syntax) using class variable register and metaclass variable register_meta. Both variables can be used when deriving from either Serializable or TypeSerializer.
Using the register class variable
Class variable register specifies whether a given class and all its derived children classes are registered (only non-abstract Serializable or TypeSerializer-derived classes are registered):
from xerializer import TypeSerializer, get_registered_serializers
class MyClass:
pass
class MyTypeSerializer(TypeSerializer):
"""
This and all derived classes are registered automatically because
they are non-abstract and TypeSerializer.register=True.
"""
handled_type = MyClass
def as_serializable(self):
pass
class MyTypeSerializerUnregistered(MyTypeSerializer):
"""
This and all derived classes are not registered automatically despite
being non-abstract since register=False.
"""
register = False
print(get_registered_serializers())
{'as_serializable': [<class 'MyTypeSerializer'>], 'from_serializable': [<class 'MyTypeSerializer'>]}
Using the register_meta keyword
Metaclass keyword register_meta is passed in as a class definition keyword argument and can be one of None, True, False. If None (the default), it has no effect. If True or False, it overrides the register class variable but only affects the class being defined and not its children:
class MyChildSerializer(TypeSerializer, register_meta=False):
"""
This class is not registered despite being non-abstract since register_meta is False.
All derived classes will be registered since register=True.
"""
register = True
handled_type = MyClass
def as_serializable(self):
pass
class MyGrandchildSerializer(MyChildSerializer):
"""
This class is registered since its parent has register=True.
"""
pass
print(get_registered_serializers())
{'as_serializable': [<class 'MyGrandchildSerializer'>], 'from_serializable': [<class 'MyGrandchildSerializer'>]}
Using register_meta=True is also a good way to debug class registration issues, as it will force class registration or fail with a descriptive error message:
try:
class AbstractTypeSerializer(TypeSerializer, register_meta=True):
pass
except Exception as err:
assert str(err) == "Cannot register abstract class <class 'AbstractTypeSerializer'>."