Skip to content

Commit

Permalink
Refactor shallow schema (Qiskit#1350)
Browse files Browse the repository at this point in the history
* `shallow_schema` refactor to make its generation more maintainable.

Overview
========

This PR introduces a kind of _mixin field_ `ModelValidation` with model type validation.
All the Marshmallow fields are imported into `qiskit.validation.fields` and extended to
include the mixin. The same happens with polymorphic fields and containers.

Validation is possible thanks to a special copy of the model schema, the _shallow schema_,
which replaces the `_deserialize` private method of its fields (at instance-level) to use
the new `validate_model` from the `ModelValidator` mixin.

Validating models consist into checking that the values are of the corresponding types.
Specifically:

1. For regular fields, the type they deserialize to (deserialization type).
2. For fields containers (such as lists), the deserialization type of the contained field.
3. For schema containers (such as nested schemas), the type of the schema's model.
4. For polymorphic fields, one of the deserialization types of the choices.
5. For polymorphic schemas, one of the types of the possible schemas' model.

While validating, nothing more than the type of the value is considered. Validation
assumes that the values of the inner properties are already valid or they would have
failed during their instantiation. For instance, consider:

```python
Book(title="The Terror", author=Person(name="Dan Simmons"))
```

`Person` instantiation is executed first. If successful, validating `Book` does not
require `author`'s internal structure (i.e. `name`) to be validated again since otherwise,
construction of `Person` would have failed.
---

This refactor continues patching the method `_deserialize` of the schema fields but,
instead of deciding how to patch it according to the type, it makes `_deserialize`
to call a new custom method `_validate_model` with the desired behaviour.

The different strategies from the if/elif/else block of the original implementation are
now spread over the fields defined in qiskit.validation.fields

This package also contains versions of marshmallow.fields that include their own
`_validate_model` method.

* Organize the fields into several modules and factor out validation.

This PR split the fields into general fields, containers and polymorphic. Each module
has the definition of the different kind of fields.

It also provides a base field for those able of model type validation and a general
mechanism to perform this validation.

* Polishing docstrings and reviewing container and polymorphic validations.

* Rename to make intentions clear and fixed fields hierarchy

* Moving tests to its own folder

* Split tests into different categories

* Docstrings fixes and clarifications.

* Style, convention and doc tweaks

* Update CHANGELOG
  • Loading branch information
delapuente authored and diego-plan9 committed Dec 5, 2018
1 parent 6525615 commit b7865d6
Show file tree
Hide file tree
Showing 17 changed files with 777 additions and 504 deletions.
4 changes: 2 additions & 2 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ Changed
- `IBMQ.save_account()` now takes an `overwrite` option to replace an existing
account on disk. Default is False (#1295).
- Backend and Provider methods defined in the specification use model objects
rather than dicts, along with validation against schemas (#1249, #1277). The
updated methods include:
rather than dicts, along with validation against schemas (#1249, #1277,
#1350). The updated methods include:
- ``backend.status()``(#1301).
- ``backend.configuration()`` (and ``__init__``) (#1323).
- ``backend.properties()``, returning ``None`` for sims (#1331, #1401).
Expand Down
2 changes: 1 addition & 1 deletion qiskit/backends/models/backendconfiguration.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@

"""Model and schema for backend configuration."""

from marshmallow.fields import Boolean, DateTime, Integer, List, Nested, String
from marshmallow.validate import Equal, Length, OneOf, Range, Regexp

from qiskit.validation import BaseModel, BaseSchema, bind_schema
from qiskit.validation.fields import Boolean, DateTime, Integer, List, Nested, String


class GateConfigSchema(BaseSchema):
Expand Down
2 changes: 1 addition & 1 deletion qiskit/backends/models/backendproperties.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@

"""Model and schema for backend configuration."""

from marshmallow.fields import DateTime, List, Nested, Number, String
from marshmallow.validate import Length, Regexp

from qiskit.validation import BaseModel, BaseSchema, bind_schema
from qiskit.validation.fields import DateTime, List, Nested, Number, String


class NduvSchema(BaseSchema):
Expand Down
2 changes: 1 addition & 1 deletion qiskit/backends/models/backendstatus.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@

"""Model and schema for backend status."""

from marshmallow.fields import Boolean, Integer, String
from marshmallow.validate import Range, Regexp

from qiskit.validation import BaseModel, BaseSchema, bind_schema
from qiskit.validation.fields import Boolean, Integer, String


class BackendStatusSchema(BaseSchema):
Expand Down
2 changes: 1 addition & 1 deletion qiskit/backends/models/jobstatus.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@

"""Model and schema for job status."""

from marshmallow.fields import String
from marshmallow.validate import OneOf

from qiskit.validation import BaseModel, BaseSchema, bind_schema
from qiskit.validation.fields import String


class JobStatusSchema(BaseSchema):
Expand Down
2 changes: 1 addition & 1 deletion qiskit/result/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@

"""Schema and helper models for schema-conformant Results."""

from marshmallow.fields import Boolean, DateTime, Integer, List, Nested, Raw, String
from marshmallow.validate import Length, OneOf, Regexp, Range

from qiskit.validation.base import BaseModel, BaseSchema, ObjSchema, bind_schema
from qiskit.validation.fields import Complex, ByType
from qiskit.validation.fields import Boolean, DateTime, Integer, List, Nested, Raw, String
from qiskit.validation.validate import PatternProperties


Expand Down
5 changes: 4 additions & 1 deletion qiskit/validation/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,7 @@
# the LICENSE.txt file in the root directory of this source tree.

"""Models and schemas for Terra."""
from .base import BaseModel, BaseSchema, bind_schema

from marshmallow import ValidationError

from .base import BaseModel, BaseSchema, bind_schema, ModelTypeValidator
143 changes: 68 additions & 75 deletions qiskit/validation/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,60 @@ class Person(BaseModel):
pass
"""

from functools import partial, wraps
from types import SimpleNamespace
from functools import wraps
from types import SimpleNamespace, MethodType

from marshmallow import ValidationError
from marshmallow import Schema, post_dump, post_load, fields
from marshmallow import Schema, post_dump, post_load
from marshmallow import fields as _fields
from marshmallow.utils import is_collection

from .fields import BasePolyField, ByType

class ModelTypeValidator(_fields.Field):
"""A field able to validate the correct type of a value."""

valid_types = (object, )

def _expected_types(self):
return self.valid_types

def check_type(self, value, attr, data):
"""Validates a value against the correct type of the field.
It calls ``_expected_types`` to get a list of valid types.
Subclasses can do one of the following:
1. They can override the ``valid_types`` property with a tuple with
the expected types for this field.
2. They can override the ``_expected_types`` method to return a
tuple of expected types for the field.
3. They can change ``check_type`` completely to customize
validation.
This method or the overrides must return the ``value`` parameter
untouched.
"""
expected_types = self._expected_types()
if not isinstance(value, expected_types):
raise self._not_expected_type(
value, expected_types, fields=[self], field_names=attr, data=data)
return value

@staticmethod
def _not_expected_type(value, type_, **kwargs):
if is_collection(type_) and len(type_) == 1:
type_ = type_[0]

if is_collection(type_):
body = 'is none of the expected types {}'.format(type_)
else:
body = 'is not the expected type {}'.format(type_)

message = 'Value \'{}\' {}'.format(value, body)
return ValidationError(message, **kwargs)


class BaseSchema(Schema):
Expand Down Expand Up @@ -142,88 +188,35 @@ def __call__(self, model_cls):
model_cls.__init__ = self._validate_after_init(model_cls.__init__)

# Add a Schema that performs minimal validation to the Model.
model_cls.shallow_schema = self._create_shallow_schema(self._schema_cls)
model_cls.shallow_schema = self._create_validation_schema(self._schema_cls)

return model_cls

def _create_shallow_schema(self, schema_cls):
"""Create a Schema with minimal validation for compound types.
@staticmethod
def _create_validation_schema(schema_cls):
"""Create a patched Schema for validating models.
This is a helper for performing the initial validation when
instantiating the Model via **kwargs. It works on the assumption that
**kwargs will contain:
* for compound types (`Nested`, `BasePolyField`), it will already
contain `BaseModels`, which should have been validated earlier
(during _their_ instantiation), and only type checking is performed.
* for `Number` and `String` types, both serialized and deserialized
are equivalent, and the shallow_schema will try to serialize in
order to perform stronger validation.
* for the rest of fields (the ones where the serialized and deserialized
data is different), it will contain _deserialized_ types that are
passed through.
Model validation is not part of Marshmallow. Schemas have a ``validate``
method but this delegates execution on ``load`` and discards the result.
Similarly, ``load`` will call ``_deserialize`` on every field in the
schema.
The underlying idea is to be able to perform validation (in the schema)
at only the first level of the object, and at the same time take
advantage of validation during **kwargs instantiation as much as
possible (mimicking `.from_dict()` in that respect).
This function patches the ``_deserialize`` instance method of each
field to make it call a custom defined method ``check_type``
provided by Qiskit in the different fields at
``qiskit.validation.fields``.
Returns:
BaseSchema: a copy of the original Schema, overriding the
``_deserialize()`` call of its fields.
"""
shallow_schema = schema_cls()
for _, field in shallow_schema.fields.items():
if isinstance(field, fields.Nested):
field._deserialize = partial(self._overridden_nested_deserialize, field)
elif isinstance(field, BasePolyField):
field._deserialize = partial(self._overridden_basepolyfield_deserialize, field)
elif not isinstance(field, (fields.Number, fields.String, ByType)):
field._deserialize = partial(self._overridden_field_deserialize, field)
return shallow_schema

@staticmethod
def _overridden_nested_deserialize(field, value, _, data):
"""Helper for minimal validation of fields.Nested."""
if field.many and not is_collection(value):
field.fail('type', input=value, type=value.__class__.__name__)

if not field.many:
values = [value]
else:
values = value

for v in values:
if not isinstance(v, field.schema.model_cls):
raise ValidationError(
'Not a valid type for {}.'.format(field.__class__.__name__),
data=data)
return value

@staticmethod
def _overridden_basepolyfield_deserialize(field, value, _, data):
"""Helper for minimal validation of fields.BasePolyField."""
if not field.many:
values = [value]
else:
values = value

for v in values:
schema = field.serialization_schema_selector(v, data)
if not schema:
raise ValidationError(
'Not a valid type for {}.'.format(field.__class__.__name__),
data=data)
return value

@staticmethod
def _overridden_field_deserialize(field, value, attr, data):
"""Helper for validation of generic Field."""
# Attempt to serialize, in order to catch validation errors.
field._serialize(value, attr, data)
validation_schema = schema_cls()
for _, field in validation_schema.fields.items():
if isinstance(field, ModelTypeValidator):
validate_function = field.__class__.check_type
field._deserialize = MethodType(validate_function, field)

# Propagate the original value upwards.
return value
return validation_schema

@staticmethod
def _to_dict(instance):
Expand Down
126 changes: 126 additions & 0 deletions qiskit/validation/fields/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# -*- coding: utf-8 -*-

# Copyright 2018, IBM.
#
# This source code is licensed under the Apache License, Version 2.0 found in
# the LICENSE.txt file in the root directory of this source tree.

"""Fields to be used with Qiskit validated classes.
When extending this module with new Fields:
1. Distinguish a new type, like the ``Complex`` number in this module.
2. Use a new Marshmallow field not used in ``qiskit`` yet.
Marshamallow fields does not allow model validation so you need to create a new
field, make it subclass of the Marshamallow field *and* ``ModelTypeValidator``,
and redefine ``valid_types`` to be the list of valid types. Usually, **the
same types this field deserializes to**. For instance::
class Boolean(marshmallow.fields.Boolean, ModelTypeValidator):
__doc__ = _fields.Boolean.__doc__
valid_types = (bool, )
See ``ModelTypeValidator`` for more subclassing options.
"""
from datetime import date, datetime

from marshmallow import fields as _fields
from marshmallow.utils import is_collection

from qiskit.validation import ModelTypeValidator
from qiskit.validation.fields.polymorphic import ByAttribute, ByType, TryFrom
from qiskit.validation.fields.containers import Nested, List


class Complex(ModelTypeValidator):
"""Field for complex numbers.
Field for parsing complex numbers:
* deserializes to Python's `complex`.
* serializes to a tuple of 2 decimals `(real, imaginary)`
"""

valid_types = (complex, )

default_error_messages = {
'invalid': '{input} cannot be parsed as a complex number.',
'format': '"{input}" cannot be formatted as complex number.',
}

def _serialize(self, value, attr, obj):
try:
return [value.real, value.imag]
except AttributeError:
self.fail('format', input=value)

def _deserialize(self, value, attr, data):
if not is_collection(value) or len(value) != 2:
self.fail('invalid', input=value)

try:
return complex(*value)
except (ValueError, TypeError):
self.fail('invalid', input=value)


class String(_fields.String, ModelTypeValidator):
# pylint: disable=missing-docstring
__doc__ = _fields.String.__doc__

valid_types = (str, )


class Date(_fields.Date, ModelTypeValidator):
# pylint: disable=missing-docstring
__doc__ = _fields.Date.__doc__

valid_types = (date, )


class DateTime(_fields.DateTime, ModelTypeValidator):
# pylint: disable=missing-docstring
__doc__ = _fields.DateTime.__doc__

valid_types = (datetime, )


class Email(_fields.Email, String):
# pylint: disable=missing-docstring
__doc__ = _fields.Email.__doc__


class Url(_fields.Url, String):
# pylint: disable=missing-docstring
__doc__ = _fields.Url.__doc__


class Number(_fields.Number, ModelTypeValidator):
# pylint: disable=missing-docstring
__doc__ = _fields.Number.__doc__

def _expected_types(self):
return self.num_type


class Integer(_fields.Integer, Number):
# pylint: disable=missing-docstring
__doc__ = _fields.Integer.__doc__


class Float(_fields.Float, Number):
# pylint: disable=missing-docstring
__doc__ = _fields.Float.__doc__


class Boolean(_fields.Boolean, ModelTypeValidator):
# pylint: disable=missing-docstring
__doc__ = _fields.Boolean.__doc__

valid_types = (bool, )


class Raw(_fields.Raw, ModelTypeValidator):
# pylint: disable=missing-docstring
__doc__ = _fields.Boolean.__doc__
Loading

0 comments on commit b7865d6

Please sign in to comment.