Skip to content

Commit

Permalink
Restructure readme/index/reasons (#562)
Browse files Browse the repository at this point in the history
* Restructure readme/index/reasons

* cleanup
  • Loading branch information
hynek authored Aug 1, 2024
1 parent c393abc commit 1d72dcc
Show file tree
Hide file tree
Showing 5 changed files with 174 additions and 102 deletions.
103 changes: 12 additions & 91 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# *cattrs*: Flexible Object Serialization and Validation

<p>
<em>Because validation belongs to the edges.</em>
</p>
*Because validation belongs to the edges.*

[![Documentation](https://img.shields.io/badge/Docs-Read%20The%20Docs-black)](https://catt.rs/)
[![License: MIT](https://img.shields.io/badge/license-MIT-C06524)](https://github.com/hynek/stamina/blob/main/LICENSE)
Expand All @@ -13,27 +11,19 @@

---

<!-- begin-teaser -->

**cattrs** is a Swiss Army knife for (un)structuring and validating data in Python.
In practice, that means it converts **unstructured dictionaries** into **proper classes** and back, while **validating** their contents.

---

Python has a rich set of powerful, easy to use, built-in **unstructured** data types like dictionaries, lists and tuples.
These data types effortlessly convert into common serialization formats like JSON, MessagePack, CBOR, YAML or TOML.

But the data that is used by your **business logic** should be **structured** into well-defined classes, since not all combinations of field names or values are valid inputs to your programs.
The more trust you can have into the structure of your data, the simpler your code can be, and the fewer edge cases you have to worry about.
<!-- end-teaser -->

When you're handed unstructured data (by your network, file system, database, ...), _cattrs_ helps to convert this data into trustworthy structured data.
When you have to convert your structured data into data types that other libraries can handle, _cattrs_ turns your classes and enumerations into dictionaries, integers and strings.

_attrs_ (and to a certain degree dataclasses) are excellent libraries for declaratively describing the structure of your data, but they're purposefully not serialization libraries.
*cattrs* is there for you the moment your `attrs.asdict(your_instance)` and `YourClass(**data)` start failing you because you need more control over the conversion process.
## Example

<!-- begin-example -->

## Examples

_cattrs_ works best with [_attrs_](https://www.attrs.org/) classes, and [dataclasses](https://docs.python.org/3/library/dataclasses.html) where simple (un-)structuring works out of the box, even for nested data:
_cattrs_ works best with [_attrs_](https://www.attrs.org/) classes, and [dataclasses](https://docs.python.org/3/library/dataclasses.html) where simple (un-)structuring works out of the box, even for nested data, without polluting your data model with serialization details:

```python
>>> from attrs import define
Expand All @@ -50,74 +40,12 @@ C(a=1, b=['x', 'y'])

```

> [!IMPORTANT]
> Note how the structuring and unstructuring details do **not** pollute your class, meaning: your data model.
> Any needs to configure the conversion are done within *cattrs* itself, not within your data model.
>
> There are popular validation libraries for Python that couple your data model with its validation and serialization rules based on, for example, web APIs.
> We think that's the wrong approach.
> Validation and serializations are concerns of the edges of your program – not the core.
> They should neither apply design pressure on your business code, nor affect the performance of your code through unnecessary validation.
> In bigger real-world code bases it's also common for data coming from multiple sources that need different validation and serialization rules.
>
> 🎶 You gotta keep 'em separated. 🎶
*cattrs* also works with the usual Python collection types like dictionaries, lists, or tuples when you want to **normalize** unstructured data data into a certain (still unstructured) shape.
For example, to convert a list of a float, an int and a string into a tuple of ints:

```python
>>> import cattrs

>>> cattrs.structure([1.0, 2, "3"], tuple[int, int, int])
(1, 2, 3)

```

Finally, here's a much more complex example, involving _attrs_ classes where _cattrs_ interprets the type annotations to structure and unstructure the data correctly, including Enums and nested data structures:
<!-- end-teaser -->
<!-- end-example -->

```python
>>> from enum import unique, Enum
>>> from typing import Optional, Sequence, Union
>>> from cattrs import structure, unstructure
>>> from attrs import define, field

>>> @unique
... class CatBreed(Enum):
... SIAMESE = "siamese"
... MAINE_COON = "maine_coon"
... SACRED_BIRMAN = "birman"

>>> @define
... class Cat:
... breed: CatBreed
... names: Sequence[str]

>>> @define
... class DogMicrochip:
... chip_id = field() # Type annotations are optional, but recommended
... time_chipped: float = field()

>>> @define
... class Dog:
... cuteness: int
... chip: DogMicrochip | None = None

>>> p = unstructure([Dog(cuteness=1, chip=DogMicrochip(chip_id=1, time_chipped=10.0)),
... Cat(breed=CatBreed.MAINE_COON, names=('Fluffly', 'Fluffer'))])

>>> p
[{'cuteness': 1, 'chip': {'chip_id': 1, 'time_chipped': 10.0}}, {'breed': 'maine_coon', 'names': ['Fluffly', 'Fluffer']}]
>>> structure(p, list[Union[Dog, Cat]])
[Dog(cuteness=1, chip=DogMicrochip(chip_id=1, time_chipped=10.0)), Cat(breed=<CatBreed.MAINE_COON: 'maine_coon'>, names=['Fluffly', 'Fluffer'])]

```

> [!TIP]
> Consider unstructured data a low-level representation that needs to be converted to structured data to be handled, and use `structure()`.
> When you're done, `unstructure()` the data to its unstructured form and pass it along to another library or module.
>
> Use [*attrs* type metadata](http://attrs.readthedocs.io/en/stable/examples.html#types) to add type metadata to attributes, so _cattrs_ will know how to structure and destructure them.
Have a look at [*Why *cattrs*?*](https://catt.rs/en/latest/why.html) for more examples!

<!-- begin-why -->

## Features

Expand Down Expand Up @@ -175,14 +103,7 @@ _cattrs_ is based on a few fundamental design decisions:
A foolish consistency is the hobgoblin of little minds, so these decisions can and are sometimes broken, but they have proven to be a good foundation.


## Additional documentation and talks

- [On structured and unstructured data, or the case for cattrs](https://threeofwands.com/on-structured-and-unstructured-data-or-the-case-for-cattrs/)
- [Why I use attrs instead of pydantic](https://threeofwands.com/why-i-use-attrs-instead-of-pydantic/)
- [cattrs I: un/structuring speed](https://threeofwands.com/why-cattrs-is-so-fast/)
- [Python has a macro language - it's Python (PyCon IT 2022)](https://www.youtube.com/watch?v=UYRSixikUTo)
- [Intro to cattrs 23.1](https://threeofwands.com/intro-to-cattrs-23-1-0/)

<!-- end-why -->

## Credits

Expand Down
10 changes: 10 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,14 @@
import sys
from importlib.metadata import version as v

# Set canonical URL from the Read the Docs Domain
html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "")

# Tell Jinja2 templates the build is running on Read the Docs
if os.environ.get("READTHEDOCS", "") == "True":
html_context = {"READTHEDOCS": True}


# If extensions (or modules to document with autodoc) are in another
# directory, add these directories to sys.path here. If the directory is
# relative to the documentation root, use os.path.abspath to make it
Expand Down Expand Up @@ -44,6 +52,8 @@
"myst_parser",
]

myst_enable_extensions = ["colon_fence", "smartquotes", "deflist"]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

Expand Down
4 changes: 2 additions & 2 deletions docs/customizing.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Customizing Un/structuring
# Customizing (Un-)structuring

This section describes customizing the unstructuring and structuring processes in _cattrs_.

## Custom Un/structuring Hooks
## Custom (Un-)structuring Hooks

You can write your own structuring and unstructuring functions and register them for types using {meth}`Converter.register_structure_hook() <cattrs.BaseConverter.register_structure_hook>` and {meth}`Converter.register_unstructure_hook() <cattrs.BaseConverter.register_unstructure_hook>`.
This approach is the most flexible but also requires the most amount of boilerplate.
Expand Down
54 changes: 45 additions & 9 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,34 @@
# *cattrs*: Flexible Object Serialization and Validation

*Because validation belongs to the edges.*

---

```{include} ../README.md
:start-after: "begin-teaser -->"
:end-before: "<!-- end-teaser"
```

```{include} ../README.md
:start-after: "begin-example -->"
:end-before: "<!-- end-example"
```

---

However, *cattrs* does **much** more with a focus on **functional composition** and **not coupling** your data model to its serialization and validation rules.

To learn more on why to use *cattrs*, have a look at {doc}`why`, and if you're convinced jump right into {doc}`basics`!


```{toctree}
---
maxdepth: 2
hidden: true
caption: Introduction
---
self
why
basics
defaulthooks
```
Expand All @@ -31,20 +54,33 @@ indepth
---
maxdepth: 2
hidden: true
caption: Dev Guide
caption: Reference
---
history
benchmarking
contributing
API <modules>
modindex
genindex
```

```{include} ../README.md
```{toctree}
---
maxdepth: 2
hidden: true
caption: Dev Guide
---
contributing
benchmarking
```

# Indices and tables
```{toctree}
---
caption: Meta
hidden: true
maxdepth: 1
---
- {ref}`genindex`
- {ref}`modindex`
history
PyPI <https://pypi.org/project/cattrs/>
GitHub <https://github.com/python-attrs/cattrs>
```
105 changes: 105 additions & 0 deletions docs/why.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Why *cattrs*?

Python has a rich set of powerful, easy to use, built-in **unstructured** data types like dictionaries, lists and tuples.
These data types effortlessly convert into common serialization formats like JSON, MessagePack, CBOR, YAML or TOML.

But the data that is used by your **business logic** should be **structured** into well-defined classes, since not all combinations of field names or values are valid inputs to your programs.
The more trust you can have into the structure of your data, the simpler your code can be, and the fewer edge cases you have to worry about.

When you're handed unstructured data (by your network, file system, database, ...), _cattrs_ helps to convert this data into trustworthy structured data.
When you have to convert your structured data into data types that other libraries can handle, _cattrs_ turns your classes and enumerations into dictionaries, integers and strings.

_attrs_ (and to a certain degree dataclasses) are excellent libraries for declaratively describing the structure of your data, but they're purposefully not serialization libraries.
*cattrs* is there for you the moment your `attrs.asdict(your_instance)` and `YourClass(**data)` start failing you because you need more control over the conversion process.


## Examples

```{include} ../README.md
:start-after: "begin-example -->"
:end-before: "<!-- end-example"
```

:::{important}
Note how the structuring and unstructuring details do **not** pollute your class, meaning: your data model.
Any needs to configure the conversion are done within *cattrs* itself, not within your data model.

There are popular validation libraries for Python that couple your data model with its validation and serialization rules based on, for example, web APIs.
We think that's the wrong approach.
Validation and serializations are concerns of the edges of your program – not the core.
They should neither apply design pressure on your business code, nor affect the performance of your code through unnecessary validation.
In bigger real-world code bases it's also common for data coming from multiple sources that need different validation and serialization rules.

🎶 You gotta keep 'em separated. 🎶
:::


*cattrs* also works with the usual Python collection types like dictionaries, lists, or tuples when you want to **normalize** unstructured data data into a certain (still unstructured) shape.
For example, to convert a list of a float, an int and a string into a tuple of ints:

```python
>>> import cattrs

>>> cattrs.structure([1.0, 2, "3"], tuple[int, int, int])
(1, 2, 3)

```

Finally, here's a much more complex example, involving _attrs_ classes where _cattrs_ interprets the type annotations to structure and unstructure the data correctly, including Enums and nested data structures:

```python
>>> from enum import unique, Enum
>>> from typing import Optional, Sequence, Union
>>> from cattrs import structure, unstructure
>>> from attrs import define, field

>>> @unique
... class CatBreed(Enum):
... SIAMESE = "siamese"
... MAINE_COON = "maine_coon"
... SACRED_BIRMAN = "birman"

>>> @define
... class Cat:
... breed: CatBreed
... names: Sequence[str]

>>> @define
... class DogMicrochip:
... chip_id = field() # Type annotations are optional, but recommended
... time_chipped: float = field()

>>> @define
... class Dog:
... cuteness: int
... chip: DogMicrochip | None = None

>>> p = unstructure([Dog(cuteness=1, chip=DogMicrochip(chip_id=1, time_chipped=10.0)),
... Cat(breed=CatBreed.MAINE_COON, names=('Fluffly', 'Fluffer'))])

>>> p
[{'cuteness': 1, 'chip': {'chip_id': 1, 'time_chipped': 10.0}}, {'breed': 'maine_coon', 'names': ['Fluffly', 'Fluffer']}]
>>> structure(p, list[Union[Dog, Cat]])
[Dog(cuteness=1, chip=DogMicrochip(chip_id=1, time_chipped=10.0)), Cat(breed=<CatBreed.MAINE_COON: 'maine_coon'>, names=['Fluffly', 'Fluffer'])]

```

:::{tip}
Consider unstructured data a low-level representation that needs to be converted to structured data to be handled, and use `structure()`.
When you're done, `unstructure()` the data to its unstructured form and pass it along to another library or module.
:::


```{include} ../README.md
:start-after: "begin-why -->"
:end-before: "<!-- end-why"
```


## Additional Documentation and Talks

- [On structured and unstructured data, or the case for cattrs](https://threeofwands.com/on-structured-and-unstructured-data-or-the-case-for-cattrs/)
- [Why I use attrs instead of pydantic](https://threeofwands.com/why-i-use-attrs-instead-of-pydantic/)
- [cattrs I: un/structuring speed](https://threeofwands.com/why-cattrs-is-so-fast/)
- [Python has a macro language - it's Python (PyCon IT 2022)](https://www.youtube.com/watch?v=UYRSixikUTo)
- [Intro to cattrs 23.1](https://threeofwands.com/intro-to-cattrs-23-1-0/)

0 comments on commit 1d72dcc

Please sign in to comment.