2021-07-29 / exchange,server,python / Sam
Clean Architecture talked a little about how a framework is merely a development detail and should be deferred just like any other detail in your system. On first reading I found this quite confusing and unhelpful. I understood the sentiment, but as someone who has just spent the pandemic year working on a "Django application", I couldn't see how one could possibly engineer applications and leverage a framework without making a decision early in the process, and conforming to that framework before it was too late.
The reason I bought Architecture Patterns with Python ("Cosmic Python") was for its appendix showing how you might integrate your freestanding application into Django. This was particularly helpful as Django is the only tool I've used to build large service-like applications, so I could see the boundaries of the example application and Django's responsibilities in much clearer terms. Seeing is believing and here was proof that a framework really can live on the "outside" of your application. Still, this begged the question: what's the point in all these lovely frameworks if you're going to write a bunch of code to keep them at bay?
Architecture Patterns with Python also introduced me to the Repository and closely related Unit of Work patterns. I won't go into detail here (because every person and their dog seems to have their own personal interpretation of each pattern), but my brief interpretation at this time is:
- A Repository offers an interface for an application to manipulate a collection of objects (eg. add, get) while hiding how and where the data is stored, effectively keeping your application ignorant of how data is persisted (eg. in memory, a database, a file)
- A Unit of Work (UoW) offers a context in which objects that have changed are noted, and those changes can be persisted (or discarded) as part of a transaction in your application
I set about building a Repository and UoW to hold the Clients
and Stocks
in memory, just like in my first program, but instead of interacting with a Python data structure directly, the application would have to interact with the Repository. I based my first Repository and UoW on the mock testing repo from the Cosmic Python book, but with extra flair; rather than merely mocking a Repository and holding a temporary list, I defined a class which held a dictionary named _objects
as a class attribute, such that any instantiation of the Repository would be able to interact with the _objects
stored inside. As suggested by the book, I made the UoW a Python context manager. A context manager requires an __enter__
dunder method to setup some context (my UoW just returns itself) and an __exit__
dunder method to specify what happens when you leave the context (my UoW calls its rollback
function to discard uncommited changes). I'd never written one of these before, but it seems perfect for this case of starting and ending a "session".
I felt a bit dirty about my Repository, as class attributes shared across all past, present and future instantiations of a class as a means of persisting data felt a bit weird -- it's easy to accidentally create an instance and shadow or overwrite the class variable. This wasn't helped by internet searches wherein I found of conflicting examples of writing a Repository and UoW. I became a little frustrated with trying to do "the right thing" first time, which caused some procrastination.
Persevering, I took the example from the Cosmic Python book much further. I gave the Repository an instance variable dictionary called _staged_objects
to keep track of objects that needed to be committed. I felt like I was really in the swing of things now. I added an _object_versions
class dict, and _staged_versions
instance dict too. If you were to imagine a process to update a user's holdings, my Repository and UoW worked like so:
- A change in the system such as a bought or sold stock triggers a call to the
Exchange
'supdate_user
service function - The
Exchange
serviceupdate_user
method "enters" a context (using Python'swith
statement), instantiating aUoW
that has access to the Repository for handling the users. A variableuow
is in scope for dealing with the unit of work and is the only way to access the user Repository update_user
uses the context of the UoW and queries the user repository withuow.users.get
- The Repository's
get
checks for the user object in its_objects
class dictionary, copies (copy.deepcopy
) it to its_staged_objects
instance dictionary (and also copies the_object_version[user_id]
to_staged_version[user_id]
) and returns the staged object - The
update_user
method makes a change to the domain object and callsuow.commit
- The UoW passes through the request to commit to the Repository:
- The
_staged_version[user_id]
is checked against_object_version[user_id]
to ensure the_objects
dictionary has not been updated for this user sinceget
was called - The
_staged_objects[user_id]
overwrites the_objects[user_id]
and_object_version[user_id]
is incremented
- The
update_user
exits thewith
block, closing the UoW context (callinguow.rollback
automatically, but there is nothing to rollback)
It took some refining but it did indeed work! When the uow
is instantiated by a service, it creates a new GenericMemoryRepository
and specifies a prefix to be added to all the keys (for the _objects
dict and so on), meaning the the GenericMemoryRepository
can be used by any model in our domain (Stocks
and Users
) without worrying about key clases. This Repository is overkill as we'll likely migrate to some other means to persist storage, but it was important for me to see how a Repostory and UoW would work, even if just to abstract a Python list out of main.py
or the Exchange
class to an interface. I struggled with the idea of not "assigning" some memory in main.py
or the service layer.
While this worked, it felt like a lot of effort to manage a dictionary, and I could certainly see the appeal of using a framework that takes all this work off you instead. I decided the way to test whether this was a worthwhile endeavour was to immediately write a new Repository and UoW to access an sqlite
database and see how badly the Exchange
was impacted.
Half a day or so later, I'd made some changes:
adapters/stex_sqlite.py
defines theSQLAlchemy
boiler plate to set-up a database table and "map" it to the domain object (to allow the ORM to commit domain objects and return domain objects from queries without writing any code)- The
stex_sqlite
adapter also providesStexSqliteSessionFactory
which is used to createSQLAlchemy
sessions, it is also controls the instantiation of a Singleton_engine
that represents the database and is required for session making. I'm not sure it belongs here at the moment... - In my
io/persistence.py
, aStockSqliteRepository
andStockSqliteUoW
implement the required functions to add/get and commit/rollback (respectively), using theSQLAlchemy
session
What about for the service? I was pretty stunned to discover the pattern really does work! All I needed to change was the StockMemoryUoW
to StockSqliteUoW
!
Suddenly a few things clicked for me. We can take advantage of the benefits of incredible software like SQLAlchemy
, but we don't have to let it in to the inner-circle of our application. In this case, the Repository (and UoW) allows us to take advantage of the bits of SQLAlchemy
that we want, without letting our application in on the secret. Indeed, as far as the Exchange
was concerned, the objects could still be in a class attribute in our GenericMemoryRepository
. Incredibly, I could pick one or the other at start-up and the application just worked either way.
This might seem like a long post to say "hey design patterns are pretty good you know", but it's a little more than that to me. Throughout my programming career so far, I'd always thought "less code is clean code". I don't see interfaces so much in Python code (compared to say, Java), and abstractions about storage and the like are quite "enterprise-y" and rarely seen in scientific programming paradigms. All this boilerplate to hold SQLAlchemy
(or indeed, a simple Python dict) at arms length is completely contrary to how I've worked in the past, and yet, the STEX2 server could swap to another ORM or persistence layer tomorrow with nothing more than a bit of grunt work to set up an appropriate Repository, and the application is none the wiser. It seems that more code, not less code, is the recipe to an adaptable system.