refactor PostgresClient #71

geospatial-jeff · 2021-01-19T02:13:27Z

No description provided.

geospatial-jeff · 2021-01-20T03:39:40Z

For context, the initial postgresql application logic (PostgresCoreClient et al) was written ~1 year ago in a different repo and refactored very slightly to accommodate the api layer which was added when this project was open sourced (#1). The original application used a FastAPI dependency to inject the required resources and code into the application.

There are some problems with the current implementation which are blocking development on other things (notably #57 #58 #64) - in no particular order:

PostgresClient is overloaded. It defines methods for database operations like committing, checking if a row exists etc. but these are very opinionated and don't always work for our various use cases. Overloading of functionality in this particular class means we use it everywhere, even when it doesn't make much sense to do so. It's also just very abstracted, making development hard.
We are using dataclass when we really need to be using attrs. Mostly because attrs is more flexible when it comes to the definition of optional and optionally required attributes.
Defaulting to the sqlalchemy models defined by the library is weird and causes some not so intuitive behavior, especially when writing certain test cases (test use of custom sqlalchemy model #70). It is also poor separation of concerns.
I'm really on the fence about allowing sqlalchemy models to be instance attributes of a class. I'm not sure its bad, but I think there is a better way of doing it.
The separation of pagination logic from the core is also very abstracted, and should be brought into core anyways to align with the spec (remove PaginationClient, support paging in core #67).
Coupling of resource management and application logic makes decouple backends from api layer #57 a nightmare.

What I'm going for with the refactor:

Separation of resource management from application logic. This blog post has an interesting implementation using a mixin, not sure if we will take the same approach but its good food for thought. Sqlalchemy models, database configuration etc. are owned by the api layer and passed down to the application logic, preferably in a separate class which is responsible for resource management across the various backends.
Don't worry about DRY. I'd rather have each database operation manage its own commits and deduplicate code as needed in the future. This should make the codebase much simpler, while making it obvious where abstraction of database operations really adds value.
Use the newest version of sqlalchemy with support for async (upgrade sqlalchemy to 1.4 for async/await #64)

I do think the base clients which define the interfaces are good, they follow the spec which is all that matters. Just that our postgres/canonical implementation of this interface is currently not that great.

This was referenced Jan 20, 2021

switch to attrs #72

Closed

decouple database connection #74

Merged

moradology closed this as completed Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor PostgresClient #71

refactor PostgresClient #71

geospatial-jeff commented Jan 19, 2021

geospatial-jeff commented Jan 20, 2021 •

edited

Loading

refactor PostgresClient #71

refactor PostgresClient #71

Comments

geospatial-jeff commented Jan 19, 2021

geospatial-jeff commented Jan 20, 2021 • edited Loading

geospatial-jeff commented Jan 20, 2021 •

edited

Loading