Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add endpoint for creating objects (fixes #178) #188

Merged
merged 17 commits into from
Jul 26, 2021

Conversation

DavidMStraub
Copy link
Member

Here is my first try at #178.

I added an endpoint /api/objects/ for POSTing a list of objects. You can check this example in the unit tests what it would look like for a person and the associated birth event.

Objects are validated against the Gramps objects' jsonschema and 400 is returned if they don't conform.

What is slightly annoying from a documentation point of view is that the schema for the posted objects is slightly different from the schema of the objects we return on the GET endpoints. But I think it makes sense to do it this way.

This is not ready to merge yet, but comments highly welcome.

@DavidMStraub DavidMStraub marked this pull request as ready for review July 17, 2021 11:58
@DavidMStraub
Copy link
Member Author

This is ready for review!

Apart from the /api/objects/ endpoint that allows to POST multiple, possibly interrelated, objects, I now also added POST to all the individual primary object endpoints, e.g. a person can be added by posting to /api/people/.

I also added the documentation and it turns out the concern in my first comment above is not relevant as properties can be explicitly specified as readOnly, i.e. they will only appear in GET requests but must not appear in POST.

Comments:

  • extended and profile are read-only properties, i.e. they must not appear in a POST
  • _class is not needed for the individual object endpoints but is mandatory for /api/objects/
  • Specifying a handle is optional (but useful particularly for /api/objects/ when posting related objects). Adding an object with a handle that already exists will fail
  • I believe setting change has no effect as it is being overwritten by Gramps (which makes sense)

There is just one open question: what to do about referenced handles that do not exist? As far as I can see, Gramps currently does not check whether any of the handles in lists like event_ref_list, citation_list, etc. exist or not when adding an object to the DB. This is probably OK for Gramps desktop which can make sure that these functions are only called from dialogues that select existing objects as references, but the question is if we should also leave this resposibility in the frontend or whether we should explicitly check the existence of each referenced handle.

@Nick-Hall
Copy link
Member

The backend doesn't enforce referential integrity. Instead a HandleError is raised when attempting to access a non-existent reference. It's probably OK to do the same in the web-api.

@DavidMStraub DavidMStraub requested a review from cdhorn July 17, 2021 19:33
Copy link

@lenzls lenzls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey,
I gave a review a try. There are only a few superficial comments, and another look by someone who really knows the code is certainly still necessary.

All in all, it looks good to me. Surprisingly little code was needed!

# Gramps Web API - A RESTful API for the Gramps genealogy program
#
# Copyright (C) 2020 David Straub
# Copyright (C) 2020 Christopher Horn
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy and paste date?

return add_method(obj, trans)
except AttributeError:
raise ValueError("Database does not support writing.")
raise ValueError("Unexpected object type.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this line dead?

It looks like we either already returned, or some error is already raised at this point.

description: "Bad Request: Malformed request could not be parsed."
401:
description: "Unauthorized: Missing authorization header."
422:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this thrown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

422 is currently not thrown at all, true. I left it in because we will need it in case we add some query parameters in the future (although currently I cannot think of any...)

obj_dict = rv.json
self.assertEqual(obj_dict["desc"], obj["desc"])
# test referring to non-existing objects
# handle = make_handle()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I should remove this. I commented it out because it didn't fail as expected - as per Nicks's comment #188 (comment) this is expected as Gramps doesn't enforce referntial integrity.

rv = self.client.post("/api/notes/", json=obj, headers=headers)
self.assertEqual(rv.status_code, 201)
rv = self.client.get(f"/api/notes/{handle}", headers=headers)
self.assertEqual(rv.status_code, 200)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All other tests also verify the content to some extent. Was that intentionally left out here and in test_objects_add_note?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not intentional!

sqlauth.create_table()
sqlauth.add_user(name="user", password="123", role=ROLE_GUEST)
sqlauth.add_user(name="admin", password="123", role=ROLE_OWNER)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to also have a test for creating multiple objects in one call.

Especially in the case of [<good_obj>, <bad_obj>], to see that the transaction roll-back works as promised.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! Added one test for now.

@DavidMStraub
Copy link
Member Author

@lenzls thanks for the review!

@DavidMStraub DavidMStraub removed the request for review from cdhorn July 18, 2021 15:08
Copy link
Collaborator

@cdhorn cdhorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick run through, seems to look pretty good to me. Nice work!

):
"""Commit a modified Gramps object to the database.

Failes with a ValueError if the object with this handle does not
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling...

@DavidMStraub
Copy link
Member Author

Thanks! Will wait with merging this until #195 is done, so we can have a tagged image before adding more features.

@DavidMStraub
Copy link
Member Author

While working on #193, I realized a fundamental flaw in my approach so far: we are opening the database without locking and setting db_handle.readonly = True, which was fine so far as everything was read-only as far as the database is concerned. But this is no longer the case with POST. The reason this was working anyhow is that Gramps' add_... methods don't check the readonly flag. Turns out the remove_... methods do.

I now added a new argument to properly open the database in writable mode. However, I'm also just naively locking it by creating a lock file. This will not work in general as there is no handling of a request that happens while another request has locked the DB.

My feeling is that we can just skip the lock step altogether, as the SQLite (or Postgres) DB should handle concurrent requests just fine. But I have to think about it more throughly.

@DavidMStraub
Copy link
Member Author

OK, done thinking more throroughly.

Writing a lock vs. checking for a lock. We definitely have to check for a lock, because we never ever want a web API process that someone starts locally on their computer to start adding or deleting objects in a database that is simultaneously opened in a desktop Gramps. But we already fail when opening a DB that is locked, so this is safe. That doesn't mean that we must write a lock though. Every API call that modifies the database (POST, PUT, DELETE) has their own database connection and Gramps database transaction that is commited once at the very end. Even with multiple users editing the database, write conflicts are very unlikely.

Concurrent writes in SQLite. Writing is always serial in SQLite with default settings, so the worst thing that can ever happen when many users are writing simultaneously to the DB is that some of them get a "database is locked" error, which for applications will manifest as a 500 Server Error, which needs to be handled anyway as it can happen for a number of reasons.

Concurrent writes in PostgreSQL. Here, the situation is slightly more tricky as there are different isolation levels that behave differently when two requests happen to commit a writing transaction exactly at the same time and modifying the same objects. But writing a lock file to the file system that prevents any concurrent write, even to totally disjoint rows in the database, is not reasonable.

So, my conclusion is that we should never use a lock file but leave database locking to the database itself. This is implemented in this PR now and from my side can be merged.

@Nick-Hall
Copy link
Member

Writing a lock file will prevent someone opening the database with the desktop GUI. We don't want a GUI session and the web API writing to the same database.

@DavidMStraub
Copy link
Member Author

Writing a lock file will prevent someone opening the database with the desktop GUI. We don't want a GUI session and the web API writing to the same database.

I think it's not a problem for the database or the web API if a desktop Gramps writes to it, as long as the web API is read-only. In fact, with the current read-only web API, I often have desktop Gramps open at the same time to modify things and see how the web app changes, and I never had problems. Since every API request opens a new (read-only) database connection, it doesn't matter if changes happen between requests.

The real issue would only be if the web API starts writing as well, as Gramps desktop does not expect simultaneous database changes. However this can never happen with the current design: as soon as a Gramps desktop opens the database, it writes a lock file. From this point in time, every attempt by a web API instance to obtain write access on the DB will fail, as we check for the lock (even if we don't write a lock ourselves).

"""Open the database and return a dbstate instance.

If `lock` is `False`, will not write a lock file (use with care!).
If `readonly` is `True` (default), write operations will fail (note,
this is not enforced by Gramps but must be taken care of in Web
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by enforced here?
Surely a write action on a read-only database will raise some kind of error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that self.readonly is a flag on DbGeneric (the base class for database implementations) that is set to True when the database file is found to be read-only; it is also always true on a ProxyDbBase (such as the PrivateProxyDb we use for users who don't have ViewPrivate permissions). However, setting self.readonly = True on a DbGeneric child does not mean that write operations will fail. (Actually, for children of DBAPI, deleting objects will silently fail as the attribute is checked in the _do_remove method.)

I'm not saying that this is a problem in Gramps - the thing is just that when we use get_db_handle(readonly=True), it should be made clear that this sets the readonly flag, but that is not enough to prevent write operations - this is why the flag is explicitly checked and an error is raised in the add_object and update_object functions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Sadly, I'm still a little confused about the database layer and its components, but that is fine.

I just wanted to make sure that you don't have to check manually the read-only flag on every write action in the codebase, but rather it is checked in an underlying shared method which centralized this check. Otherwise, it sounds error-prone.

@DavidMStraub
Copy link
Member Author

Any objections to merging this?

@cdhorn
Copy link
Collaborator

cdhorn commented Jul 26, 2021

None from me.

@DavidMStraub DavidMStraub merged commit a75de35 into gramps-project:master Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants