-
Notifications
You must be signed in to change notification settings - Fork 697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a special deleted
journalist user for associations with deleted users
#6192
Comments
deleted
journalist user to associate deleted contentdeleted
journalist user for associations with deleted users
I think there are two approaches, but first here are some notes (recap): Notes
Option A
Option B
|
I looked into this a bit, my personal preference would be to have the placeholder "deleted" user (Option A), just so we can rely on journalist_id being type A bit more concrete proposal: Rather than calling The migration step would basically do the same thing, look for NULL journalist_ids, either assigning them to the "deleted" user or deleting them. Another option is to set an event to listen for deletion attempts (https://docs.sqlalchemy.org/en/13/orm/session_events.html), and intercept them to update the seen/reply models to point to "deleted". This seems more much more complex. Also the system was changed in 1.4, so we might have to update/migrate it when upgrading sqlalchemy. |
Thanks for joining the conversation @legoktm! Option A is part of what I've been advocating for for at least a year to resolve downstream issues (the other part being: #5467). The reason I mention Option B is that it allows us to avoid the data migration and to continue relying on the DB to null out the journalist IDs upon account deletion. It is worth considering in case more issues come up with Option A. Either way, Read Receipts is partially unblocked, so I'll head over that way while you work on this Github Issue and iron out even more details with @zenmonkeykstop. |
I pushed my current WIP to the "delete-journalist" branch, so far it lazily creates a "deleted" journalist and reassigns the seen rows to it upon deletion (or so I think, I'll do more testing tomorrow). |
From what I can tell, nearly all the calls to db.session.flush() have been introduced so that a newly created object's autoincrement ID can be referenced in a second object to be created, most often this is creating a Reply, and then a corresponding SeenReply. However, if we just pass the Reply object into SeenReply, rather than the id (which is None if it hasn't been flushed), SQLAlchemy takes care of it all for us, properly assigning the correct autoincrement ID using the specified relationship. The motivation for doing this is that JournalistLoginAttempt in loaddata was not properly flushing and was being assigned to a journalist_id=NULL, which will soon be an error as part of #6192. Rather than stick another flush in to fix that case, we can just get rid of nearly all of them. There is still one explicit flush() left in loaddata.py's record_source_interaction(), but that's a different pattern than these so I've left it alone for now.
…users Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
…users Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into and has no passphase set. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username and passphrase validation. Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using a special DeletedJournalist subclass that bypasses username validation. For consistency reasons, a randomly generated diceware passphrase and TOTP secret are set in the database for this account, but never revealed to anyone. Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using `Journalist.get_deleted()`. For consistency reasons, a randomly generated diceware passphrase and TOTP secret are set in the database for this account, but never revealed to anyone. We use a placeholder username to bypass Journalist's normal validation check that the username is not "deleted". Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using `Journalist.get_deleted()`. For consistency reasons, a randomly generated diceware passphrase and TOTP secret are set in the database for this account, but never revealed to anyone. We use a placeholder username to bypass Journalist's normal validation check that the username is not "deleted". Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using `Journalist.get_deleted()`. For consistency reasons, a randomly generated diceware passphrase and TOTP secret are set in the database for this account, but never revealed to anyone. We use a placeholder username to bypass Journalist's normal validation check that the username is not "deleted". Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
As discussed in #5467 and #5503, we currently delete rows for deleted journalist users. After deletion,
journalist_id
will beNULL
in other tables that reference the deleted user (e.g.,replies
and theseen
tables; see the database diagram). The API will return the usernamedeleted
and the UUIDdeleted
wherever these users are referenced.There are a few issues with this:
/users
endpoint of the API to get a complete list of users that an API consumer may need to associate data withIn #5467, we considered preserving a record for each user, while removing all data about the user. This may be desirable in the long run (alongside features such as user locking: #3926), but even if we do this, we will need a migration for historical data. The solution we've settled on for now is to create a global
deleted
user. A database migration will need to a) create this user, b) associate records currently associated withNULL
users with this new user.This puts us in a good position to make other changes like #5467 in future, and unblocks further work on SecureDrop Client features that rely on journalist attribution.
The text was updated successfully, but these errors were encountered: