-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPP-3439: Refactor DataIssueTask / CleanerTask, clean users with blank emails #4721
Conversation
Adds __repr__ and __eq__ for DataItem, further adjusts interfaces.
The tests exposed that DataItem was too generic, and required too many run-time checks to ensure it was used correctly. Instead, factor to a hierarchy with more constraints.
Fix users whose primary email was not set from their FxA data.
Be more consistant with Item suffix, and save ReportEntry for new data structure.
This method collects the relevant data items, injects CleanedItem records, and sorts in print order along with the data needed to generate reports. It works in some corner cases that the markdown_report() method would print in the wrong order.
All the current clean_users, clean_relay_addresses, etc. just used item.get_queryset(), so pass this instead of the DataItem.
7fe3563
to
d4513d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After I cleared a user's email and ran ./manage.py cleanup_data -v 2 --missing-email --clean
I saw this ...
{"Timestamp": 1717186451021793024, "Type": "eventsinfo.cleanup_data", "Logger": "fx-private-relay", "Hostname": "groovetop", "EnvVersion": "2.0", "Severity": 6, "Pid": 19675, "Fields": {"cleaned": true, "timers": {"query_s": 0.008, "clean_s": 0.001}, "tasks": {"missing-email": {"counts": {"summary": {"ok": 3, "needs_cleaning": 0, "cleaned": 0}, "users": {"all": 4, "email": 3, "!email": 1, "!email.fxa": 0, "cleaned": 0}}, "timers": {"query_s": 0.008, "clean_s": 0.001}}}, "rid": null, "msg": "cleanup_data complete, cleaned 0 of 0 issues."}}
# Summary
| task |issues|fixed|query (s)|fix (s)|
|------------:|-----:|----:|--------:|------:|
|missing-email| 0| 0| 0.008| 0.001|
| _Total_| 0| 0| 0.008| 0.001|
# Details
## missing-email
When User.email is empty, we are unable to forward email to the Relay user. We
can get the email from the FxA profile if available.
Detected 0 issues in 0.008 seconds.
Cleaned 0 issues in 0.001 seconds.
Users:
All: 4
Email : 3 ( 75.0%)
No Email: 1 ( 25.0%)
Has FxA: 0 ( 0.0%)
And the user's email address was not restored? 😢
It restores the email by copying it from the FxA profile, and the user without an email doesn't have an FxA profile ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of all the refactoring is to make these easier to read and write, and require less repetitive code.
suggestion (non-blocking): Maybe it’s because I’ve been gone for a week, but this refactored data cleaner code doesn’t seem easier to read? Or at least, not for a "first-time" reader of the code. I had to debug it and to find the queryset filter()
call that was failing, I had to navigate thru MissingEmailCleaner
, CleanerTask
, DataModelSpec
, DataBisectSpec
, BaseDataItem
, etc. ... and I'm still not sure I understand it all.
Do you remember which parts of the refactor were the ones you meant specifically for read- and write-ability? Any chance of teasing those apart? The refactor that was meant to reduce code duplication seems to move a couple hundred lines of duplicate code into 800 lines of more abstract code?
privaterelay/cleaners.py
Outdated
model=User, | ||
subdivisions=[ | ||
DataBisectSpec("email", ~Q(email__exact="")), | ||
DataBisectSpec("!email.fxa", Q(socialaccount__provider="fxa")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hard-coded fxa
here means that the application must be set up with "fxa" as the name of the SocialAccount
SocialApp
name or this code will fail to find the user's FXA, which caused my error with the spot-check.
suggestion (blocking): either change this value to check for something like Q(socialaccount__provider=SocialAccount.SocialApp.get(provider="fxa").name)
(not exactly sure of the syntax), or let's update the README directions to always use "fxa" for the provider id+name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you tell me more about the bug you were seeing? The query Q(socialaccout_provider="fxa")
means what you are suggesting - the SocialAccount
where provider=fxa
, not the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I might be getting SocialAccount
and SocialApplication
mixed up - I'll dig in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm not. SocialApp
has a provider that is fxa
for Firefox Accounts, and SocialAccount
should have provider set to fxa
as well. What is SocialAccount.provider
for the test user that wasn't working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SocialAccount.provider
for the user was accounts.stage.mozaws.net
when it wasn't working, which seems to be copied from SocialApplication
at creation-time ...
Test 1: SocialApplication
"fxa"
- Sign-up new user
- Check http://127.0.0.1:8000/admin/socialaccount/socialaccount/
- provider is "fxa"
- Delete the
User.email
value - Run
python manage.py cleanup_data -v 2 --missing-email --clean
- Fixes the issue! (
Cleaned: 1 (100.0%)
)
- Fixes the issue! (
Test 2: SocialApplication
"accounts.stage.mozaws.net"
- Change SocialApplication "Provider ID" and "Name" value to
accounts.stage.mozaws.net
- Sign-up new user
- Check http://127.0.0.1:8000/admin/socialaccount/socialaccount/
- provider is "accounts.stage.mozaws.net"
- Delete the
User.email
value - Run
python manage.py cleanup_data -v 2 --missing-email --clean
- Doesn't find the FXA! (
Has FxA: 0 ( 0.0%)
)
- Doesn't find the FXA! (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks. Mine is looks like this:
The optional provider_id
was added in 0.55.0. I'll dig in and see what that is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR - you should not be setting provider_id
, leave it blank like it is in dev, stage, and prod. However, I'll update the code to handle it.
The documentation is not very helpful. There's specific provider docs, and this doc for allauth.socialaccount.adapter.DefaultSocialAccountAdapter.get_provider()
:
Looks up a
provider
, supporting subproviders by looking up byprovider_id
.
The code is a little more helpful:
For providers that support subproviders, such as OpenID Connect and SAML, this ID identifies that instance. SocialAccount's originating from app will have their
provider
field set to theprovider_id
if available, elseprovider
.
When a provider is providing OpenID Connect or SAML, then django-allauth
has been removing the provider and using the generic provider:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest commit fixes the provider_id
issue, with tests.
The "cleaner" part is in the cleaning tasks. It went from these 127 lines: fx-private-relay/emails/cleaners.py Lines 14 to 141 in e2544bb
to these 50: fx-private-relay/emails/cleaners.py Lines 18 to 68 in 0685e1e
The retained stuff:
The new stuff:
The dropped stuff:
The altered stuff:
Do you think some more documentation would help? Or would you like it broken into smaller pieces that don't try to do so much at once? |
I think it's the number of "generic methods" calling back and forth, using more generic types & objects that made it harder to debug the specific issue I was having. |
@@ -84,10 +92,13 @@ def test_missing_email_cleaner_data_to_clear() -> None: | |||
assert report == expected | |||
|
|||
|
|||
@pytest.mark.parametrize("provider_id", ("", "fxa", "test.example.com")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: perfect, thanks!
Some users in production have
user.email == ""
. This prevents us from forwarding them any emails.At least one also has an email in their Mozilla account data in the database. The
user.email
should have been set to theirsocialaccount.extra_data["email"]
during signup, but some other issue must have prevented it from happening, or caused it to be cleared.This PR adds a cleaner
MissingEmailCleaner
to the cleaner lists, so that this issue can be fixed in production. The cleaners are run daily as a Cron Job in production and other environments, so these and future incidents will be fixed.The Refactor
The changes for MPP-3439 are small compared to the refactor of
DataIssueTask
and the derivedCleanerTask
. However, none of the tests for the existing email cleaners changed, so the refactor (hopefully!) works the same.It is probably more enlightening to read the code in
emails/cleaners.py
andprivaterelay/cleaners.py
than to try to understand all the code changes, or the explanation that follows. The purpose of all the refactoring is to make these easier to read and write, and require less repetitive code. Feel free to skip ahead to "How to Test".Previously a cleaner would derive from
CleanerTask
and define the methods:_get_counts_and_data
- Make databaseCOUNT
queries, and create a two-level dictionary of counts, a "summary" plus one for each model. Return the counts and querysets for getting the problem rows._clean
- Take the "problem row" querysets, and update the rows or iterate over the rows and fix them one at a time. Return the count of fixed rows.markdown_report
- Generate a blob of text representing the data and how many rows were cleaned.After the refactor, these functions have been generalized and moved up into the parent classes. Because there is quite a lot of shared code, it has been moved to
privaterelay/cleaner_task.py
, leavingprivaterelay/cleaners.py
for the newMissingEmailCleaner
.After the refactor, a data cleaner derives from
CleanerTask
and defines adata_specification
with two models:DataModelSpec
- This represents a model, such aUser
orRelayAddress
, as well as some details for data cleaningDataBisectSpec
- A filter of a model or a further filtering, as well as the opposite of that filter. For example,User.objects.filter(is_active=True)
andUser.objects.exclude(is_active=True)
bisects theUser
table into two sets of rows. All the current cleaning operations can be represented as a series of filters that bisect the rows until the problem rows, if any, remain.The data cleaner class also defines functions with the pattern
clean_<plural model_name>
. These are passed a queryset representing the problem rows, if any. The function fixes the issues if it can, and returns a count of the issues it fixed.The
CleanerTask
uses some new classes (ReportItem
,CleanedItem
,BaseDataItem
,DataModelItem
, andDataItem
) to implement querying, generating counts, running cleaning (in cooperation with theclean_x
methods) and creating reports. These mostly hold and validate data, and are thoroughly tested.The unused
DetectorTask
is gone. The base classDataIssueTask
could do the same, or be extended if we ever want to use this framework to explore the data but not fix it.Other changes:
verbose_name_plural
forRelayAddress
andDomainAddress
. This changes them from "addresss" to "addresses" in the Django admin, and also means it can be used inDataTask
where a dictionary key is needed."raise NotImplementedError"
to the coverage ignore list, so@abstractmethod
functions do not appear as uncovered.How to test:
In your local environment, ensure you have at least one staff user (can access the Django admin) and one regular user (logs in with Mozilla Accounts).
As the staff user, go to http://127.0.0.1:8000/admin/auth/user/
On this branch, run:
The output contains a log entry and a text report, and is something like:
As the staff user, refresh the user list at http://127.0.0.1:8000/admin/auth/user/ . The regular user should have their email again.
As a bonus, run all the cleaners: