MPP-3439: Refactor DataIssueTask / CleanerTask, clean users with blank emails #4721

jwhitlock · 2024-05-22T22:47:31Z

Some users in production have user.email == "". This prevents us from forwarding them any emails.

At least one also has an email in their Mozilla account data in the database. The user.email should have been set to their socialaccount.extra_data["email"] during signup, but some other issue must have prevented it from happening, or caused it to be cleared.

This PR adds a cleaner MissingEmailCleaner to the cleaner lists, so that this issue can be fixed in production. The cleaners are run daily as a Cron Job in production and other environments, so these and future incidents will be fixed.

The Refactor

The changes for MPP-3439 are small compared to the refactor of DataIssueTask and the derived CleanerTask. However, none of the tests for the existing email cleaners changed, so the refactor (hopefully!) works the same.

It is probably more enlightening to read the code in emails/cleaners.py and privaterelay/cleaners.py than to try to understand all the code changes, or the explanation that follows. The purpose of all the refactoring is to make these easier to read and write, and require less repetitive code. Feel free to skip ahead to "How to Test".

Previously a cleaner would derive from CleanerTask and define the methods:

_get_counts_and_data - Make database COUNT queries, and create a two-level dictionary of counts, a "summary" plus one for each model. Return the counts and querysets for getting the problem rows.
_clean - Take the "problem row" querysets, and update the rows or iterate over the rows and fix them one at a time. Return the count of fixed rows.
markdown_report - Generate a blob of text representing the data and how many rows were cleaned.

After the refactor, these functions have been generalized and moved up into the parent classes. Because there is quite a lot of shared code, it has been moved to privaterelay/cleaner_task.py, leaving privaterelay/cleaners.py for the new MissingEmailCleaner.

After the refactor, a data cleaner derives from CleanerTask and defines a data_specification with two models:

DataModelSpec - This represents a model, such a User or RelayAddress, as well as some details for data cleaning
DataBisectSpec - A filter of a model or a further filtering, as well as the opposite of that filter. For example, User.objects.filter(is_active=True) and User.objects.exclude(is_active=True) bisects the User table into two sets of rows. All the current cleaning operations can be represented as a series of filters that bisect the rows until the problem rows, if any, remain.

The data cleaner class also defines functions with the pattern clean_<plural model_name>. These are passed a queryset representing the problem rows, if any. The function fixes the issues if it can, and returns a count of the issues it fixed.

The CleanerTask uses some new classes (ReportItem, CleanedItem, BaseDataItem, DataModelItem, and DataItem) to implement querying, generating counts, running cleaning (in cooperation with the clean_x methods) and creating reports. These mostly hold and validate data, and are thoroughly tested.

The unused DetectorTask is gone. The base class DataIssueTask could do the same, or be extended if we ever want to use this framework to explore the data but not fix it.

Other changes:

Added verbose_name_plural for RelayAddress and DomainAddress. This changes them from "addresss" to "addresses" in the Django admin, and also means it can be used in DataTask where a dictionary key is needed.
Added "raise NotImplementedError" to the coverage ignore list, so @abstractmethod functions do not appear as uncovered.

How to test:

In your local environment, ensure you have at least one staff user (can access the Django admin) and one regular user (logs in with Mozilla Accounts).

As the staff user, go to http://127.0.0.1:8000/admin/auth/user/

From the user list, select your regular user.
From the user detail page, clear the email then click the "Save" button.
On the user list, confirm the regular user's email is clear.

On this branch, run:

./manage.py cleanup_data -v 2 --missing-email --clean

The output contains a log entry and a text report, and is something like:

{"Timestamp": 1716414385805936896, "Type": "eventsinfo.cleanup_data", "Logger": "fx-private-relay", "Hostname": "jwhitlock-MB-M2.lan", "EnvVersion": "2.0", "Severity": 6, "Pid": 17036, "Fields": {"cleaned": true, "timers": {"query_s": 0.002, "clean_s": 0.002}, "tasks": {"missing-email": {"counts": {"summary": {"ok": 5, "needs_cleaning": 1, "cleaned": 1}, "users": {"all": 7, "email": 5, "!email": 2, "!email.fxa": 1, "cleaned": 1}}, "timers": {"query_s": 0.002, "clean_s": 0.002}}}, "rid": null, "msg": "cleanup_data complete, cleaned 1 of 1 issue."}}

# Summary

|    task     |issues|fixed|query (s)|fix (s)|
|------------:|-----:|----:|--------:|------:|
|missing-email|     1|    1|    0.002|  0.002|
|      _Total_|     1|    1|    0.002|  0.002|

# Details
## missing-email
When User.email is empty, we are unable to forward email to the Relay user. We
can get the email from the FxA profile if available.

Detected 1 issue in 0.002 seconds.
Cleaned 1 issue in 0.002 seconds.

Users:
  All: 7
    Email   : 5 ( 71.4%)
    No Email: 2 ( 28.6%)
      Has FxA: 1 ( 50.0%)
        Cleaned: 1 (100.0%)

As the staff user, refresh the user list at http://127.0.0.1:8000/admin/auth/user/ . The regular user should have their email again.

As a bonus, run all the cleaners:

./manage.py cleanup_data -v 2 --clean

I've added a unit test to test for potential regressions of this bug.
Commits in this PR are minimal and have descriptive commit messages.

Adds __repr__ and __eq__ for DataItem, further adjusts interfaces.

The tests exposed that DataItem was too generic, and required too many run-time checks to ensure it was used correctly. Instead, factor to a hierarchy with more constraints.

Fix users whose primary email was not set from their FxA data.

Be more consistant with Item suffix, and save ReportEntry for new data structure.

This method collects the relevant data items, injects CleanedItem records, and sorts in print order along with the data needed to generate reports. It works in some corner cases that the markdown_report() method would print in the wrong order.

All the current clean_users, clean_relay_addresses, etc. just used item.get_queryset(), so pass this instead of the DataItem.

groovecoder

After I cleared a user's email and ran ./manage.py cleanup_data -v 2 --missing-email --clean I saw this ...

{"Timestamp": 1717186451021793024, "Type": "eventsinfo.cleanup_data", "Logger": "fx-private-relay", "Hostname": "groovetop", "EnvVersion": "2.0", "Severity": 6, "Pid": 19675, "Fields": {"cleaned": true, "timers": {"query_s": 0.008, "clean_s": 0.001}, "tasks": {"missing-email": {"counts": {"summary": {"ok": 3, "needs_cleaning": 0, "cleaned": 0}, "users": {"all": 4, "email": 3, "!email": 1, "!email.fxa": 0, "cleaned": 0}}, "timers": {"query_s": 0.008, "clean_s": 0.001}}}, "rid": null, "msg": "cleanup_data complete, cleaned 0 of 0 issues."}}

# Summary

|    task     |issues|fixed|query (s)|fix (s)|
|------------:|-----:|----:|--------:|------:|
|missing-email|     0|    0|    0.008|  0.001|
|      _Total_|     0|    0|    0.008|  0.001|

# Details
## missing-email
When User.email is empty, we are unable to forward email to the Relay user. We
can get the email from the FxA profile if available.

Detected 0 issues in 0.008 seconds.
Cleaned 0 issues in 0.001 seconds.

Users:
  All: 4
    Email   : 3 ( 75.0%)
    No Email: 1 ( 25.0%)
      Has FxA: 0 (  0.0%)

And the user's email address was not restored? 😢

jwhitlock · 2024-06-03T14:02:56Z

Users:
All: 4
Email : 3 ( 75.0%)
No Email: 1 ( 25.0%)
Has FxA: 0 ( 0.0%)
And the user's email address was not restored? 😢

It restores the email by copying it from the FxA profile, and the user without an email doesn't have an FxA profile (Has FxA: 0). Sorry, "Regular User" meant a user with an FxA profile.

groovecoder

The purpose of all the refactoring is to make these easier to read and write, and require less repetitive code.

suggestion (non-blocking): Maybe it’s because I’ve been gone for a week, but this refactored data cleaner code doesn’t seem easier to read? Or at least, not for a "first-time" reader of the code. I had to debug it and to find the queryset filter() call that was failing, I had to navigate thru MissingEmailCleaner, CleanerTask, DataModelSpec, DataBisectSpec, BaseDataItem, etc. ... and I'm still not sure I understand it all.

Do you remember which parts of the refactor were the ones you meant specifically for read- and write-ability? Any chance of teasing those apart? The refactor that was meant to reduce code duplication seems to move a couple hundred lines of duplicate code into 800 lines of more abstract code?

groovecoder · 2024-06-11T18:01:11Z

privaterelay/cleaners.py

+            model=User,
+            subdivisions=[
+                DataBisectSpec("email", ~Q(email__exact="")),
+                DataBisectSpec("!email.fxa", Q(socialaccount__provider="fxa")),


The hard-coded fxa here means that the application must be set up with "fxa" as the name of the SocialAccount SocialApp name or this code will fail to find the user's FXA, which caused my error with the spot-check.

suggestion (blocking): either change this value to check for something like Q(socialaccount__provider=SocialAccount.SocialApp.get(provider="fxa").name) (not exactly sure of the syntax), or let's update the README directions to always use "fxa" for the provider id+name.

Can you tell me more about the bug you were seeing? The query Q(socialaccout_provider="fxa") means what you are suggesting - the SocialAccount where provider=fxa, not the name.

Ah, I might be getting SocialAccount and SocialApplication mixed up - I'll dig in.

No, I'm not. SocialApp has a provider that is fxa for Firefox Accounts, and SocialAccount should have provider set to fxa as well. What is SocialAccount.provider for the test user that wasn't working?

SocialAccount.provider for the user was accounts.stage.mozaws.net when it wasn't working, which seems to be copied from SocialApplication at creation-time ...

Test 1: SocialApplication "fxa"

Sign-up new user

Check http://127.0.0.1:8000/admin/socialaccount/socialaccount/

provider is "fxa"

Delete the User.email value

Run python manage.py cleanup_data -v 2 --missing-email --clean

Fixes the issue! (Cleaned: 1 (100.0%))

Test 2: SocialApplication "accounts.stage.mozaws.net"

Change SocialApplication "Provider ID" and "Name" value to accounts.stage.mozaws.net

Sign-up new user

Check http://127.0.0.1:8000/admin/socialaccount/socialaccount/

provider is "accounts.stage.mozaws.net"

Delete the User.email value

Run python manage.py cleanup_data -v 2 --missing-email --clean

Doesn't find the FXA! (Has FxA: 0 ( 0.0%))

Ah, thanks. Mine is looks like this:

The optional provider_id was added in 0.55.0. I'll dig in and see what that is for.

TL;DR - you should not be setting provider_id, leave it blank like it is in dev, stage, and prod. However, I'll update the code to handle it.

The documentation is not very helpful. There's specific provider docs, and this doc for allauth.socialaccount.adapter.DefaultSocialAccountAdapter.get_provider():

Looks up a provider, supporting subproviders by looking up by provider_id.

https://docs.allauth.org/en/latest/socialaccount/adapter.html#allauth.socialaccount.adapter.DefaultSocialAccountAdapter.get_provider

The code is a little more helpful:

For providers that support subproviders, such as OpenID Connect and SAML, this ID identifies that instance. SocialAccount's originating from app will have their provider field set to the provider_id if available, else provider.

https://github.com/pennersr/django-allauth/blob/9be5ef00e8d1ea697baf046c00957f259133aae5/allauth/socialaccount/models.py#L47-L55

When a provider is providing OpenID Connect or SAML, then django-allauth has been removing the provider and using the generic provider:

https://docs.allauth.org/en/latest/socialaccount/providers/authelia.html

https://docs.allauth.org/en/latest/socialaccount/providers/cern.html

https://docs.allauth.org/en/latest/socialaccount/providers/openid_connect.html

https://docs.allauth.org/en/latest/socialaccount/providers/saml.html

The latest commit fixes the provider_id issue, with tests.

jwhitlock · 2024-06-11T21:16:54Z

The "cleaner" part is in the cleaning tasks. It went from these 127 lines:

fx-private-relay/emails/cleaners.py

Lines 14 to 141 in e2544bb

    
           class ServerStorageCleaner(CleanerTask): 
        
               slug = "server-storage" 
        
               title = "Ensure no data is stored when server_storage=False" 
        
               check_description = ( 
        
                   "When Profile.server_storage is False, the addresses (both regular and domain)" 
        
                   " should have empty data (the fields description, generated_for and used_on)." 
        
               ) 
        
               def _get_counts_and_data(self) -> tuple[Counts, CleanupData]: 
        
                   """ 
        
                   Analyze usage of the server_storage flag and server-stored data. 
        
                   Returns: 
        
                   * counts: two-level dict of row counts for Profile, RelayAddress, and 
        
                     DomainAddress 
        
                   * cleanup_data: two-element dict of RelayAddresses and DomainAddress 
        
                     queries to clear 
        
                   """ 
        
                   profiles_without_server_storage = Profile.objects.filter(server_storage=False) 
        
                   no_store_relay_addresses = RelayAddress.objects.filter( 
        
                       user__profile__server_storage=False 
        
                   ) 
        
                   no_store_domain_addresses = DomainAddress.objects.filter( 
        
                       user__profile__server_storage=False 
        
                   ) 
        
                   blank_used_on = Q(used_on="") | Q(used_on__isnull=True) 
        
                   blank_relay_data = blank_used_on & Q(description="") & Q(generated_for="") 
        
                   blank_domain_data = blank_used_on & Q(description="") 
        
                   empty_relay_addresses = no_store_relay_addresses.filter(blank_relay_data) 
        
                   empty_domain_addresses = no_store_domain_addresses.filter(blank_domain_data) 
        
                   non_empty_relay_addresses = no_store_relay_addresses.exclude(blank_relay_data) 
        
                   non_empty_domain_addresses = no_store_domain_addresses.exclude( 
        
                       blank_domain_data 
        
                   ) 
        
                   empty_relay_addresses_count = empty_relay_addresses.count() 
        
                   empty_domain_addresses_count = empty_domain_addresses.count() 
        
                   non_empty_relay_addresses_count = non_empty_relay_addresses.count() 
        
                   non_empty_domain_addresses_count = non_empty_domain_addresses.count() 
        
                   counts: Counts = { 
        
                       "summary": { 
        
                           "ok": empty_relay_addresses_count + empty_domain_addresses_count, 
        
                           "needs_cleaning": non_empty_relay_addresses_count 
        
                           + non_empty_domain_addresses_count, 
        
                       }, 
        
                       "profiles": { 
        
                           "all": Profile.objects.count(), 
        
                           "no_server_storage": profiles_without_server_storage.count(), 
        
                       }, 
        
                       "relay_addresses": { 
        
                           "all": RelayAddress.objects.count(), 
        
                           "no_server_storage": no_store_relay_addresses.count(), 
        
                           "no_server_storage_or_data": empty_relay_addresses_count, 
        
                           "no_server_storage_but_data": non_empty_relay_addresses_count, 
        
                       }, 
        
                       "domain_addresses": { 
        
                           "all": DomainAddress.objects.count(), 
        
                           "no_server_storage": no_store_domain_addresses.count(), 
        
                           "no_server_storage_or_data": empty_domain_addresses_count, 
        
                           "no_server_storage_but_data": non_empty_domain_addresses_count, 
        
                       }, 
        
                   } 
        
                   cleanup_data: CleanupData = { 
        
                       "relay_addresses": non_empty_relay_addresses, 
        
                       "domain_addresses": non_empty_domain_addresses, 
        
                   } 
        
                   return counts, cleanup_data 
        
               def _clean(self) -> int: 
        
                   """Clean addresses with unwanted server-stored data.""" 
        
                   counts = self.counts 
        
                   cleanup_data = self.cleanup_data 
        
                   counts["relay_addresses"]["cleaned"] = cleanup_data["relay_addresses"].update( 
        
                       description="", generated_for="", used_on="" 
        
                   ) 
        
                   counts["domain_addresses"]["cleaned"] = cleanup_data["domain_addresses"].update( 
        
                       description="", used_on="" 
        
                   ) 
        
                   return ( 
        
                       counts["relay_addresses"]["cleaned"] + counts["domain_addresses"]["cleaned"] 
        
                   ) 
        
               def markdown_report(self) -> str: 
        
                   """Report on server-stored data found and optionally cleaned.""" 
        
                   def model_lines(name: str, counts: dict[str, int]) -> list[str]: 
        
                       """Get the report lines for a model (Profile, Relay Addr., Domain Addrs.)""" 
        
                       total = counts["all"] 
        
                       lines = [f"{name}:", f"  All: {total}"] 
        
                       if total == 0: 
        
                           return lines 
        
                       no_server_storage = counts["no_server_storage"] 
        
                       lines.append( 
        
                           "    Without Server Storage: " 
        
                           f"{self._as_percent(no_server_storage, total)}" 
        
                       ) 
        
                       if no_server_storage == 0: 
        
                           return lines 
        
                       no_data = counts.get("no_server_storage_or_data") 
        
                       has_data = counts.get("no_server_storage_but_data") 
        
                       if no_data is None or has_data is None: 
        
                           return lines 
        
                       lines.extend( 
        
                           [ 
        
                               "      No Data : " 
        
                               f"{self._as_percent(no_data, no_server_storage)}", 
        
                               "      Has Data: " 
        
                               f"{self._as_percent(has_data, no_server_storage)}", 
        
                           ] 
        
                       ) 
        
                       cleaned = counts.get("cleaned") 
        
                       if cleaned is None or has_data == 0: 
        
                           return lines 
        
                       lines.append(f"        Cleaned: {self._as_percent(cleaned, has_data)}") 
        
                       return lines 
        
                   lines: list[str] = ( 
        
                       model_lines("Profiles", self.counts["profiles"]) 
        
                       + model_lines("Relay Addresses", self.counts["relay_addresses"]) 
        
                       + model_lines("Domain Addresses", self.counts["domain_addresses"]) 
        
                   ) 
        
                   return "\n".join(lines)

to these 50:

fx-private-relay/emails/cleaners.py

Lines 18 to 68 in 0685e1e

    
           class ServerStorageCleaner(CleanerTask): 
        
               slug = "server-storage" 
        
               title = "Ensure no data is stored when server_storage=False" 
        
               check_description = ( 
        
                   "When Profile.server_storage is False, the addresses (both regular and domain)" 
        
                   " should have empty data (the fields description, generated_for and used_on)." 
        
               ) 
        
               _blank_used_on = Q(used_on="") | Q(used_on__isnull=True) 
        
               data_specification = [ 
        
                   # Report on how many users have turned off server storage 
        
                   DataModelSpec( 
        
                       Profile, 
        
                       [DataBisectSpec("server_storage", "user__profile__server_storage")], 
        
                       omit_key_prefixes=["server_storage"], 
        
                       metric_name_overrides={"!server_storage": "no_server_storage"}, 
        
                       report_name_overrides={"!server_storage": "Without Server Storage"}, 
        
                   ) 
        
               ] + [ 
        
                   # Detect users with no server storage but address records with data 
        
                   DataModelSpec( 
        
                       AddressModel, 
        
                       [ 
        
                           DataBisectSpec("server_storage", "user__profile__server_storage"), 
        
                           DataBisectSpec("!server_storage.empty", blank_data), 
        
                       ], 
        
                       omit_key_prefixes=["server_storage"], 
        
                       metric_name_overrides={ 
        
                           "!server_storage": "no_server_storage", 
        
                           "!server_storage.empty": "no_server_storage_or_data", 
        
                           "!server_storage.!empty": "no_server_storage_but_data", 
        
                       }, 
        
                       report_name_overrides={ 
        
                           "!server_storage": "Without Server Storage", 
        
                           "!server_storage.empty": "No Data", 
        
                           "!server_storage.!empty": "Has Data", 
        
                       }, 
        
                       ok_key="!server_storage.empty", 
        
                       needs_cleaning_key="!server_storage.!empty", 
        
                   ) 
        
                   for AddressModel, blank_data in [ 
        
                       (RelayAddress, _blank_used_on & Q(description="") & Q(generated_for="")), 
        
                       (DomainAddress, _blank_used_on & Q(description="")), 
        
                   ] 
        
               ] 
        
               def clean_relay_addresses(self, queryset: QuerySet[RelayAddress]) -> int: 
        
                   return queryset.update(description="", generated_for="", used_on="") 
        
               def clean_domain_addresses(self, queryset: QuerySet[DomainAddress]) -> int: 
        
                   return queryset.update(description="", used_on="")

The retained stuff:

slug - used on the command line
title- used on reports
check_description - Used for help output/

The new stuff:

data_specification - Says what model(s) we're looking at, what queries can get the problematic rows / items, how they should show up in logs, and how they should show up in human-level reports

The dropped stuff:

_get_counts_and_data() - This hand-coded method is replaced by generic methods
markdown_report() - This hand-coded method is replaced by generic methods

The altered stuff:

_clean() - This hand-coded method becomes smaller methods like clean_relay_addresses and clean_domain_addresses, often one-liners, called by generic methods

Do you think some more documentation would help? Or would you like it broken into smaller pieces that don't try to do so much at once?

groovecoder · 2024-06-13T17:51:08Z

Do you think some more documentation would help? Or would you like it broken into smaller pieces that don't try to do so much at once?

I think it's the number of "generic methods" calling back and forth, using more generic types & objects that made it harder to debug the specific issue I was having.

groovecoder · 2024-06-20T15:25:16Z

privaterelay/tests/cleaners_tests.py

@@ -84,10 +92,13 @@ def test_missing_email_cleaner_data_to_clear() -> None:
    assert report == expected


+@pytest.mark.parametrize("provider_id", ("", "fxa", "test.example.com"))


praise: perfect, thanks!

jwhitlock added 25 commits May 17, 2024 12:41

Complete type annotations

72588a9

Set pluralization for addresses

efa6fa4

Add class DataItem for querysets

2d109d2

Use for other cleaner

3b8061b

Refactor to data specs

3c26b09

Refactor to DataModelSpec

29a5bb0

Move DataItem generation to DataSpec classes

c320bc1

Init data_items at startup

47ab9d0

Refactor markdown to base class

0297246

Simplify clean lookup

80859b8

Merge to DataIssueTask

459dd72

Add tests and docs for DataItem

8885846

Add tests for DataModelSpec and DataBisectSpec

f787a8c

Adds __repr__ and __eq__ for DataItem, further adjusts interfaces.

Refactor to ReportEntry

75b397c

The tests exposed that DataItem was too generic, and required too many run-time checks to ensure it was used correctly. Instead, factor to a hierarchy with more constraints.

Complete tests

fa68591

Rename to privaterelay/data_issue_task.py

9cef366

Add MissingEmailCleaner

ea368e6

Fix users whose primary email was not set from their FxA data.

Rename ReportEntry to ReportItem

93d6ef4

Be more consistant with Item suffix, and save ReportEntry for new data structure.

Add DataTask.get_report_entries()

a34d33f

This method collects the relevant data items, injects CleanedItem records, and sorts in print order along with the data needed to generate reports. It works in some corner cases that the markdown_report() method would print in the wrong order.

Reimplement markdown_report()

500019e

Add _CLEANED_METRIC_NAME as constant

62fb1b1

Add test coverage or skip coverage

3be7847

Move to privaterelay/tests/cleaners_tests.py

fcf76e1

Move to privaterelay/cleaner_task.py

6ae0370

Change clean_x methods to take a queryset

d4513d6

All the current clean_users, clean_relay_addresses, etc. just used item.get_queryset(), so pass this instead of the DataItem.

jwhitlock force-pushed the clean-users-with-blank-emails-mpp-3439 branch from 7fe3563 to d4513d6 Compare May 22, 2024 22:51

jwhitlock requested a review from groovecoder May 22, 2024 22:55

Fix spelling

0685e1e

groovecoder assigned groovecoder and jwhitlock May 23, 2024

groovecoder reviewed May 31, 2024

View reviewed changes

groovecoder reviewed Jun 11, 2024

View reviewed changes

jwhitlock added 2 commits June 11, 2024 16:30

Explicitly create the SocialApp in the test

f5c899d

Remove unused parameters

fc8a67a

Handle when SocialApp.provider_id is set

355389a

groovecoder approved these changes Jun 20, 2024

View reviewed changes

jwhitlock added this pull request to the merge queue Jun 20, 2024

Merged via the queue into main with commit 5cbdedf Jun 20, 2024
29 checks passed

jwhitlock deleted the clean-users-with-blank-emails-mpp-3439 branch June 20, 2024 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPP-3439: Refactor DataIssueTask / CleanerTask, clean users with blank emails #4721

MPP-3439: Refactor DataIssueTask / CleanerTask, clean users with blank emails #4721

jwhitlock commented May 22, 2024

groovecoder left a comment

jwhitlock commented Jun 3, 2024 •

edited

Loading

groovecoder left a comment

groovecoder Jun 11, 2024 •

edited

Loading

jwhitlock Jun 11, 2024

jwhitlock Jun 11, 2024

jwhitlock Jun 11, 2024

groovecoder Jun 13, 2024

jwhitlock Jun 13, 2024

jwhitlock Jun 18, 2024

jwhitlock Jun 19, 2024

jwhitlock commented Jun 11, 2024

groovecoder commented Jun 13, 2024

groovecoder Jun 20, 2024

		@@ -84,10 +92,13 @@ def test_missing_email_cleaner_data_to_clear() -> None:
		assert report == expected


		@pytest.mark.parametrize("provider_id", ("", "fxa", "test.example.com"))

MPP-3439: Refactor DataIssueTask / CleanerTask, clean users with blank emails #4721

MPP-3439: Refactor DataIssueTask / CleanerTask, clean users with blank emails #4721

Conversation

jwhitlock commented May 22, 2024

The Refactor

How to test:

groovecoder left a comment

Choose a reason for hiding this comment

jwhitlock commented Jun 3, 2024 • edited Loading

groovecoder left a comment

Choose a reason for hiding this comment

groovecoder Jun 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Test 1: SocialApplication "fxa"

Test 2: SocialApplication "accounts.stage.mozaws.net"

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwhitlock commented Jun 11, 2024

groovecoder commented Jun 13, 2024

Choose a reason for hiding this comment

jwhitlock commented Jun 3, 2024 •

edited

Loading

groovecoder Jun 11, 2024 •

edited

Loading

Test 1: `SocialApplication` "fxa"

Test 2: `SocialApplication` "accounts.stage.mozaws.net"