Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mpp 2062 return trackers found removed in email header context #2232

Merged

Conversation

say-yawn
Copy link
Contributor

@say-yawn say-yawn commented Jul 21, 2022

New feature description

  • Migration 0049 on Emails app to store number of tracker removed for Relay and Domain addresses, and for deleted addresses on the Profile object

  • Add field for tracker removed count per Relay and Domain addresses

  • Return tracker removed count per Relay and Domain addresses on the relayaddresses and domainaddresses API endpoint under thenum_level_one_trackers_blocked field

  • Add field for tracker removed count for deleted addresses on the profile object

  • Return aggregate tracked removed per profile on the profile API endpoint under level_one_trackers_blocked field

  • Use profile setting value under remove_level_one_email_trackers on Profile to remove email trackers in level one category

  • Generate report link for tracker removed with data stored in URL's hash bit so that it does not get logged

  • Pass report link and tracker removed count for Relay header/footer wrapper for related emails

Screenshot (if applicable)

Not applicable.

How to test

(Prerequisite) Email tracker removal is behind the tracker_removal flag which needs to be set by:

./manage.py waffle_flag tracker_removal --everyone --authenticated

and because Django-Waffle's logic for checking it the flag is enabled for user (method is_active_for_user) does not check the everyone we need to use the authenticated.

Check that emails relayed with user's setting DISABLED (default) for remove email trackers relays emails with trackers NOT removed

  1. Login to the account and go to /accounts/settings/
  2. Check that the checkbox for Remove email trackers in all forwarded emails. is NOT checked
  3. Send an email with trackers to your alias (Relay or Domain Address). Sign-up for activecampaign.com is a good sample email with trackers.
  4. Check that email is relayed without any changes to the content

Check that emails relayed with user's setting ENABLED for remove email trackers relays emails with trackers removed

  1. Login to the account and go to /accounts/settings/
  2. Click the checkbox for Remove email trackers in all forwarded emails. and click "Save" button on the bottom right corner
  3. Go back to the /accounts/profile/ and check that a new column for Trackers Removed is added to the address card like:

Screen Shot 2022-07-21 at 21 45 04

4. Send an email with trackers to your alias. Sign-up for activecampaign.com is a good sample email with trackers. 5. Check that email is relayed with trackers removed count and "Learn more" link on the header of the relayed email like:

Screen Shot 2022-07-21 at 21 48 46

6. Click on "Learn more" and check that number of trackers removed count in the generated report is the same as the tracker removed count on the email header. 7. Go back to the `/accounts/profile` and check that the count for `Trackers Removed` on the address card increased

Check that emails relayed with user's setting ENABLED then DISABLED for remove email trackers relays emails with trackers NOT removed

  1. Login back to the account used for the previous test
  2. Uncheck the checkbox for Remove email trackers in all forwarded emails. and click "Save" button on the bottom right corner
  3. Go back to the /accounts/profile/ and check that the column for Trackers Removed is greyed out on the address card like:
    Screen Shot 2022-07-21 at 21.45.31
  4. Send an email with trackers to your alias. Sign-up for activecampaign.com is a good sample email with trackers.
  5. Check that email is relayed with trackers NOT removed and the removed count and "Learn more" link on the header of the relayed email is no longer there.
  6. Go back to the /accounts/profile and check that the count for Trackers Removed stayed the same

Checklist

  • l10n changes have been submitted to the l10n repository, if any.
  • All acceptance criteria are met.
  • I've added or updated relevant docs in the docs/ directory.
  • All UI revisions follow the coding standards, and use Protocol tokens where applicable (see /frontend/src/styles/tokens.scss).
  • Commits in this PR are minimal and have descriptive commit messages.

say-yawn added 9 commits July 21, 2022 11:23
Use level one and two instead of general and strict
for email tracker removed report on relayed email as well as cleaning old foxfooding work
if there is no trackers removed
on RelayAddresses, DomainAddresses, and for deleted addresses on Profile
and remove override_sample since it was used for Foxfooding and not needed anymore
@netlify
Copy link

netlify bot commented Jul 21, 2022

Deploy Preview for fx-relay-demo canceled.

Name Link
🔨 Latest commit d7c4d8b
🔍 Latest deploy log https://app.netlify.com/sites/fx-relay-demo/deploys/62da0c386b31900009a52419

@say-yawn say-yawn mentioned this pull request Jul 21, 2022
5 tasks
Copy link
Member

@jwhitlock jwhitlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @say-yawn! Some required changes:

  • Make the tracker counts read-only fields in the API
  • Update the database schema to ensure that data isn't lost during deployment

and a bunch of nits.

My testing process (after a few iterations!):

  • Add waffle flag "tracker_removal", enable for all users
  • Set "remove_level_one_email_trackers" for my test user. At first, I did this manually. After rebuilding the front-end I could see how to do it in my user settings.
  • Send an HTML email with 2 links to a level-2 tracker and 1 to a strict tracker. The link text was the URL.

In the email processing log, I saw the event log:

{"Timestamp": 1658435438043749120, "Type": "eventsinfo", "Logger": "fx-private-relay", "Hostname": "johnwhitlock-4zq05n", "EnvVersion": "2.0", "Severity": 6, "Pid": 18832, "Fields": {"level": "general", "tracker_removed": 2, "level_one": {"count": 4, "trackers": {"8d8.biz": 4}}, "level_two": {"count": 2, "trackers": {"1-2-1marketing.com": 2}}, "msg": "email_tracker_summary"}}

I'm not sure why it detected 4 trackers and removed 2. Maybe because the link text was the URL? That remained in the forwarded email.

In the relayed message, the level one tracker links were replaced with a link to /faq. The header had the text "2 email trackers removed - Learn more". "Learn more" was a link to the report:

https://relay-quhitloch.ngrok.io/tracker-report/#%7B%22sender%22:%20%[email protected]%22,%20%22received_at%22:%201658435438,%20%22trackers%22:%20%7B%228d8.biz%22:%204%7D%7D

The linked report had more data, including a very old "Received by" time:

Relay Tracker Report

api/serializers/__init__.py Show resolved Hide resolved
api/serializers/__init__.py Show resolved Hide resolved
api/serializers/__init__.py Show resolved Hide resolved
@@ -91,6 +91,9 @@ class Profile(models.Model):
last_account_flagged = models.DateTimeField(blank=True, null=True, db_index=True)
num_email_forwarded_in_deleted_address = models.PositiveIntegerField(default=0)
num_email_blocked_in_deleted_address = models.PositiveIntegerField(default=0)
num_level_one_trackers_blocked_in_deleted_address = models.PositiveIntegerField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New fields need to have null=True to avoid data loss during deployment. The existing code will be running during the deployment, and will fail with NOT NULL constraint failed when adding a new address or profile. You'll need to recreate the migration:

manage.py migrate emails 0048
rm emails/migrations/0049_*
manage.py makemigrations

Later, we can do a data migration to populate the nulls and a schema migration to disallow nulls.

Alternatively, you might be able to add a step to the migration to add a database-level default, something like ALTER TABLE "emails_domainaddress" ALTER COLUMN "num_level_one_trackers_blocked" SET DEFAULT 0. I haven't tried that, I'm not sure how it will work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked, tested as:

  1. Apply the migration
  2. Check out main. I should check out v3.5.21, but main should work too
  3. Load the site, create a relay alias

Previously, it fails, because of the NOT NULL constraint failed. Now it succeeds, and sets num_level_one_trackers_blocked to NULL.

However, the profile breaks when switching back to this branch, because it is attempting to add 0 and None and failing. The code needs to interpret a None as 0.

Maybe we'll try the database default next time...

@property
def level_one_trackers_blocked(self):
return (
sum(ra.num_level_one_trackers_blocked for ra in self.relay_addresses)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need to check if Django converts a NULL to a 0, if we make the fields nullable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not 🤦

emails/views.py Outdated
else:
tracker_report_link = ""

if "has_num_level_one_email_trackers_removed" in request.GET:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the override could set the number directly:

Suggested change
if "has_num_level_one_email_trackers_removed" in request.GET:
if "num_level_one_email_trackers_removed" in request.GET:
num_level_one_email_trackers_removed = int(request.GET["num_level_one_email_trackers_removed"])
else
num_level_one_email_trackers_removed = 0

emails/views.py Outdated
<dd>{"Yes" if has_email_tracker_study_link else "No"}</dd>
<dt>has_tracker_report_link</dt>
<dd>{"Yes" if has_tracker_report_link else "No"}</dd>
<dt>has_num_level_one_email_trackers_removed</dt>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: line up <dt>s...

Suggested change
<dt>has_num_level_one_email_trackers_removed</dt>
<dt>has_num_level_one_email_trackers_removed</dt>

@@ -5,7 +5,7 @@
from django.test import TestCase, override_settings
from unittest.mock import patch
from model_bakery import baker
from waffle.testutils import override_sample
from waffle.testutils import override_flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

override_flag is not used in this file, please remove the import

emails/views.py Outdated
else:
has_tracker_report_link = False
if has_tracker_report_link:
tracker_report_link = "https://example.com/fake_tracker_report_link"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This could be a real report to test that as well:

Suggested change
tracker_report_link = "https://example.com/fake_tracker_report_link"
tracker_report_link = '/tracker-report/#{"sender": "sender@example.com", "received_at": 1658434657, "trackers": {"fake-tracker.example.com": 2}}'

if "has_email_tracker_study_link" in request.GET:
has_email_tracker_study_link = strtobool(
request.GET["has_email_tracker_study_link"]
if "has_tracker_report_link" in request.GET:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating the wrapped email template and test view!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test! I didn't know we added tests for this but the failed tests told me about it and was happy to add on to it!

say-yawn added 10 commits July 21, 2022 17:17
to prevent errors during requests to the service during  migrations
since it is not used while the email is relayed
and update email wrapper test to be more similar to actual data passed in context
and add more tests for tracker removal
which the frontend expects in instead of in seconds
so that we are using the lists as according to the license for Disconnect list as stated in shavar-prod-lists repo
until we can handle the CircleCI throwing FileNotFoundError
@jwhitlock jwhitlock self-requested a review July 22, 2022 02:51
Copy link
Member

@jwhitlock jwhitlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @say-yawn! This is OK for merging and deploying to stage, where usage is low, but the null value for profiles and aliases created during the deploy will bite us in production.

@@ -11,6 +13,22 @@
logger = logging.getLogger("events")


def get_trackers(category="Email"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this a little in a different channel, and we're deferring live download of data until the a future PR. I'd rather this code not be added until that time, but I'll try to remember to review it then. Since the code is unused, these comments are not blockers.

I'm curious about the license terms, and if they require download at the time of use, or if we can download once as part of building the Docker image, or something else.

for _, resources in entity.items():
for _, domains in resources.items():
trackers.extend(domains)
# when storing use a tuple so it's not mutable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this comment doesn't appear to reflect the code, since the data is returned as a list instead of a tuple.

@@ -34,6 +52,16 @@ def __init__(self, app_name, app_module):
self.badwords = self._load_terms("badwords.txt")
self.blocklist = self._load_terms("blocklist.txt")

# TODO: fix the relative path issue on CircleCI to use the commented code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocker: I believe the issue is that the folder emails/tracker_lists does not exist. It can be created on the fly, or we can create a file like emails/tracker_lists/.keep or emails/tracker_lists/README.md to ensure the directory exists.

@property
def level_one_trackers_blocked(self):
return (
sum(ra.num_level_one_trackers_blocked for ra in self.relay_addresses)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not 🤦

Comment on lines +318 to +320
sum(ra.num_level_one_trackers_blocked for ra in self.relay_addresses)
+ sum(da.num_level_one_trackers_blocked for da in self.domain_addresses)
+ self.num_level_one_trackers_blocked_in_deleted_address
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because these can be null, we need something like:

Suggested change
sum(ra.num_level_one_trackers_blocked for ra in self.relay_addresses)
+ sum(da.num_level_one_trackers_blocked for da in self.domain_addresses)
+ self.num_level_one_trackers_blocked_in_deleted_address
sum(ra.num_level_one_trackers_blocked or 0 for ra in self.relay_addresses)
+ sum(da.num_level_one_trackers_blocked or 0 for da in self.domain_addresses)
+ self.num_level_one_trackers_blocked_in_deleted_address or 0

Comment on lines +710 to +712
profile.num_level_one_trackers_blocked_in_deleted_address += (
self.num_level_one_trackers_blocked
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, either could be null:

Suggested change
profile.num_level_one_trackers_blocked_in_deleted_address += (
self.num_level_one_trackers_blocked
)
profile.num_level_one_trackers_blocked_in_deleted_address = (
(profile.num_level_one_trackers_blocked_in_deleted_address or 0) +
(self.num_level_one_trackers_blocked or 0)
)

@@ -5,7 +5,6 @@
from django.test import TestCase, override_settings
from unittest.mock import patch
from model_bakery import baker
from waffle.testutils import override_sample
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

general_removed = tracker_details["tracker_removed"]
general_count = tracker_details["level_one"]["count"]

assert changed_content == "<a href='https://test.com/faq'>A link</a>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like checking the expected content

@@ -373,12 +373,17 @@ def set_user_group(user):
internal_group.user_set.add(user)


def convert_domains_to_regex_patterns(domain_pattern):
return r"""(["'])(\S*://(\S*\.)*""" + re.escape(domain_pattern) + r"\S*)\1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 pyflakes is happy!

)
address.num_level_one_trackers_blocked += removed_count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here:

Suggested change
address.num_level_one_trackers_blocked += removed_count
address.num_level_one_trackers_blocked = (
(address.num_level_one_trackers_blocked or 0) + removed_count
)

@say-yawn say-yawn merged commit 8fd00f8 into main Jul 22, 2022
@say-yawn say-yawn deleted the MPP-2062-return-trackers-found-removed-in-email-header-context branch July 22, 2022 03:27
@say-yawn say-yawn mentioned this pull request Jul 25, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants