Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add safe search filter for NSFW rooms #208

Merged
merged 16 commits into from
May 3, 2023

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented May 2, 2023

Add safe search filter for NSFW rooms

Alternative community PR (unfinished): #155

Fix #89

Todo

  • Add custom styles to safe search toggle
  • Should we add safe search to the query parameters: ?safe-search=false
    • On one-hand, it would be nice for NSFW-tolerant people to be able to link to room directory as they see it. This would also solve the flash of blocked to unblocked when the page loads.
    • On the other hand, it allows people to just fling people to some nasty content although they could just link the room anyway.
    • In any case, it's simpler not to do this for now and we can iterate when people actually speak up with an actual need/use-case.
  • Add test to ensure NSFW room is blocked on page-load
  • Update Hydrogen with latest scratch changes

@MadLittleMods MadLittleMods added T-Enhancement New feature or request A-room-directory Room directory landing page where you can explore the list of rooms labels May 2, 2023
@MadLittleMods MadLittleMods added the A-moderation Moderating content and rooms (safe search) label May 2, 2023
@MadLittleMods MadLittleMods marked this pull request as ready for review May 3, 2023 05:34
Comment on lines +2478 to +2494
// Browse the room directory searching the room directory for those NSFW rooms
// (narrowing down results).
archiveUrl = matrixPublicArchiveURLCreator.roomDirectoryUrl({
searchTerm: roomPlanetPrefix,
});
const { data: roomDirectoryWithSearchPageHtml } = await fetchEndpointAsText(archiveUrl);
const domWithSearch = parseHTML(roomDirectoryWithSearchPageHtml);

const roomsCardsOnPageWithSearch = [
...domWithSearch.document.querySelectorAll(`[data-testid="room-card"]`),
];

// Assert that the rooms we searched for are on the page
const roomsIdsOnPageWithSearch = roomsCardsOnPageWithSearch.map((roomCardEl) => {
return roomCardEl.getAttribute('data-room-id');
});
assert.deepStrictEqual(roomsIdsOnPageWithSearch.sort(), [roomUranusId, roomMarsId].sort());
Copy link
Contributor Author

@MadLittleMods MadLittleMods May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this test flakey (it's failed 3/4 times I've retried in CI)? Example failure: https://github.com/matrix-org/matrix-public-archive/actions/runs/4868717759/jobs/8682443906#step:7:203

Synapse doesn't cache responses for searches on /publicRooms so it shouldn't be a stale results problem with other tests cases interfering. I even see the Bypassing cache as search request. log line on the server when running the test:

$ docker logs -f --tail 10 matrix_public_archive_test_hs1_1
...
2023-05-03 06:51:17,515 - synapse.handlers.room_list - 90 - INFO - POST-25759 - Getting public room list: limit=9, since=None, search=True, network=None
2023-05-03 06:51:17,515 - synapse.handlers.room_list - 101 - INFO - POST-25759 - Bypassing cache as search request.
2023-05-03 06:51:17,546 - synapse.access.http.8008 - 460 - INFO - POST-25759 - ::ffff:172.22.0.1 - 8008 - {@archiver:hs1} Processed request: 0.031sec/0.001sec (0.020sec, 0.001sec) (0.008sec/0.022sec/2) 554B 200 "POST /_matrix/client/v3/publicRooms HTTP/1.1" "undici" [0 dbevts]

Copy link
Contributor Author

@MadLittleMods MadLittleMods May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can actually reproduce consistently locally after re-pulling the latest Synapse image (minimal test run reproduction: npm run test -- --grep "Room directory")

Just did some bisecting of Synapse versions to figure out where the Synapse change was introduced:

  • v1.81.0: Test fails consistently ❌
  • v1.80.0: Test fails consistently ❌
  • v1.80.0rc1: Test fails consistently ❌
  • v1.79.0: Working ✅
  • v1.68.0: Working ✅

I don't know which Synapse commit/PR specifically yet: https://github.com/matrix-org/synapse/blob/develop/CHANGES.md#synapse-1800rc1-2023-03-21

Copy link
Contributor Author

@MadLittleMods MadLittleMods May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a bisect on Synapse commits and found the culprit -> matrix-org/synapse#15229. Not totally obvious what's going wrong/different to cause this but probably something with visibility.

Bisect (relevant page of commits):

  • 4953cd71dfbb1925dc4a477efd2ed48c2dfd70d6 (2023-03-16): Test fails consistently ❌ (locally built matrixdotorg/synapse:4953cd71)
    a1c986939444db9bbf2343224091c88d18b3f598 (2023-03-13): Test fails consistently ❌ (locally built matrixdotorg/synapse:a1c98693)
    be4ea209e8bf92fb3660807c1fe8ad3d7d05621f (2023-03-08): Test fails consistently ❌ (locally built matrixdotorg/synapse:be4ea209)
    88efc75bab2849b7b1cee52770dea3cf9925b2e8 (2023-03-08): Working ✅ (locally built matrixdotorg/synapse:88efc75b)
    8314646cd3563d9aaaf8028c9e58989b4ed980ba (2023-03-07): Working ✅ (locally built matrixdotorg/synapse:8314646c)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to be a visibility issue as the rooms still end up as preset_config=public_chat and visibility=public.

Seems like a race condition because it passes if I add a delay after the room creation. And I would think this is a read-after-write consistency issue since we're not even using workers. Created matrix-org/synapse#15526 to track this ⏩

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pinned the Synapse version to v1.79.0 before the regression for now ⏩

If we need to test with the latest Synapse, we will need to introduce some wait/retry complexity like this:

// Try to avoid flakey tests where the homeserver hasn't added the rooms
// to the room directory yet
await waitForRoomIdsInHomeserverRoomDirectory([roomUranusId, roomMarsId]);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waitForResultsInHomeserverRoomDirectory(...) was added in #276

await waitForResultsInHomeserverRoomDirectory({
  client,
  searchTerm: 'room-foo',
});

It's not possible to search directly for a room_id (at least in Synapse) so I've opted to searching for the room name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-moderation Moderating content and rooms (safe search) A-room-directory Room directory landing page where you can explore the list of rooms T-Enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add NSFW safe search filter to room directory landing page
1 participant