Skip to content

Commit

Permalink
Begin work on a data request API (#4045)
Browse files Browse the repository at this point in the history
[Core] Data Deletion And Disclosure APIs

 - Adds a Data Deletion API
   - Deletion comes in a few forms based on who is requesting
   - Deletion must be handled by 3rd party
 - Adds a Data Collection Disclosure Command
   - Provides a dynamically generated statement from 3rd party
   extensions
 - Modifies the always available commands to be cog compatible
   - Also prevents them from being unloaded accidentally
  • Loading branch information
Michael H authored Aug 3, 2020
1 parent bb1a256 commit c0b1e50
Show file tree
Hide file tree
Showing 38 changed files with 1,763 additions and 224 deletions.
8 changes: 8 additions & 0 deletions docs/framework_commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,14 @@ extend functionalities used throughout the bot, as outlined below.

.. autofunction:: redbot.core.commands.group

.. autoclass:: redbot.core.commands.Cog

.. automethod:: format_help_for_context

.. automethod:: red_get_data_for_user

.. automethod:: red_delete_data_for_user

.. autoclass:: redbot.core.commands.Command
:members:
:inherited-members: format_help_for_context
Expand Down
18 changes: 18 additions & 0 deletions docs/guide_cog_creation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ Open :code:`__init__.py`. In that file, place the following:
from .mycog import Mycog
def setup(bot):
bot.add_cog(Mycog())
Expand Down Expand Up @@ -238,3 +239,20 @@ Not all of these are strict requirements (some are) but are all generally advisa
but a cog which takes actions based on messages should not.

15. Respect settings when treating non command messages as commands.

16. Handle user data responsibly

- Don't do unexpected things with user data.
- Don't expose user data to additional audiences without permission.
- Don't collect data your cogs don't need.
- Don't store data in unexpected locations.
Utilize the cog data path, Config, or if you need something more
prompt the owner to provide it.

17. Utilize the data deletion and statement APIs

- See `redbot.core.commands.Cog.red_delete_data_for_user`
- Make a statement about what data your cogs use with the module level
variable ``__red_end_user_data_statement__``.
This should be a string containing a user friendly explanation of what data
your cog stores and why.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Welcome to Red - Discord Bot's documentation!
:caption: User guides:

getting_started
red_core_data_statement

.. toctree::
:maxdepth: 2
Expand Down
87 changes: 87 additions & 0 deletions docs/red_core_data_statement.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
.. Red Core Data Statement
=====================
Red and End User Data
=====================

Notes for everyone
******************

What data Red collects
----------------------

Red and the cogs included with it collect some amount of data
about users for the bot's normal operations.

In particular the bot will keep track of a short history of usernames/nicknames
which actions refer to your Discord account (such as creating a playlist)
as well as the content of specific messages used directly as commands for the bot
(such as reports sent to servers).

By default, Red will not collect any more data than it needs, and will not use it
for anything other than the portion of the Red's functionality that necessitated it.

3rd party extensions may store additional data beyond what Red does by default.
You can use the command ``[p]mydata 3rdparty``
to view statements about how extensions use your data made by the authors of
the specific 3rd party extensions an instance of Red has installed.

How can I delete data Red has about me?
---------------------------------------

The command ``[p]mydata forgetme`` provides a way for users to remove
large portions of their own data from the bot. This command will not
remove operational data, such as a record that your
Discord account was the target of a moderation action.

3rd party extensions to Red are able to delete data when this command
is used as well, but this is something each extension must implement.
If a loaded extension does not implement this, the user will be informed.

Additional Notes for Bot Owners and Hosts
*****************************************

How to comply with a request from Discord Trust & Safety
--------------------------------------------------------

There are a handful of these available to bot owners in the command group
``[p]mydata ownermanagement``.

The most pertinent one if asked to delete data by a member of Trust & Safety
is

``[p]mydata ownermanagement processdiscordrequest``

This will cause the bot to get rid of or disassociate all data
from the specified user ID.

.. warning::

You should not use this unless
Discord has specifically requested this with regard to a deleted user.
This will remove the user from various anti-abuse measures.
If you are processing a manual request from a user, read the next section


How to process deletion requests from users
-------------------------------------------

You can point users to the command ``[p]mydata forgetme`` as a first step.

If users cannot use that for some reason, the command

``[p]mydata ownermanagement deleteforuser``

exists as a way to handle this as if the user had done it themselves.

Be careful about using the other owner level deletion options on behalf of users,
as this may also result in losing operational data such as data used to prevent spam.

What owners and hosts are responsible for
-----------------------------------------

Owners and hosts must comply both with Discord's terms of service and any applicable laws.
Owners and hosts are responsible for all actions their bot takes.

We cannot give specific guidance on this, but recommend that if there are any issues
you be forthright with users, own up to any mistakes, and do your best to handle it.
4 changes: 4 additions & 0 deletions redbot/cogs/admin/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,10 @@ def __init__(self):
async def cog_before_invoke(self, ctx: commands.Context):
await self._ready.wait()

async def red_delete_data_for_user(self, **kwargs):
""" Nothing to delete """
return

async def handle_migrations(self):

lock = self.config.get_guilds_lock()
Expand Down
2 changes: 1 addition & 1 deletion redbot/cogs/alias/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@

async def setup(bot: Red):
cog = Alias(bot)
await cog.initialize()
bot.add_cog(cog)
cog.sync_init()
107 changes: 98 additions & 9 deletions redbot/cogs/alias/alias.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
import asyncio
import logging
from copy import copy
from re import search
from string import Formatter
from typing import Dict, List
from typing import Dict, List, Literal

import discord
from redbot.core import Config, commands, checks
Expand All @@ -14,6 +16,8 @@

_ = Translator("Alias", __file__)

log = logging.getLogger("red.cogs.alias")


class _TrackingFormatter(Formatter):
def __init__(self):
Expand All @@ -38,24 +42,107 @@ class Alias(commands.Cog):
and append them to the stored alias.
"""

default_global_settings: Dict[str, list] = {"entries": []}

default_guild_settings: Dict[str, list] = {"entries": []} # Going to be a list of dicts

def __init__(self, bot: Red):
super().__init__()
self.bot = bot
self.config = Config.get_conf(self, 8927348724)

self.config.register_global(**self.default_global_settings)
self.config.register_guild(**self.default_guild_settings)
self.config.register_global(entries=[], handled_string_creator=False)
self.config.register_guild(entries=[])
self._aliases: AliasCache = AliasCache(config=self.config, cache_enabled=True)
self._ready_event = asyncio.Event()

async def red_delete_data_for_user(
self,
*,
requester: Literal["discord_deleted_user", "owner", "user", "user_strict"],
user_id: int,
):
if requester != "discord_deleted_user":
return

await self._ready_event.wait()
await self._aliases.anonymize_aliases(user_id)

async def cog_before_invoke(self, ctx):
await self._ready_event.wait()

async def _maybe_handle_string_keys(self):
# This isn't a normal schema migration because it's being added
# after the fact for GH-3788
if await self.config.handled_string_creator():
return

async with self.config.entries() as alias_list:
bad_aliases = []
for a in alias_list:
for keyname in ("creator", "guild"):
if isinstance((val := a.get(keyname)), str):
try:
a[keyname] = int(val)
except ValueError:
# Because migrations weren't created as changes were made,
# and the prior form was a string of an ID,
# if this fails, there's nothing to go back to
bad_aliases.append(a)
break

for a in bad_aliases:
alias_list.remove(a)

# if this was using a custom group of (guild_id, aliasname) it would be better but...
all_guild_aliases = await self.config.all_guilds()

for guild_id, guild_data in all_guild_aliases.items():

to_set = []
modified = False

for a in guild_data.get("entries", []):

for keyname in ("creator", "guild"):
if isinstance((val := a.get(keyname)), str):
try:
a[keyname] = int(val)
except ValueError:
break
finally:
modified = True
else:
to_set.append(a)

if modified:
await self.config.guild_from_id(guild_id).entries.set(to_set)

await asyncio.sleep(0)
# control yielded per loop since this is most likely to happen
# at bot startup, where this is most likely to have a performance
# hit.

await self.config.handled_string_creator.set(True)

def sync_init(self):
t = asyncio.create_task(self._initialize())

def done_callback(fut: asyncio.Future):
try:
t.result()
except Exception as exc:
log.exception("Failed to load alias cog", exc_info=exc)
# Maybe schedule extension unloading with message to owner in future

t.add_done_callback(done_callback)

async def _initialize(self):
""" Should only ever be a task """

await self._maybe_handle_string_keys()

async def initialize(self):
# This can be where we set the cache_enabled attribute later
if not self._aliases._loaded:
await self._aliases.load_aliases()

self._ready_event.set()

def is_command(self, alias_name: str) -> bool:
"""
The logic here is that if this returns true, the name should not be used for an alias
Expand Down Expand Up @@ -327,6 +414,8 @@ async def _list_global_alias(self, ctx: commands.Context):
@commands.Cog.listener()
async def on_message_without_command(self, message: discord.Message):

await self._ready_event.wait()

if message.guild is not None:
if await self.bot.cog_disabled_in_guild(self, message.guild):
return
Expand Down
24 changes: 24 additions & 0 deletions redbot/cogs/alias/alias_entry.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,30 @@ def __init__(self, config: Config, cache_enabled: bool = True):
self._loaded = False
self._aliases: Dict[Optional[int], Dict[str, AliasEntry]] = {None: {}}

async def anonymize_aliases(self, user_id: int):

async with self.config.entries() as global_aliases:
for a in global_aliases:
if a.get("creator", 0) == user_id:
a["creator"] = 0xDE1
if self._cache_enabled:
self._aliases[None][a["name"]] = AliasEntry.from_json(a)

all_guilds = await self.config.all_guilds()
async for guild_id, guild_data in AsyncIter(all_guilds.items(), steps=100):
for a in guild_data["entries"]:
if a.get("creator", 0) == user_id:
break
else:
continue
# basically, don't build a context manager wihout a need.
async with self.config.guild_from_id(guild_id).entries() as entry_list:
for a in entry_list:
if a.get("creator", 0) == user_id:
a["creator"] = 0xDE1
if self._cache_enabled:
self._aliases[guild_id][a["name"]] = AliasEntry.from_json(a)

async def load_aliases(self):
if not self._cache_enabled:
self._loaded = True
Expand Down
11 changes: 11 additions & 0 deletions redbot/cogs/audio/apis/playlist_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
PRAGMA_SET_read_uncommitted,
PRAGMA_SET_temp_store,
PRAGMA_SET_user_version,
HANDLE_DISCORD_DATA_DELETION_QUERY,
)
from ..utils import PlaylistScope
from .api_utils import PlaylistFetchResult
Expand Down Expand Up @@ -58,6 +59,8 @@ def __init__(self, bot: Red, config: Config, conn: APSWConnectionWrapper):
self.statement.get_all_with_filter = PLAYLIST_FETCH_ALL_WITH_FILTER
self.statement.get_all_converter = PLAYLIST_FETCH_ALL_CONVERTER

self.statement.drop_user_playlists = HANDLE_DISCORD_DATA_DELETION_QUERY

async def init(self) -> None:
"""Initialize the Playlist table"""
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
Expand Down Expand Up @@ -247,3 +250,11 @@ async def upsert(
"tracks": json.dumps(tracks),
},
)

async def handle_playlist_user_id_deletion(self, user_id: int):
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
executor.submit(
self.database.cursor().execute,
self.statement.drop_user_playlists,
{"user_id": user_id},
)
Loading

0 comments on commit c0b1e50

Please sign in to comment.