Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider merging tags and metadata #266

Closed
lemon24 opened this issue Dec 16, 2021 · 6 comments
Closed

Consider merging tags and metadata #266

lemon24 opened this issue Dec 16, 2021 · 6 comments

Comments

@lemon24
Copy link
Owner

lemon24 commented Dec 16, 2021

Two main reasons:

First, there are 7 methods for metadata/tags. If we add entry and global metadata/tags, there may be 21 (if we actually want to add methods for them is debatable).

Second, there are two namespaces, and some functionality overlaps – to filter by metadata, you have to have an identical tag; for example, the dedupe plugin may need to do this:

for feed in feeds with the .dedupe tag:
    metadata = get .dedupe metadata for feed
    dedupe based on options in metadata

... the issue is not necessarily that you have to maintain both, but that you have to do or not do something when one is missing (and document that etc.).

Third, we need to maintain two very similar implementations.

Finally, AWS does it, and they spent far more time thinking about it.


Here's an example of how it might work (omitting the reader object and the first feed argument).

For reference, tags are conceptually set[str], and metadata is dict[str, JSONType].

Currently:

>>> set_feed_metadata_item('one', {})
>>> add_feed_tag('two')
>>> get_feed_metadata()
{'one': {}}
>>> get_feed_tags()
{'two'}

With the same namespace (to do: is this backwards compatible?):

>>> set_feed_metadata_item('one', {})
>>> add_feed_tag('two')
>>> get_feed_metadata()
{'one': {}, 'two': None}
>>> get_feed_tags()
{'one', 'two'}

At this point we could either leave both sets of methods, or unify them (to do: what would it look like?).

FWIW, the boto3 EC2 client has create_tags(), delete_tags(), describe_tags() (which can filter).

@lemon24
Copy link
Owner Author

lemon24 commented Dec 30, 2021

OK, I decided going forward with this is the Right Thing™.

Unifying the namespaces should happen as soon as possible, so as few people as possible depend on them being separated. This does not break the API in any way, but does change behavior.

Unifying the methods we can postpone for a bit; we can only remove old methods in 3.0, after deprecating them in 2.x.

@lemon24
Copy link
Owner Author

lemon24 commented Dec 30, 2021

To do for unifying the namespaces:

  • change storage/reader
  • migration
  • tests
  • docstrings
  • user guide
  • changelog

@lemon24
Copy link
Owner Author

lemon24 commented Dec 30, 2021

Here are all the tag and metadata methods as of 2.6:

get_feed_metadata(feed[, key]) -> (key, value), ...
get_feed_metadata_item(feed, key[, default]) -> value # raises FeedMetadataNotFoundError
set_feed_metadata_item(feed, key, value) # raises FeedNotFoundError
delete_feed_metadata_item(feed, key) # raises FeedMetadataNotFoundError

get_feed_tags([feed]) -> tag, ...
add_feed_tag(feed, tag) # raises FeedNotFoundError
remove_feed_tag(feed, tag)

Note:

  • "delete" for metadata, "remove" for tags (copy guidance in 2.0 backwards compatibility breaks #183); once we merge them, it should be "delete" for tags too.
  • delete_feed_metadata_item() raises an exception if the metadata does not exist; remove_feed_tag() does not.
  • get_feed_metadata() requires a feed, get_feed_tags() does not, so it can't be built efficiently on top of get_feed_metadata() (n+1); also, we'd be getting the values as well, despite not needing them.
  • There's no get_feed_tag().
  • add_feed_tag() is a no-op if the tag already exists; to preserve semantics, we'd need a setdefault_feed_metadata_item().

@lemon24
Copy link
Owner Author

lemon24 commented Dec 30, 2021

Here's what a unified feed tag API would look like:

get_feed_tags(feed[, key]) -> (key, value), ...
get_feed_tag_keys([feed]) -> key, ...
get_feed_tag(feed, key[, default]) -> value  # raises FeedTagNotFoundError
set_feed_tag(feed, key[, value])  # raises FeedNotFoundError
delete_feed_tag(feed, key)  # raises FeedTagNotFoundError

# optional, can be added later (or never)
describe_feed_tags([feed[, key]]) -> TagDescription(feed, key, value), ...

To do: What would this look like if we were to generalize it to different types of resources, per #228 (global, feed, entry)? The following would have to be generic:

  • object ids, including wildcards (global, any feed, any entry, any thing)
  • TagNotFoundError needs to be generic, but the user also needs to know which kind of resource triggered it (?)

lemon24 added a commit that referenced this issue Jan 2, 2022
lemon24 added a commit that referenced this issue Jan 2, 2022
Was breaking the build for #266.
lemon24 added a commit that referenced this issue Jan 2, 2022
lemon24 added a commit that referenced this issue Jan 2, 2022
lemon24 added a commit that referenced this issue Jan 2, 2022
lemon24 added a commit that referenced this issue Jan 2, 2022
@lemon24
Copy link
Owner Author

lemon24 commented Jan 15, 2022

This is what generic global/feed/entry tag methods would look like:

class ResourceNotFoundError(ReaderError):
    object_id: tuple[str, ...]
class FeedError(ReaderError):
    object_id: tuple[str]
class FeedNotFoundError(ResourceNotFoundError, FeedError):
    object_id: tuple[str]

# same as above for entries    
class EntryNotFoundError(EntryError, ResourceNotFoundError)

class TagError(ReaderError):
    tag: str
    # optional, can be added later (or never)
    object_id: tuple[str, ...]
class TagNotFoundError(TagError)

# optional, can be added later (or never)
class FeedTagNotFoundError(FeedError, TagNotFoundError)
class EntryTagNotFoundError(FeedError, TagNotFoundError)
class GlobalTagNotFoundError(FeedError, TagNotFoundError)

get_tags(resource, *, key=None) -> (key, value), ...
get_tag_keys(resource) -> key, ...
get_tag(resource, key, default=MISSING, /) -> value  # raises TagNotFoundError
set_tag(resource, key, value=None, /)  # raises specific ResourceNotFoundError
delete_tag(resource, key, /, missing_ok=False)  # raises TagNotFoundError

# types of resource arguments
EntryInput = EntryLike | tuple[str, str]
FeedInput = FeedLike | tuple[str] | str
GlobalInput = tuple[]
AnyEntryInput = EntryInput | tuple[str, None] | tuple[None, None]
AnyFeedInput = FeedInput | tuple[None]
AnyInput = AnyEntryInput | AnyFeedInput | None

@lemon24
Copy link
Owner Author

lemon24 commented Jan 15, 2022

To do:

  • add Tag exception tree
  • add ResourceNotFoundError tree
  • add new methods, rewriting the old feed tag/metadata methods in terms of them
    • get_tags()
    • get_tag_keys()
    • get_tag()
    • set_tag()
    • delete_tag()
  • add deprecation warnings for all the old methods
    • update tests to use the new methods
  • add deprecation notices to the metadata exception docstrings
  • docs
    • changelog / deprecation notices
    • user guide
    • dev notes
    • docstrings
  • remove deprecated exceptions and methods (in 3.0)

lemon24 added a commit that referenced this issue Jan 15, 2022
lemon24 added a commit that referenced this issue Jan 15, 2022
lemon24 added a commit that referenced this issue Jan 15, 2022
lemon24 added a commit that referenced this issue Jan 16, 2022
lemon24 added a commit that referenced this issue Jan 16, 2022
lemon24 added a commit that referenced this issue Jan 16, 2022
lemon24 added a commit that referenced this issue Jan 16, 2022
lemon24 added a commit that referenced this issue Jan 16, 2022
lemon24 added a commit that referenced this issue Jan 17, 2022
1092 warnins -> 0 warnings.

For #266.
lemon24 added a commit that referenced this issue Jan 17, 2022
lemon24 added a commit that referenced this issue Jan 19, 2022
lemon24 added a commit that referenced this issue Jan 19, 2022
lemon24 added a commit that referenced this issue Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant