Changes for v2 content libs #143

ormsbee · 2024-01-18T15:22:09Z

Work to support openedx/edx-platform#34066

This bumps the version for openedx-learning to 0.5.0.

Some highlights:

new API functions and queries needed for draft/publish comparisons, fetching & creating component versions, etc.
re-created the migrations yet again, since this is the last time I should be able to get away with it before it gets installed into edx-platform.
made some fixes that I realized would bite us later (like making the text collation text insensitive for search purposes later).
new custom manager to make it easier to specify common "select_related" queries on models, first used on Component.
a few performance tweaks based on looking at the queries that show up when using this in edx-platform in Switch v2 libraries to Learning Core data models edx-platform#34066 (mostly getting rid of n+1 queries)

Squashed commit message:

This commit adds a number of features needed to accomodate content
libraries. These are all collected together because this work was done
while iteratively making more and more of libraries functional, and it
was difficult to properly separate out later.

This bumps the version to 0.5.0.

New publishing API functions:

* Learning Packages
    * get_learning_package
    * get_learning_package_by_key,
    * update_learning_package

* Publishable Entities
    * get_publishable_entity and
    * get_publishable_entity_by_key

* Draft/Publishing
    * get_last_publish
    * get_all_drafts
    * get_entities_with_unpublished_changes
    * get_entities_with_unpublished_deletes
    * get_published_version
    * set_draft_version
    * soft_delete_draft
    * reset_drafts_to_published

New Components API functions:

* create_next_version
* get_component_by_key
* component_exists_by_key
* get_components

PublishableEntityMixin improvements:

* New property methods: latest, has_unpublished_changes
* Existing methods were improved to use model-cached values.

Encoding fixes:

* collations for multiple fields were changed to be case insensitive in
  order to better support simple text searches.

Model optimizations:

* created the WithRelationsManager to more easily define querysets that
  involve adding select_related fields. This was used to elininate a
  number of n+1 queries that came up during libraries development.
* PublishableEntityVersion.version_num was reduced to a 4-byte
  PositiveIntegerField instead of using the 8-byte default for
  openedx_learning apps (it's assumed we will never make more than 2
  billion versions of a particular PublishableEntity). This was done to
  help reduce the size of indexes.

Typing:

* type annotations were added to many functions and test functions.

ormsbee · 2024-01-22T01:50:52Z

Okay, I think I've gotten all the basic changes in there. Now to fix old tests and add new ones for the new functionality. Then a final re-base and version bump.

ormsbee · 2024-01-22T23:58:03Z

Old tests are fixed, added a few new ones, still some more to go. I'll deal with mypy last.

ormsbee · 2024-01-23T00:18:18Z

@kdmccormick, @bradenmacdonald: If you have some time, please give this a review pass. The tests aren't fully there yet, but I'd appreciate your thoughts on the API additions especially.

bradenmacdonald

These API changes look great @ormsbee ! All my notes here are very minor nits, so feel free to ignore.

openedx_learning/core/components/api.py

bradenmacdonald · 2024-01-24T19:46:21Z

openedx_learning/core/components/api.py

-def get_component_by_pk(component_pk: int) -> Component:
-    return Component.objects.get(pk=component_pk)
+def get_component_by_key(
+    learning_package_id: int,


Nit/Q: Is it inconsistent that Learning Package PK is suffixed _id whereas component PKs are suffixed _pk in parameter names in this API? Or is there some difference related to what types of ID each thing has?

learning_package_id: int vs. component_pk: int

Good point. Maybe I should stick with _pk for everything?

The reason why it's not component_id is because there is no Component.id field–Components have a OneToOneField relationship to PublishableEntity with primary_key=True. So the actual database field is oel_components_component.publishable_entity_id.

So maybe I should just convert everything to be _pk for consistency instead, since learning_package.pk will just be an alias to learning_package.id anyway. But one of the things that I didn't think through at the time is that if you have a foreign key relation to Component from another model, the db field will be component_id, and component_version.component_pk doesn't work.

Personally I don't think it's a big deal either way, but I'd probably lean toward calling it component_id for consistency, even though Component.id doesn't exist.

@ormsbee @bradenmacdonald Consider me a soft vote for _pk. In edx-platform, _id could indicate all sorts of things: primary db keys, opaque keys (course_id, library_id, usage_id), and other things (block_id, user_id). _pk is unambiguously a primary db key.

Yeah, I'm going to go with _pk everywhere and do that refactoring now.

openedx_learning/core/components/api.py

openedx_learning/core/publishing/api.py

bradenmacdonald · 2024-01-24T20:13:25Z

openedx_learning/core/publishing/api.py

+               .filter(learning_package_id=learning_package_id) \
+               .exclude(draft__version=F('published__version'))
+
+def get_entities_with_unpublished_deletes(learning_package_id: int):


What's the intended use case of this API method? Perhaps a widget in the UI that says something like "These things have been deleted since your last publish: ____" ?

I'm not entirely sure, actually. It's returned as part of the metadata in the get_library call.

In the new code, I just run .exists() on the queryset returned here.

openedx_learning/core/publishing/model_mixins.py

tests/openedx_learning/core/components/test_api.py

openedx_learning/core/components/api.py

ormsbee · 2024-01-30T16:51:17Z

Okay, I wrote the tests, made changes based on the review comments, and the linters are appeased. I'm going to do the _pk refactoring now, but with any luck I'm hoping this can merge today.

ormsbee · 2024-01-30T17:38:38Z

Okay, slight delay because I noticed that it's always doing the work of loading all components no matter the pagination params in the REST API. I think this is a problem on the edx-platform side, but I'm looking into it to be sure.

kdmccormick

I reviewed all the apis, but had to sign off before getting to the mixins and model managers. Looking great! No blockers from me.

openedx_learning/core/components/api.py

ormsbee · 2024-01-30T23:31:38Z

Okay, I wrote the tests, made changes based on the review comments, and the linters are appeased. I'm going to do the _pk refactoring now, but with any luck I'm hoping this can merge today.

I went back and forth on _id vs. _pk for a bit and then decided to punt entirely by making these arguments positional-only for the moment. >_<

ormsbee · 2024-01-31T00:30:54Z

Also, I tracked down the query issue to something in edx-platform, and I'll have an update for that PR with the fix later. Part of it points to an issue that our internal APIs have with QuerySets and pagination. Namely, the pattern for DRF and Django in general is QuerySet -> Pagination -> Serialization. So it gets awkward when we're returning something from an API that we want to be QuerySet-like in that it's filterable and pageable, but also presents something that is not actually a model. I hacked up something before that wrapped the __iter()__ call and did serialization that seemed to work, but I feel like that's probably too much hackery to want to support in the longer term.

Anyhow, I'll discuss it more on the other PR.

ormsbee · 2024-01-31T05:27:54Z

It occurred to me while looking at some recent comments in the wiki w.r.t. library content needs that we probably want to have ComponentType represented as its own model. Partly for saving space, but mostly because it gives us something to later hang policy off of on a per-type basis, e.g. "videos are okay to show in libraries, SGA blocks are experimental for library use".

I implemented that this evening. It was pleasantly straightforward, and it also gets rid of our having to tell pylint to ignore our use of type as a variable name. Tests already pass, so hopefully I can just do it as a fast-follow to this PR tomorrow if this merges.

ormsbee · 2024-01-31T16:31:50Z

BTW, the only reason this build isn't green is because I was a doofus and mis-spelled a couple of conventional commit messages. Please know that it's ready for final review, and I'll squash it at the end.

bradenmacdonald

These are nice API changes and the code looks good to me!

I didn't run/test the code - let me know if you'd like me to.

bradenmacdonald · 2024-01-31T19:13:07Z

Question that's not directly related: Can you use add_content_to_component_version to modify a ComponentVersion that has already been published?

This commit adds a number of features needed to accomodate content libraries. These are all collected together because this work was done while iteratively making more and more of libraries functional, and it was difficult to properly separate out later. This bumps the version to 0.5.0. New publishing API functions: * Learning Packages * get_learning_package * get_learning_package_by_key, * update_learning_package * Publishable Entities * get_publishable_entity and * get_publishable_entity_by_key * Draft/Publishing * get_last_publish * get_all_drafts * get_entities_with_unpublished_changes * get_entities_with_unpublished_deletes * get_published_version * set_draft_version * soft_delete_draft * reset_drafts_to_published New Components API functions: * create_next_version * get_component_by_key * component_exists_by_key * get_components PublishableEntityMixin improvements: * New property methods: latest, has_unpublished_changes * Existing methods were improved to use model-cached values. Encoding fixes: * collations for multiple fields were changed to be case insensitive in order to better support simple text searches. Model optimizations: * created the WithRelationsManager to more easily define querysets that involve adding select_related fields. This was used to elininate a number of n+1 queries that came up during libraries development. * PublishableEntityVersion.version_num was reduced to a 4-byte PositiveIntegerField instead of using the 8-byte default for openedx_learning apps (it's assumed we will never make more than 2 billion versions of a particular PublishableEntity). This was done to help reduce the size of indexes. Typing: * type annotations were added to many functions and test functions.

ormsbee · 2024-01-31T20:00:32Z

Thanks folks! I squashed my commits and made a decent commit message out of it. Will merge once things are green.

ormsbee · 2024-01-31T20:41:23Z

Question that's not directly related: Can you use add_content_to_component_version to modify a ComponentVersion that has already been published?

Yes, you can. 95% of the time you shouldn't, but some real use cases came up where doing so seems like it would be beneficial:

Bug fixes: We expect people to attach data to ComponentVersions, but that code will have bugs at some point, and people will want to be able to correct that programmatically without creating a new "version" of that thing.
Data upgrade/migration: Say we have tens of thousands of Videos with SRT files, and we want to bulk upgrade everything to VTT files instead. We'd probably update the player to recognize both (with a preference for VTT), and then bulk convert SRT to VTT where VTT is absent.

ormsbee mentioned this pull request Jan 18, 2024

Switch v2 libraries to Learning Core data models openedx/edx-platform#34066

Merged

6 tasks

ormsbee marked this pull request as draft January 18, 2024 15:23

ormsbee force-pushed the changes-for-v2-content-libs branch from 7e5546a to 979d6cc Compare January 22, 2024 23:38

ormsbee marked this pull request as ready for review January 23, 2024 00:07

ormsbee requested review from bradenmacdonald and kdmccormick January 23, 2024 00:07

bradenmacdonald reviewed Jan 24, 2024

View reviewed changes

kdmccormick reviewed Jan 26, 2024

View reviewed changes

openedx_learning/core/components/api.py Outdated Show resolved Hide resolved

ormsbee force-pushed the changes-for-v2-content-libs branch from 3207984 to f62f8c5 Compare January 29, 2024 16:18

kdmccormick reviewed Jan 30, 2024

View reviewed changes

openedx_learning/core/components/api.py Outdated Show resolved Hide resolved

openedx_learning/core/components/api.py Show resolved Hide resolved

ormsbee force-pushed the changes-for-v2-content-libs branch from f5a042f to d36ec08 Compare January 30, 2024 23:17

ormsbee mentioned this pull request Jan 31, 2024

Add ComponentType model #147

Closed

kdmccormick self-assigned this Jan 31, 2024

bradenmacdonald approved these changes Jan 31, 2024

View reviewed changes

kdmccormick approved these changes Jan 31, 2024

View reviewed changes

ormsbee force-pushed the changes-for-v2-content-libs branch from 9695377 to ad6a897 Compare January 31, 2024 19:58

ormsbee force-pushed the changes-for-v2-content-libs branch from ad6a897 to 68a32d8 Compare January 31, 2024 20:00

ormsbee merged commit 2ec2918 into openedx:main Jan 31, 2024
7 checks passed

ormsbee deleted the changes-for-v2-content-libs branch January 31, 2024 20:16

ormsbee mentioned this pull request Feb 2, 2024

Discovery: Determine how v2 Content Libraries can make use of Learning Core #30

Closed

ormsbee self-assigned this Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes for v2 content libs #143

Changes for v2 content libs #143

ormsbee commented Jan 18, 2024 •

edited

Loading

ormsbee commented Jan 22, 2024

ormsbee commented Jan 22, 2024

ormsbee commented Jan 23, 2024

bradenmacdonald left a comment

bradenmacdonald Jan 24, 2024

ormsbee Jan 24, 2024

bradenmacdonald Jan 24, 2024

kdmccormick Jan 29, 2024

ormsbee Jan 30, 2024

bradenmacdonald Jan 24, 2024

ormsbee Jan 26, 2024

ormsbee Jan 26, 2024

ormsbee commented Jan 30, 2024

ormsbee commented Jan 30, 2024 •

edited

Loading

kdmccormick left a comment

ormsbee commented Jan 30, 2024

ormsbee commented Jan 31, 2024

ormsbee commented Jan 31, 2024 •

edited

Loading

ormsbee commented Jan 31, 2024

bradenmacdonald left a comment

bradenmacdonald commented Jan 31, 2024

ormsbee commented Jan 31, 2024

ormsbee commented Jan 31, 2024 •

edited

Loading

Changes for v2 content libs #143

Changes for v2 content libs #143

Conversation

ormsbee commented Jan 18, 2024 • edited Loading

ormsbee commented Jan 22, 2024

ormsbee commented Jan 22, 2024

ormsbee commented Jan 23, 2024

bradenmacdonald left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ormsbee commented Jan 30, 2024

ormsbee commented Jan 30, 2024 • edited Loading

kdmccormick left a comment

Choose a reason for hiding this comment

ormsbee commented Jan 30, 2024

ormsbee commented Jan 31, 2024

ormsbee commented Jan 31, 2024 • edited Loading

ormsbee commented Jan 31, 2024

bradenmacdonald left a comment

Choose a reason for hiding this comment

bradenmacdonald commented Jan 31, 2024

ormsbee commented Jan 31, 2024

ormsbee commented Jan 31, 2024 • edited Loading

ormsbee commented Jan 18, 2024 •

edited

Loading

ormsbee commented Jan 30, 2024 •

edited

Loading

ormsbee commented Jan 31, 2024 •

edited

Loading

ormsbee commented Jan 31, 2024 •

edited

Loading