-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) Backend API restructure: pagination, refactoring, and plans for Gitlab & BitBucket #1134
Comments
I've detailed the existing backend methods in this Gist: https://gist.github.com/tech4him1/9972cf4465e88e7f4e8cc29e6019e8c6. |
I've been attempting to implement an S3-based custom backend, using the reference material from here, @talves, and @tech4him1, and continually digging through netlify-cms-core, netlify-cms-backend-github, and netlify-cms-backend-test. I'm still lacking a lot of clarity, and there seem to be inconsistencies and asymmetries with the existing methods and documentation. I would be interested in participating in discussions about the backend API and how it will change. |
@snoopdouglas I apologize that my spare time for this effort has been reappropriated. I have not been keeping up with the current state of netlify-cms, but anyone else should feel welcome to pick up the torch and run with it as I will not likely get back to it. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Closing this as stale |
Summary
This is a design proposal for our backend API - a goal of the 2.0 release. It is also an update on the immediate plans for implementing pagination for the GitLab API. This is an architecturally complex feature, and the current code needs significant refactoring to support more pagination paradigms, such as the links used by the GitLab API or the JSON wrapper used by BitBucket. Developing this feature should therefore be done with thought towards our future API. I've had extensive discussions with other maintainers and contributors about improvements we want out of this process - these have been a major influence on the design I'm proposing, so I encourage continuing that process with further critique of the suggestions here. While I've written things in an imperative style ("the best way to do this is foo", "we should do bar"), that's just for the sake of brevity. This is all currently in flux, so now is the ideal time to bring up as many concerns as you can think of.
The immediate goals are listed at the end - the proposed design is not intended to be implemented top-to-bottom all at once. It is important to consider, though, since the entire entry-listing process already requires refactoring to support pagination.
Reasons for restructuring
The current pagination code is heavily coupled to a few assumptions which prevent using most REST APIs idiomatically. These include the following:
TODO: link to the code assuming these conditions
The reducer that manages
store.entries.pages
maintains a single Immutable.jsList
of ids for each collection, which is simply concatenated to as new pages are loaded. It's the minimum possible implementation for the infinite-scroll pagination used by the Algolia integration. However, it is incompatible with the way most third-party APIs implement pagination, as well as preventing us from loading arbitrary pages and having their order make sense.Usually, APIs return - either in the JSON body or in headers - a set of links which allow for navigation to given pages relative to the current one. In GitLab, for instance, you can navigate to the next, previous, first, or last page via links in specific headers, but the docs specifically warn against trying to construct URLs based on indexes. BitBucket, on the other hand, uses a wrapper with pagination information around the values returned by its API, and only provides
previous
andnext
(andprevious
isn't guaranteed).Additionally, pages can change between requests as entries are added or removed, and resolving arbitrary changes in pagination (the sources of which may be well outside the current page) on top of sorting things.
Finally, there are significant improvements we've already discussed which can be leveraged to make dealing with pagination easier.
Current backend API
Expand...
Init and auth lifecycle
Expand...
The CMS starts its lifecycle by rendering the
App
component, which dispatchesloadConfig
when it mounts:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/components/App/App.js#L60-L62
When the config finishes loading in
loadConfig
, it dispatchesauthenticateUser
:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/actions/config.js#L102-L105
authenticateUser
dispatchesauthenticating
(AUTH_REQUEST
), then callsbackend.currentUser
:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/actions/auth.js#L46-L51
backend.currentUser
checksthis.user
andthis.authStore
for existing user information, then passes that tothis.implementation.restoreUser
and resets theauthStore
user to that, returningnull
if there is no user info:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/backends/backend.js#L95-L107
When the
App
component is rendering and the user isnull
, it renders the return value ofthis.authenticating()
:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/components/App/App.js#L121-L123
this.authenticating
getsauth
(originally from the redux store) as a prop and creates a backend withcurrentBackend
:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/components/App/App.js#L68-L70
currentBackend
returns a function that takes aconfig
and returns a memoized backend withresolveBackend
, which constructs and returns a backend based on theconfig
:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/backends/backend.js#L358-L367
https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/backends/backend.js#L343-L356
Next,
App.authenticating
rendersbackend.authComponent()
:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/components/App/App.js#L76-L89
It is passed the following props:
onLogin
: bound version of theApp.handleLogin(credentials)
methoderror
: the last auth error (during init this should be null)isFetching
: whether the authentication is fetching (during init this is null)base_url
:backend.base_url
from theconfig.yml
.siteId
:backend.site_domain
from theconfig.yml
.(The names of the
siteId
andbase_url
props should be made consistent - they will be a public face of our backend API. We may want to rethink how we handle backend configuration as well.)The component returned from
backend.authComponent
calls itsonLogin
prop with acredentials
argument (e.g., the GitHub backend). This is passed toApp.handleLogin
, which dispatchesloginUser
with the same argument:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/components/App/App.js#L64-L66
loginUser
dispatches anauthenticating
(AUTH_REQUEST
), then callsbackend.authenticate
:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/actions/auth.js#L66-L72
backend.authenticate
callsthis.implementation.authenticate
to get auser
, adds abackendName
property, caches it, and finally resolves to it:https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/backends/backend.js#L113-L119
In current implementations, this sets properties on the
backend
instance itself (this.token
, for instance) and initializes the backendAPI
object (these are not technically public interfaces, but they are patterns used by all current backends). (e.g., GitHubbackend.authenticate
).Next,
loginUser
dispatchesauthenticate
(AUTH_SUCCESS
), or, on error, dispatches a notification action and then anauthError
(AUTH_FAILURE
):https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/actions/auth.js#L72-L84
The
AUTH_SUCCESS
action is received by the auth reducer, which resets the stored auth information in Redux toMap({ user })
.https://github.com/netlify/netlify-cms/blob/b4b584682473924556a41cae10f64f085b0f432b/src/reducers/auth.js#L4-L19
Methods
Expand...
The methods a backend implementation is required to have are a mixed bag of synchronous, asynchronous, and some which are called with a
Promise.resolve
wrapper, so a synchronous return works despite the fact that the rest of the code sees that operation as asynchronous.TODO: expand on these methods
Authentication
sync authComponent()
Returns a component which presents an interface to login to the CMS. This is usually either a login form or a button to trigger a login popup.
async authenticate(credentials)
Calls the backend with the credentials returned from the
authComponent
. Returns auser
object which is stored in Redux (see auth lifecycles above).sync getToken()
This is used solely to pass a
token
to integrations, and is called synchronously.sync/async logout()
sync/async restoreUser(storedUser)
Published entries
async entriesByFiles(collection, extension)
This only receives the
extension
argument because of the dynamic dispatch used for this andentriesByFolder
.async entriesByFolder(collection, extension)
Note that the existing pagination API is not currently exposed to backends (this would be the most commonly paginated endpoint), as it's only used by an integration.
async getEntry(collection, slug, path)
The
path
argument to this strikes me as odd - shouldn't the backend be responsible for resolving a collection/slug pair to a path? This seems to couple the API needlessly to a filesystem-based backend, or at least require other backends to emulate being based on a filesystem.async persistEntry(entryObj, options)
TODO: figure out the arguments to this function
TODO: figure out if the collection object is consistent across calls
Media library
async persistMedia(file, options)
async getMedia()
async deleteMedia(path)
Editorial workflow
async unpublishedEntries()
async unpublishedEntry(collection, slug)
async updateUnpublishedEntryStatus(collection, slug, newStatus)
async publishUnpublishedEntry(collection, slug)
async deleteUnpublishedEntry(collection, slug)
No clear category
async deleteFile(path, commitMessage)
This method couples the backend interface not only to fs-like storage, but also to GitHub's particular implementation of this API method. This call should be reworked to be in terms of a collection and entry (or just a collection in the case of a single-file collection, if we end up implementing those). It is also confusingly similar to
deleteUnpublishedEntry
.Redux store design
Expand...
New backend design
Expand...
Goals
Backends as stateless functions
Expand...
Instead of storing auth, token, and configuration information in the state of an object, we can pass the required state into the backend when calling it, along with functions that allow the backend to store the state it needs. This state can then be handled by the core code however we want - this would allow us to more easily debug and test backend functions, including the eventual possibility of a backend-agnostic test suite we can provide to assist backend authors. All of that is much harder to do reliably if our API requires the backends to store state themselves.
We can do this by passing a
context
object as an argument containing the valuestate
and the asynchronous functionsetState
.setState
triggers a Redux action which stores a new backend state in the primary Redux store. This allows us to control backend state by changing the arguments we pass in - a huge benefit for both testing backends and using them in unanticipated ways.Is there a use-case for changing
state
to an async functiongetState
?Additionally, this would make creating a backend as simple as exporting functions and a capabilities object (perhaps some other metadata) from a JS module. The imported object could be passed directly to
CMS.registerBackend
.Backend state storage
Expand...
Since backends will be implemented by arbitrary third parties, we can't allow the core code to depend on any part of the redux structure that's specific to a given backend. However, we also want to move to a stateless backend API which allows state to be stored in the Redux store, which allows us to leverage Redux state tracking. We also don't want to store functions in the Redux store, since they can't be serialized - this rules out hiding state with closures, as well as using a
Symbol
for a unique backend key.I believe the best way to do this is to create a separate tree in the redux store specifically for backends. It should be considered a bug for any of the core code to rely on the data within this structure directly - if there is any way to enforce that, we should do so. It should be solely for the purposes of storing backend state, which should be considered opaque to the core CMS application. This allows us to store data in Redux without coupling the core code to a specific backend storage layout.
One important restriction on what backends should set as state is that it should remain serializable, to preserve the benefits of being able to serialize our entire Redux store.
Credential elision with stateless backends
One notable concern with saving Redux state is that we don't want to risk exposing a user's credentials. While we don't currently serialize the Redux store at all, we should keep in mind that we'll need some sort of credential elision if we decide to add an export of the Redux state for things like bug reports.
Allowing backends to use Redux actions/reducers
While I think it would be a mistake to require the use of Redux for backend state, I think it would be useful to allow backends to use the actions/reducers pattern for managing their own state.
An easy way to implement this would be to create a default set of action and reducer that simply implement the ability to set the new backend state completely. Backends that wish to override the default backend reducer could simply export their own reducer, which will be used if it exists. This would make Redux part of the public API, albeit a small one.
Redux structure
Configuration
TODO
Init and auth
TODO
Functions
TODO: determine function signatures - these will be mostly the same with the addition of the
context
argument and a few refactors.Capabilities
Expand...
API
A capabilities API for backends would allow them to declare what featureset they support. For instance, GitLab's pagination capabilities might look like the following:
Since the plan is to use capabilities only for pagination to begin with, GitHub's capabilities would start out as simply an empty object.
The UI would receive the current capabilities as a prop, and could change UI based on what actions are available.
Pagination
Initially, capabilities will be implemented to handle pagination.
None of these capabilities can be assumed - BitBucket, for instance, only guarantees that the actual values and the link to the next page will exist.
Search
Media library
Pagination
Expand...
Features impacted by pagination
I summarized this for @talves in Gitter as follows:
The impact of this is that backends will need some way of communicating their capabilities to the core code, since feature support differences between backends can be small and fine-grained. The "crash on unsupported features" approach we use for backends without editorial workflow support won't work for more subtle differences like "which pagination links are available".
API
The core application will represent pagination state as an opaque
cursor
. The cursor is simple a value used to pass pagination context to the backend, but the core code should never look into or examine the cursor's contents. Cursors should be serializable, so they can be stored in Redux.The
cursor
lifecycle will look something like the following:When a paginated backend endpoint is called, the backend creates and returns a new cursor alongside the results.
The core code stores the cursor object as part of the current view state.
When the UI is rendered, it is passed function props that allow it to perform backend pagination. These should close around the cursor and pass it to the action creator so that the UI itself never has direct access to the cursor. It is also passed the static backend capabilities.
The UI decides which actions to show based on the backend's capabilities.
When the user performs a pagination action, the UI calls the corresponding function prop to create and dispatch that action (the cursor is passed along by calling the function prop, which closed over it earlier).
The action creator calls
backend.cursor.<ACTION>(cursor)
with the cursor, and receives the results and a new cursor.TODO: how do we handle the count, size, and index?
Page Caching
What do we currently cache?
We currently treat pagination as if it is a static, lazy list of entries. That is, we don't require fetching all pages up-front, but we don't ever invalidate the pagination of the current request.
Should we cache?
Since we're leveraging Git-based APIs, I believe that we should cache collection contents as much as possible. Git helps us here by providing SHAs which can be used to tell when the folder underlying a collection has changed.
Where should caching be done, architecturally?
Since the CMS could conceivably used with APIs that either cannot or do not provide immutable keys to data, we will want different caching strategies for different backends. Caching should therefore be done in the backends themselves, and be transparent to the core code.
Caching goes quite a ways past pagination - backends are also responsible for caching auth information, entry contents, etc. Doing page caching in the backend fits our current architecture best.
When can page data be safely cached?
When it refers to an immutable collection with a unique identity. For instance, if a Git-based API provides SHAs of folders, we can safely cache the pages of that folder, since if the folder changes its SHA will also change. This can be leveraged to reduce bandwidth usage even further if the API supports
HEAD
requests to get the SHA of an directory or file without the body.For instance, a page-cache system for a paginated folder using such an API could work like this:
HEAD
request to the currently paginated folder, and retrieve the SHA.UI
Infinite scrolling
Infinite scrolling should not be specified as a capability of its own, since it is simply an implementation of various other pagination capabilities (next/previous being the minimum required for 2-way infinite scrolling).
Standard page navigation
It may be simpler to drop infinite scrolling entirely in favor of explicit paging. It's extremely easy to drop features dynamically with standard paging - you just show a different set of buttons to go to other pages.
Semantic changes
Expand...
These changes are goals for the new API, but can't be implemented as a simple wrapper without changes to the core code:
Backends should be able to trigger display of the authentication dialog.
If we pursue composable backends, we'll want to support multiple backends by allowing the default backend to be overridden for specific collections.
Composable backends can implement several features which are currently implemented separately:
Ideally, we'll remove ~"files"~ and ~"folder"~ collections in favor of single-entry and multiple-entry collections, leaving concerns such as paths to be handled by backend-specific configuration.
Examples
TODO
Integrations
The integrations API is also in need of refactoring or elimination (of the API, not the functionality it's implementing). The writeup on that is now located here: #1171
Immediate Goals
All this long-term API design has a short-term purpose: it directly impacts how we'll want to implement pagination for the GitLab API.
Expand...
Capabilities API for GitLab/BitBucket pagination
For purposes of GitLab and BitBucket support, the capabilities API will only be used for pagination to begin with. Other backends will be modified to return an empty object for capabilities. The capabilities will be passed into the collections pages to allow pagination support, but a comprehensive capabilities system is beyond the scope of the immediate pagination changes.
Cursor implementation
Since the current pagination implementation cannot be neatly refactored to work with the paginated GitLab or Bitbucket APIs, the cursor API will be implemented immediately. Existing backends should need minimal or no changes in order to use the cursor API - they'll simply export no pagination capabilities, return any value whatsoever for the cursor (since it won't actually be used), and continue returning all entries when requesting the
BitBucket
BitBucket will also be updated to use the new capabilities API for pagination. Merging may require some cleanup of the auth code as well, to implement BitBucket's refresh token, but a new auth workflow won't be implemented until this is merged.
New API methods
These will not be implemented by the GitLab or BitBucket backends for now. See the longer-term migration strategy below for more details.
Longer-term migration strategy
Once we've got pagination working and GitLab/BitBucket support merged into master, we'll want to start the work of moving to the new, stateless API (assuming we've decided to move forward with this).
Expand...
The best way to migrate to this backend API would be to build an object which wraps a backend written to support the new, stateless API, and implements the stateful lifecycle expected by the current API. The object wrapper will likely be quite simple, as it's just providing the state management without any of the redux details. Once this is done, we'll want to develop a generic backend test suite that can be run on any backend (perhaps with some backend-specific setup code for reproducible tests). This will allow several things:
This won't immediately provide the benefits of the stateless backend structure, such as more robust Redux playback, but we've lived without that so far. It's important to prove that the new design is both robust and usable before we try to move heavily-used, battle-hardened code like the GitHub backend to it.
TODO: expand
The text was updated successfully, but these errors were encountered: