Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Paginating ClusterManager Read APIs] Paginate _cat/indices API. #14258

Closed
gargharsh3134 opened this issue Jun 13, 2024 · 30 comments · Fixed by #14718
Closed

[Paginating ClusterManager Read APIs] Paginate _cat/indices API. #14258

gargharsh3134 opened this issue Jun 13, 2024 · 30 comments · Fixed by #14718
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request v2.17.0

Comments

@gargharsh3134
Copy link
Contributor

gargharsh3134 commented Jun 13, 2024

Is your feature request related to a problem? Please describe

As the number of indices grow in an opensearch cluster, the response size and the latency of the default _cat/indices API increases which makes it difficult not only for the client to consume such large responses, but also stresses out the cluster by making it accumulate stats across all the indices.

Thus, pagination will not only help in limiting the size of response for a single query but will also prevent the cluster from accumulating indices stats for all the indices. So, this issue tracks the approaches that can be used to paginate the response.

Describe the solution you'd like

Paginated API would require new query params and response formats to be introduced, thus it is being proposed to add new _list/indices API.
As for the high level approach, for paginating the response, a pagination key would be required for which a deterministic order is/can be maintained/generated in the cluster. Deterministic order is required for starting a new response page from the point where the last page left. Index creation timestamps will thus be used as pagination keys.

Overview

Each index has a creation timestamp stored in IndexMetadata which is part of Metadata object of ClusterState. These creation timestamps can act as sort/pagination key using which list of indices, sorted as per their respective creation timestamps, can be generated. The generated sorted list can then be used to prepare a list of indices to be sent as a response as per the page size.

Proposed User Experience

New API Path/URL:

curl "localhost:9200/_list/indices?___
curl "localhost:9200/_list/indices/{indices}?___ where {indices} is a comma separated list of indices.

New Query Parameters:

Parameter Name Type Default Value Description
next_token String null to be used for fetching the next page
size Integer 500 maximum number of indices that can be returned in a single page
sort String/ENUM asc order of indices in page. Allowed values will be "asc"/"desc". If "desc", most recently created indices would be displayed first

Sample Query :
curl "localhost:9200/_list/indices?next_token=<nextToken>&size=1000&sort=asc"

curl "localhost:9200/_list/indices/{indices}?next_token=<nextToken>&size=1000&sort=asc"

New Response Parameters:

Parameter Name Type Description
next_token String to be used for fetching the next page

Note: The next_token would be Base64 encoded.

New Response Formats:

format=JSON: next_token, and indices will be new keys of the JSON response object.


{
  "next_token" : "<nextToken>",
  "indices" : [{
    "health" : "green",
    "status" : "open",
    "index" : "test-index1",
    "uuid" : "0TbPDgYBRkmifYvJGWYk9w",
    "pri" : "1",
    "rep" : "1",
    "docs.count" : "1",
    "docs.deleted" : "0",
    "store.size" : "7.8kb",
    "pri.store.size" : "3.9kb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : ""test-index2,
    "uuid" : "VVU90yukSCqxs9N8VLWT0w",
    "pri" : "1",
    "rep" : "2",
    "docs.count" : "0",
    "docs.deleted" : "0",
    "store.size" : "624b",
    "pri.store.size" : "208b"
  }]
}

Plain text format (or table format): next_token will be present as the last row of the table.

green open test-index  FclPNwqmQie3RnpoOYpLTg 2 3 0 0 1.6kb 416b
green open test-index2 k5FNOQcvSV679Hcs1g-OwQ 2 3 0 0 1.6kb 416b
green open index4-test s2AJ5CoYRru-8lbB871pWA 2 3 0 0 1.6kb 416b
next_token <nextToken>

Proposed Pagination Behaviour

Note: The indices which might get created while the paginated queries are being executed, will be referred to as newly created indices for the following points.

  1. Number of indices in a page will always be less than or equal to the user provided size query parameter.

  2. Displaying newly created indices will depend on the requested sort type.
    If sort is specified as "asc", then the newly created indices will be shown as part of rear end of the response pages. However, for sort specified as "desc", because the subsequent pages will only contain the indices which are older than the already displayed ones, newly created indices will be filtered out.

  3. Any index yet to be displayed, if deleted will NOT be a part of response.

Implementation Details:

Common PageParams class which would contain parameters required for paginating APIs, and the underlying implementation to paginate would thus interact with it:


public class PageParams {

    public static final String PARAM_SORT = "sort";
    public static final String PARAM_NEXT_TOKEN = "next_token";
    public static final String PARAM_SIZE = "size";
    public static final String PARAM_ASC_SORT_VALUE = "asc";
    public static final String PARAM_DESC_SORT_VALUE = "desc";

    private final String requestedTokenStr;
    private final String sort;
    private final int size;

    public PageParams(String requestedToken, String sort, int size) {
        this.requestedTokenStr = requestedToken;
        this.sort = sort;
        this.size = size;
    }

    public String getSort() {
        return sort;
    }

    public String getRequestedToken() {
        return requestedTokenStr;
    }

    public int getSize() {
        return size;
    }

    public void writePageParams(StreamOutput out) throws IOException {
        out.writeString(requestedTokenStr);
        out.writeString(sort);
        out.writeInt(size);
    }

    public static PageParams readPageParams(StreamInput in) throws IOException {
        String requestedToken = in.readString();
        String sort = in.readString();
        int size = in.readInt();
        return new PageParams(requestedToken, sort, size);
    }

}


Common PageToken class which would contain response body parameters for paginated APIs, which the pagination logic should output:

public class PageToken {

    public static final String PAGINATED_RESPONSE_NEXT_TOKEN_KEY = "next_token";

    /**
     * String denoting the next_token of paginated response, which will be used to fetch next page (if any).
     */
    private final String nextToken;

    /**
     * String denoting the element which is being paginated (for e.g. shards, indices..).
     */
    private final String paginatedEntity;

    public PageToken(String nextToken, String paginatedElement) {
        assert paginatedElement != null : "paginatedElement must be specified for a paginated response";
        this.nextToken = nextToken;
        this.paginatedEntity = paginatedElement;
    }

    public String getNextToken() {
        return nextToken;
    }

    public String getPaginatedEntity() {
        return paginatedEntity;
    }

    public void writePageToken(StreamOutput out) throws IOException {
        out.writeString(nextToken);
        out.writeString(paginatedEntity);
    }

    public static PageToken readPageToken(StreamInput in) throws IOException {
        String nextToken = in.readString();
        String paginatedEntity = in.readString();
        return new PageToken(nextToken, paginatedEntity);
    }
}

As, parsing request is the responsibility of Rest Layer, the implementation to do so for paginated APIs can be abstracted out. A new AbstractListAction would be introduced, and the API to be paginated (_list/indices in this case) can thus extend it:


public abstract class AbstractListAction extends AbstractCatAction {

    private static final int DEFAULT_PAGE_SIZE = 100;
    protected PageParams pageParams;

    protected abstract void documentation(StringBuilder sb);

    @Override
    public RestChannelConsumer prepareRequest(final RestRequest request, final NodeClient client) throws IOException {
        boolean helpWanted = request.paramAsBoolean("help", false);
        if (helpWanted || isActionPaginated() == false) {
            return super.prepareRequest(request, client);
        }
        this.pageParams = validateAndGetPageParams(request);
        assert Objects.nonNull(pageParams) : "pageParams can not be null for paginated queries";
        return doCatRequest(request, client);
    }

    @Override
    public boolean isActionPaginated() {
        return true;
    }

    /**
     *
     * @return Metadata that can be extracted out from the rest request. Query params supported by the action specific
     * to pagination along with any respective validations to be added here.
     */
    protected PageParams validateAndGetPageParams(RestRequest restRequest) {
        PageParams pageParams = restRequest.parsePaginatedQueryParams(defaultSort(), defaultPageSize());
        // validating pageSize
        if (pageParams.getSize() <= 0) {
            throw new IllegalArgumentException("size must be greater than zero");
        }
        // Validating sort order
        if (!(PARAM_ASC_SORT_VALUE.equals(pageParams.getSort()) || PARAM_DESC_SORT_VALUE.equals(pageParams.getSort()))) {
            throw new IllegalArgumentException("value of sort can either be asc or desc");
        }
        return pageParams;
    }

    protected int defaultPageSize() {
        return DEFAULT_PAGE_SIZE;
    }

    protected String defaultSort() {
        return PARAM_ASC_SORT_VALUE;
    }

Since paginating an API would require hold of clusterState to fetch metadata for indices, the core logic to paginate can only be executed after a TransportClusterStateAction has been completed. Each API has its own way of calling Transport Actions. APIs would either be making the TransportAction calls directly from the Rest layer or would have a wrapper Transport Action which would call required actions. Given these behaviours co-exist, the approach has to be agnostic of the way TransportClusterState call is being made.
Thus, the proposal is to have PaginationStrategies implementing a common Strategy interface, which the onboarding APIs can instantiate:

public interface PaginationStrategy<T> {

    String INCORRECT_TAINTED_NEXT_TOKEN_ERROR_MESSAGE =
        "Parameter [next_token] has been tainted and is incorrect. Please provide a valid [next_token].";

    /**
     *
     * @return Base64 encoded string, which can be used to fetch next page of response.
     */
    PageToken getResponseToken();

    /**
     *
     * @return List of elements fetched corresponding to the store and token received by the strategy.
     */
    List<T> getRequestedEntities();

    /**
     *
     * Utility method to get list of indices filtered as per {@param filterPredicate} and the sorted according to {@param comparator}.
     */
    static List<IndexMetadata> getSortedIndexMetadata(
        final ClusterState clusterState,
        Predicate<IndexMetadata> filterPredicate,
        Comparator<IndexMetadata> comparator
    ) {
        return clusterState.metadata().indices().values().stream().filter(filterPredicate).sorted(comparator).collect(Collectors.toList());
    }

    static String encryptStringToken(String tokenString) {
        if (Objects.isNull(tokenString)) {
            return null;
        }
        return Base64.getEncoder().encodeToString(tokenString.getBytes(UTF_8));
    }

    static String decryptStringToken(String encTokenString) {
        if (Objects.isNull(encTokenString)) {
            return null;
        }
        try {
            return new String(Base64.getDecoder().decode(encTokenString), UTF_8);
        } catch (IllegalArgumentException exception) {
            throw new OpenSearchParseException(INCORRECT_TAINTED_NEXT_TOKEN_ERROR_MESSAGE);
        }
    }
}


Related component

Cluster Manager

Describe alternatives you've considered

Re-using existing APIs by introducing versioning in Opensearch APIs.
Following URLs were taken into consideration:

  1. _cat/v2/indices - This messes up the API routing for all _cat APIs - Not Preferred
  2. _cat/indices/v2 - v2 will conflict with index names which are supported as filters - Not Preferred
  3. v2/_cat/indices - More rest friendly but it would impact fine grained access control users may have configured on their clusters. Most of the APIs will still be without version in the path and specifics one will have v2 in the path - Preferred
  4. _cat/indicesV2 - This will be less friction but v2 has to be appended to every API being changed - This might also be ok
  5. Using header for version - It is not so obvious or visible. Also for certain APIs (like here where next_token will be required to get next pages after first page), it may require request params or filters to be different so just the setting header may not be sufficient as well. It would require client changes to pass version header.

Alternate implementation details which were tried out:

  1. Generating nextToken and previousToken.
    During the first paginated query (when nextToken in request is null), generate the following:

    queryStartTime = CreationTimeOfMostRecentlyCreatedIndex

    The queryStartTime will be used to filter out any newly created indices & then passed along in the next & previous tokens for subsequent page requests. Creation time of most recently created index is being chosen to avoid using coordinator node's clock which might be skewed.

    nextToken = base64Encoded("PositionOfCurrentPage'sLastIndexInSortedListOfIndices" + "$ " + "CreationTimeStampOfCurrentPage'sLastIndex" + "$" + "queryStartTime" + "NameOfCurrentPage'sLastIndex")

    For previousToken, need to first get the:

    previousPage'sStartingIndexPosition = max(PositionOfCurrentPage'sLastIndexInSortedListOfIndices - maxPageSize, 0)

    then:

    previousToken= base64Encoded("previousPage'sStartingIndexPosition" + "$ " + "CreationTimeStampOfPreviousPage'sStartingIndex" + "$" + "queryStartTime" + "NameOfPreviousPage'sStartingIndex")

  2. Generating sorted list of indices based on creation time. To generate a sorted list, the following approaches need to be compared:

  • (RECOMMENDED) Sorting the list obtained from clusterState metadata at the rest layer each time a paginated query is obtained.

    Pros:
    * Lesser refactoring and cleaner.

    Cons:
    * Performance/Latency overhead of sorting list of indices for each paginated query.

  • Making the indices() map of the Metadata object itself deterministic/sorted. The current Hashmap implementation can be replaced with say LinkedHashMap and each time a new entry (index) is put/inserted into the map, a sorted map will be maintained.

    Pros:
    * Metadata will always remain sorted and can then be directly used at the rest layer.

    Cons:
    Needs significant refactoring & will introduce branching because of the following issues:
    * The order needs to maintained while publishing and applying the clusterState's diff.
    * RemoteState has a different diff logic and does't make use of the common diff utility.
    * Handling multi version nodes in a cluster will be required. While reading the state from a node on an older version, a new sorting layer would be required in readFrom.
    * Handling case when clusterState is built from state stored in Lucene index on the disc.

  • Maintaining a cache on each node which can be updated as part of appliers and listeners being run after each clusterState update. Also, a new TransportAction (with a term version check) might also be required for rest layer to fetch that cache.

  1. New paginationMetadata object to be introduced in the Table class which can then be used while preparing the rest response.

    public static class PaginationMetadata {

        /**
         * boolean denoting whether the table is paginated or not.
         */
        public final boolean isResponsePaginated;

        /**
         * String denoting the element which is being paginated (for e.g. shards, indices..).
         */
        public final String paginatedElement;

        /**
         * String denoting the nextToken of paginated response, which will be used to fetch nextPage (if any).
         */
        public final String nextToken;
        
        /**
         * String denoting the previousToken of paginated response, which will be used to fetch previous page (if any).
         */
        public final String previousToken;

        public PaginationMetadata(@NonNull boolean isResponsePaginated, @Nullable String paginatedElement, @Nullable String nextToken, @Nullable String previousToken) {
            this.isResponsePaginated = isResponsePaginated;
            assert !isResponsePaginated || paginatedElement != null : "paginatedElement must be specified for a table which is paginated";
            this.paginatedElement = paginatedElement;
            this.nextToken = nextToken;
            this.previous= previousToken;
        }
    }

Additional context

No response

@dblock
Copy link
Member

dblock commented Jul 1, 2024

[Catch All Triage - Attendees 1, 2, 3, 4, 5]

@rwali-aws rwali-aws added the v2.16.0 Issues and PRs related to version 2.16.0 label Jul 11, 2024
@backslasht
Copy link
Contributor

(OPEN POINT) Response for a JSON format request is easy to define with the existing behaviour. However, with plain text format response, which is actually a (n*n) table with headers and rows, nextToken doesn't fit in.

Curious to know how are we solving it.

Whether to expose a parameter for user to get the list of indices in the ascending or descending order of their corresponding index create timestamps.

This makes sense, helps to identify the newly created indices with descending time stamp.

@shwetathareja
Copy link
Member

asc/ desc sorting will definitely be helpful.

@shwetathareja
Copy link
Member

Introducing nextToken in the responseBody will change how response is parsed today

@rwali-aws rwali-aws added v2.17.0 and removed v2.16.0 Issues and PRs related to version 2.16.0 labels Jul 22, 2024
@andrross
Copy link
Member

@gargharsh3134 Regarding the question of returning JSON, the "cat" in "_cat" literally means "compact aligned text". If the proposed API is going to return JSON then it really sounds like a different API that isn't "cat". Is there an example today of a _cat API that returns something other than compact aligned text? I'm not totally opposed to changing the meaning of "cat", but you're already proposing to stick a new "v2" in the URL so maybe all that suggests this really should just be a new API all together?

@gargharsh3134
Copy link
Contributor Author

@andrross Sorry if I misunderstood your comment, but JSON format is an existing common url parameter for _cat APIs. Plain text format is however the default one, but if specified, the JSON response as of today looks like this:

[email protected]: ~
% curl "localhost:9200/_cat/indices?format=json&pretty" 
[
  {
    "health" : "green",
    "status" : "open",
    "index" : ".plugins-ml-config",
    "uuid" : "0TbPDgYBRkmifYvJGWYk9w",
    "pri" : "1",
    "rep" : "1",
    "docs.count" : "1",
    "docs.deleted" : "0",
    "store.size" : "7.8kb",
    "pri.store.size" : "3.9kb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : ".opensearch-observability",
    "uuid" : "VVU90yukSCqxs9N8VLWT0w",
    "pri" : "1",
    "rep" : "2",
    "docs.count" : "0",
    "docs.deleted" : "0",
    "store.size" : "624b",
    "pri.store.size" : "208b"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : ".kibana_1",
    "uuid" : "1YfSlVzyQHSOOLW9OB6BMw",
    "pri" : "1",
    "rep" : "2",
    "docs.count" : "1",
    "docs.deleted" : "0",
    "store.size" : "15.6kb",
    "pri.store.size" : "5.2kb"
  }
]


The proposal is to add new keys to the JSON object in case requested format is JSON (will edit the issue to bring this out clearly). Please do let me know if you still see this as a concern or otherwise.

Thanks!

@rajiv-kv
Copy link
Contributor

rajiv-kv commented Aug 1, 2024

@gargharsh3134 - Given that we are proposing to sort on indices creation time (latest_indices_first), do we see a usecase for introducing prevToken as parameter ? It would help if you can share an example on prevToken based iteration.

In table format, would the last two rows returned always (even if there no tokens) ? Just thinking as to how clients will adapt to the table parsing.

@gargharsh3134
Copy link
Contributor Author

Given that we are proposing to sort on indices creation time (latest_indices_first), do we see a usecase for introducing prevToken as parameter ? It would help if you can share an example on prevToken based iteration.

@rajiv-kv latest_indices_first and previous_token are being introduced as two independent features. latest_indices_first would be just used to define the order in which the indices are to be returned, while previous_token can be used to move between the pages (i.e. say moving from page3 back to page2).

In table format, would the last two rows returned always (even if there no tokens) ? Just thinking as to how clients will adapt to the table parsing.

Given that the default behaviour of V2 API will be paginated, they will always be returned (will be null if doesn't exist).

Please let me know, if this behaviour seems correct to you.
Thanks!

@andrross
Copy link
Member

andrross commented Aug 1, 2024

Thanks @gargharsh3134! I didn't realize the _cat APIs could already return formats other than compact aligned text. Ignore my comment.

@Bukhtawar
Copy link
Collaborator

My two cents lets start with nextToken and avoid a previous token since it's quite possible to see inconsistent results when user performs prev and then next when the backing data/metadata has changed. Keeping it consistent would require a local copy in some cases, and since there is no keep_alive we might be holding or would require to hold certain resources for long.

@gargharsh3134
Copy link
Contributor Author

@Bukhtawar Agree, this is prone to introducing some inconsistency (as is already called out in the issue) if the already displayed indices/elements are deleted, then the previous page will not be the same as it was when last queried. I was thinking to clarify and explicitly call this behaviour out in the documentation.
The primary motivation behind supporting previous_token was to cover the use case of UI clients such as Dashboards, where if a user requires to query the previous page, they can make use of previous token. That way, the clients don't need to cache or store either of the page contents or next_tokens from the last set of responses to display the previous page.

Please let me know if you still think that we should avoid providing previous_token.

@shwetathareja
Copy link
Member

@gargharsh3134 : Adding Rest API versioning with v2 at the server side needs a more wider discussion. I don't want this decision to be taken in the context of limited APIs (_cat APIs here) rather think more holistically so that in future if versioning is introduced for any other API, it can follow it consistently.

@andrross
Copy link
Member

andrross commented Aug 8, 2024

Adding Rest API versioning with v2 at the server side needs a more wider discussion

@gargharsh3134 @shwetathareja Agreed. There was some initial discussion in #3035 for this.

@shwetathareja
Copy link
Member

shwetathareja commented Aug 20, 2024

Thanks @andrross #3035 this issue is more from client perspective. On the Server side to add versioning in routing path, the different proposals are:

  1. _cat/v2/indices - This messes up the API routing for all _cat APIs - Not Preferred
  2. _cat/indices/v2 - v2 will conflict with index names which are supported as filters - Not Preferred
  3. v2/_cat/indices - More rest friendly but it would impact fine grained access control users may have configured on their clusters. Most of the APIs will still be without version in the path and specifics one will have v2 in the path - Preferred
  4. _cat/indicesV2 - This will be less friction but v2 has to be appended to every API being changed - This might also be ok
  5. _list/indices - Switch to different terminology altogether. The major friction is these are diagnostic APIs and users may be too comfortable with current APIs and it will add discoverability and adoption challenge
  6. Using header for version - It is not so obvious or visible. Also for certain APIs (like here where next_token will be required to get next pages after first page), it may require request params or filters to be different so just the setting header may not be sufficient as well. It would require client changes to pass version header.

I am in favor of Option 3 (Option 4 is also acceptable), but open for feedback if you think any other option is better.

Tagging : @Bukhtawar @backslasht @andrross @dblock @reta @rajiv-kv @rwali-aws

@Bukhtawar
Copy link
Collaborator

Thanks @shwetathareja do we also want to consider versioning using custom headers
For instance we can have curl -H "Accepts-version:2.0" _cat/indices. The benefit of using this is it doesn't clutter the URI with the versioning information and keeps the security bindings intact.

@shwetathareja
Copy link
Member

shwetathareja commented Aug 20, 2024

Thanks @Bukhtawar . I did think about header for versioning, just that it is not so obvious or visible. Also for certain APIs (like here where next_token will be required to get next pages after first page), it may require request params or body to be different so just the setting header may not be sufficient as well. Updated above as well.

@Bukhtawar
Copy link
Collaborator

Thanks @Bukhtawar . I did think about header for versioning, just that it is not so obvious or visible. Also for certain APIs (like here where next_token will be required to get next pages after first page), it may require request params or body to be different so just the setting header may not be sufficient as well. Updated above as well.

The header only serves to remove the URI clutter, while I agree it is not so obvious and that's also what de-clutters the URI. Yes it would require the params to be different but thats with all the other options too, as all of them require client side code changes. I am good with 3, 4 and 6

@backslasht
Copy link
Contributor

I prefer option 3, the most cleanest and less ambiguous of all.

@reta
Copy link
Collaborator

reta commented Aug 20, 2024

Thanks @shwetathareja , I would side with @Bukhtawar here and the (early) conclusions from #3035 as @andrross mentioned. We already use some versioning as part of content type (fe application/vnd.opensearch+json;compatible-with=7), it looks logical (to me at least) to stay consistent and stick to it (OpenAPI specs support content type based schemas if I am not mistaken).

Alternatively, we could keep the same API endpoint _cat/indices, same response payload, but document that:

  • limit the output to N indices (for example 100, could be controlled using query string parameter)
  • return Link header with the next page, fe: Link: </_cat/indices/next_token=xxxx>; rel="next" to indicate more results could be fetched

This will not introduce any changes in the payload / URL but only use hypermedia (Link header) to navigate over the resultset.

@backslasht
Copy link
Contributor

While header solves the problem of not modifying the request/response structure (directly), it is not very apparent from the user perspective. Also, high level clients (across all languages) does need to introduce new request/response structures to support pagination.

@reta
Copy link
Collaborator

reta commented Aug 20, 2024

While header solves the problem of not modifying the request/response structure (directly), it is not very apparent from the user perspective.

Yes, but this is the approach OS (and ES) already offers for quite a while, this not something new we introducing

Also, high level clients (across all languages) does need to introduce new request/response structures to support pagination.

Depending on the approach, but not at all when using Link header (nor request / nor response changes). From the other side, changing response to return token (copying example from description):

green open test-index2 k5FNOQcvSV679Hcs1g-OwQ 2 3 0 0 1.6kb 416b
green open index4-test s2AJ5CoYRru-8lbB871pWA 2 3 0 0 1.6kb 416b
previous_token <previousToken>
next_token <nextToken>

breaks all command line tooling (notably, awk, that people use very often, myself included).

@rajiv-kv
Copy link
Contributor

Thanks @shwetathareja . I think option-3 is better as we can set expectation to users that the response is partial and user needs to make follow-up calls to fetch additional pages.

I think command line tooling would anyways require a change as they would have to interpret the token and accumulate responses across multiple calls.

@shwetathareja
Copy link
Member

Thanks @Bukhtawar @backslasht @reta @rajiv-kv for the feedback.

@reta I do agree with the pain of changing the custom scripts written for _cat API to handle pagination. But, the current _cat APIs were not designed considering large cluster setup (both in terms of nodes and indices/shards) in perspective. That, we all agree we need to fix.

Now, the reason to keep the routing same much as possible is the familiarity of users on how they interact _cat APIs and adding headers is also a friction to users on how they interact with these APIs e.g. users using OpenSearch Dashboards to perform GET _cat/shards, doesn't look for response headers by default. I do agree with @rajiv-kv , in OpenSearch we don't use response headers to store part of the response. This is not something OpenSearch users are familiar with. It will make their diagnostics even more difficult.

@reta
Copy link
Collaborator

reta commented Aug 27, 2024

I do agree with @rajiv-kv , in OpenSearch we don't use response headers to store part of the response. This is not something OpenSearch users are familiar with. It will make their diagnostics even more difficult.

Thanks @shwetathareja , I will very much disagree with you here, we do store the information in response headers (deprecation warnings is one of them, there are more). In my opinion, the goal to keep tooling intact and change non-breaking is very important. In any case, I share just my opinion on the implementation.

Thanks!

@shwetathareja
Copy link
Member

Thanks @reta for sharing your opinion.
100% in for non breaking changes but using Link Header or providing the next token in the response both are breaking changes from user perspective, they will need to change their client or scripts to handle it. Any thing which requires users to change their client is breaking change IMHO. The deprecation header and others are secondary information, it will not impact your response handling in the application layer even if you ignore them.
In case _cat APIs carry lot of baggage with them, I am open to introducing users to _list/* like _list/indices where it is completely new set of APIs and we can set the right expectation as well.

@backslasht
Copy link
Contributor

Any thing which requires users to change their client is breaking change IMHO

+1, completely agree here.

In case _cat APIs carry lot of baggage with them, I am open to introducing users to _list/* like _list/indices where it is completely new set of APIs and we can set the right expectation as well.

IMHO, this is a very clean option. This also provides enough time for users to switch to new API as and when it is required (when they start using large number of indices) while the existing _cat API will work as is.

@reta
Copy link
Collaborator

reta commented Aug 28, 2024

100% in for non breaking changes but using Link Header or providing the next token in the response both are breaking changes from user perspective,

Using new Link header ls 100% non-breaking change: existing clients are not forced to consume it, there is no any user facing change in request or/and response.

In case _cat APIs carry lot of baggage with them, I am open to introducing users to _list/* like _list/indices where it is completely new set of APIs and we can set the right expectation as well.

Agree with @backslasht , yet another good option to consider

@shwetathareja
Copy link
Member

Using new Link header ls 100% non-breaking change: existing clients are not forced to consume it, there is no any user facing change in request or/and response.

Just to clarify, existing _cat APIs wouldn't have been removed even in 3.0 (in case this caused the confusion).

So we have alignment on introducing users to _list/* APIs. We are aiming to add this support for selected APIs in 2.17 itself.

@dblock
Copy link
Member

dblock commented Aug 29, 2024

So the cat APIs support a number of options that change what they return, such as format or h. But we are saying that with pagination that does not return CAT format anymore, only JSON, so adding a size parameter that turns on pagination and wraps the response in a standard paginated envelope with next and previous is not one of the many options above. Is text format the only exception? In that case would it be simpler to just not return next/previous in text format or error if format=text and size?

I do like _list too if it's implemented very generically and wraps anything that supports pagination in a standard envelope with next/prev and other metadata. It would give us a cookbook to add pagination to any API, and could be similar to how HAL format works for all APIs that enables generic clients.

{
    "_links": {
        "first": {
            "href": "http://localhost:9200/_list/_cat/aliases"
        },
        "prev": {
            "href": "http://localhost:9200/_list/_cat/aliases?cursor=..."
        },
        "self": {
            "href": "http://localhost:9200/_list/_cat/aliases?cursor=..."
        },
        "next": {
            "href": "http://localhost:9200/_list/_cat/aliases?cursor=..."
        }
    },
    "page": {
        "size": 2,
        "totalElements": 10,
        "totalPages": 5,
        "number": 1
    },
    "_embedded": {
        "_cat/aliases": [

I am not attached to the HAL format at all, OpenSearch probably wants to develop its own page envelope.

If any API can be paginated this way then clients can implement something generic for pagination too. In ruby for example we can make this very elegant and easy to paginate over a large volume of data (example from slack-ruby-client).

client.cat_aliases(size: 10) # returns the first page of aliases from /_cat/aliases

client.cat_aliases(size: 10) do |alias|
   # automatically paginates underneath, yields aliases one by one
end

I want to call out that it's important the embedded items be in the identical format to what _cat/aliases?format=json returns today to avoid duplicating schema all over the place.

@reta
Copy link
Collaborator

reta commented Aug 29, 2024

I am not attached to the HAL format at all, OpenSearch probably wants to develop its own page envelope.

I very like this direction but it is difficult to replicate it for the _cat plain output (non JSON/YAML), but the links could be propagated over Link HTTP standard header.

I want to call out that it's important the embedded items be in the identical format to what _cat/aliases?format=json returns today to avoid duplicating schema all over the place.

💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request v2.17.0
Projects
Status: ✅ Done
Status: 2.17 (First RC 09/03, Release 09/17)
Development

Successfully merging a pull request may close this issue.

9 participants