Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexer: adhoc dataset indexing API #3345

Closed
t83714 opened this issue Apr 19, 2022 · 2 comments
Closed

Indexer: adhoc dataset indexing API #3345

t83714 opened this issue Apr 19, 2022 · 2 comments

Comments

@t83714
Copy link
Contributor

t83714 commented Apr 19, 2022

Indexer: Adhoc dataset indexing API

The indexer currently subscribes (backed by registry webhook) to the dataset changes event stream and auto-indexing any dataset changes.

This process may introduce a few seconds delay between the changes made on the original dataset in the registry and the changes indexed in the search engine.

The short time delay won't cause too much trouble for general-purpose updates. But for access control related updates, we will want a proactive interface to index the changes without delay.

This ticket is about adding 2 new APIs to the indexer:

  • index dataset by id: PUT /v0/dataset/:id
    • indexer will attempt to retrieve the dataset data from the registry using the ID and index the new data into search engine
  • delete dataset by id: DELETE /v0/dataset/:id
    • indexer will delete the dataset with the specified id from the search engine
  • Once the APIs are added, we also need to expose them via gateway & add auth to existing APIs (previously are internally only APIs). For existing APIs, we should:
  • exposing the following APIs.
    • POST /reindex : trigger a proactive full index.
      • validate auth via operationUri api/indexer/reindex.
    • POST /reindex/snapshot: trigger a snapshot.
      • validate auth via operationUri api/indexer/reindex/snapshot.
      • We don't use this API at this moment as it might not fully functional due to this ticket
    • GET /reindex/in-progress: get progress info of the full index trigger.
      • validate auth via operationUri api/indexer/reindex/in-progress.
    • DELETE /dataset/{datasetId}: delete a dataset from search engine index
      • validate auth via operationUri object/dataset/delete
    • PUT /dataset/{datasetId}: reindex a single dataset
      • validate auth via operationUri object/dataset/update
  • The following APIs will still stay internal only.
    • POST /registry-hook: web-hook listener
      • stay as an internal API for performance consideration
    • Status API e.g. /status/ready & /status/live
      • stay as internal API as there is for internal usage only
@t83714
Copy link
Contributor Author

t83714 commented Apr 21, 2022

Partially done in PR: #3344
i.e.: API added but still needs to expose via gateway & add auth to existing APIs (previously are internally only APIs)

@t83714
Copy link
Contributor Author

t83714 commented May 25, 2022

closed via PR #3344 and #3358

@t83714 t83714 closed this as completed May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant