Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DELETE syntax for MSQ #14145

Closed
adarshsanjeev opened this issue Apr 23, 2023 · 4 comments
Closed

DELETE syntax for MSQ #14145

adarshsanjeev opened this issue Apr 23, 2023 · 4 comments

Comments

@adarshsanjeev
Copy link
Contributor

Description

Currently, MSQ supports INSERT (for adding data) and REPLACE (for replacing rows). REPLACE, however, has the potential to also drop segments, by replacing it with an empty set. This works by reingesting the entire data except for the part to be deleted based on some condition.

This is a proposal to add a new DELETE syntax, that would allow users to more easily specify what they want to do without having to understand the concept of reindexing. Internally, this would translate to a REPLACE query, with the inverted condition of the DELETE, so that all rows except the ones in the DELETE are reingested. This means that not many changes would be needed in the MSQ part of Druid.

Example:

DELETE FROM stats
WHERE country = 'New Zealand'
PARTITIONED BY MONTH
CLUSTERED BY city

would be translated to

REPLACE INTO stats OVERWRITE ALL
SELECT * FROM stats 
WHERE NOT(country = 'New Zealand')
PARTITIONED BY MONTH
CLUSTERED BY city

Syntax

DELETE FROM "table name"
WHERE "condition"
PARTITIONED BY "partitioning"
CLUSTERED BY "clustering"

This is similar is structure to a DELETE query from SQL, with the addition of partitioning and clustering. For a DELETE, ideally it should not be required to define these. However, since internally, we reindex the table it is required. If there is a mechanism to get the partitioning/clustering of a datasource while parsing, it would be possible to make those parameters optional, making the query simpler:

DELETE FROM "table name"
WHERE "condition"

PR

There is one PR, which is still a work in progress.

#13674

@imSanko
Copy link

imSanko commented Apr 28, 2023

Can I work on this issue ??

@LakshSingla
Copy link
Contributor

@imSanko Sure! There was some work initially done as a part of the PR #13674, however, that was put on hold due to some related work that might change how the feature looks. LMK in case you need help to proceed further.

Copy link

github-actions bot commented Mar 9, 2024

This issue has been marked as stale due to 280 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If this issue is still
relevant, please simply write any comment. Even if closed, you can still revive the
issue at any time or discuss it on the [email protected] list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Mar 9, 2024
Copy link

github-actions bot commented Apr 6, 2024

This issue has been closed due to lack of activity. If you think that
is incorrect, or the issue requires additional review, you can revive the issue at
any time.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants