Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: delete / cancel query #11708

Merged
merged 10 commits into from
Sep 16, 2021
25 changes: 24 additions & 1 deletion docs/querying/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -849,7 +849,6 @@ include:

- [Inline datasources](datasource.md#inline).
- [Spatial filters](../development/geo.md).
- [Query cancellation](querying.md#query-cancellation).
- [Multi-value dimensions](#multi-value-strings) are only partially implemented in Druid SQL. There are known
inconsistencies between their behavior in SQL queries and in native queries due to how they are currently treated by
the SQL planner.
Expand Down Expand Up @@ -956,6 +955,30 @@ trailer they all include: one blank line at the end of the result set. If you de
through a JSON parsing error or through a missing trailing newline, you should assume the response was not fully
delivered due to an error.

### HTTP DELETE
You can use the HTTP `DELETE` method to cancel a SQL query. When you cancel a query canceled, Druid marks its `SqlLifecycle` canceled immediately. After that, any operations in `SqlLifecycle` become non-operational to avoid unnecessary execution of expensive operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can use the HTTP `DELETE` method to cancel a SQL query. When you cancel a query canceled, Druid marks its `SqlLifecycle` canceled immediately. After that, any operations in `SqlLifecycle` become non-operational to avoid unnecessary execution of expensive operations.
You can use the HTTP `DELETE` method to cancel a SQL query. When you cancel a query, Druid immediately marks its `SqlLifecycle` as canceled. After that, any operations in `SqlLifecycle` become non-operational to avoid unnecessary execution of expensive operations.


HTTP DELETE method uses the following syntax:
```
DELETE https://BROKER:8082/druid/v2/sql/{sqlQueryId}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we should promote using router instead of broker since the router is now sort of mandatory to use the web console. Or, should we mention that you can use either router or broker address here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The POST examples used BROKER, but I can. change to ROUTER and mention both are options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that this is clearly laid out in a code block, but it's a different format than is used to describe HTTP POST, which says: "You can make Druid SQL queries using HTTP via POST to the endpoint /druid/v2/sql/."

Perhaps have a DELETE endpoint description similar to HTTP POST, or edit HTTP POST to have a code block like done here?

```

The DELETE method requires the `sqlQueryId`, therefore to predict the query id you must set it in the query context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Queries are allowed to have the same sqlQueryId even when they are different. It would be nice to mention that all queries will be cancelled if you have multiple sql queries running with the same sqlQueryId.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about that actually. Thank you!


For example if you issue the following query:
```bash
curl --request POST 'https://BROKER:8082/druid/v2/sql' \
--header 'Content-Type: application/json' \
--data-raw '{"query" : "SELECT sleep(CASE WHEN sum_added > 0 THEN 1 ELSE 0 END) FROM wikiticker WHERE sum_added > 0 LIMIT 15",
"context" : {"sqlQueryId" : "myQuery01"}}'
```
You can cancel the query using the query id as follows:
```bash
curl --request DELETE 'https://BROKER:8082/druid/v2/sql/myQuery01' \
```

Druid returns an HTTP 202 response for successful deletion requests. If the the query id is incorrect or if the query completes before your cancellation request, Druid returns an HTTP 404 response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cancellation request requires the read permission on all resources (currently datasource and view) in the sql query. It returns 404 as well when authorization fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jihoonson Why is status code 404 returned for autorization failure? In compliance with the HTTP status code definition, code 403 is the appropriate one. Actually, the native query endpoint returns 403 for such case

if (!authResult.isAllowed()) {
throw new ForbiddenException(authResult.toString());
}

Copy link
Contributor

@jihoonson jihoonson Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @FrankChen021, per the RFC for HTTP, you can return either 403 or 404 (https://datatracker.ietf.org/doc/html/rfc7231#section-6.5.3, https://datatracker.ietf.org/doc/html/rfc7231#section-6.5.4). From security perspective, 404 is preferred to 403 since 403 leaks the existence of resource.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in such case, this endpoint is already public to everyone, I think there's no concern of security problem so that we need to hide its existence. 404 might be a confusion for users when authorization fails. And it's better that we keep it consistent with the implementation of native query.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL authorization checks the permission on the resources used in the SQL query, such as datasources or views. Returning 404 is to hide the existence of those datasources or views, not the endpoint. For user confusion, what is the use case where the user should know that he doesn't have permission to cancel the query? The query is usually cancelled by the user who submitted the query or the cluster admin. Either case, they will have the right permission already. For the difference from native query behavior, I'm pretty sure that most clients should handle native queries and sql queries separately anyway since their result format is different.

BUT, I think authorization failure handling should be consistent with the query posting endpoint which returns 403. I will create a PR to fix this. Thank you for bringing this up!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised #11710.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jihoonson


### JDBC

You can make Druid SQL queries using the [Avatica JDBC driver](https://calcite.apache.org/avatica/downloads/). We recommend using Avatica JDBC driver version 1.17.0 or later. Note that as of the time of this writing, Avatica 1.17.0, the latest version, does not support passing connection string parameters from the URL to Druid, so you must pass them using a `Properties` object. Once you've downloaded the Avatica client jar, add it to your classpath and use the connect string `jdbc:avatica:remote:url=http://BROKER:8082/druid/v2/sql/avatica/`.
Expand Down