wip

317brian · Jan 9, 2025 · 91ec717 · 91ec717
1 parent 9906544
commit 91ec717
Show file tree

Hide file tree

Showing 16 changed files with 73 additions and 39 deletions.
diff --git a/docs/api-reference/sql-ingestion-api.md b/docs/api-reference/sql-ingestion-api.md
@@ -379,7 +379,8 @@ print(response.text)
 
 The response shows an example report for a query.
 
-<details><summary>View the response</summary>
+<details>
+<summary>View the response</summary>
 
 ```json
 {

diff --git a/docs/comparisons/druid-vs-spark.md b/docs/comparisons/druid-vs-spark.md
@@ -39,4 +39,4 @@ One typical setup seen in production is to process data in Spark, and load the p
 
 For more information about using Druid and Spark together, including benchmarks of the two systems, please see:
 
-<https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible-analytics-scale-butani>
+https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible-analytics-scale-butani
diff --git a/docs/configuration/extensions.md b/docs/configuration/extensions.md
@@ -100,7 +100,7 @@ All of these community extensions can be downloaded using [pull-deps](../operati
 |druid-momentsketch|Support for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library|[link](../development/extensions-contrib/momentsketch-quantiles.md)|
 |druid-tdigestsketch|Support for approximate sketch aggregators based on [T-Digest](https://github.com/tdunning/t-digest)|[link](../development/extensions-contrib/tdigestsketch-quantiles.md)|
 |gce-extensions|GCE Extensions|[link](../development/extensions-contrib/gce-extensions.md)|
-|prometheus-emitter|Exposes [Druid metrics](../operations/metrics.md) for Prometheus server collection (<https://prometheus.io/>)|[link](../development/extensions-contrib/prometheus.md)|
+|prometheus-emitter|Exposes [Druid metrics](../operations/metrics.md) for Prometheus server collection (https://prometheus.io/)|[link](../development/extensions-contrib/prometheus.md)|
 |druid-kubernetes-overlord-extensions|Support for launching tasks in k8s without Middle Managers|[link](../development/extensions-contrib/k8s-jobs.md)|
 |druid-spectator-histogram|Support for efficient approximate percentile queries|[link](../development/extensions-contrib/spectator-histogram.md)|
 |druid-rabbit-indexing-service|Support for creating and managing [RabbitMQ](https://www.rabbitmq.com/) indexing tasks|[link](../development/extensions-contrib/rabbit-stream-ingestion.md)|

diff --git a/docs/configuration/index.md b/docs/configuration/index.md
@@ -403,7 +403,7 @@ Metric monitoring is an essential part of Druid operations. The following monito
 |`org.apache.druid.server.metrics.SegmentStatsMonitor` | **EXPERIMENTAL** Reports statistics about segments on Historical services. Available only on Historical services. Not to be used when lazy loading is configured.|
 |`org.apache.druid.server.metrics.QueryCountStatsMonitor`|Reports how many queries have been successful/failed/interrupted.|
 |`org.apache.druid.server.metrics.SubqueryCountStatsMonitor`|Reports how many subqueries have been materialized as rows or bytes and various other statistics related to the subquery execution|
-|`org.apache.druid.server.emitter.HttpEmittingMonitor`|Reports internal metrics of `http` or `parametrized` emitter (see below). Must not be used with another emitter type. See the description of the metrics here: <https://github.com/apache/druid/pull/4973>.|
+|`org.apache.druid.server.emitter.HttpEmittingMonitor`|Reports internal metrics of `http` or `parametrized` emitter (see below). Must not be used with another emitter type. See the description of the metrics here: https://github.com/apache/druid/pull/4973.|
 |`org.apache.druid.server.metrics.TaskCountStatsMonitor`|Reports how many ingestion tasks are currently running/pending/waiting and also the number of successful/failed tasks per emission period.|
 |`org.apache.druid.server.metrics.TaskSlotCountStatsMonitor`|Reports metrics about task slot usage per emission period.|
 |`org.apache.druid.server.metrics.WorkerTaskCountStatsMonitor`|Reports how many ingestion tasks are currently running/pending/waiting, the number of successful/failed tasks, and metrics about task slot usage for the reporting worker, per emission period. Only supported by Middle Manager node types.|
@@ -1195,7 +1195,8 @@ The following table shows the dynamic configuration properties for the Overlord.
 
 The following is an example of an Overlord dynamic config:
 
-<details><summary>Click to view the example</summary>
+<details>
+<summary>Click to view the example</summary>
 
 ```json
 {

diff --git a/docs/development/docs-contribute.md b/docs/development/docs-contribute.md
@@ -34,7 +34,7 @@ Druid docs contributors:
 Druid docs contributors can open an issue about documentation, or contribute a change with a pull request (PR).
 
 The open source Druid docs are located here:
-<https://druid.apache.org/docs/latest/design/index.html>
+https://druid.apache.org/docs/latest/design/index.html
 
 If you need to update a Druid doc, locate and update the doc in the Druid repo following the instructions below.
 

diff --git a/docs/ingestion/concurrent-append-replace.md b/docs/ingestion/concurrent-append-replace.md
@@ -83,7 +83,8 @@ druid.indexer.task.default.context={"useConcurrentLocks":true}
 
 We recommend that you use the `useConcurrentLocks` context parameter so that Druid automatically determines the task lock types for you. If, for some reason, you need to manually set the task lock types explicitly, you can read more about them in this section.
 
-<details><summary>Click here to read more about the lock types.</summary>
+<details>
+<summary>Click here to read more about the lock types.</summary>
 
 Druid uses task locks to make sure that multiple conflicting operations don't happen at once.
 There are two task lock types: `APPEND` and `REPLACE`. The type of lock you use is determined by what you're trying to accomplish.

diff --git a/docs/ingestion/kinesis-ingestion.md b/docs/ingestion/kinesis-ingestion.md
@@ -43,7 +43,8 @@ This section outlines the configuration properties that are specific to the Amaz
 
 The following example shows a supervisor spec for a stream with the name `KinesisStream`:
 
-<details><summary>Click to view the example</summary>
+<details>
+<summary>Click to view the example</summary>
 
 ```json
 {

diff --git a/docs/multi-stage-query/examples.md b/docs/multi-stage-query/examples.md
@@ -39,7 +39,8 @@ When you insert or replace data with SQL-based ingestion, set the context parame
 
 This example inserts data into a table named `w000` without performing any data rollup:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 INSERT INTO w000
@@ -85,7 +86,8 @@ CLUSTERED BY channel
 
 This example inserts data into a table named `kttm_rollup` and performs data rollup. This example implements the recommendations described in [Rollup](./concepts.md#rollup).
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 INSERT INTO "kttm_rollup"
@@ -126,7 +128,8 @@ CLUSTERED BY browser, session
 
 This example aggregates data from a table named `w000` and inserts the result into `w002`.
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 INSERT INTO w002
@@ -153,7 +156,8 @@ CLUSTERED BY page
 
 This example inserts data into a table named `w003` and joins data from two sources:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 INSERT INTO w003
@@ -209,7 +213,8 @@ PARTITIONED BY HOUR
 
 This example replaces the entire datasource used in the table `w007` with the new query data while dropping the old data:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 REPLACE INTO w007
@@ -256,7 +261,8 @@ CLUSTERED BY channel
 
 This example replaces certain segments in a datasource with the new query data while dropping old segments:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 REPLACE INTO w007
@@ -279,7 +285,8 @@ CLUSTERED BY page
 
 ## REPLACE for reindexing an existing datasource into itself
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 REPLACE INTO w000
@@ -305,7 +312,8 @@ CLUSTERED BY page
 
 ## SELECT with EXTERN and JOIN
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 WITH flights AS (

diff --git a/docs/querying/nested-columns.md b/docs/querying/nested-columns.md
@@ -330,7 +330,7 @@ FROM (
 PARTITIONED BY ALL
 ```
 
-## Ingest a JSON string as COMPLEX<json\>
+## Ingest a JSON string as COMPLEX\<json\>
 
 If your source data contains serialized JSON strings, you can ingest the data as `COMPLEX<JSON>` as follows:
 - During native batch ingestion, call the `parse_json` function in a `transform` object in the `transformSpec`.

diff --git a/docs/querying/sql-translation.md b/docs/querying/sql-translation.md
@@ -78,7 +78,8 @@ EXPLAIN PLAN statements return:
 
 Example 1: EXPLAIN PLAN for a `SELECT` query on the `wikipedia` datasource:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 EXPLAIN PLAN FOR
@@ -93,7 +94,8 @@ GROUP BY channel
 
 The above EXPLAIN PLAN query returns the following result:
 
-<details><summary>Show the result</summary>
+<details>
+<summary>Show the result</summary>
 
 ```json
 [
@@ -235,7 +237,8 @@ The above EXPLAIN PLAN query returns the following result:
 
 Example 2: EXPLAIN PLAN for an `INSERT` query that inserts data into the `wikipedia` datasource:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 EXPLAIN PLAN FOR
@@ -263,7 +266,8 @@ PARTITIONED BY ALL
 
 The above EXPLAIN PLAN returns the following result:
 
-<details><summary>Show the result</summary>
+<details>
+<summary>Show the result</summary>
 
 ```json
 [
@@ -452,7 +456,8 @@ The above EXPLAIN PLAN returns the following result:
 Example 3: EXPLAIN PLAN for a `REPLACE` query that replaces all the data in the `wikipedia` datasource with a `DAY`
 time partitioning, and `cityName` and `countryName` as the clustering columns:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 EXPLAIN PLAN FOR
@@ -482,7 +487,8 @@ CLUSTERED BY cityName, countryName
 
 The above EXPLAIN PLAN query returns the following result:
 
-<details><summary>Show the result</summary>
+<details>
+<summary>Show the result</summary>
 
 ```json
 [

diff --git a/docs/release-info/upgrade-notes.md b/docs/release-info/upgrade-notes.md
@@ -314,7 +314,8 @@ This property affects both storage and querying, and must be set on all Druid se
 
 The following table illustrates some example scenarios and the impact of the changes.
 
-<details><summary>Show the table</summary>
+<details>
+<summary>Show the table</summary>
 
 | Query| Druid 27.0.0 and earlier| Druid 28.0.0 and later|
 |------|------------------------|----------------------|

diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md
@@ -145,7 +145,8 @@ Follow these steps to load the sample Wikipedia dataset:
 5. Click **Done**. You're returned to the **Query** view that displays the newly generated query.
    The query inserts the sample data into the table named `wikiticker-2015-09-12-sampled`.
 
-   <details><summary>Show the query</summary>
+   <details>
+<summary>Show the query</summary>
 
    ```sql
    REPLACE INTO "wikiticker-2015-09-12-sampled" OVERWRITE ALL

diff --git a/docs/tutorials/tutorial-msq-convert-spec.md b/docs/tutorials/tutorial-msq-convert-spec.md
@@ -41,7 +41,8 @@ To convert the ingestion spec to a query task, do the following:
   ![Convert ingestion spec to SQL](../assets/multi-stage-query/tutorial-msq-convert.png "Convert ingestion spec to SQL")
 3. In the **Ingestion spec to covert** window, insert your ingestion spec. You can use your own spec or the sample ingestion spec provided in the tutorial. The sample spec uses data hosted at `https://druid.apache.org/data/wikipedia.json.gz` and loads it into a table named `wikipedia`:
 
-   <details><summary>Show the spec</summary>
+   <details>
+<summary>Show the spec</summary>
 
    ```json
    {
@@ -127,7 +128,8 @@ To convert the ingestion spec to a query task, do the following:
 
 4. Click **Submit** to submit the spec. The web console uses the JSON-based ingestion spec to generate a SQL query that you can use instead. This is what the query looks like for the sample ingestion spec:
 
-   <details><summary>Show the query</summary>
+   <details>
+<summary>Show the query</summary>
 
    ```sql
    -- This SQL query was auto generated from an ingestion spec

diff --git a/docs/tutorials/tutorial-msq-extern.md b/docs/tutorials/tutorial-msq-extern.md
@@ -46,7 +46,8 @@ To generate a query from external data, do the following:
    - Customize how Druid handles the data by selecting the **Input format** and its related options, such as adding **JSON parser features** for JSON files.
 5. When you're ready, click **Done**. You're returned to the **Query** view where you can see the starter query that will insert the data from the external source into a table named `wikipedia`.
 
-   <details><summary>Show the query</summary>
+   <details>
+<summary>Show the query</summary>
 
    ```sql
    REPLACE INTO "wikipedia" OVERWRITE ALL
@@ -122,7 +123,8 @@ ORDER BY COUNT(*) DESC
 
 With the EXTERN function, you could run the same query on the external data directly without ingesting it first:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 SELECT

diff --git a/docs/tutorials/tutorial-query-deep-storage.md b/docs/tutorials/tutorial-query-deep-storage.md
@@ -39,7 +39,8 @@ Use the **Load data** wizard or the following SQL query to ingest the `wikipedia
 
 Partitioning by hour provides more segment granularity, so you can selectively load segments onto Historicals or keep them in deep storage.
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```sql
 REPLACE INTO "wikipedia" OVERWRITE ALL
@@ -152,7 +153,8 @@ This query looks for records with timestamps that precede `00:10:00`. Based on t
 
 When you submit the query from deep storage through the API, you get the following response:
 
-<details><summary>Show the response</summary>
+<details>
+<summary>Show the response</summary>
 
 ```json
 {
@@ -209,7 +211,8 @@ A successful query also returns a `pages` object that includes the page numbers
 
 Note that `sampleRecords` has been truncated for brevity.
 
-<details><summary>Show the response</summary>
+<details>
+<summary>Show the response</summary>
 
 ```json
 {
@@ -265,7 +268,8 @@ curl --location 'http://ROUTER:PORT/druid/v2/sql/statements/:queryId'
 
 Note that the response has been truncated for brevity.
 
-<details><summary>Show the response</summary>
+<details>
+<summary>Show the response</summary>
 
 ```json
 [

diff --git a/docs/tutorials/tutorial-unnest-arrays.md b/docs/tutorials/tutorial-unnest-arrays.md
@@ -271,7 +271,8 @@ You can use a single unnest datasource to unnest multiple columns. Be careful wh
 
 The following native Scan query returns the rows of the datasource and unnests the values in the `dim3` column by using the `unnest` datasource type:
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```json
 {
@@ -334,7 +335,8 @@ You can implement filters. For example, you can add the following to the Scan qu
 
 The following query returns an unnested version of the column `dim3` as the column `unnest-dim3` sorted in descending order.
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```json
 {
@@ -375,7 +377,8 @@ The following query returns an unnested version of the column `dim3` as the colu
 
 The example topN query unnests `dim3` into the column `unnest-dim3`. The query uses the unnested column as the dimension for the topN query. The results are outputted to a column named `topN-unnest-d3` and are sorted numerically in ascending order based on the column `a0`, an aggregate value representing the minimum of `m1`.
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```json
 {
@@ -434,7 +437,8 @@ The example topN query unnests `dim3` into the column `unnest-dim3`. The query u
 
 This query joins the `nested_data` table with itself and outputs the unnested data into a new column called `unnest-dim3`.
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```json
 {
@@ -539,7 +543,8 @@ The `unnest` datasource supports unnesting virtual columns, which is a queryable
 
 The following query returns the columns `dim45` and `m1`. The `dim45` column is the unnested version of a virtual column that contains an array of the `dim4` and `dim5` columns.
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```json
 {
@@ -585,7 +590,8 @@ The following query returns the columns `dim45` and `m1`. The `dim45` column is
 
 The following Scan query unnests the column `dim3` into `d3` and a virtual column composed of `dim4` and `dim5` into the column `d45`. It then returns those source columns and their unnested variants.
 
-<details><summary>Show the query</summary>
+<details>
+<summary>Show the query</summary>
 
 ```json
 {
Original file line number	Diff line number	Diff line change
Expand Up		@@ -39,4 +39,4 @@ One typical setup seen in production is to process data in Spark, and load the p

		For more information about using Druid and Spark together, including benchmarks of the two systems, please see:

		<https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible-analytics-scale-butani>
		https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible-analytics-scale-butani