Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes support for Hadoop 2 #14763

Merged

Conversation

tejaswini-imply
Copy link
Member

Release note
Hadoop 2 support is formally deprecated in Druid 27.0.0. Hadoop 2 support is now terminated.

@@ -112,7 +112,6 @@ jobs:
strategy:
fail-fast: false
matrix:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line can be removed as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we move these files or just simply deleted them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted original tutorial/hadoop/docker/* files and moved tutorial/hadoop3/docker/* files to tutorial/hadoop/docker/ folder

@@ -1534,7 +1534,7 @@ Additional peon configs include:
|`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`|
|`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/task`|
|`druid.indexer.task.batchProcessingMode`| Batch ingestion tasks have three operating modes to control construction and tracking for intermediary segments: `OPEN_SEGMENTS`, `CLOSED_SEGMENTS`, and `CLOSED_SEGMENT_SINKS`. `OPEN_SEGMENTS` uses the streaming ingestion code path and performs a `mmap` on intermediary segments to build a timeline to make these segments available to realtime queries. Batch ingestion doesn't require intermediary segments, so the default mode, `CLOSED_SEGMENTS`, eliminates `mmap` of intermediary segments. `CLOSED_SEGMENTS` mode still tracks the entire set of segments in heap. The `CLOSED_SEGMENTS_SINKS` mode is the most aggressive configuration and should have the smallest memory footprint. It eliminates in-memory tracking and `mmap` of intermediary segments produced during segment creation. `CLOSED_SEGMENTS_SINKS` mode isn't as well tested as other modes so is currently considered experimental. You can use `OPEN_SEGMENTS` mode if problems occur with the 2 newer modes. |`CLOSED_SEGMENTS`|
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client:2.8.5|
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|
Copy link
Contributor

@ektravel ektravel Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|`org.apache.hadoop:hadoop-client-api:3.3.6`, `org.apache.hadoop:hadoop-client-runtime:3.3.6`|

@@ -1605,7 +1605,7 @@ then the value from the configuration below is used:
|`druid.worker.numConcurrentMerges`|Maximum number of segment persist or merge operations that can run concurrently across all tasks.|`druid.worker.capacity` / 2, rounded down|
|`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`|
|`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/tasks`|
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client:2.8.5|
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|
Copy link
Contributor

@ektravel ektravel Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|
|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|`org.apache.hadoop:hadoop-client-api:3.3.6`, `org.apache.hadoop:hadoop-client-runtime:3.3.6`|

@@ -89,7 +89,7 @@ classloader.
2. Batch ingestion uses jars from `hadoop-dependencies/` to submit Map/Reduce jobs (location customizable via the
`druid.extensions.hadoopDependenciesDir` runtime property; see [Configuration](../configuration/index.md#extensions)).

`hadoop-client:2.8.5` is the default version of the Hadoop client bundled with Druid for both purposes. This works with
`hadoop-client-api:3.3.6, hadoop-client-runtime:3.3.6` is the default version of the Hadoop client bundled with Druid for both purposes. This works with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`hadoop-client-api:3.3.6, hadoop-client-runtime:3.3.6` is the default version of the Hadoop client bundled with Druid for both purposes. This works with
The default version of the Hadoop client bundled with Druid is 3.3.6. This works with

Copy link
Contributor

@ektravel ektravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few suggestions for the docs.

@tejaswini-imply
Copy link
Member Author

Thanks @abhishekagarwal87, @ektravel for the review. I have addressed your comments in the latest commit.

Copy link
Contributor

@ektravel ektravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look good.

@abhishekagarwal87 abhishekagarwal87 merged commit a45b25f into apache:master Aug 9, 2023
@LakshSingla LakshSingla added this to the 28.0 milestone Oct 12, 2023
xvrl added a commit to xvrl/druid that referenced this pull request Dec 5, 2023
Several dependabot ignore directives are no longer relevant. Unpin them
to ensure we get again get timely updates via dependabot.

* support for Hadoop 2 was dropped as part of apache#14763
* Guava was upgraded to 31 as part of apache#14767
* Calcite was upgraded to 1.35 as part of apache#14510
xvrl added a commit to xvrl/druid that referenced this pull request Dec 5, 2023
Several dependabot ignore directives are no longer relevant. Unpin them
to ensure we get again get timely updates via dependabot.

* support for Hadoop 2 was dropped as part of apache#14763
* Guava was upgraded to 31 as part of apache#14767
* Calcite was upgraded to 1.35 as part of apache#14510
xvrl added a commit that referenced this pull request Dec 6, 2023
Several dependabot ignore directives are no longer relevant. Unpin them
to ensure we get again get timely updates via dependabot.

* support for Hadoop 2 was dropped as part of #14763
* Guava was upgraded to 31 as part of #14767
* Calcite was upgraded to 1.35 as part of #14510
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants