-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade calcite in druid #13532
Comments
Do you know if Hadoop 2 + Guava 19 has been tested? Guava 19 is Calcite's minimum version these days, so we could update to that as a compromise version if it works for Hadoop 2. I raised a PR here so it can at least run through CI: #13544. That won't exercise all the Hadoop distributions we want to support, though, so we'll need additional testing even if it passes. |
A note about shading: it looks like Apache Beam does this too. Here’s their build file for Calcite: https://github.com/apache/beam/blob/52753a9c854786ad0732af4b8577d1cdcfc66047/vendor/calcite-1_28_0/build.gradle. It relocates Guava ( They do their own releases as well. Here's the latest one: https://mvnrepository.com/artifact/org.apache.beam/beam-vendor-calcite-1_28_0/0.2 |
Another question for anyone who has tried this: what went wrong when using the latest Calcite with Guava 16.0.1? From the thread at https://lists.apache.org/thread/oy84c1607hyjhkbop8svtrzlzgj5632q, it seems like this may actually work OK, even though Calcite currently claims a minimum version of 19. |
Relevant action in this sub-thread: https://lists.apache.org/thread/oy84c1607hyjhkbop8svtrzlzgj5632q. Calcite targets a minimum Guava version of 16.0.1 as of https://issues.apache.org/jira/browse/CALCITE-5428. I'm working on integrating this into Druid. So far, I ran into these three things along the way: https://issues.apache.org/jira/browse/CALCITE-5477 Other than those things, there are also a lot of changes in Calcite that require adjustments on the Druid side. |
Thanks @abhishekagarwal87. I did indeed start from your PR as a base. |
Motivation
We are currently stuck on an older version of calcite. This is because of old guava dependencies coming from Hadoop 2. Till we get rid of hadoop 2, we cannot upgrade calcite.
The upgrade will be very helpful because
Proposed Solutions
The major blocker for the upgrade is old guava dependencies.
Removing Hadoop 2 entirely
We can remove Hadoop 2 entirely and thus rid ourselves of its transitive dependencies. We have a Hadoop 3 distribution profile and will be shipping a Hadoop 3 compatible druid distribution bundle with 25.0. There is some discussion about it here https://lists.apache.org/thread/1j5w9dmt1gp8hx31tvrmyomcko4mlp03
However, there are concerns in the community about Hadoop 3 not being a viable alternative. We have now classic batch ingestion and also SQL-based batch ingestion. We have also added MM-less ingestion on Kubernetes. With this, users now have the ability to use a common/shared infra to run druid ingestion jobs. However, MM-less ingestion is still experimental.
Calcite shading
The other option is to shade the calcite jars. Calcite dev team is not going to do this. Instead, we can shade the jars ourselves. There is a prototype here and it works. I ran into a problem with tests however it was fixed in calcite 1.30.0
Next steps
To proceed with the upgrade, we can go with the shading approach. There is a prototype that already works. We need to figure out where we want to host these shaded jars. Once done, we also need to test the SQL thoroughly to avoid regressions.
The text was updated successfully, but these errors were encountered: