-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split Commons Compress into multiple artifacts #490
base: master
Are you sure you want to change the base?
Conversation
I like this idea. If we follow this path I'd push further and fully modularize commons-compress, with a separate module for each compression and archive format. That may be easier to split the component in two groups with new names, for example commons-archive-* and commons-compression-*. |
8b04ef0
to
433bd3a
Compare
433bd3a
to
6db519f
Compare
421241a
to
d207ba7
Compare
d207ba7
to
a6e6df6
Compare
<module>compress</module> | ||
<module>compress-brotli</module> | ||
<module>compress-core</module> | ||
<module>compress-parent</module> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we simply have this pom as parent which looks more obvious in the tree :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean moving compress-parent/pom.xml
here and this file to compress-bom/pom.xml
, without changing the inheritance tree (commons-compress-bom
-> commons-compress-parent
-> commons-compress-core
? It should probably work.
In Log4j we structure our projects this way, which probably has to do with the fact that we use ${revision}
as version number and the definition of the revision
property is in the root pom.xml
. @vy, are the other reasons to put the BOM at the root of the project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean why not having a structure as follows?
/commons-bom/pom.xml
(commons-bom
module)/pom.xml
(commons-parent
module parented bycommons-bom
)/commons-core/pom.xml
(commons-core
moduleparented by
commons-parent`)
This setup has the following shortcoming:
Imagine you have a, say, maven-enforcer-plugin
configuration that applies to all pom.xml
s, including /commons-bom/pom.xml
. If you run ./mvnw enforcer:enforce
at the project root directory, it will only run on commons-parent
and modules defined in /pom.xml
. That is, it won't run for /commons-bom/pom.xml
. You can add more examples: apache-rat-plugin
won't check the license of /commons-bom/pom.xml
, etc.
[To give some context to this PR, it was motivated by this discussion on Yes, it would be nice to split Commons Compress into a small API and many modules. Right now we use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is just a PoC, it would help to convert it to a draft PR.
As written, this does not work and cannot be shipped. The fundamental problem is that it moves classes from the commons-compress artifact to the commons-compress-core artifact without changing the Java package or class names.
This breaks existing code, even of projects that do not depend directly on commons-compress, and will cause a lot of needless pain in the Java community.
Adding a BOM is a workaround for a problem that we shouldn't cause users in the first place.
From my POV, if we do anything, and as a first cut, I would follow the route we are taking in Commons VFS: Move to new modules, format-specific code that is already in their own package that depend on a currently optional library, which is: brotli, zstd, and xz. There is no breakage of BC, just new artifacts. If you don't want the new artifacts, stick to the old one. Simple. For the implementation, the root POM needs to no longer generate a jar obviously, so the current commons-compress artifact can move to a module dir and the root can be renamed commons-compress-parent. Having a BOM POM is nice and easy to add. Alternatively, the root POM can keep the same name, and the current code can move to a commons-compress-core module. I think I like the first option better. |
@elharo, I am confused with your remarks given @ppkarwasz stated that it works and it is backward compatible. I see your objections in the mailing list due to JPMS issues:
AFAIK about JPMS, Piotr is right.
I am not able to see the problem here. Could you give a concrete example where this change will break the application?
Could you elaborate on your reasoning a bit, please? It is really difficult to come up with a fix when you only say "this breaks existing code". |
I believe @ppkarwasz is simply wrong. This change is not backwards compatible, and will break existing projects. It works in the very simple case where a project only has a direct dependency on commons-compress and has no transitive dependencies on commons-compress. However, in then much more common case where the dependency on commons-compress lives deep in the transitive dependencies this does not work. It's another instance of diamond dependencies and split packages. The basic problem is that a project can have the old commons-compress deep in the transitive dependency tree. One then updates some other library to a new version that now has common-compress-core deep in its transitive dependency tree. The classpath now contains both commons-compress and commons-compress-core, even though the developer of the project might never have heard of commons-compress. The project breaks in more or less clear ways, depending on Java version and the dependency graph, because a class is found in both commons-compress and commons-compress-core. |
This is a crystal clear argument @elharo, thanks so much for the elaborate explanation. @ppkarwasz indeed needs to address this. |
@elharo, I see your point. I might modify the PR, so that most of the code remains in However I am not sure if the pro/con balance of such a solution would be positive:
So it appears that the only way to make breaking changes in Java is to break everything by repackaging all the code. It is a pity, because it means that any attempt to make smaller artifacts practically has the opposite effect for users and requires Jakarta-like code changes. |
The problem of deep dependencies can be solved if Compress provides a BOM POM and my app depends on this BOM POM (commons-compress-bom). The refactored Compress can contain an empty commons-compress artifact/jar. Or, the refactored Compress keeps core code in commons-compress and introduces a commons-compress-parent POM (this is not the BOM POM). |
I think the point that @elharo tried to make is that many developers will not notice that
I am sure that Spring Boot will rapidly replace Anyway: if you know about an Apache-wide BOM that features coherent library versions, my only question is: where do I sign up? |
Yes, a good BOM is a workaround; but it's still only a workaround. and one that operates post facto. That is, first the developer sees their project break. Then they spend an hour to a day or more debugging it. How long it takes depends on how familiar they are with Maven dependency trees and whether they immediately recognize the symptoms. If you've encountered it before, you curse and say, "oh yeah, that again" and then start inspecting with mvn dependency:tree and google searches until you figure out exactly which exclusion or BOM you need to add where. If you haven't debugged one of these problems before, though, it's going to take a lot longer to figure it out. The kindest thing we can do for developers is to change package names and Maven coordinates at the same time, or not at all. As to an Apache-wide BOM that features coherent library versions, it's not a bad idea. A cross-functional team I was part of spent a couple of years creating such a thing for the Google Cloud Platform. The amount of cat herding involved was significant. The majority of the effort was not in writing the BOM, but simply synchronizing the dozens of projects so that a consistent, compatible set of library versions existed and could be released at the same time. Some of the tooling we built for that effort could be reused here. However, that was within one company and mostly within a single product area. Apache is far more diverse and much less well funded. |
This is proof-of-concept of the possibility of splitting Commons Compress into artifacts without optional dependencies, while maintaining backward compatibility.
The Commons Compress code is split into:
commons-compress-brotli
: an artifact that includes Brotli support and with a non-optional dependency onorg.brotli:dec
,commons-compress-core
: the old Commons Compress with one entire package and the dependency onorg.brotli:dec
removed,common-compress
: an almost empty artifact, whose purpose is to provide backward compatibility. It depends on the other artifacts and contains a module descriptor that allows JPMS users that declaresrequires org.apache.commons.compress;
to also read the other modules:commons-compress-bom
: a BOM artifact to help users prevent having multiple copies of the same classes on the classpath (or module path). Without using a BOM for dependency management, users could end up havingcommons-compress-core
version 1.27.0 and an oldercommons-compress
as transitive dependency. This would lead to a duplication of classes on the classpath.commons-compress-parent
: a technical POM to manage dependency versions and plugin configurations in a single place. We might use Maven Flatten Plugin to remove this artifact from the published ones.The PoC is fully functional, although the POMs might need some small adjustments.