Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-17750: Add docs for dependency management changes #170

Open
wants to merge 13 commits into
base: trunk
Choose a base branch
from

Conversation

aratno
Copy link
Contributor

@aratno aratno commented Sep 1, 2022

No description provided.

@@ -12,13 +12,26 @@ As Cassandra is an Apache product, all included libraries must follow
Apache's https://www.apache.org/legal/resolved.html[software license
requirements].

=== Changes to dependency management in Cassandra 4.1

Before Cassandra 4.1, dependencies were managed in `build.xml` and managed with
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need ,


For more information on this change, see
https://issues.apache.org/jira/browse/CASSANDRA-17750[CASSANDRA-17750].

Copy link
Member

@michaelsembwever michaelsembwever Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm going to nit-pick a bit here… forgive me…

but i'm curious whether folk think this page would be more readable if it contains two separate unadulterated sections:

  • Dependency Management 4.2+
  • Dependency Management before 4.2

For example, the Changes to dependency management in Cassandra 4.2 section puts unnecessary cognitive load onto the reader if they just don't care about how it worked before 4.2
The reader just wants to know what to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm up for that - how's my last edit? This separates those two sections a bit more clearly. We can always edit based on feedback from more users as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the latest changes, I would probably just put before 4.2 first and then 4.2. POM file types shouldn't be only at the end I think or we need some kind of anchor to them maybe, where we refer to changes to be done to them or what changed in different versions.
Otherwise LGTM, this is more around how to organize to make the flow easier to follow.
Thanks for looking into that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the main audience for this doc (in the short term) is developers who are familiar with how to add dependencies in the pre-4.2 world, and are trying to figure out how to do things post-17750. In that scenario, I think it makes sense to increase visibility around the changes by having that section come first. Over time, this will change as 4.2 becomes more familiar to more folks.

Copy link
Member

@michaelsembwever michaelsembwever Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the main audience for this doc (in the short term) is developers who are familiar with how to add dependencies in the pre-4.2 world, and are trying to figure out how to do things post-17750.

I would disagree with this statement :) because that method only existed for a year, ref CASSANDRA-16391. There was a much longer precedence before (checking jar and license files into lib/). I do suspect most developers are next to starting from scratch (and we don't add deps that often).

If you want the extra explanation there, I don't object if it appears later. But I'm in the "don't explain, don't complain" camp, where a tl;dr what do i do comes first so I can be quick. (My example was overly concise, and only meant as an example.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am with Mick here. Also, I don't like the tribe knowledge which I am honestly even sure that it is not the case here. Not many people deal with the dependencies or do it regularly. Also, yes, good point, things changed last year. I prefer to be more noisy and things to be more documented than less.

Comment on lines 17 to 40
In Cassandra 4.2 and onwards, dependencies are managed in Maven POM templates
in `.build/*-template.xml`. These templates are processed into valid Maven POMs
with the `ant write-poms` task and copied to `build/*.pom`.

The following POMs from `build.xml` have been migrated to templates:
* `parent-pom` is now at `.build/parent-pom-template.xml`
* `all-pom` is now at `.build/cassandra-deps-template.xml`
* `build-deps-pom` is now at `.build/cassandra-build-deps-template.xml`

To add new dependencies in Cassandra 4.2 onwards, add the dependency to
`parent-pom-template` and `cassandra-deps-template`, and optionally
`cassandra-build-deps-template` if the dependency is required for build only.
See
https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management[the
Maven docs] on how to reference dependencies in the parent POM from the child
POMs.

For dependency versions that need to be available in `build.xml` and
`parent-pom-template`, specify the version as a property in `build.xml`, add it
to the `ant write-poms` target, then add the property to `parent-pom-template`
with the value of the template substitution.

For more information on this change, see
https://issues.apache.org/jira/browse/CASSANDRA-17750[CASSANDRA-17750].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In Cassandra 4.2 and onwards, dependencies are managed in Maven POM templates
in `.build/*-template.xml`. These templates are processed into valid Maven POMs
with the `ant write-poms` task and copied to `build/*.pom`.
The following POMs from `build.xml` have been migrated to templates:
* `parent-pom` is now at `.build/parent-pom-template.xml`
* `all-pom` is now at `.build/cassandra-deps-template.xml`
* `build-deps-pom` is now at `.build/cassandra-build-deps-template.xml`
To add new dependencies in Cassandra 4.2 onwards, add the dependency to
`parent-pom-template` and `cassandra-deps-template`, and optionally
`cassandra-build-deps-template` if the dependency is required for build only.
See
https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management[the
Maven docs] on how to reference dependencies in the parent POM from the child
POMs.
For dependency versions that need to be available in `build.xml` and
`parent-pom-template`, specify the version as a property in `build.xml`, add it
to the `ant write-poms` target, then add the property to `parent-pom-template`
with the value of the template substitution.
For more information on this change, see
https://issues.apache.org/jira/browse/CASSANDRA-17750[CASSANDRA-17750].
* Update dependency in `.build/parent-pom-template.xml` with version
** Update `.build/cassandra-deps-template.xml` for runtime and test dependencies
** Update `.build/cassandra-build-deps-template.xml` for build dependencies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to spell this out in a little more detail, at least while the change is new. I'd support reducing it in the future, when people are more familiar with the process. But right now the change is so fresh that a newcomer might not get a good answer even after asking in Slack, and adding a little more detail here could make their lives a little easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Abe here. Even people long time on the project who don't deal with those things regularly will benefit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the process needs to be spelled out because the people reading this will not be the developer that has experience with it. It needs to be clear in a fairly step by step process.

Linking to external documentation is advantageous and while I think we can expect that developers know how to specify a Maven dependency, the link to the Maven documentation should be included.

Comment on lines 29 to 32
See
https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management[the
Maven docs] on how to reference dependencies in the parent POM from the child
POMs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lines 5-9 need fixing. this sentence would be work after that.

Copy link
Contributor

@ekaterinadimitrova2 ekaterinadimitrova2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, my main point was around organizing the sections.
One thing on my mind is - should we add a note that new dependencies require a consensus on the mailing list here? Someone can miss it in the Code Style so I think it will be a good reminder here.


For more information on this change, see
https://issues.apache.org/jira/browse/CASSANDRA-17750[CASSANDRA-17750].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the latest changes, I would probably just put before 4.2 first and then 4.2. POM file types shouldn't be only at the end I think or we need some kind of anchor to them maybe, where we refer to changes to be done to them or what changed in different versions.
Otherwise LGTM, this is more around how to organize to make the flow easier to follow.
Thanks for looking into that!

@aratno
Copy link
Contributor Author

aratno commented Sep 6, 2022

One thing on my mind is - should we add a note that new dependencies require a consensus on the mailing list here?

Let's make those edits after the discussion here settles? https://lists.apache.org/thread/33dt0c3kgskrzqtp4h8y411tqv2d6qvh

I don't see anything in the code style page here about community consensus when adding dependencies: https://cassandra.apache.org/_/development/code_style.html

@aratno
Copy link
Contributor Author

aratno commented Sep 6, 2022

Just updated the intro based on feedback from @ekaterinadimitrova2 and @michaelsembwever. Could you approve if it feels ready?

@michaelsembwever
Copy link
Member

michaelsembwever commented Sep 6, 2022

One thing on my mind is - should we add a note that new dependencies require a consensus on the mailing list here?

Yes.

I don't see anything in the code style page here about community consensus when adding dependencies: https://cassandra.apache.org/_/development/code_style.html

See
https://lists.apache.org/thread/33dt0c3kgskrzqtp4h8y411tqv2d6qvh and https://lists.apache.org/thread/fwylw50s4nh393y79yz4z8k4gyj7ybvx

Hasn't moved from the gdoc yet: https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo/edit

Comment on lines 17 to 40
In Cassandra 4.2 and onwards, dependencies are managed in Maven POM templates
in `.build/*-template.xml`. These templates are processed into valid Maven POMs
with the `ant write-poms` task and copied to `build/*.pom`.

The following POMs from `build.xml` have been migrated to templates:
* `parent-pom` is now at `.build/parent-pom-template.xml`
* `all-pom` is now at `.build/cassandra-deps-template.xml`
* `build-deps-pom` is now at `.build/cassandra-build-deps-template.xml`

To add new dependencies in Cassandra 4.2 onwards, add the dependency to
`parent-pom-template` and `cassandra-deps-template`, and optionally
`cassandra-build-deps-template` if the dependency is required for build only.
See
https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management[the
Maven docs] on how to reference dependencies in the parent POM from the child
POMs.

For dependency versions that need to be available in `build.xml` and
`parent-pom-template`, specify the version as a property in `build.xml`, add it
to the `ant write-poms` target, then add the property to `parent-pom-template`
with the value of the template substitution.

For more information on this change, see
https://issues.apache.org/jira/browse/CASSANDRA-17750[CASSANDRA-17750].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the process needs to be spelled out because the people reading this will not be the developer that has experience with it. It needs to be clear in a fairly step by step process.

Linking to external documentation is advantageous and while I think we can expect that developers know how to specify a Maven dependency, the link to the Maven documentation should be included.

Comment on lines 18 to 21
The following POMs from `build.xml` have been migrated to templates:
* `parent-pom` is now at `.build/parent-pom-template.xml`
* `all-pom` is now at `.build/cassandra-deps-template.xml`
* `build-deps-pom` is now at `.build/cassandra-build-deps-template.xml`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block does not seem to render correctly in the github viewer. That may be an artifact of the viewer or there may be a need for a blank line to separate the bulleted list from the proceeding paragraph.

However, I think the sentences should be changed to read:
X is now Y rather than X is now at Y as the later sounds like a file with the name Y will be found in the directory.

Comment on lines 15 to 16
in `.build/*-template.xml`. These templates are processed into valid Maven POMs
with the `ant write-poms` task and copied to `build/*.pom`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block does not seem to render correctly in the github viewer. That may be an artifact of the viewer. However the * symbols are swallowed in the View File -> Preview mode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The POM FIle Types section does not make sense where it is now. As it is at the same level as the 4.2 and pre-4.2 sections it would appear to apply to both, however, it mentions files that are only in the pre-4.2 code. I think that a section for the POMs found in each section would be cleaner and easier to understand.

Copy link
Member

@michaelsembwever michaelsembwever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments.

and… of curiosity… given the versioning differences in the doc, wouldn't this be better moved in-tree ? (easier to read, easier to maintain)

`maven-ant-tasks-retrieve-build` target. This should contain libraries that are
required for build tools (grammar, docs, instrumentation), but are not
shipped as part of the Cassandra distribution.
- Since Cassandra 4.0, `coverage-deps-pom` has been removed and the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an item about lib/ and resolver-dist-lib ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share more about what you mean? I'll admit I haven't used resolver-dist-lib directly so I'm not the right person to document its behavior for new contributors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you ant artifacts this will depend on the resolver-dist-lib target. This propagates the lib/ folder. This folder contains the jar files that are actually distributed in a release (e.g. the tarball, the deb and rpm packages). As opposed to build/lib/jars which contains more, e.g. build libs, provided scope, optional deps, etc.

We have to keep a closer eye on what gets generated into the lib/ folder, according to ASF distribution policy: https://www.apache.org/legal/resolved.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added more details here

`build-deps-pom` template is `.build/cassandra-build-deps-template.xml`.
* *all-pom*
- POM for https://mvnrepository.com/artifact/org.apache.cassandra/cassandra-all[cassandra-all.jar]
that can be installed or deployed to public maven repos via `ant publish`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no you can't. nothing can be pushed public without a vote and three PMC +1s

(what ant publish does it stage it the ASF nexus, where we can then vote on it. but no one AFAIK needs to do this because we only do it as part of our release process.)

when a vote is finalised the staging repo in ASF nexus is "published" and gets pushed out to other public repos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an existing part of the documentation, but I just made some clarifications - let me know if this is suitable for you

Comment on lines 56 to 58
In Cassandra 4.2 and later, dependencies are managed in Maven POM templates in
`.build/\*-template.xml`. These templates are processed into valid Maven POMs
and copied to `build/\*.pom` by the `ant write-poms` task.
Copy link
Contributor

@Claudenw Claudenw Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird but the .build/\*-template.xml displays correctly on my system while the build/\*.pom does not. In my editor they both display correctly. Is there a way to see how the publishing system will render them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see build instructions in README, and output of GHA.

@ekaterinadimitrova2
Copy link
Contributor

ekaterinadimitrova2 commented Nov 2, 2023

Same notes as @michaelsembwever , except:

and… of curiosity… given the versioning differences in the doc, wouldn't this be better moved in-tree ? (easier to read, easier to maintain)

It is better if the instruction is in one place. Otherwise, people will be merging, breaking, and searching for more instructions... Keeping it in one place makes it easier. On the other hand, if we decide to split - this should be done in another ticket. This one is almost ready, and it has been reviewed three times already.

aratno and others added 2 commits November 2, 2023 12:32
@aratno
Copy link
Contributor Author

aratno commented Nov 3, 2023

and… of curiosity… given the versioning differences in the doc, wouldn't this be better moved in-tree ? (easier to read, easier to maintain)

Agree with Ekaterina - I don't have a preference where this documentation lives. I'd just like it to be updated with information on the recent changes in how builds work.

@michaelsembwever
Copy link
Member

michaelsembwever commented Nov 4, 2023

I don't have a preference where this documentation lives. I'd just like it to be updated with information on the recent changes in how builds work.

That would be an argument to have it versioned :-)

No strong opinions here, the trade-offs I see are

  • versioned: simpler to read, just go to the version you are working with and read half as much content. Less error-prone, and an easier page to maintain.
  • unversioned: one place to learn how it works in all branches. more to read, more cognitive load. if working with multiple branches (e.g. forward merging) you have to do this anyway.

A question I have is if it's unversioned then what's happens to the docs on dependency version for branches that are no longer maintained. The code is not deleted but are we then deleting the documentation to it ? This isn't just archival purposes, we do still (in exception circumstances) apply CVE fixes to past branches (and the older branches are often those even the experienced devs need to refresh their memories on how things worked…)

(Even if we come to conclusion versioned is better, it does not need to be done in this PR.)

@aratno
Copy link
Contributor Author

aratno commented Nov 6, 2023

(Even if we come to conclusion versioned is better, it does not need to be done in this PR.)

I agree with your trade-offs listed. If you believe this structural change doesn't have to be done as part of this PR, do you need the discussion of pros and cons to happen here?

I've addressed the other feedback and would love for this to merge - we can always relocate docs as necessary afterwards.

@michaelsembwever
Copy link
Member

do you need the discussion of pros and cons to happen here?

yes, totally make sense to me. it's input to what and where to do something that stems from and impacts here :-)

Dependencies added to the `lib/` directory are built into the release artifacts
by the `ant artifacts` target (see target `resolver-dist-lib`). Libraries
distributed this way must meet the
[https://www.apache.org/legal/resolved.html]ASF distribution policy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link is not working,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oopsies - fixed

@aratno
Copy link
Contributor Author

aratno commented Nov 6, 2023

Here's my personal opinion:

I'm against any changes to documentation structure that are motivated by the minor changes in dependency management that started in 5.0. Right now there's a pretty straightforward path for a new contributor to find the contribution docs on cassandra.apache.org, and find their way to the dependency section. I'm not convinced that the changes in this PR exceed the threshold where restructuring makes things easier and not more complicated. It's not uncommon for a dependency change to impact multiple versions, so having both docs in one place feels very reasonable to me, especially while the change is relatively new. Using versioned documentation can make it harder to tell what's different between versions, which is relevant for contributors working with multiple release branches.

I'm not opposed to versioned dependencies in general, but it's more common for users to work with a single version than contributors, so defaulting to non-versioned docs for contributor-facing docs feels reasonable to me.

@michaelsembwever
Copy link
Member

michaelsembwever commented Nov 6, 2023

@aratno we're talking past each other here… 😃

Your involvement in this work makes your opinion super valuable and accessible. That's an opportunity to take advantage of 🙏 Creating a separate ticket to discuss the trade-offs risks creating a ticket unnecessarily, and losing the attention from the folks that probably have the most valuable opinion on it currently. Raising the discussion, having the discussion, deciding on an action, and where and when to do that action are all very separate tasks, there's no assumptions being made that doing some enforce others.

Appreciate your feedback! I generally agree with you, my concern remains about what happens to this documentation on EOL'd branches (there's benefit of just reading the accurate docs in -tree, rather than crawling through git log to find it…)

@aratno
Copy link
Contributor Author

aratno commented Nov 6, 2023

I think I understand your position better now - previously I had interpreted your comment as you prioritizing perfection over incremental progress, but now it's more clear that you find the discussion valuable even if it doesn't lead to an active effort immediately. I was maybe too wary of over-discussion - thank you for taking the step back to clarify.

Regarding EOL'd branches - aren't branches usually EOL'd when a new major comes out? Maybe we can do a docs sweep for every new major, as there's likely going to be a lot of other change happening to the docs at the same time?

@michaelsembwever
Copy link
Member

Regarding EOL'd branches - aren't branches usually EOL'd when a new major comes out? Maybe we can do a docs sweep for every new major, as there's likely going to be a lot of other change happening to the docs at the same time?

Well yes, that's kinda my point. We'll possibly be cleaning up docs quickly (after each major release), and then those EOL'd branches no longer have the docs for them anywhere – and that's a problem, because our EOL'd isn't "nuke them", it's just we don't have the contributor resources to maintain these older branches but they still exist and others are free to step in and work with them. For examples: we do still apply CVE to older EOL'd branches as we see fit, and Microsoft has stepped forward and said they will provide support for older branches.

@aratno
Copy link
Contributor Author

aratno commented Nov 6, 2023

It seems reasonable to me that someone preparing a patch for an EOL'd version might have to dig a bit deeper and re-build the docs for when that version was supported. Having in-tree docs would make that easier, since their checkout of the EOL'd release would include the docs at that point in time, but we could also include a link to the SHA of the docs before they were pruned for that EOL. Then contributors would clone the EOL'd branch, find the ref for the docs, and be able to build those themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants