Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Maven import and update on large projects #155

Closed
eliasbalasis opened this issue Apr 3, 2021 · 25 comments
Closed

Slow Maven import and update on large projects #155

eliasbalasis opened this issue Apr 3, 2021 · 25 comments

Comments

@eliasbalasis
Copy link

eliasbalasis commented Apr 3, 2021

When importing into Eclipse workspace a Maven repository with a large number on artifacts the process takes too long to complete.

I am having a project with 288 artifacts which takes over 90mins to get imported.

I am suspecting that the larger the project is the more the permutations to be performed by M2E are and so the slower the import process becomes.

This is not productive and the members of my team cannot afford waiting that long each time they need to start a new piece of work.

Can something be done about this? Is there room for tuning and optimization of the M2E processes?

@mickaelistria
Copy link
Contributor

Do you have an example of a public maven project that can be used to reproduce this issue?

@eliasbalasis
Copy link
Author

Unfortunately not, I don't have any public projects big enough to demonstrate the problem.

However, having been observing this issue for a very long time, I have to admit that it feels as if M2E is unnecessarily calculating all possible permutations and the larger the number of Maven artifacts imported in Eclipse workspace the larger the number of permutations that M2E is performing slowing down not only the import but also any M2E rebuild when either changes are made to pom.xml files or when a "Maven / Update" action is performed.

I have to admit though, M2E does a great job building the Maven structure when changes but it is probably calculating too many permutations and perhaps some optimizations could be made to speed it up.

@laeubi
Copy link
Member

laeubi commented Apr 3, 2021

Can you attach a profiler (e.g. JProfiler, Yourkit,VisualVM ...) and share the results where it seem to take most of the time?
This might be related to #123

@eliasbalasis
Copy link
Author

eliasbalasis commented Apr 3, 2021

I am running Eclipse JEE 2021-03 with Spring Boot Tools 4 from "Marketplace" under Windows 10

I managed to run VisualVM executable and attach to the running "eclipse.exe" process and I will try to collect measurements similar to #123

However, I don't think this is a new problem and any VisualVM investigation won't help much as I have been experiencing this problem for years, it has just kept becoming more intensive as complexity of projects increased, meaning number of artifacts, and their relationships plus strong use of transitive dependency chains, dependency baselines, BOMs, Maven profiles, interpolation etc.

@eliasbalasis
Copy link
Author

Correction,

I had forgotten to assign enough Java heap to Eclipse process.

After adjusting the Eclipse memory settings, the process that used to take over an hour with earlier versions of Eclipse (4.15) now takes just 15 minutes.

Great job and well done.

@eliasbalasis
Copy link
Author

eliasbalasis commented Nov 23, 2021

After having experimented for many years with Eclipse and M2E and with more than enough Java heap assigned to Eclipse process, I am certain that the larger an imported Maven solution is the larger the number of permutations M2E seems to be performing and consequently the longer it takes not only to import the Maven solution the first time (as "Existing Maven Projects") but also to process each pom.xml change (292 Maven artifacts in Eclipse workspace).

The problem becomes worse when importing a hierarhcy of Maven solutions (952 Maven artifacts in Eclipse workspace).

Creating a minimal but complex enough project to reproduce the issue is rather impossible as only a project as big as the ones I have been working on could reproduce the issue but unfortunately I cannot share those projects.

I cannot explain the behavior described in my last comment and I consider it erroneous as it happened only once and on a project that was smaller at the time (<300 Maven artifacts in Eclipse workspace, having grown to 366 at present)

I suspect the slowness is rather expected given the large number of artifacts I am working with.
Yet, I am sure other people have experienced this issue and I hope there is enough room for optimizations and performance improvements that could be made to M2E.

@eliasbalasis eliasbalasis reopened this Nov 23, 2021
@eliasbalasis eliasbalasis changed the title Slow Maven import on large projects Slow Maven import and update on large solutions Nov 23, 2021
@eliasbalasis eliasbalasis changed the title Slow Maven import and update on large solutions Slow Maven import and update on large projects Nov 23, 2021
@eliasbalasis
Copy link
Author

eliasbalasis commented Nov 25, 2021

As promised and in relation to #123

I have collected,using VisualVM, calls of very large depth on a running Eclipse instance while M2e was processing permutations.
see snapshot-1637846394151.zip and thread named "Worker-712: Building" similar to the one in #123 but without using "Tycho".

To clarify, Eclipse is not freezing, it just seems to be taking too long for M2e to process all the permutations.

Is this helpful?

@mickaelistria
Copy link
Contributor

Can you please clarify what makes you think there are "permutations" involved? I don't get where this comes from.
I have more the impression that m2e code relies too much on resolveParentProject, which is an expensive operation, while some other and more efficient strategies can probably be used.

@eliasbalasis
Copy link
Author

eliasbalasis commented Nov 25, 2021

I keep seeing strong repetition of "Building: project: XXX" messages in the "Progress" view often on projects already built over and over, as if possible Maven permutations and artifact relationship paths are being examined continuously.

Also, at different stages of the build process, the methods depth becomes more shallow, for example during the "Setting classpath containers" stage.

Overall, after having studied various JVM snapshots, the m2e method calls seem to always have a much larger depth than anything else, if it helps you.

@mickaelistria
Copy link
Contributor

I've looked at the code in more details, and calling "resolveParentProject" could be the root cause. I don't really know (yet?) why project resolution is necessary instead of just project.getParent()...

@eliasbalasis
Copy link
Author

Thanks @mickaelistria,

this makes a lot of sense and aligns with my observations and thoughts.

I fear we are about to open a can of worms but not unnecessarily as more and more frequently and intensively I am experiencing these delays which gives me the feeling that some expensive operation is continuously and/or unecessarily being called (probably "resolveParentProject"), delaying the whole process.

I am not an M2E expert but let me know if there is anything I can do to help.

mickaelistria added a commit to mickaelistria/m2e-core that referenced this issue Nov 25, 2021
…ject()

It's not clear why a project resolution is required, although it's a
very expansive operation and we already have the parent project at hand.
Just skip resolution for now.
@mickaelistria
Copy link
Contributor

Submitted #155 about that. Let's see if it causes some test failures; if not, I'll probably just merge ASAP to get feedback.

mickaelistria added a commit to mickaelistria/m2e-core that referenced this issue Nov 25, 2021
…ject()

It's not clear why a project resolution is required, although it's a
very expansive operation and we already have the parent project at hand.
Just skip resolution for now.
mickaelistria added a commit to mickaelistria/m2e-core that referenced this issue Nov 25, 2021
…ject()

It's not clear why a project resolution is required, although it's a
very expansive operation and we already have the parent project at hand.
Just skip resolution for now.
mickaelistria added a commit to mickaelistria/m2e-core that referenced this issue Nov 25, 2021
…ject()

It's not clear why a project resolution is required, although it's a
very expansive operation and we already have the parent project at hand.
Just skip resolution for now.
mickaelistria added a commit that referenced this issue Nov 25, 2021
It's not clear why a project resolution is required, although it's a
very expansive operation and we already have the parent project at hand.
Just skip resolution for now.
@mickaelistria
Copy link
Contributor

I merged a patch that should improve this case. It will be available in snapshots in the next ~15 minutes. Please try upgrading to newer snapshots once they're available and report whether the fix is enough for you.

@eliasbalasis
Copy link
Author

eliasbalasis commented Nov 25, 2021

Thanks @mickaelistria ,

I am running Eclipse JEE 2021-03 with Spring Boot Tools 4 from "Marketplace" under Windows 10

This implies M2E 1.17.2.20210219-1922

I will try with latest M2E 1.19.0 before trying your fix.

@eliasbalasis
Copy link
Author

eliasbalasis commented Nov 26, 2021

As expected, latest M2E 1.19.0 didn't make any difference.

However, your snapshot is lightning fast.

Where importing into Eclipse a hierarchy of Maven solutions used to take 90mins or more, it now takes <10 mins
Eclipse took <10mins to rebuild all 366 imported Maven artifacts as a result of a change in the topmost ancestor pom.xml compared to more than one hour before the fix.

All of my struggles have miraculously disappeared.

I will keep experimenting with this and keep you informed, but excellent progress so far...

@eliasbalasis
Copy link
Author

I have been experimenting with the snapshot release and I realize that Eclipse has stopped trying to resolve the whole universe all the time.
Instead, Eclipse now has the opportunity to run more tasks in parallel which seems to have a performance impact on my machine but it is well understood and expected while parallelism increase is very much welcome.
I did have a few issues where Eclipse seemed to have frozen a couple of times while trying to save some file changes and I had to forcibly stop Eclipse but it could have been an erroneous situation, I am not 100% convinced.

Overall, both the initial import of the Maven solutions hierarchy (roughly 1000 artifacts) and processing of pom.xml changes have proven to be very fast so far, absolutely incomparable to previous state.

I will keep using the snapshot and let you know if I experience anything unusual...

@eliasbalasis
Copy link
Author

I keep experimenting quite successfully with the snapshot release.

I did experience Eclipse freezing occasionally for longer that it has normally been acceptable but without recovering this time and I blamed it to my haste making and saving changes while Eclipse was building which in my long experience
with Eclipse has never been a good approach.
Lesson confirmed once again, if Eclipse is building wait for it.

Overall, the replacement of resolveParentProject() with project.getParent() seems to be very crucial for the build lifecycle.
However, I didn't understand why project resolution was necessary instead of just project.getParent()...
Is it possible that use of resolveParentProject() is actually useful, even though my experimentation so far is proving otherwise?

@mickaelistria
Copy link
Contributor

I did experience Eclipse freezing occasionally for longer that it has normally been acceptable

I don't think the change is likely to cause longer freeze; it may be another issue. When this happen, run jstack against the Eclipse java process, the stack for the "main" thread would usually show the reason of the freeze, and reading packge names can direct to the culprit project where to report freeze.

@eliasbalasis
Copy link
Author

Thanks @mickaelistria, I will follow your advice next time.

However, just to be absolutely certain,
is there a possibility that resolveParentProject() is actually useful, even though my experimentation so far is proving otherwise?

@mickaelistria
Copy link
Contributor

is there a possibility that resolveParentProject() is actually useful, even though my experimentation so far is proving otherwise?

I don'y know for sure. I didn't figure out a reasin why readParentProject would be interesting here.

@dsbanks99
Copy link

@mickaelistria can you confirm, is the no resolveParentProject() enhancement in the latest m2e I would get from the marketplace? If not , how do I get it? I tried pointing my eclipse install to https://download.eclipse.org/technology/m2e/snapshots/1.19.0/latest/ but the latest version seems to be
M2E - Complete Development Kit (optional) 1.19.0.20211118-0811

@mickaelistria
Copy link
Contributor

No, it's not yet released and not accessible on Marketplace. You'll need to install snapshots as described in https://github.com/eclipse-m2e/m2e-core#-installation

@HannesWell
Copy link
Contributor

I tried pointing my eclipse install to https://download.eclipse.org/technology/m2e/snapshots/1.19.0/latest/ but the latest version seems to be M2E - Complete Development Kit (optional) 1.19.0.20211118-0811

Because M2E 1.19.0 was released already the snapshots the repo you used are not updated any more.
To get the latest updates for the release currently under development use (notice 1.19.1 vs. 1.19.0):
https://download.eclipse.org/technology/m2e/snapshots/1.19.1/latest/
or to always get the latest snapshots use:
https://download.eclipse.org/technology/m2e/snapshots/latest/

@StrongSteve
Copy link

can anyone point me in the direction of which version of m2e to install to get the speedup during the maven handling?
is it https://download.eclipse.org/technology/m2e/snapshots/1.19.1/latest/?

so basically a version > 1.19.1?

@HannesWell
Copy link
Contributor

The latest release of m2e is 1.20.0, which was published a few days ago (and likely is the last release in the 1.x line).
You can find the repository of that specific release at: https://download.eclipse.org/technology/m2e/releases/1.20.0/
To be always up to date you can use https://download.eclipse.org/technology/m2e/releases/latest/ which is always changed to contain the latest release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants