-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Configuration hashing into file hash breaks existing caching - only confirmed with XML at the moment #757
Comments
Linking reference to #734 |
I have this entire issue fixed. It will be a day or two before I bring it over as I have to extensively clean it up my work. I spent about 5 to 6 hours to get this working. I also took care of the impsort change to caching to bring here as well as addressed the dependabot PR in full and another small bug. Some good faith will be necessary as it was a nightmare to test in the current project structure as I need it to use the 'new' version not the 'old' version against project build and without more work to build out integration tests that can handle this, I had to do some tricky business of installing the snapshot jar so it would run off that on github. So that is to say, on my windows 11 platform, I had to run twice (once to install a copy of latest snapshot and second to run to generate the cache based off usage of the fixes). On remote, since this project fails to deliver snapshots to sonatype (it could very easily - I do that with every single project I support otherwise), I had to check that jar in and run a maven install first. I followed that by outputting the cache and using notepad++ to compare. So all nix and windows were confirmed to now remain the same. To the original issue, windows locally seemed ok, not entirely sure why. But on github (as I saw here) or jenkins in our case where we found this issue, it was immediately clear that it was an issue and we stored the cache with the projects automatically which caused endless looping of commits. It corrupted the cache for every single entry both windows and linux on github windows which was very strange. Both mac and linux corrupted the cache for nix style. Our case was just linux and caught with the XML files only on a super pom. Eventually we will be moving the caches to develocity like others are doing. Please do understand that the cache is all we are discussing here. The formatting was not a problem but for those that store the cache, the case file was a major problem due to the change indicated on this original ticket. It would IMO be problematic even for develocity if every single build continually was changing the cache needlessly. Reviewing the original logic again, it was entirely wrong. All files were affected including java. There was zero reason to make the one formatter (java) have equals / hashcode when its entire setup comes from a map called options passed into the init methods. In fact all formatters have that. So only a hash directly on the map which is contract guaranteed to be correct and making it available was all that was needed. The final implementation is very straight forwards and simple. Pull requests incoming. I further found a minor bug in how reporting no configuration was setup as json was missing from the setup. All the further caching tests were updated as well along with the updates. I'm expecting probably 4 or so pull requests once I get this cleaned up. A proper integration test to handle this for the future with verification on the caching will be necessary but don't see that happening at the moment as I lack the time for it. Even before this is released though, we need to make sure all the submodules that are used either directly like javascript or indirectly are up to date and released too as those were skipped last time around. While I don't think that mattered much we need a solid release that leaves nothing behind. FWIW - At my day job, we are trying to go out with formatting to thousands of projects and scale that up from there. Without getting this fixed, we were going to just entirely drop it. And its not been a pressing thing until now because we got reworked just towards end of last year and these decisions were made then. Prior to that we ran this on few 100 repos based off a parent pom and just used the older version but now we are practically moving everything to java 21 and making highly configurable pipelines for Jenkins and later actions and probably git hooks too for various formatters including impsort, spotless, openrewrite, and whitespace (mine). I probably missed a few but that is intended to happen fast and work started on mine already by an offshore team. So I'm back here now as much as I can be to make sure these issues are fixed so we can get these scaled up. |
Original change in d1d5852 only addressed java and did so at detriment of all formatters. This caused certain platforms to continually update the cache file on every build. As seen on GitHub, Linux and Macos would change the *nix hashes for every file constantly including java. For Windows on Github, it was changing the entire file (not clear as to why entire file but that was observed over and over). Testing to show issue was done with 2.23.0 formatter release that introduced the problem.
This commit d1d5852 is breaking caching.
Multiple issues...
It only addressed java files. All the formatters go through that same code. So it is treating all files as if java and entirely ignoring fact that we have the following.
See code...
final var originalHash = this.sha512hash(originalCode + this.javaFormatter.hashCode());
Negative impact
Resolution
Dropping back to the prior release 2.22.0, it goes back to working without issue.
Some of the original commit may be worth keeping but as a whole we need to remove that logic and it be looked at in more detail. I did have some misgivings to start with putting configuration data hash into every single entry but it was a small issue. I think we should not be doing that at all now and rather store separate line for the physical configuration and if the map is the issue, just store the actual values. Otherwise its getting complicated. I'm not sure there is any need to go crazy hashing in that way but if we do, it needs to be done in far more files and possibly over on the xml project if config there is held elsewhere.
Otherwise 2.23.0 is IMO not usable now until addressed for anyone expecting it to actually work correctly. Also, unrelated but do note that as I brought up before eclipse broke formatter when using mockito 'when' and as such, we should be on the 3.33.0 ecj release until they fix the issue which has not been fixed yet.
One other note, it didn't say it wrote the file, so that part seemed ok but rather it just keeps writing the cache file. So those of us that store this are broken. I would not expect that file to keep changing as a result of this anyways.
Quickest solution is to just revert this commit, release again. Come back to it. I would also consider in that the note on ecj drop back until eclipse fixes their issues.
The text was updated successfully, but these errors were encountered: