Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache config #734

Merged
merged 5 commits into from
Apr 26, 2023
Merged

Cache config #734

merged 5 commits into from
Apr 26, 2023

Conversation

delanym
Copy link
Contributor

@delanym delanym commented Apr 25, 2023

Include the formatting configuration in the hash calc as per #453

If you're happy with the approach I'll do the same for the other formatters.

@ctubbsii ctubbsii added this to the 2.23.0 milestone Apr 25, 2023
@ctubbsii ctubbsii merged commit d1d5852 into revelc:main Apr 26, 2023
@ctubbsii
Copy link
Member

Thanks @delanym !

@@ -711,14 +727,14 @@ protected void doFormatFile(final File file, final ResultCollector rc, final Pro
final var log = this.getLog();
log.debug("Processing file: " + file);
final var originalCode = this.readFileAsString(file);
final var originalHash = this.sha512hash(originalCode);
final var originalHash = this.sha512hash(originalCode + this.javaFormatter.hashCode());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delanym @ctubbsii Thought I commented on separate PR possibly about why we need this on every single entry. Would it not be by far easier to just write the configuration hash 1 time for each configuration type we end up storing and look that up to see if we simply invalidate the entire file right away? Not sure how big the hash is but could grow the file needlessly IMO.

Copy link
Member

@ctubbsii ctubbsii Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hazendaz This change seemed pretty simple... it didn't change the format of the cache file at all, just included some information about the formatter config when computing existing hash code entries for a file. This isn't perfect, because there can be hash collisions in the formatter config (new, and possibly common), as well as hash collisions in the sha512 cache entry (existing, but rare), but it's a simple fix for the issue the user was seeing.

This can definitely grow the file needlessly if the formatter config is changed frequently and the cache is not cleared. The number of entries is approximately number of files * number of times the formatter config changed, but I think that's a reasonable trade-off for users in exchange for the convenience of automatically detecting formatter changes most of the time. They'll probably need to clear the cache occasionally if they make a lot of formatter config changes. But that's better than needing to clear it every time when they make a config change, like they have to now.

If you have a better idea, please go ahead with it. I'm not sure I understand your proposed solution, though. Maybe I would if I saw the code changes in a PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctubbsii Think I was reading this wrong now that I'm looking again and reading your comments. I was thinking two separate hashes applied together making the field really long, did not catch or notice that its not that but rather a hash of the code+formatter config hash so only one hash entry with probably nearly identical results to file sizing. So it probably is no real difference. I'm good.

@delanym All good, looking forwards to seeing the others. Maybe you might be interested in trying to rework for potential live cache. That is to say, today we write the cache file but I envision we get the before hash and after hash and if they don't match, write as I think the write is really the only tangible cost. Especially for repos not large in size. Believe we probably still need both options for repos like camel that I believe are using this still. I have a ticket out there for that effort #674. You seem to have a very good handle here and possible cycles to address.

@@ -37,17 +38,37 @@
*/
public class JavaFormatter extends AbstractCacheableFormatter implements Formatter {

@Override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delanym @ctubbsii Sorry late again here, this part should move down after the class variables and needs a bit of cleanup to match our coding style. I can take care of later if not addressed, just noting this. And as I think I mentioned on other PR originally, Never thought to store it this way, nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New formatter config must invalidate cache
3 participants