Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to control progress reporting #1513

Merged
merged 5 commits into from
Mar 9, 2021

Conversation

pepeiborra
Copy link
Collaborator

@pepeiborra pepeiborra commented Mar 7, 2021

The current implementation of progress reporting uses actionBracket which has O(logC) complexity in the number of registered Cleanups (ndmitchell/shake#797).

Since we do progress reporting for every File and Rule updated, the worst case complexity is around O(FRlogC) for all the files in the project. In large projects, this is quite expensive.

Since otTracedAction is implemented using actionBracket too, I have made it conditional on eventlogs being enabled.

For reviewers understanding: I had to move some types around to avoid cyclic module dependencies.

@ndmitchell
Copy link
Collaborator

I think the O(log n) is easily fixable on the Shake side. If it was fixed, would you still want to make this change? When you say "quite expensive" do you have any numbers you can share?

@pepeiborra
Copy link
Collaborator Author

pepeiborra commented Mar 7, 2021

I think the O(log n) is easily fixable on the Shake side. If it was fixed, would you still want to make this change? When you say "quite expensive" do you have any numbers you can share?

Yes, this change still makes sense, to remove the arbitrary exclusion of GetFileExists and to make this configurable for users of ghcide as a library.

The icicle graph below is the result of profiling an "edit" experiment over a project with ~40k source files where there is only a FOI which only has 3 transitive dependencies. It shows that ~50% of the time spent in defineEarlyCutoff is used by actionBracket for the GetFileExists and GetModificationTime rules. Since these rules do little or no work, ~50% of the total overhead induced by defineEarlyCutoff is attributable to actionBracket

image

@ndmitchell
Copy link
Collaborator

Ouch, I'll get a constant time actionBracket done today with luck...

@pepeiborra pepeiborra requested a review from ndmitchell March 7, 2021 17:06
Copy link
Collaborator

@ndmitchell ndmitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice simplification and performance saving when we don't have tracing.

import Bag (listToBag)
import ErrUtils (mkErrMsg)
import Outputable (text, neverQualify)
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this #if disappear? Seems unrelated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh, again! It's the second time it happens to me, I think this could be a problem with the pre-commit hook, /cc @Ailrun

Copy link
Member

@Ailrun Ailrun Mar 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be... It's probably because CPP support for the formatter is not so reliable. Unfortunately, I don't know any formatter that works well with CPP...

}

defaultSkipProgress :: Typeable a => a -> Bool
defaultSkipProgress key = case () of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we comment as to why these nodes get excluded? The original reason for excluding GetFileExists was that given a module Foo.hs, we might have N GetFileExists at each include path, at .lhs, at .hs-boot etc. Whereas every other rule is pretty much only at a single filepath. Is that property true for modtime? It definitely isn't for getfilecontents, but is that about performance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's performance, but I haven't measured. I'll add a comment or revert to just GetFileExists

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think is unreasonable - its a commonly called thing that is mostly irrelevant for performance concerns. I do think its important to say that GetFileExists is specially, because its not obvious that is has that weird behaviour, and how if you do include GetFileExists your progress is basically meaningless.

@pepeiborra pepeiborra force-pushed the optionalProgressReporting branch from dc0966e to 0d9df1b Compare March 9, 2021 08:47
@pepeiborra pepeiborra added the merge me Label to trigger pull request merge label Mar 9, 2021
@mergify mergify bot merged commit 3995f3a into master Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge me Label to trigger pull request merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants