Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite hls-graph to not use the Shake code #1759

Closed
wants to merge 1 commit into from

Conversation

ndmitchell
Copy link
Collaborator

This PR reimplements the key algorithms from Shake directly. Most of the tests still work, but there are some failures relating to progress messages. Unfortunately, those failures also stop the benchmark from working, and thus I have no idea of the relative performance. In order to simplify relative to Shake, and make it more suitable for HLS:

  • No serialisation logic at all.
  • A lot less focus on error handling and fast exception propagation, since we never have exceptions.
  • No continuation monad or thread pool, everything is separate threads (managed by async) and direct ReaderT IO.
  • The Ids and Intern modules are lifted from Shake unchanged.

All the interesting code is in the Database module. Everything else is just wrapping and implementing required type classes on the underlying types.

Things that don't work:

  • The benchmarks, which seem to have race conditions around progress messages.
  • Most tests work, but the "fix syntax error" ones don't. Unfortunately, they also fail with race conditions around progress messages, but sometimes I won those races and they still failed.
  • All profiling is stubbed out. If we consider that important, I think we can get it back. I think reusing the Shake progress reporting HTML is even feasible, with a bit of modularisation in Shake. It wouldn't be unreasonable for Shake to make that a separate package.

How we move forward is probably heavily influenced by what the benchmarks say. If this new Shake is either within the ballpark of same speed or faster, I suggest we move entirely to this new one, at which point things like @pepeiborra's reverse dependencies are way easier to integrate.

@jneira
Copy link
Member

jneira commented Apr 19, 2021

Sorry if it is not related at all but would be #796 relevant? this could enable the change suggested there?

@ndmitchell
Copy link
Collaborator Author

The work already merged enables #796. This would be going in a different direction with our own implementation

Copy link
Collaborator

@pepeiborra pepeiborra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive! I read through the code and was able to understand most of it. Planning to play with it this weekend and make it work with the benchmarks.

Comment on lines +80 to +83
-- Hold the lock while mutating Ids/Values
databaseLock :: !Lock,
databaseIds :: !(IORef (Intern Key)),
databaseValues :: !(Ids (Key, Status))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- Hold the lock while mutating Ids/Values
databaseLock :: !Lock,
databaseIds :: !(IORef (Intern Key)),
databaseValues :: !(Ids (Key, Status))
-- TODO (someday) Replace the 3 fields below by a concurrent (mutable) Hashtable
-- Hold the lock while mutating Ids/Values
databaseLock :: !Lock,
databaseIds :: !(IORef (Intern Key)),
databaseValues :: !(Ids (Key, Status))

Comment on lines +77 to +78
databaseExtra :: Dynamic,
databaseRules :: TheRules,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these fields lazy on purpose?

Comment on lines +87 to +89
= Clean Result
| Dirty (Maybe Result)
| Running (IO Result) (Maybe Result)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
= Clean Result
| Dirty (Maybe Result)
| Running (IO Result) (Maybe Result)
= Clean !Result
| Dirty !(Maybe Result)
| Running !(IO Result) !(Maybe Result)

resultBuilt :: !Step,
resultChanged :: !Step,
resultDeps :: !(Maybe [Id]), -- Nothing = alwaysRerun
resultData :: BS.ByteString
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resultData :: BS.ByteString
resultData :: !BS.ByteString

Comment on lines +142 to +144
let act2 = Box <$> act
let res = unsafePerformIO act2
(void $ evaluate res, fromBox res)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this trick. What is the Box for?

-- Catch only catches exceptions that were caused by this code, not those that
-- are a result of program termination
f e | isAsyncException e = Nothing
| otherwise = fromException e

actionBracket :: IO a -> (a -> IO b) -> (a -> Action c) -> Action c
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
actionBracket :: IO a -> (a -> IO b) -> (a -> Action c) -> Action c
-- TODO Generalize type signatures and implement exception classes
actionBracket :: IO a -> (a -> IO b) -> (a -> Action c) -> Action c

pure $ ShakeDatabase threads (length actions) actions db

shakeRunDatabase :: ShakeDatabase -> [Action a] -> IO ([a], [IO ()])
shakeRunDatabase (ShakeDatabase threads lenAs1 as1 db) as2 = withNumCapabilities threads $ do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
shakeRunDatabase (ShakeDatabase threads lenAs1 as1 db) as2 = withNumCapabilities threads $ do
-- TODO move capability bits out of hls-graph
shakeRunDatabase (ShakeDatabase threads lenAs1 as1 db) as2 = withNumCapabilities threads $ do


data SAction = SAction {
actionDatabase :: !Database,
actionDeps :: !(IORef (Maybe [Id])) -- Nothing means always rerun
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
actionDeps :: !(IORef (Maybe [Id])) -- Nothing means always rerun
actionDeps :: !(IORef (Maybe [Id])) -- ^ Nothing means always rerun

@pepeiborra
Copy link
Collaborator

After rebasing on top of #1754 benchmarks are failing with out of range errors:

╰─>$ tail -n 20 ghcide/bench-results/unprofiled/Cabal-3.0.0.0/HEAD/completions.output.log
                "method": "workspace/didChangeWatchedFiles",
                "id": "globalFileWatches"
            }
        ]
    },
    "method": "client/registerCapability",
    "id": 0
}
ghcide: Ix{Int}.index: Index (43) out of range ((0,0))
[DEBUG] Set files of
--> {
    "jsonrpc": "2.0",
    "params": null,
    "method": "shutdown",
    "id": 0
}
fd:12: hPutBuf: resource vanished (Broken pipe)
name        | success | samples | startup | setup | userTime | delayedTime | totalTime
----------- | ------- | ------- | ------- | ----- | -------- | ----------- | ---------
completions | False   | 50      | 0.00s   | 0.00s | 0.00s    | 0.00s       | 0.00s

@ndmitchell
Copy link
Collaborator Author

That's a different error to what I was seeing. I'll double check tomorrow morning that I didn't forget to send up any of my changes.

act <- uninterruptibleMask $ \restore -> do
-- the child actions should always be spawned unmasked
-- or they can't be killed
async <- async $ restore $ check db key id s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On IRC, we were discussing if it was possible to make all the book-keeping contention free, by eliminating the locks and running in STM (and using a concurrent map like stm-containers?).

This bit seems to be the problematic part, the rest of the computation seem like it can easily be moved into STM. But it might just be possible to make something work using judicious use of unsafePerformIO, unsafeIOToSTM and the proper care. Do you think this would be possible?

@pepeiborra pepeiborra mentioned this pull request Aug 1, 2021
5 tasks
@pepeiborra
Copy link
Collaborator

Continued in #2060

@pepeiborra pepeiborra closed this Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants