Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run plugins' test suites with server in the same process #1628

Merged
merged 19 commits into from
Apr 6, 2021

Conversation

berberman
Copy link
Collaborator

Following #1617

Hope this would make tests more stable

@berberman
Copy link
Collaborator Author

@berberman berberman requested a review from pepeiborra March 28, 2021 10:18
@pepeiborra
Copy link
Collaborator

Tests on Windows seemd got stuck again:
https://github.com/haskell/haskell-language-server/pull/1628/checks?check_run_id=2211603744
https://github.com/haskell/haskell-language-server/pull/1628/checks?check_run_id=2211603749

Isn't it enough to set shakeThreads to 2?

You also need to build with --threaded

@berberman
Copy link
Collaborator Author

berberman commented Mar 29, 2021

Tests on Windows seemd got stuck again:
https://github.com/haskell/haskell-language-server/pull/1628/checks?check_run_id=2211603744
https://github.com/haskell/haskell-language-server/pull/1628/checks?check_run_id=2211603749
Isn't it enough to set shakeThreads to 2?

You also need to build with --threaded

Mmm, -threaded is already here:

test-suite tests
type: exitcode-stdio-1.0
default-language: Haskell2010
hs-source-dirs: test
main-is: Main.hs
ghc-options: -threaded -rtsopts -with-rtsopts=-N

@berberman berberman force-pushed the plugin-tests2 branch 2 times, most recently from d29c7dd to 405635c Compare March 29, 2021 12:22
@berberman
Copy link
Collaborator Author

So far, tests of plugins on linux and macOS look great, but on Windows, they seem to hang100%. I don't have Windows machine with development environment setup, no idea how to debug this.

@berberman berberman marked this pull request as ready for review March 30, 2021 04:41
@jneira
Copy link
Member

jneira commented Mar 30, 2021

I'll try to take a look in my local win 10

@jneira jneira self-requested a review March 30, 2021 05:33
@jneira
Copy link
Member

jneira commented Mar 31, 2021

Unfortunately tests pass in my local windows 10, after a first try with some resource vanished but no stuck. 😟

@berberman
Copy link
Collaborator Author

Ugh, it's still getting stuck... BTW definition and hover tests of ghcide is fairly unstable, and it even fails on my machine constantly: #1626 (comment)

@pepeiborra
Copy link
Collaborator

Unfortunately tests pass in my local windows 10, after a first try with some resource vanished but no stuck. 😟

But this is the same problem that CI shows, isn't it? Tests are not stuck, they are failing in Windows GHC 8.10.4. "resource vanished - broken pipe" means that HLS stopped running. Either it was killed, it crashed, or it decided to exit on its own.

Why is HLS crashing in Windows 8.10.4 all of a sudden? Has anything changed?

@jneira
Copy link
Member

jneira commented Mar 31, 2021

But this is the same problem that CI shows, isn't it?

Yeah, i thought they stuck without no response until a ci timeout, without reviewing the logs,thanks for pointing it out.
The symtomps are similar locally but the situation is much worse in ci, locally i got three errors in a cold first run and it passed in the second one.

Why is HLS crashing in Windows 8.10.4 all of a sudden? Has anything changed?

It is failing since some time ago, although it seems in this pr is worse. I have been reviewing the gh actions and dont see a pattern. There are recent builds with succesful test run at first like https://github.com/haskell/haskell-language-server/runs/2197782538?check_suite_focus=true
Being transient errors and the differences between ci and my local machine make a bisect is hard 😟

@berberman
Copy link
Collaborator Author

Sometimes a dying server (after the test session finished successfully) throws this error:

tests.exe: Control.Concurrent.Extra.signalBarrier, attempt to signal a barrier that has already been signaled
CallStack (from HasCallStack):
  errorIO, called at src\Control\Concurrent\Extra.hs:196:16 in extra-1.7.9-1fb0534e9d12add2b941eefeb5d3e9bbdea5a70c:Control.Concurrent.Extra
  signalBarrier, called at src\Development\IDE\LSP\LanguageServer.hs:64:16 in ghcide-1.1.0.0-inplace:Development.IDE.LSP.LanguageServer

-- These barriers are signaled when the threads reading from these chans exit.
-- This should not happen but if it does, we will make sure that the whole server
-- dies and can be restarted instead of losing threads silently.
clientMsgBarrier <- newBarrier
-- Forcefully exit
let exit = signalBarrier clientMsgBarrier ()

Does it mean the server received two exit messages?

@pepeiborra
Copy link
Collaborator

But this is the same problem that CI shows, isn't it?

Yeah, i thought they stuck without no response until a ci timeout, without reviewing the logs,thanks for pointing it out.
The symtomps are similar locally but the situation is much worse in ci, locally i got three errors in a cold first run and it passed in the second one.

Why is HLS crashing in Windows 8.10.4 all of a sudden? Has anything changed?

It is failing since some time ago, although it seems in this pr is worse. I have been reviewing the gh actions and dont see a pattern. There are recent builds with succesful test run at first like https://github.com/haskell/haskell-language-server/runs/2197782538?check_suite_focus=true
Being transient errors and the differences between ci and my local machine make a bisect is hard 😟

This PR didn't make any changes to the ghcide test suite, so I don't see the connection. Unfortunately master has the getting-stuck problem which means we cannot see if the Windows issue is specific to this PR or not

@pepeiborra
Copy link
Collaborator

Sometimes a dying server (after the test session finished successfully) throws this error:

tests.exe: Control.Concurrent.Extra.signalBarrier, attempt to signal a barrier that has already been signaled
CallStack (from HasCallStack):
  errorIO, called at src\Control\Concurrent\Extra.hs:196:16 in extra-1.7.9-1fb0534e9d12add2b941eefeb5d3e9bbdea5a70c:Control.Concurrent.Extra
  signalBarrier, called at src\Development\IDE\LSP\LanguageServer.hs:64:16 in ghcide-1.1.0.0-inplace:Development.IDE.LSP.LanguageServer

-- These barriers are signaled when the threads reading from these chans exit.
-- This should not happen but if it does, we will make sure that the whole server
-- dies and can be restarted instead of losing threads silently.
clientMsgBarrier <- newBarrier
-- Forcefully exit
let exit = signalBarrier clientMsgBarrier ()

Does it mean the server received two exit messages?

Yes, I think so. Perhaps this is a new feature of lsp-test.

Great find! Please fix the crash by replacing the Barrier with a plain MVar and let's see if that unblocks the test suite!

@berberman
Copy link
Collaborator Author

One difference between local machine and Windows CI I noticed (by checking lsp messages log one by one) was that in CI, there were no hiedb related messages sent from the server, such as ghcide/reference/ready, Finished indexing 1 files.

@berberman
Copy link
Collaborator Author

berberman commented Apr 1, 2021

Sometimes a dying server (after the test session finished successfully) throws this error:

tests.exe: Control.Concurrent.Extra.signalBarrier, attempt to signal a barrier that has already been signaled
CallStack (from HasCallStack):
  errorIO, called at src\Control\Concurrent\Extra.hs:196:16 in extra-1.7.9-1fb0534e9d12add2b941eefeb5d3e9bbdea5a70c:Control.Concurrent.Extra
  signalBarrier, called at src\Development\IDE\LSP\LanguageServer.hs:64:16 in ghcide-1.1.0.0-inplace:Development.IDE.LSP.LanguageServer

-- These barriers are signaled when the threads reading from these chans exit.
-- This should not happen but if it does, we will make sure that the whole server
-- dies and can be restarted instead of losing threads silently.
clientMsgBarrier <- newBarrier
-- Forcefully exit
let exit = signalBarrier clientMsgBarrier ()

Does it mean the server received two exit messages?

Yes, I think so. Perhaps this is a new feature of lsp-test.

Great find! Please fix the crash by replacing the Barrier with a plain MVar and let's see if that unblocks the test suite!

But according to the lsp messages log, the client sends exactly one exit message 🤔

Anyway, I don't see the reason that we need Barrier instead of MVar -- being signalled more than one time should be OK as well.

@wz1000
Copy link
Collaborator

wz1000 commented Apr 1, 2021

You might want to investigate/patch lsp-test. I believe the resource vanished error comes from this line: https://github.com/haskell/lsp/blob/6dd6ad630ddb4a08a2bbcd6f71588f0864a555b6/lsp-test/src/Language/LSP/Test/Session.hs#L440

It might be good to add a catch there, or at least print out more information for debugging (like the message we are trying to send).

Comment on lines 123 to 125
timeout 3 (wait server) >>= \case
Just () -> pure ()
Nothing -> putStrLn "Server does not exit on time, canceling the async task..." >> cancel server
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests on Windows passed with this. We should investigate why the server didn't exit properly.

Copy link
Collaborator

@pepeiborra pepeiborra Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just looking at the exit code in ghcide and I think it might be time to update it. I'm not sure that the waitAnyCancel construction is needed any more, and it could be hiding exceptions coming from the lsp library:

void $ waitAnyCancel =<< traverse async
[ void $ LSP.runServerWithHandles
inH
outH
serverDefinition
, void $ waitBarrier clientMsgBarrier
]

EDIT: according to the docs, exceptions will be re-thrown to the parent thread, so maybe it's harmless

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is that defaultMain does not call setupLogger so we might be missing lsp errors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-- These barriers are signaled when the threads reading from these chans exit.
-- This should not happen but if it does, we will make sure that the whole server
-- dies and can be restarted instead of losing threads silently.
clientMsgBarrier <- newBarrier

These comments match the old code:

void $ waitAnyCancel =<< traverse async
[ void $ LSP.runWithHandles
stdin
newStdout
( const $ Right ()
, handleInit (signalBarrier clientMsgBarrier ()) clientMsgChan
)
(handlers clientMsgChan)
options
Nothing
, void $ waitBarrier clientMsgBarrier
]
where
handleInit :: IO () -> TChan LSP.FromClientMessage -> LSP.LspFuncs () -> IO (Maybe LSP.ResponseError)
handleInit exitClientMsg clientMsgChan lspFuncs@LSP.LspFuncs{..} = do
Handlers{..} <- getHandlers lspFuncs
let requestHandler' (req, reqId) = requestHandler
(\res -> ResponseMessage "2.0" (responseId reqId) (Just res) Nothing)
(\err -> ResponseMessage "2.0" (responseId reqId) Nothing (Just $ ResponseError err "" Nothing))
req
_ <- flip forkFinally (const exitClientMsg) $ forever $ do
msg <- atomically $ readTChan clientMsgChan
case convClientMsg msg of
Nothing -> Logger.logSeriousError loggerH $ "Unknown client msg: " <> T.pack (show msg)
Just (Left notif) -> notificationHandler notif
Just (Right req) -> sendFunc =<< requestHandler' req
pure Nothing

but now this barrier will be signalled when the server receives exit message instead.

I think we could try removing the waitAnyCancel.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, now we signal the barrier from two places - the exit handler and also if the initialisation fails, which probably explains the error that you were seeing earlier.

I'm not convinced that removing it will actually fix anything. runServerWithHandles doesn't actually exit until the input stream is closed, and I don't know whether lsp-test closes the stream until the server exits

Copy link
Collaborator Author

@berberman berberman Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that lsp-test won't close the handle for us (in our case mServerProc is Nothing):

let cleanup
| Just sp <- mServerProc = do
-- Give the server some time to exit cleanly
-- It makes the server hangs in windows so we have to avoid it
#ifndef mingw32_HOST_OS
timeout msgTimeoutMs (waitForProcess sp)
#endif
cleanupProcess (Just serverIn, Just serverOut, Nothing, sp)
| otherwise = pure ()
finally (timeout msgTimeoutMs (runSession' exitServer))
-- Make sure to kill the listener first, before closing
-- handles etc via cleanupProcess
(killThread tid >> cleanup)

But I didn't find any manual close in the test of lsp-test. Maybe @wz1000 could help?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure the handle closes when the server exits.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handle closes when the server exits, sure, but as I said runServerWithHandles doesn't exit until the stream ends, so it won't exit.

@berberman
Copy link
Collaborator Author

I observed some snippets from debug log, like this:

[Debug] Finishing build session(exception: Error when running Shake build system:
thread blocked indefinitely in an STM transaction
)
[Debug] Finishing build session(exception: Error when running Shake build system:
[Debug] Finishing build session(exception: Error when running Shake build system:
[Debug] Finishing build session(exception: Error when running Shake build system:
thread blocked indefinitely in an STM transaction
thread blocked indefinitely in an STM transaction
thread blocked indefinitely in an STM transaction
)))

@pepeiborra
Copy link
Collaborator

pepeiborra commented Apr 2, 2021

I observed some snippets from debug log, like this:

[Debug] Finishing build session(exception: Error when running Shake build system:
thread blocked indefinitely in an STM transaction
)
[Debug] Finishing build session(exception: Error when running Shake build system:
[Debug] Finishing build session(exception: Error when running Shake build system:
[Debug] Finishing build session(exception: Error when running Shake build system:
thread blocked indefinitely in an STM transaction
thread blocked indefinitely in an STM transaction
thread blocked indefinitely in an STM transaction
)))

That is not necessarily an issue, helper threads like the reactor thread or the hiedb thread would fail like that when the main thread exits the lsp loop and no longer keeps a reference to the shared state

@berberman
Copy link
Collaborator Author

The server sends "Finishing build session..." using notifyTestingLogMessage when a worker exits, but in our case, the input stream has been closed. Thus, calling shakeShut in exit message handler results in hPutBuf: resource vanished (Broken pipe).

workRun restore = withSpan "Shake session" $ \otSpan -> do
let acts' = pumpActionThread otSpan : map (run otSpan) (reenqueued ++ acts)
res <- try @SomeException (restore $ shakeRunDatabase shakeDb acts')
let res' = case res of
Left e -> "exception: " <> displayException e
Right _ -> "completed"
let msg = T.pack $ "Finishing build session(" ++ res' ++ ")"
return $ do
logDebug logger msg
notifyTestingLogMessage extras msg

notifyTestingLogMessage :: ShakeExtras -> T.Text -> IO ()
notifyTestingLogMessage extras msg = do
(IdeTesting isTestMode) <- optTesting <$> getIdeOptionsIO extras
let notif = LSP.LogMessageParams LSP.MtLog msg
when isTestMode $ mRunLspT (lspEnv extras) $ LSP.sendNotification LSP.SWindowLogMessage notif

And it seems that the server still reports progress after shutting down the shake. Should we call stopProgressReporting before cancelShakeSession in shakeShut?

shakeShut :: IdeState -> IO ()
shakeShut IdeState{..} = withMVar shakeSession $ \runner -> do
-- Shake gets unhappy if you try to close when there is a running
-- request so we first abort that.
void $ cancelShakeSession runner
shakeClose
stopProgressReporting

Also, is it possible to disable lsp capabilities when we want server die?

A piece of log from Windows CI:

brittany
Starting LSP server...
If you are seeing this in a terminal, you probably should have run ghcide WITHOUT the --lsp option!
Started LSP server in 0.08s
setInitialDynFlags cradle: Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
Output from setting up the cradle Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
lsp:Got EOF, exiting 1 ...

warning: LF will be replaced by CRLF in C:/Users/runneradmin/AppData/Local/Temp/BriA841.actual.
The file will have its original line endings in your working directory
Starting LSP server...
  formats a document with LF endings:   OK (0.79s)
If you are seeing this in a terminal, you probably should have run ghcide WITHOUT the --lsp option!
Started LSP server in 0.00s
  formats a document with CRLF endings: IO Exception: clientOut:
<file descriptor: 6>: hPutBuf: resource vanished (Broken pipe)
{"method":"window/logMessage","params":{"message":"Finishing build session(exception: AsyncCancelled)","type":4},"jsonrpc":"2.0"}
IO Exception: clientOut:
<file descriptor: 6>: hPutBuf: resource vanished (Broken pipe)
{"method":"window/logMessage","params":{"message":"Restarting build session (aborting the previous one took 0.00s)","type":4},"jsonrpc":"2.0"}
IO Exception: clientOut:
<file descriptor: 6>: hPutBuf: resource vanished (Broken pipe)
{"method":"window/logMessage","params":{"message":"Finishing build session(exception: AsyncCancelled)","type":4},"jsonrpc":"2.0"}
IO Exception: clientOut:
<file descriptor: 6>: hPutBuf: resource vanished (Broken pipe)
{"method":"$/progress","params":{"token":"14","value":{"message":"2/2","kind":"report"}},"jsonrpc":"2.0"}
setInitialDynFlags cradle: Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
Output from setting up the cradle Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
lsp:Got EOF, exiting 1 ...

warning: LF will be replaced by CRLF in C:/Users/runneradmin/AppData/Local/Temp/BriB274.actual.
The file will have its original line endings in your working directory
Starting LSP server...
OK (2.61s)
If you are seeing this in a terminal, you probably should have run ghcide WITHOUT the --lsp option!
Started LSP server in 0.00s
setInitialDynFlags cradle: Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
Output from setting up the cradle Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
  formats a range with LF endings:      IO Exception: clientOut:
<file descriptor: 9>: hPutBuf: resource vanished (Broken pipe)
{"method":"window/logMessage","params":{"message":"Finishing build session(exception: AsyncCancelled)","type":4},"jsonrpc":"2.0"}
IO Exception: clientOut:
<file descriptor: 9>: hPutBuf: resource vanished (Broken pipe)
{"method":"window/logMessage","params":{"message":"Restarting build session (aborting the previous one took 2.03s)","type":4},"jsonrpc":"2.0"}
IO Exception: clientOut:
<file descriptor: 9>: hPutBuf: resource vanished (Broken pipe)
{"method":"window/logMessage","params":{"message":"Finishing build session(exception: AsyncCancelled)","type":4},"jsonrpc":"2.0"}
IO Exception: clientOut:
<file descriptor: 9>: hPutBuf: resource vanished (Broken pipe)
{"method":"$/progress","params":{"token":"37","value":{"message":"2/2","kind":"report"}},"jsonrpc":"2.0"}
lsp:Got EOF, exiting 1 ...

warning: LF will be replaced by CRLF in C:/Users/runneradmin/AppData/Local/Temp/BriB4B9.actual.
The file will have its original line endings in your working directory
Starting LSP server...
OK (0.57s)
If you are seeing this in a terminal, you probably should have run ghcide WITHOUT the --lsp option!
Started LSP server in 0.00s
setInitialDynFlags cradle: Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
Output from setting up the cradle Cradle {cradleRootDir = "D:\\a\\haskell-language-server\\haskell-language-server\\plugins\\hls-brittany-plugin\\test\\testdata", cradleOptsProg = CradleAction: Default}
lsp:Got EOF, exiting 1 ...

warning: LF will be replaced by CRLF in plugins/hls-brittany-plugin/test/testdata/BrittanyCRLF.formatted_range.hs.
The file will have its original line endings in your working directory
warning: LF will be replaced by CRLF in C:/Users/runneradmin/AppData/Local/Temp/BriB6AF.actual.
The file will have its original line endings in your working directory
  formats a range with CRLF endings:    OK (0.51s)

All 4 tests passed (4.47s)
Test suite tests: PASS
Test suite logged to:
D:\a\haskell-language-server\haskell-language-server\dist-newstyle\build\x86_64-windows\ghc-8.10.4\hls-brittany-plugin-1.0.0.0\t\tests\test\hls-brittany-plugin-1.0.0.0-tests.log
1 of 1 test suites (1 of 1 test cases) passed.

@pepeiborra
Copy link
Collaborator

shakeShut already calls progress reporting - could you try to reorder the statements so that it calls it first?

This output is caused by lsp-test closing the stream immediately in Windows, whereas it waits for the server to exit in Linux:

https://github.com/haskell/lsp/blob/6dd6ad630ddb4a08a2bbcd6f71588f0864a555b6/lsp-test/src/Language/LSP/Test/Session.hs#L283-L288

So it's probably not a big deal

hls-test-utils/src/Test/Hls.hs Outdated Show resolved Hide resolved
@berberman
Copy link
Collaborator Author

The barrier did be signaled two times. Is this exit what we expected?

[Error] Fatal error in server thread: thread blocked indefinitely in an MVar operation
tests: Control.Concurrent.Extra.signalBarrier, attempt to signal a barrier that has already been signaled
CallStack (from HasCallStack):
  errorIO, called at src/Control/Concurrent/Extra.hs:196:16 in extra-1.7.9-B289Y5Rzkww81ONKh8ABWZ:Control.Concurrent.Extra
  signalBarrier, called at src/Development/IDE/LSP/LanguageServer.hs:64:16 in ghcide-1.1.0.0-inplace:Development.IDE.LSP.LanguageServer

Full test log:

  formats a range with CRLF endings:    Starting LSP server...
If you are seeing this in a terminal, you probably should have run ghcide WITHOUT the --lsp option!
Started LSP server in 0.00s
setInitialDynFlags cradle: Cradle {cradleRootDir = "/home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata", cradleOptsProg = CradleAction: Default}
[Info] Registering ide configuration: IdeConfiguration {workspaceFolders = fromList [NormalizedUri (-6923107000992859315) "file:///home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata"], clientSettings = hashed Nothing}
[Debug] Set files of interest to: [(NormalizedFilePath "/home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata/BrittanyCRLF.hs",Modified {firstOpen = True})]
[Debug] Finishing build session(exception: AsyncCancelled)
[Debug] Restarting build session (aborting the previous one took 0.00s)
[Debug] Opened text document: file:///home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata/BrittanyCRLF.hs
[Info] Consulting the cradle for "BrittanyCRLF.hs"
[Warning] No [cradle](https://github.com/mpickering/hie-bios#hie-bios) found for BrittanyCRLF.hs.
 Proceeding with [implicit cradle](https://hackage.haskell.org/package/implicit-hie).
You should ignore this message, unless you see a 'Multi Cradle: No prefixes matched' error.
Output from setting up the cradle Cradle {cradleRootDir = "/home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata", cradleOptsProg = CradleAction: Default}
[Debug] Session loading result: Right (ComponentOptions {componentOptions = ["-dynamic"], componentRoot = "/home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata", componentDependencies = []},"/usr/lib/ghc-8.10.4")
[Info] Using interface files cache dir: /home/berberman/.cache/ghcide/main-1a596a151463f2c53ee4feb14ecd276a1ccebfda
[Info] Making new HscEnv[main]
[Debug] New Component Cache HscEnvEq: (([],Just HscEnvEq 87),fromList [])
[Debug] Known files updated: fromList [(TargetFile NormalizedFilePath "/home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata/BrittanyCRLF.hs",fromList ["/home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata/BrittanyCRLF.hs"])]
[Debug] Restarting build session (aborting the previous one took 0.00s)
[Debug] Finishing build session(exception: AsyncCancelled)
[Info] finish: brittany (took 0.00s)
[Error] Fatal error in server thread: thread blocked indefinitely in an MVar operation
tests: Control.Concurrent.Extra.signalBarrier, attempt to signal a barrier that has already been signaled
CallStack (from HasCallStack):
  errorIO, called at src/Control/Concurrent/Extra.hs:196:16 in extra-1.7.9-B289Y5Rzkww81ONKh8ABWZ:Control.Concurrent.Extra
  signalBarrier, called at src/Development/IDE/LSP/LanguageServer.hs:64:16 in ghcide-1.1.0.0-inplace:Development.IDE.LSP.LanguageServer
[Debug] finish: InitialLoad (took 0.10s)
[Debug] Set files of interest to: [(NormalizedFilePath "/home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata/BrittanyCRLF.hs",Modified {firstOpen = False})]
[Debug] Finishing build session(exception: AsyncCancelled)
[Debug] Restarting build session (aborting the previous one took 0.00s)
[Debug] Modified text document: file:///home/berberman/Desktop/haskell/haskell-language-server/plugins/hls-brittany-plugin/test/testdata/BrittanyCRLF.hs
[Debug] Finishing build session(exception: AsyncCancelled)
OK (0.45s)

@pepeiborra
Copy link
Collaborator

I thought you replaced this barrier with an MVar at some point - that change makes sense to me, since we call it from two places and therefore it shouldn't be a barrier.

The "thread blocked indefinitely in an MVar operation" error is ugly - it means that the main loop has finished and no longer holds a reference to the shared state, but there are background threads still waiting on MVars. It would be good to identify which threads and clean them up properly, or just make sure we use withAsync instead of async where possible

@berberman berberman added the merge me Label to trigger pull request merge label Apr 5, 2021
@berberman berberman linked an issue Apr 5, 2021 that may be closed by this pull request
@@ -107,22 +107,6 @@ library
TypeOperators,
ViewPatterns


executable test-server
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent change. Thanks!

Copy link
Member

@jneira jneira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks!

@berberman
Copy link
Collaborator Author

Branch protection feels unhappy with the skipped nix jobs :(

@jneira
Copy link
Member

jneira commented Apr 6, 2021

Branch protection feels unhappy with the skipped nix jobs :(

Umm, i have to change it to skip steps instead the entire job, like #1656

@berberman berberman merged commit 2d1a588 into haskell:master Apr 6, 2021
@berberman berberman deleted the plugin-tests2 branch April 6, 2021 07:37
Comment on lines +79 to +88
silenceStderr :: IO a -> IO a
silenceStderr action = withTempFile $ \temp ->
bracket (openFile temp ReadWriteMode) hClose $ \h -> do
old <- hDuplicate stderr
buf <- hGetBuffering stderr
h `hDuplicateTo'` stderr
action `finally` do
old `hDuplicateTo'` stderr
hSetBuffering stderr buf
hClose old
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we can no longer use LSP_TEST_LOG_STDERR=1 to get more info about a failing test, since silencing stderr is now hard coded?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//cc @berberman

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably check for the environment variable manually and pretend that it was lsp-test that did it, to minimize disturbance of people's workflows.

(Here's lsp-test's code for it: https://github.com/haskell/lsp/blob/e707cbf5ca7077f70884ae0d2a8d016aa30ced5a/lsp-test/src/Language/LSP/Test.hs#L267-L275)

That way the current uses, like in ci doesn't have to change. Regardless of what we do, it should probably be documented together with the LSP_TEST_LOG_MESSAGES env variable in the contributing section, which it is currently not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, pr's welcome!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we can no longer use LSP_TEST_LOG_STDERR=1 to get more info about a failing test, since silencing stderr is now hard coded?

Yes, output from lsp server is completely removed in test suites which use this function since that change, because there were too many logs were printed not through the logger, but were directly written to stderr, messing up testing status. So setting environment variable LSP_TEST_LOG_MESSAGES actually won't work. However, this function is a temporary workaround, and we should cleanup such code in server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge me Label to trigger pull request merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI jobs of plugins' test suites got stuck randomly
6 participants