Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in sequencer aren't easily visible/exported anywhere #733

Open
RJPercival opened this issue Jul 11, 2017 · 11 comments
Open

Errors in sequencer aren't easily visible/exported anywhere #733

RJPercival opened this issue Jul 11, 2017 · 11 comments

Comments

@RJPercival
Copy link
Contributor

RJPercival commented Jul 11, 2017

$ go test -v github.com/google/trillian/integration
=== RUN   TestLiveLogIntegration
--- SKIP: TestLiveLogIntegration (0.00s)
        log_integration_test.go:50: Log integration test skipped as no tree ID provided
=== RUN   TestInProcessLogIntegration
E0711 11:45:35.614068   18160 log_operation_manager.go:388] failed to execute operation on logs: failed to retrieve full list of log IDs: failed to get tx for retrieving logIDs: context canceled
--- PASS: TestInProcessLogIntegration (17.52s)
=== RUN   TestInProcessLogIntegrationDuplicateLeaves
*** Test killed: ran too long (10m0s).
FAIL    github.com/google/trillian/integration  600.600s

MySQL 5.7 server installed using developer configuration.

@RJPercival
Copy link
Contributor Author

After seeing a similar timeout occur after reducing the max number of connections to the MySQL server (in PR #722), I tried increasing the max_connections MySQL server parameter from the default of 151 to 500. This had no effect.

@RJPercival
Copy link
Contributor Author

The cause of this was a bug in PR #734 that resulted in sequencing passes failing, but the integration test not reporting this as an error. Instead, it would just spin attempting pass after pass until the timeout was reached. The integration test should be made to abort if a terminal error occurs.

@RJPercival RJPercival changed the title Integration tests timeout on Windows using local MySQL server Integration tests timeout when sequencing passes fail Jul 12, 2017
@Martin2112
Copy link
Contributor

Is this still an issue? I haven't seen this fail recently.

@RJPercival
Copy link
Contributor Author

Yes, the LogOperationManager swallows sequencing errors and so the integration tests can't detect that there's a problem - they just run until the timeout. There's a relevant TODO here: https://github.com/google/trillian/blob/master/server/log_operation_manager.go#L378

@Martin2112
Copy link
Contributor

The TODO seems to no longer be there. Not sure if this is fixed.

@RJPercival
Copy link
Contributor Author

@Martin2112
Copy link
Contributor

Martin2112 commented Jan 26, 2018 via email

@RJPercival
Copy link
Contributor Author

RJPercival commented Jan 26, 2018

The TODO is now here:

// TODO(Martin2112): No mechanism for error reporting etc., this is OK for v1 but needs work

@Martin2112
Copy link
Contributor

I added error metrics in #957, doesn't help tests much but you can see signer errors on production logs.

@daviddrysdale daviddrysdale changed the title Integration tests timeout when sequencing passes fail Errors in sequencer aren't easily visible/exported anywhere Aug 14, 2018
@RJPercival RJPercival removed their assignment Jan 20, 2020
@paulmattei
Copy link
Contributor

Next step: The integration test should use the new metric

@pav-kv pav-kv self-assigned this Jan 22, 2021
@pav-kv
Copy link
Contributor

pav-kv commented Jan 22, 2021

Blocked on #1640.

@pav-kv pav-kv removed their assignment Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants