Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious Android testing failure: connection reset by peer #38710

Closed
alexcrichton opened this issue Dec 30, 2016 · 0 comments · Fixed by #38883
Closed

Spurious Android testing failure: connection reset by peer #38710

alexcrichton opened this issue Dec 30, 2016 · 0 comments · Fixed by #38883
Labels
A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) O-android Operating system: Android

Comments

@alexcrichton
Copy link
Member

First seen as https://travis-ci.org/rust-lang/rust/jobs/187737907, I've also seen this from time to time on other locations but haven't ever tracked it down. Wanted to have an issue to track it though!

Failures tend to look like:

---- [debuginfo-gdb] debuginfo-gdb/vec.rs stdout ----
	NOTE: compiletest thinks it is using GDB without native rust support
error: line not found in debugger output: $1 = {1, 2, 3}
status: exit code: 0
command: arm-linux-androideabi-gdb -quiet -batch -nx -command=/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo/vec.debugger.script
stdout:
------------------------------------------
------------------------------------------
stderr:
------------------------------------------
/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo/vec.debugger.script:4: Error in sourced command file:
Remote communication error.  Target disconnected.: Connection reset by peer.
------------------------------------------
thread '[debuginfo-gdb] debuginfo-gdb/vec.rs' panicked at 'explicit panic', /checkout/src/tools/compiletest/src/runtest.rs:2465
@alexcrichton alexcrichton added O-android Operating system: Android A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) labels Dec 30, 2016
alexcrichton added a commit to alexcrichton/rust that referenced this issue Jan 6, 2017
Local testing showed that I was able to reproduce an error where debuginfo tests
on Android would fail with "connection reset by peer". Further investigation
turned out that the gdb tests are android with bit of process management:

* First an `adb forward` command is run to ensure that the host's port 5039 is
  the same as the emulator's.
* Next an `adb shell` command is run to execute the `gdbserver` executable
  inside the emulator. This gdb server will attach to port 5039 and listen for
  remote gdb debugging sessions.
* Finally, we run `gdb` on the host (not in the emulator) and then connect to
  this gdb server to send it commands.

The problem was happening when the host's gdb was failing to connect to the
remote gdbserver running inside the emulator. The previous test for this was
that after `adb shell` executed we'd sleep for a second and then attempt to make
a TCP connection to port 5039. If successful we'd run gdb and on failure we'd
sleep again.

It turns out, however, that as soon as we've executed `adb forward` all TCP
connections to 5039 will succeed. This means that we would only ever sleep for
at most one second, and if this wasn't enough time we'd just fail later because
we would assume that gdbserver had started but it may not have done so yet.

This commit fixes these issues by removing the TCP connection to test if
gdbserver is ready to go. Instead we read the stdout of the process and wait for
it to print that it's listening at which point we start running gdb. I've found
that locally at least I was unable to reproduce the failure after these changes.

Closes rust-lang#38710
bors added a commit that referenced this issue Jan 8, 2017
compiletest: Fix flaky Android gdb test runs

Local testing showed that I was able to reproduce an error where debuginfo tests
on Android would fail with "connection reset by peer". Further investigation
turned out that the gdb tests are android with bit of process management:

* First an `adb forward` command is run to ensure that the host's port 5039 is
  the same as the emulator's.
* Next an `adb shell` command is run to execute the `gdbserver` executable
  inside the emulator. This gdb server will attach to port 5039 and listen for
  remote gdb debugging sessions.
* Finally, we run `gdb` on the host (not in the emulator) and then connect to
  this gdb server to send it commands.

The problem was happening when the host's gdb was failing to connect to the
remote gdbserver running inside the emulator. The previous test for this was
that after `adb shell` executed we'd sleep for a second and then attempt to make
a TCP connection to port 5039. If successful we'd run gdb and on failure we'd
sleep again.

It turns out, however, that as soon as we've executed `adb forward` all TCP
connections to 5039 will succeed. This means that we would only ever sleep for
at most one second, and if this wasn't enough time we'd just fail later because
we would assume that gdbserver had started but it may not have done so yet.

This commit fixes these issues by removing the TCP connection to test if
gdbserver is ready to go. Instead we read the stdout of the process and wait for
it to print that it's listening at which point we start running gdb. I've found
that locally at least I was unable to reproduce the failure after these changes.

Closes #38710
frewsxcv pushed a commit to frewsxcv/rust that referenced this issue Jan 9, 2017
Local testing showed that I was able to reproduce an error where debuginfo tests
on Android would fail with "connection reset by peer". Further investigation
turned out that the gdb tests are android with bit of process management:

* First an `adb forward` command is run to ensure that the host's port 5039 is
  the same as the emulator's.
* Next an `adb shell` command is run to execute the `gdbserver` executable
  inside the emulator. This gdb server will attach to port 5039 and listen for
  remote gdb debugging sessions.
* Finally, we run `gdb` on the host (not in the emulator) and then connect to
  this gdb server to send it commands.

The problem was happening when the host's gdb was failing to connect to the
remote gdbserver running inside the emulator. The previous test for this was
that after `adb shell` executed we'd sleep for a second and then attempt to make
a TCP connection to port 5039. If successful we'd run gdb and on failure we'd
sleep again.

It turns out, however, that as soon as we've executed `adb forward` all TCP
connections to 5039 will succeed. This means that we would only ever sleep for
at most one second, and if this wasn't enough time we'd just fail later because
we would assume that gdbserver had started but it may not have done so yet.

This commit fixes these issues by removing the TCP connection to test if
gdbserver is ready to go. Instead we read the stdout of the process and wait for
it to print that it's listening at which point we start running gdb. I've found
that locally at least I was unable to reproduce the failure after these changes.

Closes rust-lang#38710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) O-android Operating system: Android
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant