Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: restore2TB/nodes=32 failed #31654

Closed
cockroach-teamcity opened this issue Oct 19, 2018 · 4 comments
Closed

roachtest: restore2TB/nodes=32 failed #31654

cockroach-teamcity opened this issue Oct 19, 2018 · 4 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/04cba2800919bdcf6a8467e8da97ae44b77a9626

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=restore2TB/nodes=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=974812&tab=buildLog

The test failed on master:
	test.go:1002: test timed out (3h0m0s)
	test.go:606,cluster.go:1110,restore.go:244,cluster.go:1432,errgroup.go:58: /home/agent/work/.go/bin/roachprod run teamcity-974812-restore2tb-nodes-32:1 -- ./cockroach sql --insecure -e "
						RESTORE csv.bank FROM
						'gs://cockroach-fixtures/workload/bank/version=1.0.0,payload-bytes=10240,ranges=0,rows=65104166,seed=1/bank'
						WITH into_db = 'restore2tb'" returned:
		stderr:
		
		stdout:
		: signal: killed
	test.go:606,cluster.go:1453,restore.go:250: context canceled

@cockroach-teamcity cockroach-teamcity added this to the 2.2 milestone Oct 19, 2018
@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Oct 19, 2018
@benesch
Copy link
Contributor

benesch commented Oct 19, 2018

Almost certainly is #31618.

@benesch benesch closed this as completed Oct 19, 2018
@benesch benesch reopened this Oct 19, 2018
@benesch
Copy link
Contributor

benesch commented Oct 19, 2018

I'll let @tschottdorf close it when the etcd/raft fix lands.

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/3035b84a682e61fb1cd34db4027dd41f7f2f226a

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=restore2TB/nodes=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=977057&tab=buildLog

The test failed on master:
	test.go:1037: test timed out (3h0m0s)
	test.go:639,cluster.go:1110,restore.go:244,cluster.go:1432,errgroup.go:58: /home/agent/work/.go/bin/roachprod run teamcity-977057-restore2tb-nodes-32:1 -- ./cockroach sql --insecure -e "
						RESTORE csv.bank FROM
						'gs://cockroach-fixtures/workload/bank/version=1.0.0,payload-bytes=10240,ranges=0,rows=65104166,seed=1/bank'
						WITH into_db = 'restore2tb'" returned:
		stderr:
		
		stdout:
		: signal: killed
	test.go:639,cluster.go:1453,restore.go:250: context canceled

tbg added a commit to tbg/cockroach that referenced this issue Oct 22, 2018
The tracking of the uncommitted portion of the log had a bug where
it wasn't releasing everything as it should've. As a result, over
time, all proposals would be dropped. We're hitting this way earlier
in our import tests, which propose large proposals. As an intentional
implementation detail, a proposal that itself exceeds the max
uncommitted log size is allowed only if the uncommitted log is empty.
Due to the leak, we weren't ever hitting this case and so AddSSTable
commands were often dropped indefinitely.

Fixes cockroachdb#31184.
Fixes cockroachdb#28693.
Fixes cockroachdb#31642.

Optimistically:
Fixes cockroachdb#31675.
Fixes cockroachdb#31654.
Fixes cockroachdb#31446.

Release note: None
craig bot pushed a commit that referenced this issue Oct 22, 2018
31554: exec: initial commit of execgen tool r=solongordon a=solongordon

Execgen will be our tool for generating templated code necessary for
columnarized execution. So far it only generates the
EncDatumRowsToColVec function, which is used by the columnarizer to
convert a RowSource into a columnarized Operator.

Release note: None

31610: sql: fix pg_catalog.pg_constraint's confkey column r=BramGruneir a=BramGruneir

Prior to this patch, all columns in the index were included instead of only the
ones being used in the foreign key reference.

Fixes #31545.

Release note (bug fix): Fix pg_catalog.pg_constraint's confkey column from
including columns that were not involved in the foreign key reference.

31689: storage: pick up fix for Raft uncommitted entry size tracking r=benesch a=tschottdorf

Waiting for the upstream PR

etcd-io/etcd#10199

to merge, but this is going to be what the result will look like.

----

The tracking of the uncommitted portion of the log had a bug where
it wasn't releasing everything as it should've. As a result, over
time, all proposals would be dropped. We're hitting this way earlier
in our import tests, which propose large proposals. As an intentional
implementation detail, a proposal that itself exceeds the max
uncommitted log size is allowed only if the uncommitted log is empty.
Due to the leak, we weren't ever hitting this case and so AddSSTable
commands were often dropped indefinitely.

Fixes #31184.
Fixes #28693.
Fixes #31642.

Optimistically:
Fixes #31675.
Fixes #31654.
Fixes #31446.

Release note: None

Co-authored-by: Solon Gordon <[email protected]>
Co-authored-by: Bram Gruneir <[email protected]>
Co-authored-by: Tobias Schottdorf <[email protected]>
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/2998190f18fab952357133aaca9fdda8bc52d5ac

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=restore2TB/nodes=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=978508&tab=buildLog

The test failed on master:
	test.go:1037: test timed out (3h0m0s)
	test.go:639,cluster.go:1110,restore.go:244,cluster.go:1432,errgroup.go:58: /home/agent/work/.go/bin/roachprod run teamcity-978508-restore2tb-nodes-32:1 -- ./cockroach sql --insecure -e "
						RESTORE csv.bank FROM
						'gs://cockroach-fixtures/workload/bank/version=1.0.0,payload-bytes=10240,ranges=0,rows=65104166,seed=1/bank'
						WITH into_db = 'restore2tb'" returned:
		stderr:
		
		stdout:
		: signal: killed
	test.go:639,cluster.go:1453,restore.go:250: context canceled

@craig craig bot closed this as completed in #31689 Oct 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

4 participants