-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roach{prod,test}: build roach{prod,test}
from the SHA being tested
#51897
Comments
I'm thumbs-up on making this change, though we need to be quite careful in doing so. We haven't historically backported changes/fixes to roachtests to release branches. So we'll need to make sure to backport current master roachtests to the releases we're still making patch releases for (19.1, 19.2, and 20.1). |
Fixes cockroachdb#51965 (and all referencing issues). Roachprod clusters running v20.1+ crdb nodes persist this `cluster-bootstrapped` file on disk after explicitly bootstrapping the cluster. Roachprod then uses the existence of this file to avoid doubly bootstrapping the cluster. Given cockroachdb#51897 remains unresolved, master-built roachprod is used to run roachtests against the 20.1 branch. Some of those roachtests test mixed-version clusters that start off at 19.2. Consequently, we manually add this file where roachprod expects to find it for already-initialized clusters. (This is a pretty gross hack, that we should address by addressing cockroachdb#51897.) Release note: None
Fixes cockroachdb#51965 (and all referencing issues). Roachprod clusters running v20.1+ crdb nodes persist this `cluster-bootstrapped` file on disk after explicitly bootstrapping the cluster. Roachprod then uses the existence of this file to avoid doubly bootstrapping the cluster. Given cockroachdb#51897 remains unresolved, master-built roachprod is used to run roachtests against the 20.1 branch. Some of those roachtests test mixed-version clusters that start off at 19.2. Consequently, we manually add this file where roachprod expects to find it for already-initialized clusters. (This is a pretty gross hack, that we should address by addressing cockroachdb#51897.) Release note: None
52040: roachtest: fix release-20.1 roachtests failing due to double-init r=RaduBerinde a=irfansharif Fixes #51965 (and all referencing issues). Roachprod clusters running v20.1+ crdb nodes persist this `cluster-bootstrapped` file on disk after explicitly bootstrapping the cluster. Roachprod then uses the existence of this file to avoid doubly bootstrapping the cluster. Given #51897 remains unresolved, master-built roachprod is used to run roachtests against the 20.1 branch. Some of those roachtests test mixed-version clusters that start off at 19.2. Consequently, we manually add this file where roachprod expects to find it for already-initialized clusters. (This is a pretty gross hack, that we should address by addressing #51897.) Release note: None Co-authored-by: irfan sharif <[email protected]>
52015: sql: use a structured error to detect roachpb.BatchTimestampBeforeGCE… r=ajwerner a=ajwerner …rror Fixes #50198. See the issue for more details. This issue is so obscure it does not deserve a release note. Release note: None 52040: roachtest: fix release-20.1 roachtests failing due to double-init r=irfansharif a=irfansharif Fixes #51965 (and all referencing issues). Roachprod clusters running v20.1+ crdb nodes persist this `cluster-bootstrapped` file on disk after explicitly bootstrapping the cluster. Roachprod then uses the existence of this file to avoid doubly bootstrapping the cluster. Given #51897 remains unresolved, master-built roachprod is used to run roachtests against the 20.1 branch. Some of those roachtests test mixed-version clusters that start off at 19.2. Consequently, we manually add this file where roachprod expects to find it for already-initialized clusters. (This is a pretty gross hack, that we should address by addressing #51897.) Release note: None Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: irfan sharif <[email protected]>
Fixes cockroachdb#52181. This test doesn't make use of all the `roachtest start` smartness to figure out what sub-command to use and instead constructs the raw start command by hand. Given our support policy, let's just bump the minimum version this test expects to run. This feels like another instance where cockroachdb#51897 would be nice to have. Release note: None
51959: backupccl: distribute RESTORE work using DistSQL r=dt,yuzefovich a=pbardea This commit replaces restore's old method of coordinating work on a single node to instead use DistSQL. It creates a 2 stage DistSQL flow. The first stage assigns chunks to SplitAndScatter processors which forward spans that they scattered to the second stage made up of the RestoreData processors which ingest the data. The SplitAndScatter processors attempt to send the work to the RestoreData colocated with the leaseholder of the span after the scattering. Release note: None. 52269: roachtest: correctly start crdb in acceptance/rapid_restart r=andreimatei a=andreimatei --start-single-node was needed. Without it, I think the test's killing of a node raced with that node dieing by itself, and sometimes the race resulted in `cockroach stop` first seeing the process but then `/bin/bash: line 8: kill: (16016) - No such proces` Fixes #52060 Release note: None 52273: roachtest: remove confusing "--- PASS" log lines r=andreimatei a=andreimatei Two workloads were printing PASS/FAIL lines from inside the predicates passed to Searcher.Search. This is really confusing when reading the test's output, because they don't correspond to the test's disposition. Release note: None 52343: roachtest: reflake (skip) disk-stalled test for release-19.1 r=irfansharif a=irfansharif Fixes #52181. This test doesn't make use of all the `roachtest start` smartness to figure out what sub-command to use and instead constructs the raw start command by hand. Given our support policy, let's just bump the minimum version this test expects to run. This feels like another instance where #51897 would be nice to have. Release note: None Co-authored-by: Paul Bardea <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: irfan sharif <[email protected]>
We should do the same for |
Slack thread about this subject. @adityamaru is taking on this issue. Targeting making these changes post 20.2.0 being cut. |
This is in preparation for the work being done in cockroachdb#51897 Release note: None
This reverts commit 6b11918. Fixes cockroachdb#55954. Fixes cockroachdb#55953. Fixes cockroachdb#55952. Fixes cockroachdb#55951. Fixes cockroachdb#55949. In cockroachdb#30624, we split various column families up in the TPC-C workload. This had the unintended consequence of breaking CDC roachtests, because CDC does not support multiple column families per row: ``` CHANGEFEEDs are currently supported on tables with exactly 1 column family: warehouse has 2 ``` This PR reverts that commit. I intend to come back to this and introduce a `--families` flag to TPC-C (like we have to YCSB) that can be used to selectively toggle column families on or off. However, I'll do this after cockroachdb#51897 lands, because it's going to be more effort than it's worth to conditionally set this flag based on whether the version of workload used in the test supports it or not.
55983: Revert "tpcc: use multiple column families to avoid contention" r=nvanbenschoten a=nvanbenschoten This reverts commit 6b11918. Fixes #55954. Fixes #55953. Fixes #55952. Fixes #55951. Fixes #55949. In #30624, we split various column families up in the TPC-C workload. This had the unintended consequence of breaking CDC roachtests, because CDC does not support multiple column families per row: ``` CHANGEFEEDs are currently supported on tables with exactly 1 column family: warehouse has 2 ``` This PR reverts that commit. I intend to come back to this and introduce a `--families` flag to TPC-C (like we have to YCSB) that can be used to selectively toggle column families on or off. However, I'll do this after #51897 lands, because it's going to be more effort than it's worth to conditionally set this flag based on whether the version of workload used in the test supports it or not. Co-authored-by: Nathan VanBenschoten <[email protected]>
Informs: cockroachdb#51897 Release note: None
@adityamaru, going to close this out now. Thanks for diligently chewing through this! |
In our CI,
roach{prod,test}
is built from HEAD and run against all our major release branches. It's also run from HEAD when testing out our alpha releases. #51535 (comment) is an example of what can go wrong with this approach.This "temporary mismatch" between roachprod code and CRDB code, where roachprod expects to find certain commits as part of the 20.2 release and does not find them because of when the alpha was cut doesn't seem like something roachprod should be taught to handle. If anything, we should start considering running roachprod from the SHA being tested. Asking around, we've been running it from a version built off of HEAD because
roach{prod,test}
used to change quite a lot, which I don't think is any longer the case. In fact, making roachprod compatible across multiple versions is increasingly cumbersome and fragile, not to mention largely untested.If we don't run our unit tests from HEAD, we probably shouldn't do the same for our integration tests. Admittedly doing so forces us to think about compatibility issues, but I'm not sure that's enough of a reason to do such a thing.
The text was updated successfully, but these errors were encountered: