-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: alterpk-tpcc-250 failed #48428
Comments
Hm, I expected to find a fatal OOM, but it's not the case. For some reason node 2 and node 3 lost connections to node 1 which is struggling tremendously. Node 2:
Node 3:
And here are the logs around 08:31:15 from node 1 (note I copy-pasted a chunk, so the logs contain a lot less entries than usually):
I'm very confused why there are no runtime stats messages after 08:30:43 - those should be logged every 10 seconds. The next one appears only at 08:46:53 and then at 08:54:19. It appears to me that cockroach process is somehow getting suspended(?). Or maybe it does hit OOM error, but the process is not killed and is thrashed instead (not sure if it is possible). That seems somewhat unlikely because the machine has 14GiB of RAM, and the last logged stats has 12GiB RSS. I don't see anything suspicious in the logs from node 1 before this badness started happening. I'm thinking that it could have been an infra flake and want to see another occurrence of the problem before digging in further. cc @jordanlewis @asubiotto |
Possible that this is just due to the overload, but it could indicate that the disks had issues. The runtime stats disappearing for 8 minutes is suspicious, especially since other logs do make it through in the meantime (if the disks were stalled, you'd get a bunch of log messages all at once at some point). The overload/issue must have blocked something in the loop that periodically prints the runtime. We had such problems before, though we addressed them: f53c14a#diff-09856fe9becddf0199651f451409356aR1957 There might be something else that is blocking. It will be hard to find out now. |
The rocksdb log also shows a giant gap (usually there's something every minute):
I agree that this might be an infra flake. |
(roachtest).alterpk-tpcc-250 failed on master@683f0d561bf9b73902edb27d681bca5333bdad60:
|
@tbg thanks for looking into the failure. The new failure is a fatal OOM:
I'm confused though why it would occur at this point - the machine has 14GiB RAM, so there should be 1-2GiB available. We turned the vectorized engine
I'll investigate whether we're missing some memory accounting in the vectorized engine. |
Another interesting question here is why the memory usage with Here is the peak of the usage I observed when running
and with
I'm guessing that with rowexec hash aggregator we now eagerly delete buckets from the hash table which allows for faster GC, and the "true" peak usage with |
(roachtest).alterpk-tpcc-250 failed on master@f89c93fa20887f8d269149d8bc573ba8e00c99e2:
|
(roachtest).alterpk-tpcc-250 failed on master@1d6db7e386e3ab21bca53f38637014f61153dc90:
|
(roachtest).alterpk-tpcc-250 failed on master@c1abb272c94a437f1df9175fc30dc6a6729d3338:
|
(roachtest).alterpk-tpcc-250 failed on master@e6b47088aba9c9978501b966a0e88aeb273d9990:
|
48511: colexec: some memory accounting and performance optimizations r=yuzefovich a=yuzefovich **colexec: account for aggregate functions structs** Previously, we were not accounting for the memory used by aggregate functions structs. This behavior is acceptable in case of ordered aggregator (because we will only constant number of such structs), but it's no bueno for the hash aggregator - we will be creating a separate struct for each group. This is now fixed by performing the memory accounting upon creation of these aggregate functions structs. This commit also adds accounting for `hashAggFuncs` structs in the hash aggregator. However, this commit does not address the missing memory accounting of the Golang's `map` that we use for mapping a hash code to all aggregate builtins (i.e. "groups") that correspond to that hash code. This is left as TODO, but we need to address it for 20.2. A note here is that we might be replacing the usage of this `map` with our vectorized hash table, so it'll probably make sense to wait for that. Addresses: #48428. Release note: None **colexec: introduce Flush method into aggregateFunc interface** I was looking at cpu profile with memory intensive hash aggregation, and it was saying that we spend noticeable amount of time checking whether aggregation function is `done`. This can be avoided if we refactor the interface slightly which is what this commit does. Previously, we used `Compute` method to both compute the aggregation on non-empty batch and to flush the result of aggregation of the last group on an empty batch. These two purposes are now split into two functions: the former is done by the same `Compute` function and the latter is done by newly-introduced `Flush` function which must be called to "flush" the result for the last group. Release note: None **colexec: generate different structs for two COUNT variants** In one profile I saw noticeable time spent on an `if` condition that determines whether we have `count_rows` aggregate or not. This commit refactors the template to generate two different structs which allows us to remove that check. Release note: None **typeconv: change map usage to family switch** In a cpu profile I noticed that some time was spent in the accesses to the map when instantiating aggregate funcs. This can be avoided by changing the usage of map to a switch on a type family, and the function should be inlined. Release note: None 48548: colexec: clean up materializer a bit r=yuzefovich a=yuzefovich While working on cfetcher + materializer instead of table reader, I noticed something weird - that we get `OutputTypes` in the materializer, and use those when converting physical vectors to datums. I thought this was a bug because `OutputTypes` returns the type schema after post-processing stage, and internal type schema of a processor could be different. It turns out we simply always pass in empty `PostProcessSpec` into the materializer because we expect the input operator to handle the post-processing itself. This commit cleans it up by removing that unused argument and the call to `ProcessRowHelper`. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]>
(roachtest).alterpk-tpcc-250 failed on master@752c32580ff452bb0039e5af0c00044b91bdcb12:
|
(roachtest).alterpk-tpcc-250 failed on master@f66ff29e25bc546ee239867e316485528e11e3dc:
|
(roachtest).alterpk-tpcc-250 failed on master@5156843cc23adecb6a70fabf19f51f46de1241ec:
|
(roachtest).alterpk-tpcc-250 failed on master@2cbf620cd229a55622fe17fb15d20ada1dbcccd3:
|
(roachtest).alterpk-tpcc-250 failed on master@7ab7fb86f0634df1f5a0b04460e1f3a1d6bead1f:
|
48831: opt: implement ON DELETE SET NULL/DEFAULT r=RaduBerinde a=RaduBerinde Implementing the ON DELETE SET NULL and SET DEFAULT actions. These actions trigger an update in the child table whenever a row is deleted from the parent. Release note: None 49102: colexec: aggregator accounting fix and performance optimizations r=yuzefovich a=yuzefovich **colexec: fix wrong usage of pointer when getting the size of a struct** I mistakenly used `unsafe.SizeOf(&aggFuncStruct{})` when calculating the size of the struct, and as a result, the sizes were all 8 bytes, but they should have been higher. This fixed the discrepancy I saw in the heap profiles and reported usage to bytes monitor. Addresses: #48428. Release note: None **colexec: put aggregate funcs in a separate file** This is just code movement and some files' renaming. Release note: None **colexec: introduce aggregate function allocator** This commit introduces an aggregate functions allocator object that supports an arbitrary function schema as well as a single aggregate function alloc that pools allocations of the same statically-typed aggregate function. All of these objects support arbitrary allocation size, so ordered aggregator sets it to 1 whereas hash aggregator sets it to 64. Release note: None **colexec: pool allocations of slices of pointers in hash aggregator** This gives modest (5% or so) improvements on microbenchmarks when group sizes are small. Release note: None Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
I looked into the logs of three latest failures, and for all of them the allocation request that tipped the node over came from the storage engine (AFAICT). Three days ago:
Two days ago:
One day ago:
Note that the machine has 14GB of RAM, so both Go and CGo usages seem to be exceeding their limits. I'm assuming that the former is due to some missing memory accounting in Pebble as well as missing memory accounting in the hash aggregator of the Golang's What makes these crashes interesting is that in the past I've seen that Go's allocations do not increase this significantly at once due to SQL engine - we see that one day ago there was still about 2GB of RAM available. This makes me think that maybe Pebble's allocations should be investigated. I'll ask Storage friends. |
Four days ago the killing request allocation came from the vectorized engine:
Five days ago the logs abruptly stop at
8 days ago the logs also abruptly stopped at
|
Have you been looking at heap profiles? You can run this test with |
(roachtest).alterpk-tpcc-250 failed on master@73631089c58554c289d0fb1b26cad62667388335:
|
(roachtest).alterpk-tpcc-250 failed on master@8fec3f4c6d136a86f472c975edd36b75e5ab9a8c:
|
(roachtest).alterpk-tpcc-250 failed on master@1520ad2ba7c926f8043de8b6e044ab35c2f67b13:
|
|
(roachtest).alterpk-tpcc-250 failed on master@809829bfe7ff27a610fa78f409fae658b9f2d9d9:
|
I think that |
Here is the commit message that will close this issue:
|
(roachtest).alterpk-tpcc-250 failed on master@425eaa8fb05fc32b2c42827b85338daa52f4177c:
More
Artifacts: /alterpk-tpcc-250
See this test on roachdash
powered by pkg/cmd/internal/issues
The text was updated successfully, but these errors were encountered: