-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: fail changefeed when server.child_metrics.enabled cluster setting is false and metrics label config used #75682
Labels
A-cdc
Change Data Capture
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
E-quick-win
Likely to be a quick win for someone experienced.
T-cdc
Comments
amruss
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-cdc
Change Data Capture
T-cdc
labels
Jan 29, 2022
cc @cockroachdb/cdc |
We decided instead to do a warning since we want users to be able to toggle this cluster setting to change whether they are receiving the scopes |
@samiskin -- close this issue? or is it being worked on? |
samiskin
added a commit
to samiskin/cockroach
that referenced
this issue
Jan 9, 2023
Resolves cockroachdb#75682 Surfaces a notice of ``` server.child_metrics.enabled is set to false, metrics will only be published to the '<scope>' label when it is set to true" ``` When child_metrics setting isn't enabled during changefeed creation Release note (enterprise change): Changefeeds created/altered with a metrics_label set while server.child_metrics.enabled is false will now provide the user a notice upon creation. <what was there before: Previously, ...> <why it needed to change: This was inadequate because ...> <what you did about it: To address this, this patch ...>
craig bot
pushed a commit
that referenced
this issue
Jan 17, 2023
94239: loqrecovery: use captured meta range content for LOQ plans r=erikgrinaker a=aliher1911 Note: only last commit belongs to this PR. Will update description once #93157 lands. Previously loss of quorum recovery planner was using local replica info collected from all nodes to find source of truth for replicas that lost quorum. With online approach local info snapshots don't have atomicity. This could cause planner to fail if available replicas are caught in different states on different nodes. This commit adds alternative planning approach when range metadata is available. Instead of fixing individual replicas that can't make progress it finds ranges that can't make progress from metadata using descriptors and updates their replicas to recover from loss of quorum. This commit also adds replica collection stage as a part of make-plan command itself. To invoke collection from a cluster instead of using files one needs to provide --host and other standard cluster connection related flags (--cert-dir, --insecure etc.) as appropriate. Example command output for a local cluster with 3 out of 5 nodes surrvivng looks like: ``` ~/tmp ❯❯❯ cockroach debug recover make-plan --insecure --host 127.0.0.1:26257 >recover-plan.json Nodes scanned: 3 Total replicas analyzed: 228 Ranges without quorum: 15 Discarded live replicas: 0 Proposed changes: range r4:/System/tsd updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2]. range r80:/Table/106/1 updating replica (n1,s1):1 to (n1,s1):14. Discarding available replicas: [], discarding dead replicas: [(n5,s5):3,(n4,s4):2]. range r87:/Table/106/1/"paris"/"\xcc\xcc\xcc\xcc\xcc\xcc@\x00\x80\x00\x00\x00\x00\x00\x00(" updating replica (n1,s1):1 to (n1,s1):14. Discarding available replicas: [], discarding dead replicas: [(n5,s5):3,(n4,s4):2]. range r88:/Table/106/1/"seattle"/"ffffffH\x00\x80\x00\x00\x00\x00\x00\x00\x14" updating replica (n3,s3):3 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2]. range r105:/Table/106/1/"washington dc"/"L\xcc\xcc\xcc\xcc\xccL\x00\x80\x00\x00\x00\x00\x00\x00\x0f" updating replica (n3,s3):3 to (n3,s3):14. Discarding available replicas: [], discarding dead replicas: [(n5,s5):1,(n4,s4):2]. range r98:/Table/107/1/"boston"/"333333D\x00\x80\x00\x00\x00\x00\x00\x00\x03" updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2]. range r95:/Table/107/1/"seattle"/"ffffffH\x00\x80\x00\x00\x00\x00\x00\x00\x06" updating replica (n3,s3):2 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n4,s4):4,(n5,s5):3]. range r125:/Table/107/1/"washington dc"/"DDDDDDD\x00\x80\x00\x00\x00\x00\x00\x00\x04" updating replica (n3,s3):2 to (n3,s3):14. Discarding available replicas: [], discarding dead replicas: [(n4,s4):1,(n5,s5):3]. range r115:/Table/108/1/"boston"/"8Q\xeb\x85\x1e\xb8B\x00\x80\x00\x00\x00\x00\x00\x00n" updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2]. range r104:/Table/108/1/"new york"/"\x1c(\xf5\u008f\\I\x00\x80\x00\x00\x00\x00\x00\x007" updating replica (n2,s2):2 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):3]. range r102:/Table/108/1/"seattle"/"p\xa3\xd7\n=pD\x00\x80\x00\x00\x00\x00\x00\x00\xdc" updating replica (n3,s3):2 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n4,s4):4,(n5,s5):3]. range r126:/Table/108/1/"washington dc"/"Tz\xe1G\xae\x14L\x00\x80\x00\x00\x00\x00\x00\x00\xa5" updating replica (n3,s3):2 to (n3,s3):14. Discarding available replicas: [], discarding dead replicas: [(n4,s4):1,(n5,s5):3]. range r86:/Table/108/3 updating replica (n1,s1):1 to (n1,s1):14. Discarding available replicas: [], discarding dead replicas: [(n4,s4):3,(n5,s5):2]. range r59:/Table/109/1 updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2]. range r65:/Table/111/1 updating replica (n3,s3):3 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2]. Discovered dead nodes would be marked as decommissioned: n4, n5 Proceed with plan creation [y/N] y Plan created. To stage recovery application in half-online mode invoke: 'cockroach debug recover apply-plan --host=127.0.0.1:26257 --insecure=true <plan file>' Alternatively distribute plan to below nodes and invoke 'debug recover apply-plan --store=<store-dir> <plan file>' on: - node n2, store(s) s2 - node n1, store(s) s1 - node n3, store(s) s3 ``` Release note: None Fixes: #93038 Fixes: #93046 94948: changefeedccl: give notice when metrics_label set without child_metrics r=samiskin a=samiskin Resolves #75682 Surfaces a notice of ``` server.child_metrics.enabled is set to false, metrics will only be published to the '<scope>' label when it is set to true" ``` When child_metrics setting isn't enabled during changefeed creation Release note (enterprise change): Changefeeds created/altered with a metrics_label set while server.child_metrics.enabled is false will now provide the user a notice upon creation. 95009: tree: fix panic when encoding tuple r=rafiss a=rafiss fixes #95008 This adds a bounds check to avoid a panic. Release note (bug fix): Fixed a crash that could happen when formatting a tuple with an unknown type. 95294: sql: make pg_description aware of builtin function descriptions r=rafiss,msirek a=knz Epic: CRDB-23454 Fixes #95292. Needed for #88061. First commit from #95289. This also extends the completion rules to properly handle functions in multiple namespaces. Release note (bug fix): `pg_catalog.pg_description` and `pg_catalog.obj_description()` are now able to retrieve the descriptive help for built-in functions. 95356: server: remove unused migrationExecutor r=ajwerner a=ajwerner This is no longer referenced since #91627. Epic: none Release note: None Co-authored-by: Oleg Afanasyev <[email protected]> Co-authored-by: Shiranka Miskin <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Andrew Werner <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-cdc
Change Data Capture
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
E-quick-win
Likely to be a quick win for someone experienced.
T-cdc
See: https://cockroachlabs.atlassian.net/wiki/spaces/CDC/pages/2398552506/22.1+Metrics+Labels+Acceptance+Testing for reproducibility
When creating a changefeed using the CREATE CHANGEFED ... WITH metrics_label=X configuration the user must set the cluster setting server.child_metrics.enabled=true in order for the feature to work. If they do not set this cluster setting, we still allow them to create the changefeed, but the metrics label feature is silently not applied. We should instead fail the changefeed creation when this happens, with a similar error message to when COCKROACH_EXPERIMENTAL_ENABLE_PER_CHANGEFEED_METRICS=true is not set:
Jira issue: CRDB-12779
Epic CRDB-13931
The text was updated successfully, but these errors were encountered: