Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of snapshot: some improvments to the snapshot process into release/1.14.x #17276

Conversation

hc-github-team-consul-core
Copy link
Collaborator

Backport

This PR is auto-generated from #17236 to be assessed for backporting due to the inclusion of the label backport/1.14.

The below text is copied from the body of the original PR.


Description

snapshot: some improvments to the snapshot process

  • Add raft to the snapshot logger because the log messages are from raft package, making it consistent with other snapshot logs like starting snapshot up to, which are from the raft package as well. Also it allows users know where to search for the logger message
// before
2023-05-08T10:10:19.954-0400 [INFO]  agent.server.raft: starting snapshot up to: index=30
2023-05-08T10:10:19.954-0400 [INFO]  agent.server.snapshot: creating new snapshot: path=/tmp/dc-2-consul-server-1/raft/snapshots/3-30-1683555019954.tmp
2023-05-08T10:10:19.967-0400 [INFO]  agent.server.raft: snapshot complete up to: index=30

// after
2023-05-08T10:04:20.361-0400 [INFO]  agent.server.raft: starting snapshot up to: index=25
2023-05-08T10:04:20.361-0400 [INFO]  agent.server.raft.snapshot: creating new snapshot: path=/tmp/dc-2-consul-server-1/raft/snapshots/2-25-1683554660361.tmp
2023-05-08T10:04:20.382-0400 [INFO]  agent.server.raft: snapshot complete up to: index=25
2023-05-08T10:04:20.382-0400 [INFO]  agent.server: creating temporary file of snapshot: path=/var/folders/c0/0_4qftyd47g8bkq_4_5dpw4m0000gn/T/snapshot3125735996
  • Print out the path of the temp snapshot file for trouble shooting

  • Update the doc a bit

PR Checklist

  • updated test coverage
  • external facing docs updated
  • appropriate backport labels added
  • not a security concern

Overview of commits

David Yu and others added 30 commits March 9, 2023 14:29
* jira pr check filter out dependabot and oss/ent merges
Add peer locality to discovery chains
* fixes for unsupported partitions field in CRD metadata block

* Apply suggestions from code review

Co-authored-by: Luke Kysow <[email protected]>

---------

Co-authored-by: Luke Kysow <[email protected]>
* Consul WAN Fed with Vault Secrets Backend document updates

* Corrected dc1-consul.yaml and dc2-consul.yaml file highlights

* Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx

Co-authored-by: trujillo-adam <[email protected]>

* Update website/content/docs/k8s/deployment-configurations/vault/wan-federation.mdx

Co-authored-by: trujillo-adam <[email protected]>

---------

Co-authored-by: trujillo-adam <[email protected]>
Co-authored-by: Ashvitha Sridharan <[email protected]>
Co-authored-by: Freddy <[email protected]>

Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory
where a unix socket will be created with the name
`<namespace>_<proxy_id>.sock` to forward Envoy metrics.

If set, this will configure:
- In bootstrap configuration a local stats_sink and static cluster.
  These will forward metrics to a loopback listener sent over xDS.

- A dynamic listener listening at the socket path that the previously
  defined static cluster is sending metrics to.

- A dynamic cluster that will forward traffic received at this listener
  to the hcp-metrics-collector service.


Reasons for having a static cluster pointing at a dynamic listener:
- We want to secure the metrics stream using TLS, but the stats sink can
  only be defined in bootstrap config. With dynamic listeners/clusters
  we can use the proxy's leaf certificate issued by the Connect CA,
  which isn't available at bootstrap time.

- We want to intelligently route to the HCP collector. Configuring its
  addreess at bootstrap time limits our flexibility routing-wise. More
  on this below.

Reasons for defining the collector as an upstream in `proxycfg`:
- The HCP collector will be deployed as a mesh service.

- Certificate management is taken care of, as mentioned above.

- Service discovery and routing logic is automatically taken care of,
  meaning that no code changes are required in the xds package.

- Custom routing rules can be added for the collector using discovery
  chain config entries. Initially the collector is expected to be
  deployed to each admin partition, but in the future could be deployed
  centrally in the default partition. These config entries could even be
  managed by HCP itself.
This commit adds a sameness-group config entry to the API and structs packages. It includes some validation logic and a new memdb index that tracks the default sameness-group for each partition. Sameness groups will simplify the effort of managing failovers / intentions / exports for peers and partitions.

Note that this change purely to introduce the configuration entry and does not include the full functionality of sameness-groups.
If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates.
* Add copyright headers to UI files

* Ensure copywrite file ignores external libs
* docs(discovery): typo

* docs(discovery): EOF and trim lines

---------

Co-authored-by: trujillo-adam <[email protected]>
This commit fixes an issue where trust bundles could not be read
by services in a non-default namespace, unless they had excessive
ACL permissions given to them.

Prior to this change, `service:write` was required in the default
namespace in order to read the trust bundle. Now, `service:write`
to a service in any namespace is sufficient.
* Add known issues to Raft WAL docs.

* Refactor update based on review feedback
* Refactored "NewGatewayService" to handle namespaces, fixed
TestHTTPRouteFlattening test

* Fixed existing http_route tests for namespacing

* Squash aclEnterpriseMeta for ResourceRefs and HTTPServices, accept
namespace for creating connect services and regular services

* Use require instead of assert after creating namespaces in
http_route_tests

* Refactor NewConnectService and NewGatewayService functions to use cfg
objects to reduce number of method args

* Rename field on SidecarConfig in tests from `SidecarServiceName` to
`Name` to avoid stutter
* ip config entry

* name changing

* move to ent

* ent version

* renaming

* change format

* renaming

* refactor

* add default values
…to connect (#16430)

* First cluster grpc service should be NodePort

This is based on the issue opened here hashicorp/consul-k8s#1903

If you follow the documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s exactly as it is, the first cluster will only create the consul UI service on NodePort but not the rest of the services (including for grpc). By default, from the helm chart, they are created as headless services by setting clusterIP None. This will cause an issue for the second cluster to discover consul server on the first cluster over gRPC as it cannot simply cannot through gRPC default port 8502 and it ends up in an error as shown in the issue hashicorp/consul-k8s#1903

As a solution, the grpc service should be exposed using NodePort (or LoadBalancer). I added those changes required in both cluster1-values.yaml and cluster2-values.yaml, and also a description for those changes for the normal users to understand. Kindly review and I hope this PR will be accepted.

* Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx

Co-authored-by: trujillo-adam <[email protected]>

* Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx

Co-authored-by: trujillo-adam <[email protected]>

* Update website/content/docs/k8s/deployment-configurations/single-dc-multi-k8s.mdx

Co-authored-by: trujillo-adam <[email protected]>

---------

Co-authored-by: trujillo-adam <[email protected]>
…16661)

* Add test for http routes

* Add fix

* Fix tests

* Add changelog entry

* Refactor and fix flaky tests
Bumps [tomhjp/gh-action-jira-search](https://github.com/tomhjp/gh-action-jira-search) from 0.2.1 to 0.2.2.
- [Release notes](https://github.com/tomhjp/gh-action-jira-search/releases)
- [Commits](tomhjp/gh-action-jira-search@v0.2.1...v0.2.2)

---
updated-dependencies:
- dependency-name: tomhjp/gh-action-jira-search
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…5921)

Bumps [atlassian/gajira-transition](https://github.com/atlassian/gajira-transition) from 2.0.1 to 3.0.1.
- [Release notes](https://github.com/atlassian/gajira-transition/releases)
- [Commits](atlassian/gajira-transition@v2.0.1...v3.0.1)

---
updated-dependencies:
- dependency-name: atlassian/gajira-transition
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: David Yu <[email protected]>
* add snapshot restore test

* add logstore as test parameter

* Use the correct image version

* make sure we read the logs from a followers to test the follower snapshot install path.

* update to raf-wal v0.3.0

* add changelog.

* updating changelog for bug description and removed integration test.

* setting up test container builder to only set logStore for 1.15 and higher

---------

Co-authored-by: Paul Banks <[email protected]>
Co-authored-by: John Murret <[email protected]>
@hc-github-team-consul-core hc-github-team-consul-core enabled auto-merge (squash) May 9, 2023 19:29
@hc-github-team-consul-core hc-github-team-consul-core force-pushed the backport/minor-update-snapshot-logger/vertically-choice-akita branch from 996460b to 8bd9bbe Compare May 9, 2023 19:29
@hc-github-team-consul-core hc-github-team-consul-core force-pushed the backport/minor-update-snapshot-logger/vertically-choice-akita branch from 507cbbf to 1e336aa Compare May 9, 2023 19:29
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto approved Consul Bot automated PR

@github-actions github-actions bot added pr/dependencies PR specifically updates dependencies of project theme/acls ACL and token generation theme/agent-cache Agent Cache theme/api Relating to the HTTP API interface theme/certificates Related to creating, distributing, and rotating certificates in Consul theme/cli Flags and documentation for the CLI interface theme/config Relating to Consul Agent configuration, including reloading theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies theme/consul-terraform-sync Relating to Consul Terraform Sync and Network Infrastructure Automation theme/contributing Additions and enhancements to community contributing materials theme/envoy/xds Related to Envoy support theme/health-checks Health Check functionality theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics theme/telemetry Anything related to telemetry or observability theme/tls Using TLS (Transport Layer Security) or mTLS (mutual TLS) to secure communication theme/ui Anything related to the UI type/ci Relating to continuous integration (CI) tooling for testing or releases type/docs Documentation needs to be created/updated/clarified labels May 9, 2023
@nathancoleman
Copy link
Member

Closing in favor of #17378

auto-merge was automatically disabled May 15, 2023 21:53

Pull request was closed

@nathancoleman nathancoleman deleted the backport/minor-update-snapshot-logger/vertically-choice-akita branch May 15, 2023 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/dependencies PR specifically updates dependencies of project theme/acls ACL and token generation theme/agent-cache Agent Cache theme/api Relating to the HTTP API interface theme/certificates Related to creating, distributing, and rotating certificates in Consul theme/cli Flags and documentation for the CLI interface theme/config Relating to Consul Agent configuration, including reloading theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies theme/consul-terraform-sync Relating to Consul Terraform Sync and Network Infrastructure Automation theme/contributing Additions and enhancements to community contributing materials theme/envoy/xds Related to Envoy support theme/health-checks Health Check functionality theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics theme/telemetry Anything related to telemetry or observability theme/tls Using TLS (Transport Layer Security) or mTLS (mutual TLS) to secure communication theme/ui Anything related to the UI type/ci Relating to continuous integration (CI) tooling for testing or releases type/docs Documentation needs to be created/updated/clarified
Projects
None yet
Development

Successfully merging this pull request may close these issues.