Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REFACTORED] Add support for Thrift and HTTP client-side fault injection. #673

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

HiramSilvey
Copy link

@HiramSilvey HiramSilvey commented Jan 16, 2025

💸 TL;DR

NOTE: This is a refactor of the approved PR #666 based on offline feedback. In addition to being a refactor, this also moves the fault injection middleware to be the final middleware for each protocol, as it is meant to simulate the upstream service being unavailable.

This PR introduces client-side fault injection capability to HTTP and Thrift clients. The behavior is controlled via new X-Bp-Fault headers, and does nothing if they aren't present, are invalid, or don't match the outgoing request.

📜 Details

If clients are currently passing matching X-Bp-Fault headers around and they happen to be valid according to this change, then unintended fault injection could take place. This is extremely unlikely to occur given how specific the headers and their valid values are, but I wanted to call it out as technically it could affect requests unintentionally.

🧪 Testing Steps / Validation

Thorough unit tests were added across the common library and protocol-specific code.

✅ Checks

  • CI tests (if present) are passing
  • Adheres to code style for repo
  • Contributor License Agreement (CLA) completed if not a Reddit employee

HiramSilvey and others added 30 commits November 14, 2024 18:19
TODO: Determine how to short-circuit Thrift requests.
TODO: Refactor into a common shared function with func parameters.
Need to add logging, tests, and potentially special Thrift error logic.
Need to adjust server name matching either for tests or
permanently. The issue with using server prefix instead of the
<service>.<namespace> paradigm is that care needs to be taken to
ensure the prefix doesn't match too many destinations. Blast radius
can explode in that model.

Prometheus exports and logging are next after that.
Co-authored-by: Andrew Boyle <[email protected]>
@HiramSilvey HiramSilvey changed the title Refactored fault injection [REFACTORED] Add support for Thrift and HTTP client-side fault injection. Jan 16, 2025
@HiramSilvey HiramSilvey marked this pull request as ready for review January 17, 2025 21:52
@HiramSilvey HiramSilvey requested a review from a team as a code owner January 17, 2025 21:52
@HiramSilvey HiramSilvey requested review from fishy, kylelemons and konradreiche and removed request for a team January 17, 2025 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant