Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPA: authorization based on request body #2518

Merged
merged 3 commits into from
Feb 15, 2024
Merged

Conversation

mjungsbluth
Copy link
Collaborator

@mjungsbluth mjungsbluth commented Aug 16, 2023

This PR adds the ability to use the request's body in authorisation decisions with Open Policy Agent.

It supports the OPA Envoy plugin behaviour and only parses the body up to a configurable maximum size and otherwise tees the request body. This is to avoid parsing large bodies and create memory and performance issues inside Skipper.

Disabling body parsing can be configured via the Open Policy Agent configuration file. It supports the same content types that the upstream OPA plugin supports: application/json and multipart/form-data.

To make negative impact on latency or memory consumption measurable, two new filter names are introduced that capture the intent to use the body. The implementation is also smart in the sense that it will check if the policy actually uses the body on from the input object before inspecting the body stream.

@mjungsbluth mjungsbluth marked this pull request as ready for review August 21, 2023 07:47
@@ -37,6 +40,7 @@ const (
defaultReuseDuration = 30 * time.Second
defaultShutdownGracePeriod = 30 * time.Second
DefaultCleanIdlePeriod = 10 * time.Second
DefaultMaxBodySize = 128 * 1024 * 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels to much. What is the rule of thumb to estimate this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we have no default constant, have flag value zero and configure actual value in the deployment manifest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also make it lower case, private, and change it later. I agree that right now 128MB seems to be a bit big for a body.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would 10 MB be fine?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit I got carried away and did not see it when reviewing my changes. 128KB is what I intended to actually use. This is based on internal use cases that we looked at where 60-100KB of JSON are valid cases so far.
I think the max header size is currently 1MB which we could also default to and would already be generous. 10MB is already strange for an authz use case IMO.

}

var err error
rawBody, err = bufferedReader.Peek(opa.maxBodyBytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's likely blocking call to read(2) and can be used as DoS vector.
Can you please make sure it's not blocking, but streaming in case you need to work on the request body?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point it is very likely blocking, any pointers how to prove that or improve it? As we need to parse the body before feeding it into OPA, we kind of need to wait. We could timeout though on read and rather fail for slow clients (or attackers) but I am not familiar enough with Skipper's architecture or Golang...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to create another filter for the body inspection.
Then for the implementation it would be a streaming filter that works on the body and as soon as it is clear that the body should be denied stop by returning an error, similar to blockContent filter.
For details how to do streaming filters: #2428 , I also had some more diret pointers in chat.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szuecs I have some questions
With streaming do you mean that

  1. We read the body partially like the blockContent filter does?
  2. With streaming is the expectation to still hit the backend before the request filter finishes the processing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers @szuecs, seems doable after reading the linked PR, just a few clarifying questions (and slightly separate topics)

(a) Separate Filter
Is the motivation to not touch the body unless needed or is there some other thing that is missing?
It should be possible to be a bit smarter in the filter and only parse the body only if it is really needed by the policy (needs some exploration with partial evaluation) and it fits from a size perspective. From a UX perspective I think ending up with different filters for each use case could be confusing but it is also a bit hard to predict which other cases we'll see. This would also be a good motivator to include a benchmark...

(b) Streaming body
Given that the mentioned PR is not yet merged, this essentially would result in a special purpose wrapper, do you think this is ok and we try to unify later or should we wait for the PR to be merged?
Secondly, the policy can influence the returned body / headers. If the deny happens during the body processing, I am not yet sure where we would implement that. I assume that the error returned during Read() is somehow propagated to the Response() method of the filter or is there a better way for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a) if it's possible to know at parse time that you need to touch the body, then it would be great to only touch the body in case you need.
(b) yes it would already send the request headers to the backend and start streaming, but your filter will be able to stop the body streaming, because it gets the bytes before these will be written into the wire. You can pass errors to the Response of the filter via filtercontext's stateBag (map[string]interface{}) .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, then I'll proceed with making the changes:
(a) will try to get this info, if that is not feasible or slow, we can default to the separate filter
(b) will switch to a streaming processing then with the information you provided...

Copy link
Collaborator Author

@mjungsbluth mjungsbluth Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When starting to implement (b), I encountered an issue that I missed: The decision can influence the headers that are sent to the downstream service and we have an internal use case that will rely on that.
I assume that the actual Read for passing the request is actually done by the http.Client that makes the outgoing request. This client actually copies the headers before starting to read the body which means that they can no longer be influenced.
I am now a bit unsure how to proceed. (a) seems possible after initial testing and the read buffer can be optimised so that parsing the body does not always consume the max body size but this would still block in the Request() method of the filter. I will reach out via internal chat to discuss options...

body := req.Body
var rawBody []byte
if body != nil && !opa.EnvoyPluginConfig().SkipRequestBodyParse && opa.maxBodyBytes > 0 {
bufferedReader := bufio.NewReaderSize(body, opa.maxBodyBytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always allocate maxBodyBytes while actual body could be much smaller

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it saved re-implementing a smarter buffered reader which I can also do. My assumption was that the buffer is typically in the range of hundreds of KBs so should not hurt too much. Would you recommend re-doing this with a fixed size buffer (that is in low two digits KBs)?

@mjungsbluth mjungsbluth force-pushed the opa_authorize_body branch 3 times, most recently from c4ff548 to 7853a61 Compare September 21, 2023 07:39
@mjungsbluth
Copy link
Collaborator Author

The PR is updated with the following changes:

  • There are two separate filter aliases that enable body parsing but otherwise re-use the existing filter code
  • body parsing is only done if not allowed by the control plane and the policy actually depends on the body for decision making
  • The body is read in chunks (by default 8kb) and stops at either content length or the maximum configured buffer size (1MB = may header size)

Happy to change things esp. if there are concerns with the default sizes.

config/config.go Outdated
@@ -497,6 +498,7 @@ func NewConfig() *Config {
flag.StringVar(&cfg.OpenPolicyAgentConfigTemplate, "open-policy-agent-config-template", "", "file containing a template for an Open Policy Agent configuration file that is interpolated for each OPA filter instance")
flag.StringVar(&cfg.OpenPolicyAgentEnvoyMetadata, "open-policy-agent-envoy-metadata", "", "JSON file containing meta-data passed as input for compatibility with Envoy policies in the format")
flag.DurationVar(&cfg.OpenPolicyAgentCleanerInterval, "open-policy-agent-cleaner-interval", openpolicyagent.DefaultCleanIdlePeriod, "JSON file containing meta-data passed as input for compatibility with Envoy policies in the format")
flag.Uint64Var(&cfg.OpenPolicyAgentMaxBodySize, "open-policy-agent-max-body-size", http.DefaultMaxHeaderBytes, "Maximum number of bytes from the body that are passed as input to the policy")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reasonable to depend on stdlib max header bytes?
I guess we should have our own MaxBodySize in openpolicyagent package.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change

@@ -161,6 +161,7 @@ func defaultConfig() *Config {
LuaModules: commaListFlag(),
LuaSources: commaListFlag(),
OpenPolicyAgentCleanerInterval: 10 * time.Second,
OpenPolicyAgentMaxBodySize: 1048576,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openpolicyagent.MaxBodySize ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, will change


This filter has the same parameters that the `opaAuthorizeRequest` filter has.

The body is parsed up to a maximum size that can be configured via the `-open-policy-agent-max-body-size` command line argument.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell about the default size here, too


This filter has the same parameters that the `opaServeResponse` filter has.

The body is parsed up to a maximum size that can be configured via the `-open-policy-agent-max-body-size` command line argument.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tell about the default size, too

body := req.Body

if body != nil && !opa.EnvoyPluginConfig().SkipRequestBodyParse &&
opa.maxBodyBytes > 0 && req.ContentLength <= int64(opa.maxBodyBytes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opa.maxBodyBytes > 0 seems to be included in req.ContentLength <= int64(opa.maxBodyBytes) and 0 case should be handled by body != nil

Copy link
Member

@szuecs szuecs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a test case for chaining the request filter?
I expect that people will use chaining and 2 days ago we found #2605.

@mjungsbluth
Copy link
Collaborator Author

Can you please add a test case for chaining the request filter?
I expect that people will use chaining and 2 days ago we found #2605.

Interesting 🤔. Yes will do that…

@mjungsbluth
Copy link
Collaborator Author

The PR has been updated, the chaining luckily worked without any changes and the other comments have been addressed as well.

@szuecs
Copy link
Member

szuecs commented Sep 25, 2023

👍

@szuecs
Copy link
Member

szuecs commented Sep 25, 2023

@AlexanderYastrebov please review, thanks

@mjungsbluth
Copy link
Collaborator Author

I will add one more memory setting to keep the total size of all concurrent bodies in check and avoid an OOM.

@mjungsbluth
Copy link
Collaborator Author

Added a new memory setting that is global for all in-flight requests that depend on body authz which is enforced using a semaphore. Exceeding that limit will start failing requests with a 5xx status. Happy to chat on names and if we should block instead of failing ...

func (opa *OpenPolicyAgentInstance) ExtractHttpBodyOptionally(req *http.Request) (io.ReadCloser, []byte, error) {
func bodyUpperBound(contentLength, maxBodyBytes int64) int64 {
if contentLength <= 0 {
return maxBodyBytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test that has no body in the request?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no. Will add one...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few tests against unknown and nil bodies.

skipper.go Outdated Show resolved Hide resolved
@@ -1804,7 +1804,7 @@ Requests can also be authorized based on the request body the same way that is s

This filter has the same parameters that the `opaAuthorizeRequest` filter has.

The body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-body-size` command line argument.
The body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-request-body-size` command line argument. To avoid OOM errors due to too many authorized body requests, another flag `-open-policy-agent-max-total-body-size` controls how much memory can be used across all requests with a default of 100MB. If in-flight requests that use body authorization exceed that limit, incoming requests that use the body will be rejected with an internal server error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also explain the equation that is used to limit max concurrent users.
Is it 100MB/1MB = 100 or something else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add this. It is effectively depending on the actual content length. So if you have an average of 80kb bodies, you can have 100MB/80KB=1280 concurrent requests before the filters would start to consume a lot more memory than granted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended the docs.

@szuecs
Copy link
Member

szuecs commented Oct 4, 2023

Added a new memory setting that is global for all in-flight requests that depend on body authz which is enforced using a semaphore. Exceeding that limit will start failing requests with a 5xx status. Happy to chat on names and if we should block instead of failing ...

Fail fast is preferable.

@@ -1857,7 +1857,7 @@ If you want to serve requests directly from an Open Policy Agent policy that use

This filter has the same parameters that the `opaServeResponse` filter has.

The body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-request-body-size` command line argument. To avoid OOM errors due to too many authorized body requests, another flag `-open-policy-agent-max-total-body-size` controls how much memory can be used across all requests with a default of 100MB. If in-flight requests that use body authorization exceed that limit, incoming requests that use the body will be rejected with an internal server error.
A request's body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-request-body-size` command line argument. To avoid OOM errors due to too many concurrent authorized body requests, another flag `-open-policy-agent-max-memory-body-parsing` controls how much memory can be used across all requests with a default of 100MB. If in-flight requests that use body authorization exceed that limit, incoming requests that use the body will be rejected with an internal server error. The number of concurrent requests is <max-memory-body-parsing> / min(avg(<request content-length>), <max-request-body-size>), so if requests on average have 100KB and the maximum memory is set to 100MB, on average 1024 authorized requests can be processed concurrently.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use Math syntax like in https://github.com/zalando/skipper/blob/master/docs/reference/filters.md?plain=1#L2289
I think this would be better readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, did not know that. Looks indeed nicer, will change

@szuecs
Copy link
Member

szuecs commented Oct 7, 2023

lgtm, only the doc could be nicer :)

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:559cd36314207aa50947cfa2a76e57d804e71e7d0b39af76e6b2529d440b3492" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:3ce2c32c2cc876e24a635a18828f8c7dd3536a727eb3fd67b147d7cd62d3401b" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:f9913c8c83cb014286853decb6a95354b266d4adb7097540407fcb44a5377d15" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:2f02d6e381f84dc6ce255296a49a7f138e3694a8c71a449ca115452db3643785" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:3e6e4423cddff23f38031c8fb8943773836aaaf7a3c92d531fcf71a8cd902539" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:0546dda4be56cf48aa3685aee31d0d2d92c6f00c41941eed13295e12c1589b3d" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:be90d71d7d76cf9131d3a7890929bf1e6a6e75d39b07e14eaf7e761afd4d0954" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@zalando-robot
Copy link

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:17bf6b3d9e34d30fb702c658c73766b5f410f6981e58f6134bc67aa66a43c70d" is not based on an approved base image. Any production deployment relying on this image will be blocked.

To create a compliant Docker image of your application, you must reference an allowed Docker image as its base image in your Dockerfile. This base image must come from the Open Source Registry namespace library and use a recommended version as listed in the documentation.

@szuecs szuecs added the major moderate risk, for example new API, small filter changes that have no risk like refactoring or logs label Jan 19, 2024
@szuecs
Copy link
Member

szuecs commented Jan 24, 2024

👍

@AlexanderYastrebov
Copy link
Member

@mjungsbluth This needs a rebase

Signed-off-by: Magnus Jungsbluth <[email protected]>
Signed-off-by: Magnus Jungsbluth <[email protected]>
@mjungsbluth
Copy link
Collaborator Author

@AlexanderYastrebov rebased onto current tip...

@szuecs
Copy link
Member

szuecs commented Feb 15, 2024

👍

1 similar comment
@AlexanderYastrebov
Copy link
Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major moderate risk, for example new API, small filter changes that have no risk like refactoring or logs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants