Add dos and monitoring docs #160

MarcoPolo · 2022-07-30T13:11:53Z

Adds some documentation and links for libp2p monitoring and dos mitigation.

mxinden

Thank you @MarcoPolo for writing this up!

mxinden · 2022-08-02T04:28:00Z

content/reference/dos-mitigation.md

+
+# General tactics that work across libp2p implementations
+
+## Limit the number of concurrent streams your protocol needs


Optional: I think a similar section on the number of connections would be helpful.

content/reference/dos-mitigation.md

content/reference/monitoring.md

BigLep

Thanks for doing this @MarcoPolo . It's great starting to get to see how this will be exposed to users.

I'm copying in some of the comments I had in https://www.notion.so/pl-strflt/Guide-for-how-to-respond-to-resource-exhaustion-attacks-b10f55cc9a3d4917ae80c9b914e05e8c

The information above is really helpful/good. Looking at this fresh, I think we need to paint a big picture. I view these a layering that someone should think about. I’m apt to order chronologically based on the lifecycle of a libp2p application from design, development, testing, and operating.

What are the steps I can take to protect my application? Basically what are all the defenses I can have in place before my application is deployed?
1. (Should be thought about early) Architect application well with limiting blast radius as described above
2. Give clear signal of problems as highlighted above by using canonicallog
3. Setup/tune resource management for each libp2p node. Here we can link to resource manager docs. We have to make sure this is user-friendly. There are followups in More documentation around limits go-libp2p-resource-manager#68 to handle.
How do I know the health of my node? This is the monitoring doc.
What are the steps I can take when being attacked? If the steps above were taken, an attack shouldn’t fully compromise my node but it may degrade performance. Followup actions may be needed.
1. Ban offending IPs. It’s best to automate this as shown above using fail2ban.
2. Maybe something about using the denylist?

content/reference/dos-mitigation.md

BigLep · 2022-08-07T20:13:02Z

content/reference/dos-mitigation.md

+Depending on your use case, it can help to limit the number of inbound
+connections. You can use go-libp2p's


Do we give guidance on when to use this mechanism vs. the resource manager? I see this is a good hook for custom logic, but it seems like what Prysm is doing could be covered by go-libp2p resource manager right?

Do we give guidance on when to use this mechanism vs. the resource manager?

No, but I could add something here.

I see this is a good hook for custom logic, but it seems like what Prysm is doing could be covered by go-libp2p resource manager right?

Not really. If you're trying to avoid an adversary that can connect to you and give you a ton of work to do all at once the rcmgr doesn't protect at all. This attack can easily be mitigated by rate limiting though.

Not all applications will want this rate limiting, or they may want to rate limit certain things (e.g. something in the protocol rather than in the connections). For example, if I'm Google I wouldn't want to rate limit any new connection to me. I would rather rate limit work per connection.

Should the rcmgr do this? I don't think so. It's not directly related to limiting the resources being used, and if it can be handled by a smaller component that already exists the better.

I hope that makes sense, but happy to expand more as well.

content/reference/dos-mitigation.md

Co-authored-by: Steve Loeppky <[email protected]>

content/reference/monitoring.md

BigLep

Thanks @MarcoPolo for all the work here. I'm game to take one last look once comments are incorporated.

content/reference/dos-mitigation.md

BigLep · 2022-08-11T13:52:51Z

content/reference/dos-mitigation.md

+Using a stream for a short period of time and then closing it is fine. It's
+really the number of _concurrent_ streams that you need to be careful of.


You know more than me here. Would it be useful to use Identify as an example?

content/reference/dos-mitigation.md

BigLep · 2022-08-12T00:24:07Z

content/reference/dos-mitigation.md

+own resource usage. So limiting connections can have a leveraged effect on your
+resource usage. 
+
+In go-libp2p the number of active connections is managed by the


From reading through this doc I don't think it's clear for a user on when to use the connmgr or the resource manager for go-libp2p.

BigLep · 2022-08-12T00:24:51Z

content/reference/dos-mitigation.md

+`ConnManager` will trim connections when you hit the high watermark number of
+connections. You can protect certain connections with the `.Protect` method.
+
+In rust-libp2p handlers should implement


For Rust, I think we should draw more from https://www.notion.so/pl-strflt/rust-libp2p-2a62a76b60c54bd69aea2aa3760d6efe . At the minimum we should get a link to https://docs.rs/libp2p/latest/libp2p/swarm/struct.ConnectionLimits.html .

BigLep · 2022-08-12T00:25:16Z

content/reference/dos-mitigation.md

+Using a stream for a short period of time and then closing it is fine. It's
+really the number of _concurrent_ streams that you need to be careful of.
+
+## Limit the number of connections your application needs


Given connections happen before streams in an application's lifecycle, maybe move this above the section above?

BigLep · 2022-08-12T00:29:37Z

content/reference/dos-mitigation.md

+
+Here are some more specific recommendations
+
+## Limit the number of concurrent streams per connection your protocol needs


Can we give pointers on how to do this?

For go-libp2p this means using resource manager right?

For rust, this isn't at the connection level, but somewhere I think we should be linking to https://docs.rs/libp2p/latest/libp2p/swarm/struct.SwarmBuilder.html#method.max_negotiating_inbound_streams . Maybe we say, "rust-libp2p relies on each protocol to limit the number of streams per connection in XXX. A global upperbound on negotiating/transient inbound streams can be set using https://docs.rs/libp2p/latest/libp2p/swarm/struct.SwarmBuilder.html#method.max_negotiating_inbound_streams."

I should be more clear.

Here I'm talking about limiting the number of concurrent streams you need by design of the protocol, as opposed to using an existing protocol and trying to limit the streams at the end. For example imagine a RPC style protocol whose procedures are async and often take a long time to return (say > 1min). Here are two ways you could implement it:

Open a stream for each RPC call, and keep that stream open until the rpc call returns.

Open a stream for the start of the call then close it. The remote side will open a new stream with the answer.

Assume you make a lot of concurrent calls, method 1 would result in a large number of concurrent and mostly inactive streams. Method 2 would result in a fewer number of concurrent streams, and thus lower memory footprint.

If you add a limit here of say 10 streams, then method 1 will mean you can only have 10 concurrent RPC calls, while method 2 would let you have a much larger number of concurrent RPC calls.

Does that make sense? I should rephrase this to focus on the fact this is about protocol design (the inception stage of a p2p application) not about the deployed stage.

Got it - makes sense.

Side: lets find/create a place to point to https://docs.rs/libp2p/latest/libp2p/swarm/struct.SwarmBuilder.html#method.max_negotiating_inbound_streams . Maybe there's a section about transient/negotiating connections and resources and that those should guarded against too. Go and Rust both have some protections here.

content/reference/dos-mitigation.md

MarcoPolo · 2022-08-12T20:58:30Z

@BigLep Thanks for the review! I appreciate the help in getting us some good docs.

I believe I addressed all the comments. Another review would be very much appreciated.

BigLep

HI @MarcoPolo - good stuff. A few last comments but feel free to ship once incorporated. Good times!

BigLep · 2022-08-12T21:47:27Z

content/reference/dos-mitigation.md

+The `ConnManager` will trim connections when you hit the high watermark number of
+connections, and try to keep the number of connections above the low watermark.
+You can protect certain connections with the
+[`.Protect`](https://github.com/libp2p/rust-libp2p/blob/ea487aebfe6eb672b05d2bec2d9d79bbd92450ba/protocols/kad/src/handler.rs#L562)


This is a rust-libp2p link.

BigLep · 2022-08-12T21:50:48Z

content/reference/dos-mitigation.md

+2. Open a stream for the start of the call then close it. The remote side will
+   open a new stream with the response.
+
+Assume we make a lot of concurrent calls, method 1 would result in a large


Suggested change

Assume we make a lot of concurrent calls, method 1 would result in a large

Assume we make a lot of concurrent calls. Method 1 would result in a large

BigLep · 2022-08-12T21:58:15Z

content/reference/dos-mitigation.md

+implements their (Connection
+Gater)[https://github.com/prysmaticlabs/prysm/blob/63a8690140c00ba6e3e4054cac3f38a5107b7fb2/beacon-chain/p2p/connection_gater.go#L43].


Suggested change

implements their (Connection

Gater)[https://github.com/prysmaticlabs/prysm/blob/63a8690140c00ba6e3e4054cac3f38a5107b7fb2/beacon-chain/p2p/connection_gater.go#L43].

implements their (ConnectionGater)[https://github.com/prysmaticlabs/prysm/blob/63a8690140c00ba6e3e4054cac3f38a5107b7fb2/beacon-chain/p2p/connection_gater.go#L43].

Fixing the rendering issue here: https://bafybeid4zqncc4v5epc4urfvl5ajgnmqeksksk4xrgabdwaxxswpsigh6y.on.fleek.co/reference/dos-mitigation/#leverage-the-resource-manager-to-limit-resource-usage-go-libp2p-only

It's a misuse of (foo)[bar] vs [foo](bar)

content/reference/dos-mitigation.md

BigLep · 2022-08-12T22:03:49Z

content/reference/dos-mitigation.md

+usage. So limiting connections can have a leveraged effect on your resource
+usage.
+
+In go-libp2p the number of active connections is managed by the


A few nitpics here:

We use connmgr and ConnManager

When we're hyperlinking, I think it's good to remove the ticks so it's clear that it's a hyperlink. (see screenshot of rendering)

We do "ConnManager" no space and "Resource Manager" with space.

Maybe the first time we talk about Resource Manager here we make it a hyperlink?

We do "ConnManager" no space and "Resource Manager" with space.

In go-libp2p-core they're called ConnManager and ResourceManager. Using Conn Manager feels weird and so does ResourceManager although happy to make one change or the other if you feel strongly.

When we're hyperlinking, I think it's good to remove the ticks so it's clear that it's a hyperlink. (see screenshot of rendering)

I think our template should support the fixed width + hyperlink. I'll see if I can fix it.

I think our template should support the fixed width + hyperlink. I'll see if I can fix it.

Fixed this by adding a text-decoration: underline css property.

BigLep

Looks great - lets ship!

Add dos and monitoring docs

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired
Learn about vigilant mode

626b5ff

MarcoPolo requested review from marten-seemann and mxinden July 30, 2022 13:11

mxinden approved these changes Aug 2, 2022

View reviewed changes

BigLep mentioned this pull request Aug 7, 2022

Guide for libp2p node monitoring #158

Closed

BigLep reviewed Aug 7, 2022

View reviewed changes

BigLep mentioned this pull request Aug 7, 2022

self-service guide for handling "resource limit exceeded" messages libp2p/go-libp2p-resource-manager#27

Closed

MarcoPolo and others added 3 commits August 8, 2022 11:09

Update content/reference/dos-mitigation.md

f475a8c

Co-authored-by: Steve Loeppky <[email protected]>

Update content/reference/dos-mitigation.md

da36b11

Co-authored-by: Steve Loeppky <[email protected]>

Rewrite

104dbbc

MarcoPolo requested review from mxinden and BigLep August 10, 2022 15:08

MarcoPolo commented Aug 10, 2022

View reviewed changes

content/reference/monitoring.md Outdated Show resolved Hide resolved

MarcoPolo added 5 commits August 10, 2022 17:24

Add links

6752ff4

Update docs. Use mp4

e684b07

Add allowlist section

300fcf3

Reformat to numbered list

7b6c544

Update ToC

a6a83a5

BigLep mentioned this pull request Aug 11, 2022

Repo Consolidation: Round 3 libp2p/go-libp2p#1556

Closed

5 tasks

BigLep reviewed Aug 12, 2022

View reviewed changes

Update docs

dc691dd

MarcoPolo requested a review from BigLep August 12, 2022 20:58

BigLep approved these changes Aug 12, 2022

View reviewed changes

PR comments

c192206

BigLep approved these changes Aug 15, 2022

View reviewed changes

MarcoPolo merged commit 5b2b5d4 into master Aug 15, 2022

MarcoPolo deleted the marco/monitor-and-dos-docs branch August 15, 2022 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dos and monitoring docs #160

Add dos and monitoring docs #160

MarcoPolo commented Jul 30, 2022

mxinden left a comment

mxinden Aug 2, 2022

BigLep Aug 7, 2022

BigLep left a comment

BigLep Aug 7, 2022

MarcoPolo Aug 9, 2022

BigLep left a comment

BigLep Aug 11, 2022

BigLep Aug 12, 2022

BigLep Aug 12, 2022

BigLep Aug 12, 2022

BigLep Aug 12, 2022

MarcoPolo Aug 12, 2022

BigLep Aug 12, 2022

MarcoPolo commented Aug 12, 2022

BigLep left a comment

BigLep Aug 12, 2022

BigLep Aug 12, 2022

BigLep Aug 12, 2022

MarcoPolo Aug 15, 2022 •

edited

Loading

BigLep Aug 12, 2022

MarcoPolo Aug 15, 2022

MarcoPolo Aug 15, 2022

BigLep left a comment


		# General tactics that work across libp2p implementations

		## Limit the number of concurrent streams your protocol needs

		Depending on your use case, it can help to limit the number of inbound
		connections. You can use go-libp2p's

		Using a stream for a short period of time and then closing it is fine. It's
		really the number of _concurrent_ streams that you need to be careful of.


		Here are some more specific recommendations

		## Limit the number of concurrent streams per connection your protocol needs

	Assume we make a lot of concurrent calls, method 1 would result in a large
	Assume we make a lot of concurrent calls. Method 1 would result in a large

		implements their (Connection
		Gater)[https://github.com/prysmaticlabs/prysm/blob/63a8690140c00ba6e3e4054cac3f38a5107b7fb2/beacon-chain/p2p/connection_gater.go#L43].

	implements their (Connection
	Gater)[https://github.com/prysmaticlabs/prysm/blob/63a8690140c00ba6e3e4054cac3f38a5107b7fb2/beacon-chain/p2p/connection_gater.go#L43].
	implements their (ConnectionGater)[https://github.com/prysmaticlabs/prysm/blob/63a8690140c00ba6e3e4054cac3f38a5107b7fb2/beacon-chain/p2p/connection_gater.go#L43].

Add dos and monitoring docs #160

Add dos and monitoring docs #160

Conversation

MarcoPolo commented Jul 30, 2022

mxinden left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BigLep left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BigLep left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcoPolo commented Aug 12, 2022

BigLep left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcoPolo Aug 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BigLep left a comment

Choose a reason for hiding this comment

MarcoPolo Aug 15, 2022 •

edited

Loading