Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnifiedPush: Rate limiting issues with matrix.gateway.unifiedpush.org and up.schildi.chat #144

Closed
binwiederhier opened this issue Feb 14, 2022 · 11 comments
Labels
🪲 bug Something isn't working server Relates to the main binary (server or client) unified-push UnifiedPush feature or bug

Comments

@binwiederhier
Copy link
Owner

binwiederhier commented Feb 14, 2022

On ntfy.sh, I'm seeing the IPs of matrix.gateway.unifiedpush.org and up.schildi.chat being heavily rate limited. We need to find a solution, or otherwise message delivery will keep on being severely impacted.

@binwiederhier binwiederhier added 🪲 bug Something isn't working server Relates to the main binary (server or client) labels Feb 14, 2022
@karmanyaahm
Copy link
Contributor

could you add a temporary exception for those two IPs as we work out a solution? perhaps just an if ip!=xyz { apply rate limit }
It's not a proper solution but should solve the problem while we develop a proper solution

Also, thanks for pointing out this issue, has it been happening for long or has it just started now?

@karmanyaahm
Copy link
Contributor

To restate the problem here on Github:

They're the default gateways for ntfy.sh (for fluffychat and schildichat respectively) so all users' messages are being forwarded through those two. Having an exception for those two is probably the simplest solution, however, it's unsustainable as more apps are supported. Even ignoring the gateway, a server with a large number of users (like matrix.org) would start seeing similar problems soon.

The best solution imho is per-endpoint rate limiting, but that can be bypassed by creating multiple endpoints. This is a problem that was missed earlier, since topics in ntfy are not the property of authenticated users (like in Gotify/Nextpush) but rather are created on message send.

@binwiederhier
Copy link
Owner Author

Also, thanks for pointing out this issue, has it been happening for long or has it just started now?

From the logs it looks like for 2 weeks or so.

could you add a temporary exception for those two IPs as we work out a solution?

I think that's what it'll be. A rate-limits-exclude-ips setting in the config or something like that. It's not all that unsustainable if I only ever have to add an app every couple of weeks. It's not like ntfy is gonna be the new Google soon.

It'd be cool if the gateway could use some sort of credentials, then I could tie the rate limits to the "app" as opposed to the IP address. But that's likely not in the spec.

The best solution imho is per-endpoint rate limiting

As you have stated, that's not sufficient, because people could just create many many topics and circumvent the limits.

@binwiederhier binwiederhier changed the title Rate limiting issues with matrix.gateway.unifiedpush.org and up.schildi.chat UnifiedPush: Rate limiting issues with matrix.gateway.unifiedpush.org and up.schildi.chat Feb 14, 2022
@binwiederhier binwiederhier added the unified-push UnifiedPush feature or bug label Feb 14, 2022
@karmanyaahm
Copy link
Contributor

gateway could use some sort of credentials, then I could tie the rate limits to the "app"

The endpoint URL is that credential.

One possible way to use that could be to check ratelimit := perEndpoint[endpoint] if (endpoint has been recently subscribed to) else global. This then restricts topics based on the subscription rate limits.

@binwiederhier
Copy link
Owner Author

The endpoint URL is that credential.

I don't fully follow. Can you elaborate on that?

One possible way to use that could be to check ..

Because endpoints can be freely chosen, I have to maintain a global per-IP limit, because there is no auth.

Rate limiting on a per-endpoint basis can be an additional measure to prevent individual users being abusing in UP-land (e.g. 5 messages per 10s for each topic), but overall this doesn't work even if the volume on the individual endpoints is low.

Example: If you have 6 endpoints, each publishing 1 message every 10 seconds, you've already reached the global rate limit of 1 message per 10s per IP (that's what ntfy.sh is set to; with an initial burst of 60 messages).

@binwiederhier
Copy link
Owner Author

Here's a working implementation that I could put live tonight: https://github.com/binwiederhier/ntfy/pull/145/files

I'll likely also increase the other rate limits a bit, and update the docs accordingly.

@karmanyaahm
Copy link
Contributor

I don't fully follow. Can you elaborate on that?
Because endpoints can be freely chosen, I have to maintain a global per-IP limit, because there is no auth.

I don't know how viable these techniques would be, but these are just my thoughts:
Idea 1
The goal is to simulate a list of "valid" topics. Then, rate limits could be per-endpoint (rather than global) for such topics; these limits could be more liberal than the combined global limit (say, 1msg / 10s, same amount but per-endpoint). non "valid" topics continue to exist and are treated with a simple global limit.

There are various ways to judge topics as "valid". Auth is a simple one, but one that core ntfy lacks (and that's good for usability).
Another method more applicable to ntfy to judge "valid" topics could be based on subscriptions. If >1 user is subscribed to the topic for a while (say, >12h/day), that means the topic is probably not spam. And since the number of subscriptions per IP is limited, that basically shifts the burden of rate limiting from the sender to the subscriber.

However, this means a potential abuser can send a lot more messages with just 1 IP - 30 subs * 1msg / 10sec / sub, rather than just 1msg/10sec.

Idea 2
Process incoming messages with a much higher rate limit (say, 10msg / s / IP), but add rate limits to the subscribers (say, 1msg/sec?), and just drop any messages over the limit from being sent to that subscriber. This will limit overuse/abuse of the server for any practical purpose (since sent messages can't be received), but still allows for potential DOS sending attacks.

Overall
This problem is pretty complicated, but the above two are the simplest things that seems like would work to me within these constraints, without having to add exceptions.

Exceptions (along with user-agent logging, which I'll work on for common-proxies soon) are probably the only simple solution though and unless ntfy+UnifiedPush sees crazy scale, exceptions will be the most practical. I'm sure there are some other advanced ways; I'll do research on how webpush servers handle this.

@binwiederhier
Copy link
Owner Author

re idea 1:

Auth is a simple one, but one that core ntfy lacks

ntfy has auth now, and I've actually thought about adding configurable limits to users; so this would be in line with the future strategy. The proxies could implement basic auth; though I do not know how much they are "aware" of what they are talking to.

If >1 user is subscribed to the topic for a while (say, >12h/day), that means the topic is probably not spam

That seems like a can of worms to me. I'd really rather not...

re idea 2:

Process incoming messages with a much higher rate limit (say, 10msg / s / IP), but add rate limits to the subscribers (say, 1msg/sec?)

I like your thinking there, but you are right; I'd have to buffer these and store them and that would be impractical. Also, Firebase, for instance or outgoing email is not buffered at all, it's just forwarded to as I get them. So this option is sadly not feasible at all.


I think the thing I implemented (exemption based on IP) is alright for now. I'd love it if we could add auth-based exemption or limits instead, but you'll have to answer how feasible that is for the proxies.

I'll close this ticket for now, since the problem is solved, but we can keep discussing here.

@bmarty
Copy link

bmarty commented May 7, 2024

Hello,
It seems that ntfy.sh is rate-limiting request from matrix.org.
From the matrix.org log:

Failed to push data to @UserRedacted:matrix.org/im.vector.app.android/https://ntfy.sh/upRPREDACTEDWI?up=1: <class 'synapse.http.RequestTimedOutError'> 504: Timeout connecting to remote server

Is there anything we can do to fix this?
Thanks!

@binwiederhier
Copy link
Owner Author

@bmarty If you provide hostnames + IP address of the publishers, I am happy to whitelist them.

@bmarty
Copy link

bmarty commented May 13, 2024

Thanks @binwiederhier , here is an official request: #1106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪲 bug Something isn't working server Relates to the main binary (server or client) unified-push UnifiedPush feature or bug
Projects
None yet
Development

No branches or pull requests

3 participants