current rate limiting is ineffective #4359

marten-seemann · 2024-03-10T10:21:40Z

The current rate limiting only works for well-behaved clients, but fails with an active attacker. The reason is that the current implementation imposes a concurrent limit. However, the concurrency is attacker-controlled: An attacker can send a ClientHello, immediately followed by an invalid frame (potentially in the same packet). The server would then start the TLS handshake and incur the CPU cost for that, only to immediately close the connection, which would then not be counted towards the limit anymore.

What we need to do instead is impose a limit that’s independent of what the attacker is doing. One option would be a leaky-bucket-like rate limiting, enforcing that we’re not handling more than X (unvalidated / total) handshakes per second, with a burst budget of Y.

x/time/rate contains a leaky-bucket rate limiter, which we might be able to use in the API: https://pkg.go.dev/golang.org/x/time/rate

The text was updated successfully, but these errors were encountered:

marten-seemann · 2024-03-10T22:49:06Z

We actually can make the API more general by using a function, and not a rate.Limiter:

RejectUnverifiedConnection(rcvTime time.Time) bool
RejectConnection(rcvTime time.Time) bool

If unset, we’d use the AllowN function of a rate.Limiter, initialized with reasonable limits.

It is expected that most users would use a rate.Limiter (with different than the defaults limits), but this API would allow users to go fancy and implement their own rate limiting logic.

marten-seemann · 2024-03-11T02:59:26Z

Here's an updated proposal. We actually only need a single callback:

// VerifySourceAddress decides if a connection attempt originating from unvalidated source
// addresses first needs to go through source address validation using QUIC's Retry mechanism,
// as described in RFC 9000 section 8.1.2.
// Validating the source address adds one additional network roundtrip to the handshake,
// and should therefore only be used if a suspiciously high number of incoming connection is recorded.
// If unset, a rate limit of 128 unvalidated connection attempts per second will be applied.
// For most use cases, wrapping the AllowN function of a rate.Limiter will be a reasonable
// implementation of this callback.
VerifySourceAddress func(receiveTime time.Time) bool

Question: Is it reasonable to set a default value here?

It doesn't make sense to have a separate RejectConnection callback, since:

If it's called before VerifySourceAddress, an attacker can just flood us with packets (potentially with spoofed source addresses), until this limit is reached.
That means the only option is to call it after VerifySourceAddress. However, then you might as well do the rate limiting in GetConfigForClient, which is also called after VerifySourceAddress and just before we're about to create a new connection struct. With add an AddressVerified field to the ClientHelloInfo #4358, this will allow the application to:
1. Impose a rate limit on the total number of handshakes. Note that this only protects against getting overloaded by new connection attempts (i.e. it prevents an attacker from bringing the CPU load to 100%). An attacker can still consume the entire rate limit by just spamming us with connection attempts, from a single machine.
2. Perform blacklisting based on the remote address. This should only be done based on addresses that were verified, otherwise an attacker will be able to prevent legitimate clients from connecting by spoofing their source address. This now means that an attacker has to control a large number of IPs to saturate our limit.

That means that a reasonable implementation would look something like this:

limiter := rate.NewLimiter(rate, burst)
quicConf := &quic.Config{
    GetConfigForClient: func(info *ClientHelloInfo) (*quic.Config, error) {
        if !limiter.Allow() {
              return errors.New("rate limited")
        }
        if info.AddrVerified {
            // do IP-address based rate limiting based on info.RemoteAddr
        }
        return &quic.Config{} /* your config here */, nil
    },
}

MarcoPolo · 2024-03-11T20:17:34Z

I like the API as a greenfield design. I'm worried that reusing GetConfigForClient will prevent quic-go from introducing a reasonable default, and thus rely on all users to implement the rate limit logic in their own GetConfigForClient. That said, I'm not sure what a reasonable default would even be since it depends on the use case. And I do appreciate having a smaller api surface. So I think this is an okay tradeoff.

Does it make sense to allow verified IP addresses to have their own rate limit? If another node went through the process of verifying their address, should they still get blocked if a malicious actor is consuming the whole rate limit? I'm thinking something like this:

unverifiedLimiter := rate.NewLimiter(rate, burst)
verifiedLimiter := rate.NewLimiter(rate, burst)
quicConf := &quic.Config{
    GetConfigForClient: func(info *ClientHelloInfo) (*quic.Config, error) {
        if info.AddrVerified {
            if !verifiedLimiter.Allow() {
                return errors.New("rate limited")
            }
            // do IP-address based rate limiting based on info.RemoteAddr
        } else if !unverifiedLimiter.Allow() {
            return errors.New("rate limited")
        }
        return &quic.Config{} /* your config here */, nil
    },
}

marten-seemann · 2024-03-12T11:39:58Z

I like the API as a greenfield design. I'm worried that reusing GetConfigForClient will prevent quic-go from introducing a reasonable default, and thus rely on all users to implement the rate limit logic in their own GetConfigForClient. That said, I'm not sure what a reasonable default would even be since it depends on the use case. And I do appreciate having a smaller api surface. So I think this is an okay tradeoff.

Thanks! I really appreciate it that you're evaluating the tradeoffs here.

Does it make sense to allow verified IP addresses to have their own rate limit? If another node went through the process of verifying their address, should they still get blocked if a malicious actor is consuming the whole rate limit?

I think the flow is a little bit different here, since it only takes 1 RTT to convert an unverified into a verified client. Here's how I imagine the layers of defence to work:

If the rate of incoming connections is low, no client has to verify its source address. Handshakes complete in 1 RTT. This is the common case (if limits are set correctly).
If the rate of incoming connections is higher, all connections above the rate limiter's steady state are forced into Retry, incurring a 1 RTT penalty to verify their address.
If the rate is getting even higher, and threatening to overwhelm the CPU capacity, a hard rate limiting applies. Applications will need to come up with some kind of blacklisting of IP addresses / ranges, to make sure their capacity is not used up by connections coming from a single IP.

Does this make sense, @MarcoPolo?

sukunrt · 2024-03-12T14:09:20Z

Do you think it would be useful to provide the source address here:

VerifySourceAddress func(addr net.Addr, receiveTime time.Time) bool

marten-seemann · 2024-03-13T08:15:34Z

How would you act upon the source address here? It's not verified yet, so you'd be using spoofed information, right in the mechanism designed to prevent this spoofing.

sukunrt · 2024-03-13T08:23:29Z

I am not sure how useful this is in the wild, but you can whitelist intranet addresses when you know the packets from outside will be rejected at ELB/Gateway etc if they present a source IP as the intranet IP.

marten-seemann · 2024-03-13T09:41:34Z

Good point. We just need to be very clear that the source address cannot be trusted.

MarcoPolo · 2024-03-13T18:49:58Z

I think the flow is a little bit different here, since it only takes 1 RTT to convert an unverified into a verified client. Here's how I imagine the layers of defence to work:

If the rate of incoming connections is low, no client has to verify its source address. Handshakes complete in 1 RTT. This is the common case (if limits are set correctly).

If the rate of incoming connections is higher, all connections above the rate limiter's steady state are forced into Retry, incurring a 1 RTT penalty to verify their address.

If the rate is getting even higher, and threatening to overwhelm the CPU capacity, a hard rate limiting applies. Applications will need to come up with some kind of blacklisting of IP addresses / ranges, to make sure their capacity is not used up by connections coming from a single IP.

Does this make sense, @MarcoPolo?

Yeah, that makes sense. I think I need to figure out an ergonomic way to define rate limits for libp2p users.

marten-seemann added bug security labels Mar 10, 2024

marten-seemann added this to the v0.42 milestone Mar 10, 2024

marten-seemann mentioned this issue Mar 11, 2024

How to best implement address validation callback #3549

Closed

marten-seemann linked a pull request Mar 11, 2024 that will close this issue

use Transport.VerifySourceAddress to control the Retry Mechanism #4362

Merged

marten-seemann closed this as completed in #4362 Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

current rate limiting is ineffective #4359

current rate limiting is ineffective #4359

marten-seemann commented Mar 10, 2024

marten-seemann commented Mar 10, 2024

marten-seemann commented Mar 11, 2024 •

edited

Loading

MarcoPolo commented Mar 11, 2024

marten-seemann commented Mar 12, 2024

sukunrt commented Mar 12, 2024

marten-seemann commented Mar 13, 2024

sukunrt commented Mar 13, 2024 •

edited

Loading

marten-seemann commented Mar 13, 2024

MarcoPolo commented Mar 13, 2024

current rate limiting is ineffective #4359

current rate limiting is ineffective #4359

Comments

marten-seemann commented Mar 10, 2024

marten-seemann commented Mar 10, 2024

marten-seemann commented Mar 11, 2024 • edited Loading

MarcoPolo commented Mar 11, 2024

marten-seemann commented Mar 12, 2024

sukunrt commented Mar 12, 2024

marten-seemann commented Mar 13, 2024

sukunrt commented Mar 13, 2024 • edited Loading

marten-seemann commented Mar 13, 2024

MarcoPolo commented Mar 13, 2024

marten-seemann commented Mar 11, 2024 •

edited

Loading

sukunrt commented Mar 13, 2024 •

edited

Loading