API Rate Limiting by IP with NGINX

Last time I wrote about the HTTP requests rate limit in ASP.NET Core. It works well when you are hosting a simple REST API or website on Kestrel web-server. But you can achieve even more by hosting your application behind a reverse proxy server—the most popular one nowadays in NGINX. According to W3Techs it hosts 32.7% of websites at the beginning of 2021.

Before showing you how to set up the NGINX rate limits, I would like to discuss why Kestrel is not enough and why you may use a reverse proxy server. Kestrel is an excellent high-performance web-server. Well, it is even at the top of TechEmpower Web Framework Benchmarks. Kestrel supports HTTPS, HTTP/2 and comes with .NET Core. But, as with most built-in web-servers on the market, it does not support more advanced features like load balancing. Also, Kestrel is a bit slow when hosting static files. I had performance boosts for a few projects in the past by moving static content out of the ASP.NET Core application to NGINX and IIS. Of course, moving to a CDN, e.g., Amazon CloudFront, will give you the best performance.

Let's reuse the API I built in the previous article

[ApiController]
[Route("[controller]")]
public class SampleController : ControllerBase
{
    [HttpGet]
    [Route("time")]
    public TimeResponse GetTime()
    {
        var response = new TimeResponse { Time = DateTime.Now };
        return response;
    }

    [HttpGet]
    [Route("status")]
    public IActionResult GetStatus()
    {
        return Ok("OK");
    }
}

Our deployment will include a public NGINX proxy server that passes traffic to the .NET API. The next step is to prepare a Dockerfile that builds and runs our application on .NET 5.

# build the app
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build
WORKDIR /src

COPY . .
RUN dotnet restore
RUN dotnet publish -c release -o /app --no-restore

# build the final image
FROM mcr.microsoft.com/dotnet/aspnet:5.0
WORKDIR /app
COPY --from=build /  app ./
ENTRYPOINT ["dotnet", "RateLimiting.dll"]

The next step is to run the NGINX proxy server. I usually use the official NGINX image from Docker Hub. By default, it is ready to serve static content, but you can customize it by placing your default.conf or any domain-specific configuration files to /etc/nginx/conf.d folder.

server {                                                                                                                               
    listen       80;                                                                                                                   
    server_name  localhost;                                                                                                            
                                                                                                                                       
    location /time {
        proxy_pass http://rate_api/sample/time;
    }

    location /status {
        proxy_pass http://rate_api/sample/status;
    }
}

It is the Dockerfile to build the NGINX image for the project.

FROM nginx
WORKDIR /etc/nginx/conf.d/
COPY default.conf .

The final step is to create a Docker Compose file to run the services.

version: "3.9"
services:
  web:
    build: ./nginx
    ports:
      - "8080:80"
    links:
      - rate_api
  rate_api:
    build: ./RateLimiting

Let's implement the requirements from the previous article:

  • The Time API allows only two requests/minute per IP. This rate does not make sense in the real world, but it is OK for testing purposes to see the actual rate limiting errors.
  • The Status API has no restrictions.

NGINX support three possible limits:

  • The number of connections per IP address
  • The request rates limit, e.g., total requests per IP address per second.
  • The download speed per client connection

You should configure the limit using limit_conn_zone and limit_req directives. First, you use the limit_conn_zone to define a key (usually an IP address), a shared memory zone to store each IP address's state and how often the URL has been requested, and the expected requests rate. The request rate value is either requests/second (r/s) or requests/minute (r/m) if a rate of less than one request per second is desired.

The next step is to apply the desired rate limit to a route, e.g., /time endpoint, using limit_req directive withing a location {} context.

limit_req_zone $binary_remote_addr zone=time_api:10m rate=2r/m;

server {                                                                                                                               
    ...                                                                           
    location /time {
        limit_req zone=time_api;
    }
    ...
}

In the following example, I allocated 10 MB to keep requests counter per IP address. Please, pay attention to the fact that I used $binary_remote_addr variable as a key instead of $remote_addr, which also holds a client's IP address. The reason for doing this is $binary_remote_addr variable holds the binary representation of IP address, which requires less memory and more efficient.

Sometimes you may want to test the limits first before enabling them on a production server. You can do that by adding limit_req_dry_run on; directive to your context.

location /time {
    limit_req zone=time_api;
    limit_req_dry_run on;
    ...
}

Once an IP address reaches the limit, you should see the error messages in NGINX logs, but the request will pass through anyway.

web_1       | 172.27.0.1 - - [17/Jan/2021:13:29:58 +0000] "GET /time HTTP/1.1" 200 55 "-" "curl/7.74.0" "-"
web_1       | 2021/01/17 13:29:59 [error] 28#28: *3 limiting requests, dry run, excess: 0.963 by zone "time_api", client: 172.27.0.1, server: localhost, request: "GET /time HTTP/1.1", host: "localhost:8080"
web_1       | 172.27.0.1 - - [17/Jan/2021:13:29:59 +0000] "GET /time HTTP/1.1" 200 55 "-" "curl/7.74.0" "-"
web_1       | 2021/01/17 13:30:00 [error] 28#28: *5 limiting requests, dry run, excess: 0.918 by zone "time_api", client: 172.27.0.1, server: localhost, request: "GET /time HTTP/1.1", host: "localhost:8080"
web_1       | 172.27.0.1 - - [17/Jan/2021:13:30:00 +0000] "GET /time HTTP/1.1" 200 55 "-" "curl/7.74.0" "-"

By default, once the number of requests exceeds the specified rate, NGINX will respond with an error.

$ curl -v http://localhost:8080/time
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying ::1:8080...
* Connected to localhost (::1) port 8080 (#0)
> GET /time HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 503 Service Temporarily Unavailable
< Server: nginx/1.19.6
< Date: Sun, 17 Jan 2021 13:37:35 GMT
< Content-Type: text/html
< Content-Length: 197
< Connection: keep-alive
<
{ [197 bytes data]
100   197  100   197    0     0  19700      0 --:--:-- --:--:-- --:--:-- 21888<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx/1.19.6</center>
</body>
</html>

* Connection #0 to host localhost left intact

Sometimes you may want to keep the requests beyond the allowed limit in a queue and execute them later. It is doable with the burst parameter of the limit_req directive.

The final NGINX configuration file looks as follows:

limit_req_zone $binary_remote_addr zone=time_api:10m rate=2r/m;

server {                                                                                                                               
    listen       80;                                                                                                                   
    server_name  localhost;                                                                                                            
                                                                                                                                       
    location /time {
        limit_req zone=time_api burst=10;
        proxy_pass http://rate_api/sample/time;
    }

    location /status {
        proxy_pass http://rate_api/sample/status;
    }
}

Now you can run the services on your own and test them.

$ docker-compose up -d
Creating network "nginx_demo_default" with the default driver
Creating nginx_demo_rate_api_1 ... done
Creating nginx_demo_web_1      ... done
$ curl -v http://localhost:8080/time
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying ::1:8080...
* Connected to localhost (::1) port 8080 (#0)
> GET /time HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: nginx/1.19.6
< Date: Sun, 17 Jan 2021 13:44:50 GMT
< Content-Type: application/json; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
<
{ [55 bytes data]
100    44    0    44    0     0   3142      0 --:--:-- --:--:-- --:--:--  3142{"time":"2021-01-17T13:44:50.7810712+00:00"}
* Connection #0 to host localhost left intact

In this article, I showed how you could set up an ASP.NET Core REST API behind the NGINX proxy, run it using Docker and Docker Compose, and use NGINX built-in features to setup request rate limits. Please consult with NGINX official documentation if you need more details on rate limits capabilities or message me if you a looking for a consultant to help with a project.