API Rate Limiting by IP with NGINX
Last time I wrote about the HTTP requests rate limit in ASP.NET Core. It works well when you are hosting a simple REST API or website on Kestrel web-server. But you can achieve even more by hosting your application behind a reverse proxy server—the most popular one nowadays in NGINX. According to W3Techs it hosts 32.7% of websites at the beginning of 2021.
Before showing you how to set up the NGINX rate limits, I would like to discuss why Kestrel is not enough and why you may use a reverse proxy server. Kestrel is an excellent high-performance web-server. Well, it is even at the top of TechEmpower Web Framework Benchmarks. Kestrel supports HTTPS, HTTP/2 and comes with .NET Core. But, as with most built-in web-servers on the market, it does not support more advanced features like load balancing. Also, Kestrel is a bit slow when hosting static files. I had performance boosts for a few projects in the past by moving static content out of the ASP.NET Core application to NGINX and IIS. Of course, moving to a CDN, e.g., Amazon CloudFront, will give you the best performance.
Let's reuse the API I built in the previous article
[ApiController]
[Route("[controller]")]
public class SampleController : ControllerBase
{
[HttpGet]
[Route("time")]
public TimeResponse GetTime()
{
var response = new TimeResponse { Time = DateTime.Now };
return response;
}
[HttpGet]
[Route("status")]
public IActionResult GetStatus()
{
return Ok("OK");
}
}
Our deployment will include a public NGINX proxy server that passes traffic to the .NET API. The next step is to prepare a Dockerfile that builds and runs our application on .NET 5.
# build the app
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build
WORKDIR /src
COPY . .
RUN dotnet restore
RUN dotnet publish -c release -o /app --no-restore
# build the final image
FROM mcr.microsoft.com/dotnet/aspnet:5.0
WORKDIR /app
COPY --from=build / app ./
ENTRYPOINT ["dotnet", "RateLimiting.dll"]
The next step is to run the NGINX proxy server. I usually use the official NGINX image from Docker Hub. By default, it is ready to serve static content, but you can customize it by placing your default.conf or any domain-specific configuration files to /etc/nginx/conf.d folder.
server {
listen 80;
server_name localhost;
location /time {
proxy_pass http://rate_api/sample/time;
}
location /status {
proxy_pass http://rate_api/sample/status;
}
}
It is the Dockerfile to build the NGINX image for the project.
FROM nginx
WORKDIR /etc/nginx/conf.d/
COPY default.conf .
The final step is to create a Docker Compose file to run the services.
version: "3.9"
services:
web:
build: ./nginx
ports:
- "8080:80"
links:
- rate_api
rate_api:
build: ./RateLimiting
Let's implement the requirements from the previous article:
- The Time API allows only two requests/minute per IP. This rate does not make sense in the real world, but it is OK for testing purposes to see the actual rate limiting errors.
- The Status API has no restrictions.
NGINX support three possible limits:
- The number of connections per IP address
- The request rates limit, e.g., total requests per IP address per second.
- The download speed per client connection
You should configure the limit using limit_conn_zone and limit_req directives. First, you use the limit_conn_zone to define a key (usually an IP address), a shared memory zone to store each IP address's state and how often the URL has been requested, and the expected requests rate. The request rate value is either requests/second (r/s) or requests/minute (r/m) if a rate of less than one request per second is desired.
The next step is to apply the desired rate limit to a route, e.g., /time endpoint, using limit_req directive withing a location {} context.
limit_req_zone $binary_remote_addr zone=time_api:10m rate=2r/m;
server {
...
location /time {
limit_req zone=time_api;
}
...
}
In the following example, I allocated 10 MB to keep requests counter per IP address. Please, pay attention to the fact that I used $binary_remote_addr variable as a key instead of $remote_addr, which also holds a client's IP address. The reason for doing this is $binary_remote_addr variable holds the binary representation of IP address, which requires less memory and more efficient.
Sometimes you may want to test the limits first before enabling them on a production server. You can do that by adding limit_req_dry_run on; directive to your context.
location /time {
limit_req zone=time_api;
limit_req_dry_run on;
...
}
Once an IP address reaches the limit, you should see the error messages in NGINX logs, but the request will pass through anyway.
web_1 | 172.27.0.1 - - [17/Jan/2021:13:29:58 +0000] "GET /time HTTP/1.1" 200 55 "-" "curl/7.74.0" "-"
web_1 | 2021/01/17 13:29:59 [error] 28#28: *3 limiting requests, dry run, excess: 0.963 by zone "time_api", client: 172.27.0.1, server: localhost, request: "GET /time HTTP/1.1", host: "localhost:8080"
web_1 | 172.27.0.1 - - [17/Jan/2021:13:29:59 +0000] "GET /time HTTP/1.1" 200 55 "-" "curl/7.74.0" "-"
web_1 | 2021/01/17 13:30:00 [error] 28#28: *5 limiting requests, dry run, excess: 0.918 by zone "time_api", client: 172.27.0.1, server: localhost, request: "GET /time HTTP/1.1", host: "localhost:8080"
web_1 | 172.27.0.1 - - [17/Jan/2021:13:30:00 +0000] "GET /time HTTP/1.1" 200 55 "-" "curl/7.74.0" "-"
By default, once the number of requests exceeds the specified rate, NGINX will respond with an error.
$ curl -v http://localhost:8080/time
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying ::1:8080...
* Connected to localhost (::1) port 8080 (#0)
> GET /time HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 503 Service Temporarily Unavailable
< Server: nginx/1.19.6
< Date: Sun, 17 Jan 2021 13:37:35 GMT
< Content-Type: text/html
< Content-Length: 197
< Connection: keep-alive
<
{ [197 bytes data]
100 197 100 197 0 0 19700 0 --:--:-- --:--:-- --:--:-- 21888<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx/1.19.6</center>
</body>
</html>
* Connection #0 to host localhost left intact
Sometimes you may want to keep the requests beyond the allowed limit in a queue and execute them later. It is doable with the burst parameter of the limit_req directive.
The final NGINX configuration file looks as follows:
limit_req_zone $binary_remote_addr zone=time_api:10m rate=2r/m;
server {
listen 80;
server_name localhost;
location /time {
limit_req zone=time_api burst=10;
proxy_pass http://rate_api/sample/time;
}
location /status {
proxy_pass http://rate_api/sample/status;
}
}
Now you can run the services on your own and test them.
$ docker-compose up -d
Creating network "nginx_demo_default" with the default driver
Creating nginx_demo_rate_api_1 ... done
Creating nginx_demo_web_1 ... done
$ curl -v http://localhost:8080/time
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying ::1:8080...
* Connected to localhost (::1) port 8080 (#0)
> GET /time HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: nginx/1.19.6
< Date: Sun, 17 Jan 2021 13:44:50 GMT
< Content-Type: application/json; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
<
{ [55 bytes data]
100 44 0 44 0 0 3142 0 --:--:-- --:--:-- --:--:-- 3142{"time":"2021-01-17T13:44:50.7810712+00:00"}
* Connection #0 to host localhost left intact
In this article, I showed how you could set up an ASP.NET Core REST API behind the NGINX proxy, run it using Docker and Docker Compose, and use NGINX built-in features to setup request rate limits. Please consult with NGINX official documentation if you need more details on rate limits capabilities or message me if you a looking for a consultant to help with a project.