docs
  1. SCAYLE Resource Center
  2. Storefront API
  3. Getting Started
  4. Rate Limits

Rate Limits

To manage the high volume of requests sent to the platform daily, limits are placed on the number of requests a tenant can make. These limits help provide a stable, responsive, and fair REST API for all tenants, and protect the database against DDoS attacks. Sending too many requests in quick succession may result in error responses with a 429 status code.

API rate limits

Calls to the REST Storefront API are governed by request-based limits, which means that the total number of API calls made by the application should be considered. Rate limits are per tenant, so calls to different shops of the same tenant still share the same global limit.

For example, if 20 requests are made to one tenant's shop within a 1 minute time period and 20 other requests are made to another shop of the same tenant within the same time period, the 40 requests would pull out of the same per tenant limit bucket. That means that if the API has a rate limit of 50 requests per minute, then this application would be able to make 10 more requests across all tenant shops.

Sandbox environment

The rate limit is set to a fixed value.

Production environment

The limit is calculated dynamically based on the peak number of requests per minute on a day averaged over the last 30 days plus an additional 20% buffer on top.

Environment# RequestsWindow of time
Test5001 minute
Preview5001 minute
LiveDynamically calculated1 minute

Rate limit algorithm

To control the rate limit, a sliding window log algorithm is applied. This solution provides better protection for burst traffic and works by tracking requests made within a given interval and applying the limit accordingly. The main difference between this strategy and one that uses a fixed window is that it will always take into account the amount of traffic that was generated over the last time window from the current moment, instead of waiting for the counter to expire.

For example, suppose that the limit is set to 5 requests per minute (on a sliding window). In this scenario:

  • T0 - T-60 seconds: the application makes 5 successful requests in the first 5 seconds at a constant rate of 1 per second. The 6th request receives an error because there are no remaining requests allowed in the current window of 60 seconds.
  • T-60 seconds - T-120 seconds: a new request is allowed up to a maximum of 5. Due to the sliding window, a new request is allowed at a rate of 1 per second. Trying to make more than 1 request in the same second during the first 5 seconds results in an error.
  • T-125: all 5 requests are allowed to be made at whatever rate is preferred. Due to the sliding window, a new request was added back to the pool at a rate of 1 per second in the past 5 seconds.

HTTP headers and response code

All responses from the API include the following headers that can be examined to understand the rate limiting situation on the tenant:

HeaderDescription
X-RateLimit-LimitThe maximum number of allowed requests in a given window of time
X-RateLimit-RemainingThe number of remaining requests in the current window of time
X-RateLimit-ResetThe time at which the rate limit resets, specified in UTC epoch time (in seconds)

When a call exceeds the rate limit, the API will return a JSON response with the status code HTTP 429 “Too Many Requests” and the following body:

{
  "code": 429,
  "message": "API rate limit exceeded."
}

The following example shows the relevant portion of headers returned for a request that hasn't exceeded the limit:

HTTP/1.1 200 OK
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 20
X-RateLimit-Reset: 1724668982

From this information, it can be understood that:

  • The tenant has used 480 of the 500 allowed requests
  • The tenant still has 20 requests that can be made until more are added.
  • 1 new request will be added at 10:43:02 AM GMT on August 26, 2024.

This example shows instead the relevant portion of headers returned for a request that has exceeded the limit:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1724668982

Handling limiting gracefully

To prevent throttling errors, it's essential to design your application following best practices. For example, the following are some key strategies that you can use:

  • Regulate request rates to ensure a smooth distribution of your requests over time.
  • Make use of caching techniques to store frequently used data and reduce repetitive requests.
  • Optimize data fetching to retrieve only the necessary data for your application.
  • Handle errors effectively to ensure your app can recover gracefully.

A common technique to handle limiting is to build a retry mechanism for responses that return 429 status code. When this happens, a simple solution involves checking the X-RateLimit-Reset header and waiting for the specified time to elapse before retrying.

Advanced solutions

For more advanced solutions, consider using algorithms like exponential backoff to decrease the request rate when needed. Adding some randomness to the backoff schedule is recommended to prevent a thundering herd effect.