Skip to main content
To maintain platform stability and fair usage, we enforce rate limits on all API requests using a token bucket algorithm.

How it works

The system uses buckets of tokens to manage how many requests each client can make:
  • Each bucket has a maximum capacity (how many requests can be stored).
  • Each request consumes tokens from a shared rate-limit bucket tied to your organization. This means all API keys under the same org count toward the same limits.
  • Tokens are refilled automatically over time, at a set rate per second.
  • When you make a request, 1 token is consumed.
  • If your bucket runs out of tokens, your request is blocked until more tokens are refilled.
This allows you to have burst periods when needed and doesn’t block you from sending requests for a pre-determined amount of period.

Identifier Scope

Rate limits are applied per organization, not per IP address. Whenever possible, the rate limiter uses the most specific and reliable identifier available to track usage:
  • Organization ID – Primary rate limit scope (tied to your API key)
  • API Key – Used if org-level context is not clearly available
  • User ID – Used if org-level context is not available
  • IP Address – Only used as a last resort (e.g., unauthenticated or anonymous requests)
As a result, all requests made using the same organization’s API keys will share the same rate-limiting bucket.

Token buckets by tier

Rate limits vary based on your organization’s tier. Each tier defines different capacities and refill rates depending on the type of request.

Available request types

  • DEFAULT: General-purpose requests (e.g., products, invoices, webhooks)
  • PAYMENTS: High-sensitivity endpoints such as payment creation
  • AUTH: Authentication-related requests (e.g., token exchanges, sessions)

Tier limits

TierTypeCapacityRefill Rate (tokens/sec)
BASEDEFAULT505
BASEPAYMENTS101
BASEAUTH51
TIER_1DEFAULT15015
TIER_1PAYMENTS505
TIER_1AUTH51
TIER_2DEFAULT45045
TIER_2PAYMENTS25050
TIER_2AUTH51
TIER_3DEFAULT1000100
TIER_3PAYMENTS500100
TIER_3AUTH51
If you’re unsure about your current tier or want to upgrade, please contact [email protected].

Retry Behavior

When your bucket runs out of tokens:
  • Your request will return an error with a 429 Too Many Requests status.
  • The response will include a Retry-After header (in seconds), which tells you how long to wait before retrying.
Implementing retry logic with exponential backoff is strongly recommended to handle rate-limited responses gracefully.

Tips for Developers

  • Group related operations to reduce the number of requests.
  • Cache frequently accessed data instead of refetching it constantly.
  • Monitor response headers for usage patterns and implement alerts when nearing limits.
  • Use retry_after from the error response to delay your retry appropriately.

Example Error Response (Rate Limit)

{
  "status": 429,
  "error": "Rate limit exceeded, try again in 5 seconds",
  "log": {
    "support": "Reach out to [email protected] to request a higher tier, or upgrade your plan. Read more at docs.cci.prem.io.",
    "rate_limit": {
      "resource": "example_resource",
      "tier": "BASE",
      "type": "DEFAULT"
    }
  }
}