Rate Limiting Guide in Vets API
A complete guide: when to add rate limiting and how to implement it.
Part 1: When Should Your Team Add Rate Limiting?
Rate limiting in vets-api is opt-in. Not every endpoint needs it. Use the guidance below to decide whether to add a Rack::Attack rule for your endpoint.
First: Understand the Two Types of 429s
Before adding rate limiting, distinguish between two different sources of 429 errors in vets-api:
Type | Description |
Rack::Attack throttles (inbound) | vets-api itself rejects requests before they reach your endpoint. Protects vets-api from abusive or excessive inbound traffic. |
Upstream 429s (outbound) | An external service (e.g. Lighthouse) returns a 429 to vets-api because your service is calling it too frequently. These are NOT solved by Rack::Attack — they require retry logic, caching, or coordination with the upstream service. |
Real Example (May 2026): benefits_documents/service generated 376 429 errors over a 3-week period. Investigation showed the referrers were almost entirely va.gov/track-claims/your-claim-letters — real veterans checking their claim letters, many immediately after login. This was Lighthouse rate limiting vets-api’s outbound calls, not inbound abuse. Adding a Rack::Attack rule here would have blocked legitimate users. The correct fix is caching, retry logic, or working with the Lighthouse team to increase their rate limit. How to tell the difference: If you’re seeing 429s in your logs but your endpoint isn’t in rack_attack.rb, check the referrer and controller in Datadog. User-facing referrers (e.g. va.gov/track-claims/*) with real controller names point to an upstream issue, not inbound abuse. |
Should You Add a Rack::Attack Rule?
Ask yourself the following questions:
1. Is your endpoint unauthenticated or lightly authenticated?
Unauthenticated endpoints are the highest priority for rate limiting. Without authentication, there’s no barrier to abuse. See representation_management/next_steps_email as an example — without throttling it functioned as an open email relay.
2. Does your endpoint trigger expensive downstream calls?
If a single request fans out to multiple upstream services (e.g. Lighthouse FHIR APIs), a high request rate can cascade into upstream rate limit exhaustion. Consider rate limiting to protect both vets-api and your upstream dependencies.
3. Does your endpoint accept file uploads or send external communications?
File upload endpoints and anything that sends emails, notifications, or triggers external actions should be rate limited to prevent abuse and resource exhaustion.
4. Has your endpoint experienced a traffic spike or near-DoS incident?
Most existing Rack::Attack rules were added reactively after incidents. Don’t wait for an incident — if your endpoint is publicly accessible and handles sensitive operations, add a rule proactively.
5. Is your endpoint part of a form submission flow?
Form submission endpoints (POST) are good candidates for rate limiting. A legitimate user submitting a form rarely needs more than 15–30 submissions per minute.
You Probably Don’t Need Rack::Attack If…
Your endpoint is fully authenticated and only accessible to credentialed users
Your endpoint is read-only with low computational cost and no upstream fan-out
Traffic to your endpoint is low and stable with no history of abuse
You’re seeing 429s that trace back to upstream services rather than inbound request volume
Quick Decision Reference
Scenario | Priority | Suggested Limit |
Unauthenticated POST (email, form) | High | 5–15/min |
File upload | High | 8/5min |
Form submission | Medium | 15–30/min |
Read endpoint with upstream calls | Medium | 20–30/min |
High-volume lookup (e.g. facility search) | Medium | 30/min |
Authenticated, read-only, low traffic | Low/None | Probably no rule needed |
When in doubt, reach out to the Platform SRE team in #vfs-platform-support on Slack or open a support request. They can help review your endpoint’s traffic patterns in Datadog and recommend an appropriate limit.
Part 2: How to Implement Rate Limiting
Rate limiting is configured in config/initializers/rack_attack.rb using the Rack::Attack gem. There is no global rate limiting — it is added per-endpoint as needed.
When to Add Rate Limiting (Checklist)
Rate limiting should be considered when:
Your endpoint is publicly accessible
The endpoint calls expensive upstream services
The endpoint could be abused to cause denial of service
A Staging Review or Security Review requires it
Reference: The Security Review checklist includes “Rate limits defined” as a required item.
How to Add Rate Limiting
Add a throttle block to config/initializers/rack_attack.rb:
throttle('your_endpoint_name/ip', limit: 10, period: 1.minute) do |req|
req.remote_ip if req.path.starts_with?('/your/endpoint/path')
end
Configuration Options
Parameter | Description | Notes |
limit | Maximum requests allowed in the period | |
period | Time window | e.g. 1.minute, 5.minutes |
req.remote_ip | Use this (not req.ip) since we’re behind a load balancer | Preferred over req.ip |
req.path | Can use == for exact match or .starts_with? for prefix | |
req.get? / req.post? | Optional — filter by HTTP method |
What Happens When Rate Limited
Returns HTTP 429 Too Many Requests
Includes headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
Response body: “throttled”
Part 3: Determining the Right Rate Limit
For New Endpoints (No Existing Traffic Data)
When launching a new endpoint, you won’t have production traffic data to analyze. Here’s how to approach rate limiting without historical data:
Step 1: Model the User Journey
Map out realistic usage scenarios. Your rate limit should accommodate the power user scenario with headroom.
Scenario | Calculation | Requests/min |
Normal user | 1 page load = 2 API calls, user visits 3 pages/min | 6 |
Power user | Rapid searching/filtering, 10 actions/min | 20 |
Automated refresh | Page polls every 30 seconds | 2 |
Step 2: Find a Similar Endpoint for Reference
Your Endpoint Type | Similar Existing Endpoint | Their Limit |
Search/lookup | facilities_api/v2/va | 30/min |
Form submission | education_benefits_claims | 15/min |
File operations | vic/profile_photo_attachments | 8/5min |
Status/polling | medical_copays | 20/min |
Step 3: Check Upstream Service Constraints
If your endpoint calls external services, their limits set your ceiling:
Your rate limit ≤ Upstream service limit / Expected concurrent users
Contact the upstream service team to understand their constraints.
Step 4: Start High and Plan to Adjust
Recommended approach for new endpoints:
# Phase 1: Launch with permissive limit (2-3x expected peak usage)
throttle('new_endpoint/ip', limit: 60, period: 1.minute) do |req| req.remote_ip if req.path.starts_with?('/v0/new_endpoint')end
Then follow this timeline:
Week | Action |
1–2 | Monitor traffic patterns in DataDog, no changes |
3 | Analyze P95 usage, identify if limit is too high |
4+ | Adjust limit based on actual data |
Step 5: Add Monitoring From Day One
Deploy with DataDog monitoring so you can adjust quickly:
# In your controller or service
StatsD.increment('api.new_endpoint.request', tags: ["ip:#{request.remote_ip}"])
Step 6: Document Your Assumptions
In your PR, document:
Expected user behavior and request patterns
Similar endpoints used as reference
Upstream service constraints (if any)
Plan for adjusting limits post-launch
Example PR description:
Rate limit set to 30/min based on: • Similar to facilities_api endpoint (30/min) • Expected max 10 requests/min for power users • Upstream service X has 100/min limit • Will review after 2 weeks of production traffic |
For Existing Endpoints (With Traffic Data)
If your endpoint already exists and has traffic, you can use DataDog to make data-driven decisions.
Step 1: Analyze Expected User Behavior
Think through the user journey: How many times would a legitimate user hit this endpoint in a session? Is it called once per page load? Multiple times during form submission? Are there any frontend polling patterns?
Example: If a user searches for facilities and might refine their search 5–6 times, and each search makes 2 API calls, that’s ~12 requests in a few minutes for an active user. |
Step 2: Check Existing Traffic in DataDog
Before adding rate limiting, query DataDog for current traffic patterns:
# Requests per IP per minute
sum:vets_api.requests{path:/your/endpoint/*} by {client_ip}.rollup(count, 60)
Look for:
P95/P99 requests per IP per minute — what do normal heavy users look like?
Max requests per IP — what do potential abusers look like?
Distribution — is there a clear gap between normal and abnormal traffic?
Step 3: Start Permissive, Then Tighten
Phase | Limit | Purpose |
1. Monitor only | None | Add logging/metrics to track what would be rate limited |
2. High limit | 100/min | Catch only obvious abuse |
3. Tighten | 30–50/min | Based on observed normal traffic |
4. Final | 10–20/min | If needed, based on upstream limits |
Step 4: Consider Upstream Service Limits
If your endpoint calls an external service (PPMS, Lighthouse, etc.):
What are their rate limits?
Your limit should be lower than theirs to protect the upstream service
Step 5: Environment-Specific Limits
You can exclude non-production environments from rate limiting:
throttle('your_endpoint/ip', limit: 10, period: 1.minute) do |req| req.remote_ip if req.path.starts_with?('/your/endpoint') && !Settings.vsp_environment.match?(/local|development|staging/)end
Part 4: Reference
Safe Starting Points
Endpoint Type | Safe Starting Limit | Rationale |
Read-only GET | 30–60/min | Users may browse/search repeatedly |
Form submission POST | 15–20/min | Deliberate actions, allow for retries |
File upload | 10/5min | Heavy operations, natural user throttling |
Shared with other apps | Coordinate with teams first | Avoid breaking partner integrations |
Existing Rate Limits in rack_attack.rb
Endpoint | Limit | Period | Notes |
facilities_api/v2/va | 30 | 1 min | Added after DoS incident |
facilities_api/v2/ccp/provider | 8 | 1 min | PPMS protection |
vic/profile_photo_attachments (GET) | 8 | 5 min | Download limit |
vic/profile_photo_attachments (POST) | 8 | 5 min | Upload limit |
vic/supporting_documentation_attachments | 8 | 5 min | Upload limit |
vic/vic_submissions | 10 | 1 min | Form submission |
check_in | 10 | 1 min | Excludes local/dev/staging |
medical_copays (GET) | 20 | 1 min | Read operations |
education_benefits_claims (POST) | 15 | 1 min | Form submission |
form214192 (POST) | 30 | 1 min | Form submission |
form21p530a (POST) | 30 | 1 min | Form submission |
form210779 (POST) | 30 | 1 min | Form submission |
form212680 (POST) | 30 | 1 min | Form submission |
vaos/v2/appointments (GET/POST/PUT) | 30 | 1 min | VAOS appointments |
vaos/v2/providers (GET) | 30 | 1 min | VAOS providers |
vaos/v2/locations (GET) | 30 | 1 min | VAOS clinics |
vaos/v2/community_care/eligibility (GET) | 30 | 1 min | VAOS CC eligibility |
vaos/v2/eligibility (GET) | 30 | 1 min | VAOS patient eligibility |
vaos/v2/scheduling/configurations (GET) | 30 | 1 min | VAOS scheduling |
vaos/v2/facilities (GET) | 30 | 1 min | VAOS facilities |
vaos/v2/relationships (GET) | 30 | 1 min | VAOS relationships |
ask_va_api/v0/zip_state_validation (POST) | 60 | 1 min | Production only |
ask_va_api/v0/diagnostics (GET) | 30 | 1 min | |
representation_management/v0/next_steps_email (POST) | 5 | 1 min | Per IP; prevents open relay |
representation_management/v0/next_steps_email (POST) | 3 | 1 hour | Per destination email address |
Monitoring After Deployment
Set up a DataDog dashboard to track:
429 responses — How often is the limit being hit?
Unique IPs hitting limits — Is it one bad actor or many users?
Requests just below limit — Are legitimate users getting close?
DataDog query examples:
# Count of 429 responses
sum:vets_api.response{status:429,path:/your/endpoint/*}.as_count()
# Unique IPs hitting rate limits
count_distinct:vets_api.requests{status:429,path:/your/endpoint/*} by {client_ip}
Safe Rollout Strategy
Start with a limit of 2–3x your expected heavy user (e.g., if you expect 10 requests max, set 30)
Deploy to production with monitoring enabled
Watch DataDog for 1–2 weeks to observe actual traffic patterns
Tighten the limit based on observed data
Document your rationale in the PR for future reference
Additional Resources
Sidekiq Enterprise Rate Limiting (for worker-level rate limiting)
Questions?
Reach out in #vfs-platform-support on Slack.
Questions? Reach out in #vfs-platform-support on Slack.