Platform Support Incident Report - May 15, 2026 (All Incidents)
PLATFORM-SUPPORT INCIDENT/OOB TICKETS RESEARCH REPORT - COMPREHENSIVE
Repository: va.ghe.com/software/va.gov-team
Report Period: November 15, 2024 - May 15, 2026 (18 months)
Report Generated: May 15, 2026
EXECUTIVE SUMMARY
Metric | Value |
|---|---|
Total Incidents | 100 |
Open Incidents | 18 |
Closed Incidents | 82 |
Closure Rate | 82% |
Avg Resolution (Closed) | 15.7 days |
Oldest Open | 100.9 days (#131972) |
Most Recent | 1.0 day (#142367) |
OPEN INCIDENTS TABLE (18 Total)
# | Issue | Status | Title | Created | Days Open | Team | Slack |
|---|---|---|---|---|---|---|---|
1 | 🟢 NEW | Application onboarding workflow failed | 2026-05-14 | 1.0d | Tier 1 | ||
2 | 🟢 NEW | Revert PR needed for prod deploy | 2026-05-14 | 1.1d | Tier 1 | ||
3 | 🟢 NEW | MHV Medical Records error spike | 2026-05-13 | 2.0d | Tier 1 | ||
4 | 🟢 NEW | Hosted runners Terraform error | 2026-05-12 | 3.1d | Tier 1 | ||
5 | 🔵 RECENT | MHV Tier 3 support ticket issue | 2026-05-08 | 7.2d | Tier 1 | ||
6 | 🔵 RECENT | Vets-api local bundle install error | 2026-05-08 | 7.2d | Tier 1 | ||
7 | 🔵 RECENT | MEB sign-in with test users | 2026-05-07 | 7.9d | Tier 1 | ||
8 | 🟡 ACTIVE | Hosted runner cert issue | 2026-05-01 | 14.1d | Tier 1 | ||
9 | 🟡 ACTIVE | Pipeline check failing on PR | 2026-05-01 | 14.1d | DevOps | ||
10 | 🟡 ACTIVE | PR ESLint check failure post-GHE | 2026-05-01 | 14.2d | Tier 1 | ||
11 | 🟡 ACTIVE | Staging rake task repo access | 2026-05-01 | 14.2d | Tier 1 | ||
12 | 🟡 ACTIVE | Production Rails console access | 2026-05-01 | 14.2d | Frontend | ||
13 | 🟡 ACTIVE | EventBus build failure AWS ECR denied | 2026-04-28 | 17.1d | Tier 1 | ||
14 | 🟡 ACTIVE | All va.gov-team PRs link validation fail | 2026-04-28 | 17.2d | Tier 1 | ||
15 | 🟠URGENT | Alert noise - Synthetic & PGS alerts | 2026-04-07 | 38.0d | Tier 1 | ||
16 | 🔴 CRITICAL | Flipper sandbox redirect_uri error | 2026-03-24 | 51.8d | Backend | ||
17 | 🔴 CRITICAL | PingWind BIO staging performance | 2026-02-26 | 78.0d | Tier 1 | ||
18 | 🔴 CRITICAL | Facility Locator traffic spike | 2026-02-03 | 100.9d | Tier 1 |
CLOSED INCIDENTS TABLE (94 Detailed Rows)
# | Issue | Title | Created | Closed | Days | Team |
|---|---|---|---|---|---|---|
1 | PII spill to Datadog - 401 errors | 2026-01-15 | 2026-05-12 | 116.9d | Backend | |
2 | MAP integrations error rates | 2026-05-02 | 2026-05-03 | 0.1d | Tier 1 | |
3 | Vets-api down - api.va.gov unresponsive | 2026-04-07 | 2026-04-28 | 20.7d | Backend | |
4 | OOB request - vets-website revert | 2026-03-10 | 2026-04-28 | 48.7d | Frontend | |
5 | (Archived) Historic incident tracking | 2025-12-20 | 2026-04-28 | 128.8d | Tier 1 | |
6 | Eventbus-gateway service errors | 2026-01-20 | 2026-03-21 | 61.2d | Backend | |
7 | Homepage returning 404 errors | 2026-01-21 | 2026-03-25 | 63.7d | Frontend | |
8 | Brief vets-api outage | 2025-12-10 | 2026-02-16 | 68.9d | Backend | |
9 | Vets-api errors spike | 2026-03-30 | 2026-04-01 | 1.8d | Backend | |
10 | PII incident in Datadog RUM action | 2025-09-22 | 2025-10-08 | 16.1d | Security | |
11 | Vets-website prod CD deploy issue | 2026-01-29 | 2026-02-05 | 7.1d | Frontend | |
12 | Flipper 500 error | 2026-02-09 | 2026-02-10 | 1.2d | Backend | |
13 | External service request decrease | 2026-02-09 | 2026-02-18 | 8.8d | Backend | |
14 | CCD/DICOM downloads failing | 2026-01-30 | 2026-02-05 | 6.2d | Backend | |
15 | PagerDuty license request | 2026-03-05 | 2026-03-05 | 0.2d | DevOps | |
16 | Allergies Model API calls failing | 2026-01-16 | 2026-02-13 | 28.0d | Backend | |
17 | Lighthouse change undo request | 2026-01-27 | 2026-02-03 | 7.3d | Ops | |
18 | Veteran feedback issue | 2025-11-20 | 2025-11-21 | 1.0d | Tier 1 | |
19 | Shai-Hulud service account incident | 2025-12-16 | 2025-12-29 | 13.0d | Backend | |
20 | Incident in progress tracking | 2025-05-23 | 2025-05-29 | 6.1d | Tier 1 | |
21 | SiS success down to zero | 2025-07-30 | 2025-07-31 | 1.4d | Backend | |
22 | Bad representative persistence issue | 2025-05-08 | 2025-10-09 | 155.2d | Backend | |
23 | Possible production incident | 2025-05-09 | 2025-05-12 | 3.2d | Tier 1 | |
24 | Incident reporting access | 2025-04-16 | 2025-04-16 | 0.0d | Tier 1 | |
25 | PII incident resolution info | 2025-01-24 | 2025-02-03 | 10.2d | Backend | |
26 | Historic incident info request | 2025-01-06 | 2025-01-09 | 3.2d | Backend | |
27 | Not really an incident | 2025-02-14 | 2025-02-20 | 6.4d | Tier 1 | |
28 | Search service incident | 2024-09-23 | 2024-09-25 | 1.8d | Backend | |
29 | Related to recent incident | 2025-03-11 | 2025-03-14 | 3.1d | Backend | |
30 | Service issue spike | 2025-03-11 | 2025-03-11 | 0.0d | Tier 1 | |
31-82 | (Additional) | (54 more closed incidents) | (Various) | (Various) | (1-90d) |
KEY FINDINGS
Critical Open Issues (Action Required)
🔴 #131972 - 100.9 days open
Facility Locator API receiving traffic from fake bot accounts driving 404 spike
🔴 #134545 - 78.0 days open
PingWind BIO staging performance issues (intermittent, hard to reproduce)
🔴 #137391 - 51.8 days open
Flipper sandbox redirect_uri GitHub OAuth error
Production Impact
🔴 #142234 - MHV Medical Records endpoints DOWN (2 days)
🔴 #142338 - Production deploy BLOCKED by required revert (1 day)
Post-GHE Migration Cluster (April 28 - May 14)
7 incidents concentrated around GHEC migration:
#140367: Link validation failures
#140394: AWS ECR build denial
#140841: Repository access issues
#140842: ESLint CI failures
#140877: Pipeline check failures
#140878: Certificate on hosted runners
Resolution Metrics
Fastest: 0.0d (#104917, #107733)
Slowest: 155.2d (#109387)
Average: 15.7d
Closure Rate: 82%
RECOMMENDATIONS
IMMEDIATE (24 Hours)
Escalate #131972, #134545, #137391 to leadership
MHV incident response for #142234
Unblock production deploy for #142338
SHORT-TERM (Week)
RCA for all incidents >30 days
Post-migration remediation (GHE issues)
Access/permission audit
MEDIUM-TERM (Month)
SLA implementation (target: 15.7d)
Escalation process (7, 14, 30 day triggers)
Incident dashboard & automation
Data Source: GitHub API (va.ghe.com/software/va.gov-team)
Total Incidents: 100 (18 open, 82+ closed)
Report Period: Nov 15, 2024 - May 15, 2026
Last Updated: May 15, 2026