Incident management
This document will help all team members working on VA.gov understand how to report incidents to the platform and provide relevant information to assist in routing reported issues to other appropriate teams. Please read the following document carefully in order to determine the appropriate path for your incident.
Is this a Severity 1 Issue? - Notify On-Call Immediately
Determine if the issue is a major outage of va.gov or a limited impact incident, if possible.
Examples:
http://va.gov is unresponsive
Platform errors causing extreme delays for users
Login to http://va.gov is unavailable
Alert the Incident Commander.
In Slack Channel #vfs-platform-support trigger an alert with the command:
/pd trigger
When the incident is believed to be Severity 1 which can be defined as (but not limited to):
Issue resolution is required in a 24-hour period
Part of (or all of) va.gov is severely disrupted
In these cases, immediately escalate to #oncall.
When to Report an Issue to be Triaged
General Rule: As stated below, VFS teams are responsible for finding and fixing bugs in the products within their jurisdiction. However, report issues if they meet any of the following criteria:
Seems systemic
Seems related to a product or service provided by the Platform
Seems related to a product managed by another VFS team
Has an unknown source and is causing problems for VA.gov users to report it to Triage
Examples:
An internal load testing tool is broken
Mock data is not working or is out of date
Metrics are being reported incorrectly or not reported at all
You are the Global UX team and you learn in research sessions that a lot of Veterans are having trouble accessing their education benefits
How to Report the Issue to be Triaged:
Choose one of the following:
If you know which team the issue should be routed to, reach out to their point of contact to confirm and directly assign the issue to that team.
If you aren't sure which team owns the issue and would like to send it directly to them without the assistance of Triage, the Product Directory can help guide you in the right direction.
If you aren't sure which team owns the issue and want to submit it to the Platform for triage, submit a GitHub issue using our Template
NOTE: Is there already a GitHub Issue? Add the following labels and post in #vfs-platform-support for visibility:
triage
triage-incident
We will not resolve all issues reported using the Triage Incident Template. We use the information in the ticket to investigate and determine the best team responsible for resolution.
When to NOT report the issue to be Triaged
Bugs with products under your team's jurisdiction including endpoints and integrations with APIs
Feature Improvements that belong within your own team
Still not sure?
If you still have a doubt about where to report your incident for whatever reason, please reach out to the #vsp-triage Slack channel and we would be happy to assist you.
When in doubt, submit any issue through our Triage Incident Template and we’ll will ensure it gets through the process correctly!
How Reported Issues will be Triaged
The Platform
We will resolve issues with products/systems that fall under Platform ownership. See the Product Directory to learn which products and systems the Platform owns.
VFS teams
VFS teams will resolve all issues with Veteran-facing Services (including endpoints and integrations with APIs) by assigning a ZenHub issue to the POC of the VFS team whose product is experiencing issues - per the ownership indicated in the Product Directory.
If the Product Directory does not indicate a VFS owner for a service, Triage will assign the issue to Chris Johnston.
Help and feedback
Get help from the Platform Support Team in Slack.
Submit a feature idea to the Platform.