Skip to main content
Skip table of contents

Incident management

This document will help all team members working on VA.gov understand how to report incidents to the platform and provide relevant information to assist in routing reported issues to other appropriate teams. Please read the following document carefully in order to determine the appropriate path for your incident.

Is this a Severity 1 Issue? - Notify On-Call Immediately

  1. Determine if the issue is a major outage of va.gov or a limited impact incident, if possible.

    1. Examples:

  2. Alert the Incident Commander.

    1. In Slack Channel #vfs-platform-support trigger an alert with the command:
      /pd trigger

      PagerDuty Trigger Screen

      PagerDuty Trigger Screen

When the incident is believed to be Severity 1 which can be defined as (but not limited to):

  • Issue resolution is required in a 24-hour period

  • Part of (or all of) va.gov is severely disrupted

In these cases, immediately escalate to #oncall.

When to Report an Issue to be Triaged

General Rule: As stated below, VFS teams are responsible for finding and fixing bugs in the products within their jurisdiction. However, report issues if they meet any of the following criteria:

  • Seems systemic

  • Seems related to a product or service provided by the Platform

  • Seems related to a product managed by another VFS team

  • Has an unknown source and is causing problems for VA.gov users to report it to Triage

Examples:

  • An internal load testing tool is broken

  • Mock data is not working or is out of date

  • Metrics are being reported incorrectly or not reported at all

  • You are the Global UX team and you learn in research sessions that a lot of Veterans are having trouble accessing their education benefits

How to Report the Issue to be Triaged:

Choose one of the following:

  • If you know which team the issue should be routed to, reach out to their point of contact to confirm and directly assign the issue to that team.

  • If you aren't sure which team owns the issue and would like to send it directly to them without the assistance of Triage, the Product Directory can help guide you in the right direction.

  • If you aren't sure which team owns the issue and want to submit it to the Platform for triage, submit a GitHub issue using our Template

    • NOTE: Is there already a GitHub Issue? Add the following labels and post in #vfs-platform-support for visibility:

      • triage

      • triage-incident

We will not resolve all issues reported using the Triage Incident Template. We use the information in the ticket to investigate and determine the best team responsible for resolution.

When to NOT report the issue to be Triaged

  • Bugs with products under your team's jurisdiction including endpoints and integrations with APIs

  • Feature Improvements that belong within your own team

Still not sure?

If you still have a doubt about where to report your incident for whatever reason, please reach out to the #vsp-triage Slack channel and we would be happy to assist you.

When in doubt, submit any issue through our Triage Incident Template and we’ll will ensure it gets through the process correctly!

How Reported Issues will be Triaged

The Platform

We will resolve issues with products/systems that fall under Platform ownership. See the Product Directory to learn which products and systems the Platform owns.

VFS teams

VFS teams will resolve all issues with Veteran-facing Services (including endpoints and integrations with APIs) by assigning a ZenHub issue to the POC of the VFS team whose product is experiencing issues - per the ownership indicated in the Product Directory.

If the Product Directory does not indicate a VFS owner for a service, Triage will assign the issue to Chris Johnston.


JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.