The development and deployment cycle for VA.gov makes use of various environments.

Environment

Description

vets-api URL

vets-website URL

Development

  • A place to build and test code changes without breaking anything in the live production environment

  • Used by developers

  • Not connected to live end-user data

dev-api.va.gov

https://dev.va.gov

Sandbox

  • A copy of the development environment typically used to define Service Level Agreements (SLAs) for clients developing against the VA.gov API (Lighthouse)

  • Used by developers

  • Not connected to live end-user data

sandbox-api.va.gov

none

Staging

  • Where code is deployed before it goes to production. Often used for user acceptance testing or to verify before releasing to production.

  • Used by developers, QA engineers, other VFS team members. Sometimes used by end-users for User Acceptance Testing (UAT).

staging-api.va.gov

https://staging.va.gov

Production

  • Live code used on VA.gov

  • Used by end-users visiting VA.gov (and other Veteran systems).

  • Full production data.

  • Not for experimentation or testing.

api.va.gov

https://www.va.gov/

Where non-production environments live

Environments each live in their own Virtual Private Cloud (VPC). Each of these environments also has their own Jumpbox--an EC2 instance that lives on the firewall border of the VPC and allows access to servers within the VPC. Our provisioning configuration deploys the API as the vets-api, which runs an HTTP server, and vets-api-worker, which processes background jobs.

The dev, staging and sandbox environments live in AWS Elastic Kubernetes Service (EKS) where vets-api is deployed as pods.

Builds and deploys

For dev, staging, and sandbox, merging to master in the vets-api GitHub repository automatically kicks off a set of GitHub actions The GitHub actions update the special branch (k8s) we are using for Kubernetes-based deployment specifications, build and push a new container image to AWS Elastic Container Registry (ECR), and update the vsp-infra-application-manifests repository with the new image tag. From here, our continuous delivery tool, ArgoCD, will see the updated image tag and update the EKS deployment by standing up new pods using the new image and taking down the old ones.

In the production environment, code is deployed daily, Monday - Friday by a cron-based deployment pipeline managed within Jenkins.

For details on the daily production deployment, see Deployment process.

Background job processing

There are a number of Sidekiq jobs defined in sidekiq_scheduler.yml in the vets-api repository that run on a time-based cron job schedule. If an issue occurs with a job, or to view which Sidekiq jobs have been queued or completed, see this Grafana chart (requires SOCKS proxy). There are worker instances for each environment which process the background jobs.

For help troubleshooting, see Sideqik jobs.

Unreleased feature testing

Our deployment process assumes that master should always be deployable. If you need finer control over the release of your feature, you can use feature toggles. Powered by the flipper gem, feature toggles are used to manage and preview unreleased features. Feature toggles allow for toggle switching without having to redeploy. In any environment, teams can enable or disable a feature for:

  • all users

  • a percentage of all users

  • a percentage of all logged-in users

  • a list of users

  • users defined in a method

For information on using feature toggles, see Feature toggles.

Test users

A variety of test users are available for the staging and development environments with various credentials and levels of assurance. 

ID.me provides authentication for users accessing services through VA.gov. Users don’t have a VA.gov-specific account. Instead, users sign in to ID.me and their credentials are passed to VA.gov. VA.gov uses this information to make additional requests for authorization within VA systems.

For more information about ID.me, see the FAQs on the ID.me support website. (You’ll need to create your own ID.me account to access their support information.)

Monitoring and error tracking

We use Sentry, Grafana, and Datadog to provide insights into performance metrics and error tracking. (Sentry and Grafana require SOCKS proxy. For Datadog access, see Monitor Applications and Infrastructure with Datadog)

  • A unique project exists in Sentry for each environment, including development.

  • Grafana includes a collection of resources including dashboards with a variety of metrics and visualization options. The “production” Data Source can be selected for the ability to view metrics in the development environment.

  • Datadog has similar features to Grafana but also includes some monitoring and alerting capabilities.

Database

Each environment is configured with a Postgres instance to store relational data. The database is managed by AWS RDS.

External service calls and mocked data

Environment

Data source

Development

Access to external services is blocked in development. Instead, responses come from the Vets API mockdata repo through the use of the betamocks gem. Information about betamocks can be found here.

Staging

In the staging environment, some responses are mocked and some live requests go through the forward proxy and connect to lower env backends. Mockdata comes from the Vets API mockdata repo through the use of the betamocks gem.

Sandbox

Access to external services is blocked in development. Instead, responses come from the Vets API mockdata repo through the use of the betamocks gem. Information about betamocks can be found here.

Production

Most of the data that is displayed, or stored from VA.gov lives in external services. These external services store information about Veteran’s health, benefits, education, etc records.

Vets API connects to these external services via the forward proxy. In the production environment, live requests are made to external services. When appropriate vets-api caches many of the responses from external services, when there is no cache record, a request is made through the forward proxy.

Breakers

Vets API uses the breakers gem to implement the circuit breakers pattern using Faraday middleware. Breakers is used in conjunction with external services to determine if a service may be down.

If an external service outage occurs, breakers record successful and failed requests in Redis to determine whether to mark a service as up or down. Instrumentation for breakers and external services can be found in Grafana.

Breakers are not used in the development or sandbox environment since no HTTP calls are made in those environments.

Logs

Logs for each environment are sent to Loki. For a more granular level of analysis, per-instance logs can be analyzed. Log events can be queried via Loki, leveraging the Grafana interface. This can also be done in Datadog. Vets API server and worker instances have their own logs for server or background job-related information. 

PII and sensitive data

We expect developers to use the minimum amount of Personal Identifiable Information (PII) required by their application. Be aware of where and how that data is stored throughout the web request lifecycle. Most PII related to logs and Sentry errors gets filtered out automatically. vets-api stores very minimal PII, but if there is an absolute need to log PII, a PersonalInformationLog is available.

Do not store any PII in the development or sandbox environments.

For more information on PII, see Personal Identifiable Information (PII) guidelines.

How migrations are ran

In each environment, Rails database migrations for vets-api are triggered by a manual process.

For more information about database migrations, see Database migrations.

Settings

You will need to define appropriate config settings that vary in value for each environment in the relevant devops (private repo)  and application-manifests(private repo) configurations (prod-settings.local.yml.j2 in the devops repository and yaml configurations with the dev, staging, and sandbox directories of the application in the vsp-infra-application-manifests repository).

Defaults for secrets that developers can use locally are also defined in the vets-api config/settings.yml and must be safe to provide to the public. Get help in Slack from the on-call DevOps engineer to arrange secure delivery and configuration for settings that are used in the development environment via AWS Parameter Store (this step has been heading in the direction of, and will continue to move towards, a more self-service model by allowing teams to manage their own secrets).