To ensure code stability and quality, code goes through several steps to get to production.
The same deployment process applies for both content and code changes.
Automated Deployment Schedule
The main branch of
vets-api follow the same daily (Mon - Fri) deployment schedule:
Changes in by
2 p.m. ET
3 p.m. ET
Varies. Typically complete by 3:45 p.m. ET
For feature and hotfix development (code changes and content changes), the process looks like:
Dev/Content: Create feature branch from
Dev/Content: Commit changes to feature branch
Dev/Content: Feature branch merged to the
mainbranch (via Pull Request)
Automatic: Build run from
mainbranch to create an artifact
Automatic: Deploy newly created artifact to dev and staging
Automatic: Create a release in GitHub from
main, tag artifacts of that commit with release name
Automatic: Deploy to production using artifacts
1. Dev: Create feature branch from
2. Dev: Commit changes to feature branch
3. Dev: Feature branch merged to the
main branch (via Pull Request)
Code is reviewed
GitHub Actions runs unit tests, linting, and security scans.
Committing and code reviewing developers are responsible for running automated and manual integration tests locally before closing the pull request
4. Automatic: Build run from
main branch to create an artifact
5. Automatic: Deploy newly created artifact to development and staging environments
All VA.gov GitHub repos are setup to do squash merges (via the GitHub PR interface), leaving behind a clean revision history that is feature based.
main branch should always be deployable. As such, the deployment to the staging environment is configured to happen automatically and can be used to see what something would look like in a production-like environment for any kind of manual testing/verification.
main is designed to always be deployable, long running features that should not be deployed should utilize feature toggles in the code that disable the feature for the actual production environment. Notifying the DevOps team on what feature toggles should be enabled/disabled in staging and production environment is an important part of this process. However, it's likely that breakages in staging will occur and that this is necessary to discover these prior to moving anything to the production steps.
6. Create a release in GitHub from
main, tag artifacts of that commit with release name
Every weekday, at the beginning of the daily deployment, a Jenkins or Github Actions (for
content-build) automerge job sends a link to the #vfs-platform-builds Slack channel with a diff between the last release and the most recent changes in
main. This commit reference is stored to ensure the diff and released version is deterministic.
After a time has elapsed (currently set to 60m) release is created at the reference from above.
7. Automated: Deploy to production using artifacts
From here, Jenkins or Github Actions can kickoff a production deployment. After the deployment occurs, the normal site monitoring infrastructure will be used to validate it is working. As this process is automatic any new features should have monitoring in place before, or as a part of their deployment
The code that appears in the
main branch actually gets deployed to both dev and staging environments. This is done to support different configurations for the DevOps team as they work to support any configuration changes (i.e. in dev first).
If a production deployment introduces issues that affect Service Level Objectives (SLOs) established for the project, the DevOps team may restore service to users by rolling back the deployment. This is accomplished by triggering a new deploy job in Jenkins or Github Actions using a previous release tag or commit.
The use of hotfixes is discouraged, but may be useful in an emergency situation when
main has significantly deviated from the release and a fix to the failed production release is critical. To create a hotfix, create a branch from the last stable release tag, make changes necessary (with review), create a new release tag following the correct naming scheme, and trigger a deploy in Jenkins with the release name as a parameter.
If SLOs are not affected and a fix is not critical, no rollback will be issued. Instead the fix should be applied through the standard development workflow.
Background and context for deployment process
The creation of this deployment process was triggered and influenced by the following problems (expected or currently experienced) and feedback (based on previous drafts).
Complexity of branching strategy
This concern hovers around two things:
Will people be able to do it without much learning or frequent git mistakes?
Is our approach an industry standard such that it has a low learning curve?
The process originally followed most closely aligned with GitHub Flow, but this process proved to be overly simple for multiple projects being committed to our
Other popular flows such as Git Flow and GitLab Flow were suggested, but have the opposite problem. While being industry standards (like GitHub Flow), they tend to be more complex than our team's needs demand. They are complex enough that they require expert Git knowledge, merging mistakes are commonly made by teams that use them, and they require more active management to maintain clean merge trees. Git Flow and Git Lab do some things really well though: clear guidance on what is being developed and what is in which environment.
While it's not ideal to create our own strategy, none of the strategies seemed to fit the bill with priority being on simplicity. Instead this one was written and does the following things to meet its goals:
Stays very close to GitHub Flow, the simplest of the industry standards and the easiest to learn/manage
Uses Git Flow's emphasis on
mainalways being deployable, which eases deployment and rollback
Just seven repeatable steps for any kind of development (feature or hotfix)
Integrates our team's people work flow (i.e. interactions between development and DevOps teams)
During a quick unscientific survey of the team, most had experience with GitHub Flow, while only a few folks had worked with Git Flow or GitLab Flow in practice (esp. GitLab Flow). Most folks had heard of Git Flow, but only a small subset had used it explicitly. By keeping the chosen flow close to GitHub Flow (and close what the team has been doing), meant this likely has the least cognitive overhead.
Confusion about what has to happen during code reviews / pull requests
Unfortunately, it's hard to codify this outside of running some things automatically such as the tests and scanners. However, it's really important that teams give each other clear feedback and code reviews are the single most important quality gate for the code.
Lack of clear process for getting code changes from development to production
Before, we didn't have anything written down that described how things went to production. This process is different from the other Git* Flows in that it says who performs each step. This means that teams know exactly who to involve and who to delegate responsibility to for each part of getting things to go live.
Different flows between features and hotfixes
This is one of the weaknesses with Git Flow, which requires additional management. We've simplified this by creating a process that is the same for both features and hotfixes. It's expected that things will go to production at least weekly, meaning that there shouldn't be any feature or hotfix branches to manage for long periods of time.
For out-of-band deployment information, see our Deployment Policies.