Load Tests

Load tests need to be run before launching new endpoints or substantial updates to existing endpoints to ensure the stability of the API. Static endpoints have no explicit need for this, but we may have examples in the loadtest directory of the devops repo to exercise components of the infrastructure to test for bottlenecks with the same goals outlined here (for example, while tuning at the revproxy layer).

There are a few goals to keep in mind while writing and conducting a test of your new feature. There may be slightly different goals for a load test when launching Platform level features.

Identification of bottlenecks and operational concerns
Validation that if an SLA is given for a new service is achievable with the current implementation
Estimating upper bounds for traffic for a given service

Before You Begin

To load test you need access to

Load Test Tools

There are two main tools that we use for load testing:

Locust lets you define test behavior in Python code and has a web interface for interactive testing. It also supports creating a swarm of test instances. It breaks down latency measurements by individual request paths. In the current version, its text output format is somewhat lacking which makes scripted testing and report generation difficult.
wrk2 claims to have more accurate latency measurements through the use of HdrHistogram and accounting for coordinated omission at the limits of system throughput. Simple tests against a single endpoint can be triggered easily, or test behavior can be scripted in Lua. It has a well-defined report format, or reporting behavior can be overridden via the scripting API. Install instructions are found at the wiki

Dependencies

Navigate to the loadtest folder of the devops repo and install the dependencies

by using pipenv: pipenv install OR
directly: pip install -r requirements.txt

Load Testing

Create a folder inside devops/loadtest to hold scripts that will load test your application. Look to existing scripts for examples or visit the Locust docs. There are two ways to run your load tests: via the command line or (if you’re using Locust) through a web interface.

Load Testing with the Command Line

A small script at loadtest/loadtest_runner allows you to run a load test and record information to make report generation easier. It uses the following arguments:

-d, --dir: Directory to cd to where the load test files are, e.g. search
-t, --test_type: Default 'locust', set to 'wrk2' to run wrk2 tests
argument_input: A file of command line arguments to pass to the load testing tool

An example

❗️Warning: setting -c higher than 10 can exhaust the available staging connections.

CODE

# appealsv2/test_args

-H https://staging-api.va.gov -c 10 -r 2 -t 10m --no-web -f appealsv2_locust.py --only-summary

CODE

# running the loadtest

./loadtest_runner -d appealsv2/ -t locust appealsv2/test_args

It will run the test and produce an output file {start_timestamp}-output.loadtest that can be passed to the report generation tool.

The Reporting function was previously tied to Prometheus. However, Prometheus has now been replaced with Datadog. See the following documents for more information about Datadog:

Get Initial Access to Datadog
Get Acquainted with Datadog
If you have any issues or questions after checking the above documents, contact #dots-dsva

Load Testing with Web Interface

Locust has a web interface to run your load testing scripts. After writing your script (and making sure that you have installed Locust) CD to your script and do:

locust -f my_script.py
visit http://0.0.0.0:8089/ (You can only access this link once you have downloaded and begun running Locust).

You will see something like the following:

locust load testing tool web interface landing page — Start a new load test with the Locust web interface

Use the following formula to determine the Number of Users:

(Peak Hourly Users * Average Session in seconds) / 3600

Example: let’s say that you expect 1,000 users per hour to access your product during its peak usage. The average session length for those users is 5 minutes (300 seconds). Then the number of users is (1000 * 300) / 3600 = ~83 users.

Choose as host the endpoint your test will hit, e.g. staging-api.va.gov.

After you click the Start swarming button you will see a result like the following:

results page after running a locust load test — Results after running a Locust load test

You can click the Download tab to download CSV data of the results.

Help and feedback

Suggest content changes to this page.
Submit new Platform Website content.
Get help from the Platform Support Team in Slack.
Submit a feature idea to the Platform.