You have a Docker Swarm cluster consisting of n nodes (without auto-scaling). A backend API is running on it, which can be accessed at some address(es). Now, here’s the million-dollar question! Which tool would you use to conduct load testing on the API, considering the context, to determine the expected allowable load (number of requests per second) at which this cluster will operate efficiently, and at what load the whole setup will crash?
It would be interesting to hear from specialists about which tools to use and what approaches exist for this purpose.
The original post was posted by my former colleague on LI ALYAKSANDR MAKEENAK on LinkedIn: #задачка #qa #docker #swarm #кластер (it’s in russian so I translated it to English). I’m not an experienced expert in performance testing so I only can suggest here more or less general stuff (and I did so) but maybe someone has some insights for this particular situation
I wouldn’t consider myself a specialist but I do have some experience testing in a docker swarm setup.
Selecting a tool is the easy part to be honest. Most tools like K6 or Gatling etc will let you ramp the throughput until failure. The failure could be set due to breaching a threshold in terms of response time or the number of errors.
There is no mention of a load balancer or database, which would likely impact on the approach I took. Also what types of technology is being used by the backend APIs e.g. GraphQL, WebSockets, REST etc.
For failover docker swarm uses leader-election which means you need an odd number of nodes to maintain a quorum (used to determine cluster health). So if you want a resilient setup then you would ideally want 3 or 5 nodes etc.
You will also want some way of monitoring the tests to determine what caused the failure.