I want to test a site to see if it is ready to go live. There are a few million image url’s and i want to hit a lot of them simultaneously. I have been told that these url’s must not be cached, so each time a page is hit, it must be unique (within a margin or error). This is to simulate the situation when the site first goes live: no cached pages and probably higher than average user count due to advertising.
My question is about how to best achieve this. I am using k6 but my question is fairly general, so mostly tool-agnostic.
How do i make sure that every virtual user uses a unique url ever time? Let’s say there are 100 VUS and the sleep is set to 1 second. i want each VUS to hit a new url every second (or i suppose every 1.5 seconds because each request will take some time too). And these should be unique across all users.
I have written a few smaller k6 load tests but nothing quite so complex. I am open to general feedback but a few of my more concrete questions are:
How to ensure that all url’s are unique? Do i have to create a different set or url’s for each VUS? They will pretty much all simultaneously access my list so i don’t know how else to make sure there is no overlap.
What would be a sensible format to have my URL’s in? I am creating the list, so i can freely choose: json, text file, list, map, something else?
Is this even the best approach to achieve my goal of hitting a few thousand un-cached url’s in one load test? Or is there a better way that I have not thought of?
In short you can retrieve unique rows from your data set. So you probably need to capture that list of possible URLs (full or a sample set), and make use of scenario.iterationInTest to only use a value once.
I think, (and I’m not a web tester by any means), that the point is being missed. Sometimes people are a bit vague when they ask us to test things and they come up with this kind of vague test “desire”, to test-all-of-the-things.
What kind of defect are we looking for, or is this just a performance test? Do we want to load as hard as possible or do we want to vary the number of simulated users continously to achieve a specific realistic load. And do we want to also specify a maximum resource load that the test will monitor and then throttle back down to - and then measure the duration of the test workload instead?
I would use these questions to buy myself time and do some exploratory testing and also ask for instrumentation help to be able to get some stats out of this K9 or whatever it is thing.