Testing / QA in Production

I saw this on article on QA in Production and it made me wonder if there were any other resources out there on the topic.

There are a couple of TestBash talks here:

Some other article:

Anyone else have useful resources or stories on testing in production to add to this?

6 Likes

The NetFlix Chaos Monkeys has always intrigued me. Imagine having bots kill elements in your production environment - on purpose to test the resilience… “in the wild”. Releated story:

Usually testing in production is labelled “shift right”

on https://www.linkedin.com/pulse/8-reasons-shift-testing-right-lanette-creamer/ Lanette writes aobut Uptime Testing, Release Dress Rehearsal, End to End Workflow Testing, Collecting, Translating, and Isolating Customer Issues, Actionable Analytics, Security Bug Bounty Program, More Automation, Customer Visits

5 Likes

Katrina Clokie’s book “A Practical Guide to Testing in DevOps” contains great insights about this subject: see https://leanpub.com/testingindevops

4 Likes

…And “Testing in Production - Quality Software Faster” by Michael Bryzek: https://www.infoq.com/presentations/testing-software-production

2 Likes

A big +1 for Chaos Monkey. I discovered a bug in the load balancers at Racksapce (Our hosting provider at the time) using it and we ended up switching to another provider because it was so serious.

1 Like

You already linked to a blog post by Charity Majors, but she gave a great presentation at Strangeloop this past year:

She’s a pretty entertaining speaker, and has an interesting focus on observability over metrics and dashboards. She also stresses that she’s not saying to replace testing in lower environments, just that at large scale, you can’t test for the situations you’re likely to encounter in prod.

2 Likes

This bliki entry in Synthetic Monitoring from Martin Fowler’s blog is also great :smiley:

2 Likes

I remember when I worked for a credit card company, and the sheer joy of “testing in production”, because it meant going around shops with several of our cards, looking at their payment terminal, and going “what’s the cheapest thing in the shop … I WILL BUY IT”.

For me, a huge part of it has always been monitoring - looking at what our users are doing, and understanding the scope of what they do. We use Piwik to help with this. We’ll perform biscuit factory testing (random sampling) to make sure things are as we expect, as well as build up general models of what people do.

But also we’d develop ways and rules about doing our own thing in production. As you can imagine, when spending with credit cards, I’d come back with a lot of receipts which needed to be filed, and we also had a way of montoring “cards for testing” for misuse.

BTW - best damn test I’ve ever done in production. I created a credit card for a 18 and 17 year old. One would be able to purchase alcohol, the other was not. Sadly I was not allowed to keep the purchased alcohol.

2 Likes

For a client we have build a “beta testers program” where users could sign up to access new features before they became mainstream available. They also were offered an option to be tracked during their visit on the web application as a way to improve the service and discover flaws.

With only 8% of all users signing up for the program, we had detailed end-to-end access flows through the whole web application giving us repeatable usage data that we could use in our automated test tools (like Selenium) and issues that these users generated (404 and 500 errors) were added to the priority issue tracker, with a full detail about which pages they visited, what button or image they clicked on and which components were affected.

We discovered many small but very impacting issues that we would normally not find in our pre-production tests. We also learned that users don’t use an application the way developers, testers, business owners or project managers think they will. This type of insight in user behaviour on the production web application thought us many things and made the overal application better for the end-user.

In this thread mentions of Netflix Chaos Monkey were made as well. We consider that to be part of our resilience tests to find flaws in our design and dependencies when parts of our application architecture are no longer or only partial available. It’s a necessary step to ensure that we fail safely or that we have mechanisms in place for when a dependency is not available. This has thought us to implement heartbeats and deferred execution solutions in our application architecture.

One important piece of advice: when you want to implement such a “beta program”, especially with the upcoming GDPR and other privacy regulations, make sure you have an explicit approval from the user as you will gather a lot of information, often sensitive PII. And have people re-confirm their interest every 6 months or year as most people forgot they signed up for the program.

3 Likes

We at Springernature test a lot in production, We tried to write something up. Its pretty high level . Let us know if its useful :slight_smile:

1 Like

Hi,

Whether it is performance testing services or web app testing most of the time we get a chance to test pre-release or testing before release.
There are however certain advantages of testing in production.

  1. You get the live feedback of customer reviews.
  2. It helps you get real time user experience.
  3. The performance of app can be monitored while the application is live.
  4. Load testing can be better executed in production.
  5. Beta releases are best versions to get feedback on newly added features.
  6. Live traffic data helps to analyze app and get feedback on which services are most used.

I once worked at a company that did printed products. The complete workflow where the order went through to final printing was only executed on production. That along with true payment processing, not the test mode of payment processing invoked in test environments, and/or where using test credit card numbers that only worked on fake orders in test environments, etc.

As a result some tests were run on production to regression validate the payment processing, and the product printing, which ends up being shipped back to the company or to an employee’s home address. The test orders were typically random, unpurposeful text and images printed on the product, not customized orders for the employee to make use of while testing the system. I think any actual payments used were refunded by the billing team for these orders, the employer incurred the cost for testing the system. Other times for these production orders, special test promo codes were used to discount the product to zero to not need to bill the order.

1 Like