have you ever seen the term āgenerative testingā before. If yes, what do you understand it to mean?
āreally what you want is to sample from a statistical estimator fit to real world log data but afaik nobody does this.ā I take this to mean, you should track the real use of your app and build your tests from that info, but āstatistical estimatorā - ??? No clue on that piece, you have any thoughts?
I was struck by the thread because I also have perceived on some of my teams a gap between āthe tests that are createdā and āthe tests that I want to maintain because I think they are highly valuable.ā
TL;DR: instead of specifying all the details of each test case, you specify one or more properties that must always be true at the end of a test case, and maybe some constraints on the inputs (e.g. input string X must have at least 1 character) then the test framework repeatedly: generates its own inputs, runs the test, and sees if the properties are true.
As to the statistical estimator bit - I think that itās making the generated test data have a similar distribution of values to those experienced in production. I.e. if the input has a field X which can be an integer, and in production a quarter of inputs have X > 3000, then the generated test data inputs should also have X > 3000 a quarter of the time.
Iām not sure I agree with this (but then Iām no expert in this kind of testing). Knowing that your test data matches historical production usage is good, but it wonāt help you identify bugs that users havenāt stumbled on yet. So Iād suggest that thereās benefit in having all possibilities equally likely, at least for some test runs. (This is so that you can generate values that havenāt been encountered yet in production, because otherwise these could have 0% probability and so never get generated.)
I think generative/property-based testing can be beneficial, but the original Twitter poster seems to think that itās an either/or. Traditional unit tests are the first line of defense, and you add other layers on top of it, which can include generative/property-based testing.
As for pulling data from prod to make the generative testing more reflective of the real world, that doesnāt seem like a great idea. Iād rather define my input/output criteria based on what I expect from the method, not from what I see in prod. Thereās also the issue that youāre several degrees removed from unit tests once youāve gotten to prod/functional test data, and mapping that back to unit tests seems cumbersome at best.
The big argument for property based testing is that they can find new bugs, where as unit tests verify existing behavior. I think in general, if you find a new bug via property based testing, youāre likely going to add a unit test for that.
We are talking data driven testing but at the complexity level where simple domain analysis is not going to find the defect when the code itself is actually complex. Which is probably an argument for using production data as an input, but I agree with @bobs , production data will probably not find really costly or fatal bugs that only 2% of your customers may ever encounter. What if that 2% of your customers who hit it happen to be the ones who send you the most cash for their single instance license? Most of us will hard code some basic regression test input āsetsā, and to me that is always a red flag. Often we only test a small subset of the real boundaries as they change over time, and because the code you are testing is waaaay āsmarterā than you are (itās hiding this bug from you, so it has to be) dumb arrays of data are costing time.
Have only ever seen this done in a security testing suite, where the adversarial data sets were generated using a small āset of rulesā and then randomly thrown at a component. Would love to see a Domain Specific Language perhaps, that helps the tester find good (evil) input candidates.
Oh, your data driven/sampling comments reminded me that I meant to say that depending on your tolerance for ābugsā in production, thereās probably more bang for the buck in increasing the observability and alerting of your system than trying to wire up property based testing and using stats/prod data to find good ways to bound the properties . . .