Skills required for testing machine learning systems

I came across this short piece and wondered what others testing machine learning systems thought? (especially anyone testing reinforcement learning or probabilistic models).

What skills do you need when you are testing policies or models?

On the article:
It was nice to see an attempt at discussing testing in ML, but seemed to not quite match my understanding (maybe it was classification orientated). The focus also seemed to be a bit waterfall, separating test from the development.

e.g. for me points 1 and 2 would be the job of the researcher/ML engineer based on what policy or model they are building.

“test all the possible combinations” - seemed a strong statement. That is hard enough in a deterministic system, let alone a stochastic model or policy.

However the points about validation and not expecting a binary answer to testing are very true for me. Although this is not new for machine learning, performance testing has always had that problem.

On skills:
Depends at what stage you are talking about.

Our test professionals working on the core model or policy need to have a strong mathematical background and need to get stuck into reading the papers and books that the algorithms are based on. This gives them the understanding to test, probe and build confidence in what is being produced.

Those in product teams, can treat things more as a black box, but as the article mentions, have a good understanding of statistics, a bit of data science and technical skills to build good datasets and understand the metrics used to assess model or policy performance.

I think it is going to be really important to have people who can come up with diversified datasets to train the AI model. The more diverse the dataset the better the AI model would be able to generalize the results to a larger dataset. For example - Say you are training an AI model to recognize buttons from images. We need to have a dataset that includes round, diamond, square, rectangle shaped buttons and also include some that are not buttons in the dataset. For this a skilled person needs to come up with a good training dataset.

We need people to constantly monitor the learning of the AI model and prevent overfitting (when a model learns the detail and noise in the training data too well to the extent that it negatively impacts the performance of the model on new data) and underfitting (not enough training). For this we would preferably need people with some data science and statistical modeling background to accurately monitor this.

Finally, we need people who think outside the box and think about different scenarios where the AI system could compromise the security, privacy and ethical aspects of consumers. This is very important in this day and age where there is overflow of data.

In summary, the same skills we needed as a tester previously holds good to testing AI/ML systems as well i.e curiosity, critical thinking, open mindedness, technical knowledge and thinking like an end user. I wrote about the role of testers in testing AI based systems here, for more info -