How-To Test Third Party Generative AI Models?

sharmon · 7 November 2023 14:37

With the big push for generative Ai models and companies pushing to include them into workflows. I find myself lost as this mostly black box, especially with this being a third party ai model.

user actions → ai black box → outputs.

How does someone actually test this?
How can we automate this for regression tests?
What strategies should there be around this work?
How do you test something that is primarily out of your hands and black box?

kinofrost · 7 November 2023 16:31

It’s as big a problem as you are imagining that it is. A third party AI model is a big black box. You probably don’t have access to even the training data so you can look for biases and mistakes. The model will not likely defend or explain its output, so you have no idea why it came to that conclusion.

You can research AI attacks (e.g. To Break a Hate-Speech Detection Algorithm, Try 'Love' | WIRED) to see if that inspires anything.

But yes, there is a lot of trust involved. One thing you can test for is to imagine the worst thing the AI system could output and try to handle it properly. At some point it will surprise you and provide output that nobody can explain.

The intrinsic testability of a system relies on our ability to observe and control it, and to assess the relationships between inputs and outputs. If we cannot model the system we cannot properly test it. So instead we treat it as dangerous and unpredictable.

Either move the legal and moral responsibility elsewhere or don’t use AI black box systems for anything that matters. I’m sure you’ve seen people get in trouble for it all the time in the news.

sharmon · 8 November 2023 12:55

Thank you for the reply! That did put a lot into a new perspective and helps me understand that I’m not totally off the mark with the worries I have.

It does make sense that if I don’t have control, then I need to accommodate the worst case scenarios. So that will help inform some testing decisions. Thanks for the article! I’ll for sure check it out!

conrad.connected · 8 November 2023 15:11

Yeah, basically only test for outputs that definitely would get the company sued or actually do breach the TOS. You will find that stakeholders can only provide a very small number of these on paper in black and white, and that will be to your advantage. Do not try to solve the world.

sharmon · 10 November 2023 14:01

That does make sense. Thanks for your perspective on it! That will really help narrow down the approach so we aren’t spinning our wheels on something we don’t really have control over.

kristof · 14 November 2023 07:29

Very good question! I’ve worked on some projects where they used machine learning and such to predict a certain outcome.

So it totally depends on what it does or should do.
I’ve used probability theory testing and Metamorphic testing before for ML Models.

How to test Machine Learning Models? Metamorphic testing

What’s also interesting is to check out these top 10’s because testing doesn’t always mean validating the output:

sharmon · 14 November 2023 12:36

Interesting! I’ll gladly add those to my reading lists! Thank you!

Topic		Replies	Views
What Are the Best Practices for Testing AI Models and Systems? 🙋 Questions ai , testing	0	62	6 December 2024
Testing approach of AI integrated application 🙋 Questions	7	457	27 September 2023
What are the good tools for AI driven Test creation 🙋 Questions machine-learning	3	1982	13 May 2023
How Generative AI Can Be a Game Changer to Expedite and Scale AI & ML Algorithms Required for Autonomous Testing? 🙋 Questions automation , career-development	0	26	16 March 2025
AI Testing TestChat 🗄️ Archive	3	1317	2 February 2018

How-To Test Third Party Generative AI Models?

Related topics