Heya everyone
Been a while! But great to be back to the club.
AI based apps are everywhere, and companies have incorporated it into their existing offerings in one way or the other, in small and big ways.
Ignoring all the hyped apps, looking into just usecases that are truly a value add, even in the smallest of ways.. testing them have been such a precarious thing.
The aspect that an input drastically affects the output that comes out is now actually a feature and not a bug!
How is everyone looking at testing them ? The obvious ones that everyone have started adopting are Evals.
The non-determinism is just such contrasting that its close to impossible to cover even 80% of the cases.
So my question is… how have you changed your mindset in today’s world to test AI based features, that are non-deterministic, inherently biased and easily manipulated through prompt injections, etc.. ? how has your thought process changed ?