Hey guys I’ve been working at a company that builds a tool that is centered around a chatbot with an LLM agent. We use a logging tool to look-back at previous conversations in our test environment and look for potential ways we could improve the experience. The traffic has been getting too big for manual review and we’ve been yet unable to isolate anomalies and potentially misleading responses. How do you think about anomaly tracking with LLMs?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Testing Chatbots and RAG applications | 0 | 27 | 2 July 2025 | |
| Have you tried any of these LLM-as-a-Judge tools? | 2 | 163 | 24 June 2025 | |
| How are you using LLM or AI in testing? | 0 | 85 | 26 March 2025 | |
| Are there any specific test automation tools or frameworks recommended for testing Language Models - LLM? | 4 | 2068 | 20 January 2024 | |
| Day 28: Define quality for LLMs | 2 | 110 | 14 October 2024 |