Prompt / agent tooling for testing various LLMs and evals
Hey long time I hadn’t posted here 😛
I’d love to be able to:
- Iterate and validate the output of my assistants a bit more systematically, a bit like a QA eng
- Compare various models
I came across this demo of an open source project that does exactly that!
As I was building an agent, iterating on the prompt, I would have loved to have 3-4 basic scenario where I expected a specific answer. It would allow me to iterate more peacefully, knowing I would not go back to a previous state.
Thank you both ! Thê-Minh TRINH always happy to have you hang out with us over here ✌️
It's a great feedback. The tricky thing is to add this without adding too much complexity to the user experience. I'll definitely share your feedback with the team!