Scale Forem

Scale YouTube
Scale YouTube

Posted on

InfoQ: Elena Samuylova on Large Language Model (LLM) Based Application Evaluation and LLM as a Judge

Elena Samuylova on LLM Evaluation and AI App Quality

In this laid-back InfoQ chat, Elena Samuylova from Evidently AI spills her go-to tips for making sure your large language model apps actually do what you think they do. From setting up clear metrics and stress-testing edge cases to picking the right monitoring tools, she walks through the playbook for keeping your AI honest and battle-ready.

Hungry for more? Catch the full interview transcript at bit.ly/4mHAKvN and sign up for InfoQ’s Software Architects’ Newsletter to stay in the loop on AI trends, must-attend events and real-world tips from the trenches.

Watch on YouTube

Top comments (0)