Scale Forem

Scale YouTube
Scale YouTube

Posted on

InfoQ: Elena Samuylova on Large Language Model (LLM) Based Application Evaluation and LLM as a Judge

In this breezy InfoQ podcast, Elena Samuylova from Evidently AI spills the tea on how to properly evaluate, test and monitor LLM-powered apps—complete with tips on letting the model play judge. If you’ve ever wondered about best practices for keeping your AI projects honest and performant, she’s got you covered.

Hungry for more? Grab the full interview transcript, subscribe to the Software Architects’ Newsletter, and don’t miss InfoQ’s upcoming events (Dev Summit Munich, QCon SF, AI New York, London) or their lineup of weekly podcasts for a steady dose of cutting-edge insights.

Watch on YouTube

Top comments (0)