Scale Forem

Scale YouTube
Scale YouTube

Posted on

InfoQ: Elena Samuylova on Large Language Model (LLM) Based Application Evaluation and LLM as a Judge

What’s the scoop?

InfoQ sat down with Elena Samuylova from Evidently AI to chat about all things LLM evaluation—from setting up robust testing and monitoring pipelines to picking the right tools for your AI-powered apps. She walks through best practices that help you catch drift, bias, or performance hiccups before they hit production.

Bonus goodies:

You can dive into the full interview transcript online, and while you’re at it, subscribe to the Software Architects’ Newsletter for more insider tips. If you’re itching for in-person inspiration, check out InfoQ’s upcoming Dev Summit in Munich, QCon events in San Francisco, New York, and London, plus a whole lineup of podcasts and community channels to keep you ahead of the curve.

Watch on YouTube

Top comments (0)