InfoQ: Elena Samuylova on Large Language Model (LLM) Based Application Evaluation and LLM as a Judge

#career #architecture

Elena Samuylova from Evidently AI joins InfoQ to spill the tea on evaluating LLM-powered applications—covering everything from the right testing strategies and monitoring tools to real-world best practices that keep your AI running smoothly in production.

She even dives into the meta-world of using LLMs as judges, exploring how these models can assess each other’s output and help you fine-tune performance and reliability.

Watch on YouTube

Scale Forem

InfoQ: Elena Samuylova on Large Language Model (LLM) Based Application Evaluation and LLM as a Judge

Top comments (0)