Beyond the Prompt: Evaluating, Testing, and Securing LLM Applications
At NDC Copenhagen, Mete Atamel shows you how to move past guesswork when tweaking prompts or RAG pipelines in your LLM apps. You’ll discover which metrics really matter, using evaluation frameworks like Vertex AI Evaluation, DeepEval and Promptfoo to turn vague changes into solid data.
But accuracy isn’t the only goal—this talk also covers testing and security tools (hello, LLM Guard) to block prompt injections and prevent harmful responses. Think of it as your go-to guide for measuring, validating and locking down any large-language-model project.
Watch on YouTube
Top comments (0)