Llmevaluation - DEV Community

Skip to content

DEV Community

👋 Sign in for the ability to sort posts by relevant, latest, or top.

Christopher Hoeben

Jul 14

LLM Evaluation System Prompts Scored Rubrics Runtime Guardrails: A Practical Guide for Production

#llmevaluation #systemprompts #scoredrubrics #runtimeguardrails

7 min read

Michael Tuszynski

Jul 13

Try It: A Working Assessment-First Course

#aieducation #llmevaluation #opensource #developertools

4 min read

Michael Tuszynski

Jul 8

Your LLM Judge Needs a Test Suite

#llmevaluation #aiengineering #softwaretesting #generativeai

4 min read

Jul 1

I reviewed six "operator-ready" checklists for AI agents. None of them define the problem correctly.

#agents #llmevaluation #mlops #agentreliability

5 min read

Jul 11

How to Add Evals to an LLM Feature

#llmevaluation #evals #llmfeatures #aitesting

5 min read

May 17

Beyond Scores: A Critical Review of Benchmark Reports for Evaluating Large Language Models

#llmevaluation #benchmarkcontamination #productiontesting #promptengineering

5 min read

May 17

Beyond Scores: A Critical Review of Benchmark Reports for Evaluating Large Language Models

#llmevaluation #benchmarkcontamination #reproducibility #llmasjudge

7 min read

May 16

Beyond Scores: A Critical Review of Benchmark Reports for Evaluating Large Language Models

#llmevaluation #benchmarks #machinelearning #productiondeployment

7 min read

Joyson Fernandes

May 31

Build a Production RAG System on AWS Bedrock from Scratch

#llmevaluation #llmasjudge #apigateway #bedrock

29 min read

👋 Sign in for the ability to sort posts by relevant, latest, or top.