Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
evals
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Evaluate LLM code generation with LLM-as-judge evaluators
Scarlett Attensil
Scarlett Attensil
Scarlett Attensil
Follow
for
LaunchDarkly
Mar 26
Evaluate LLM code generation with LLM-as-judge evaluators
#
ai
#
evals
#
llm
#
agents
6
 reactions
Comments
Add Comment
12 min read
From zero evals to a working multimodal evaluation in 30 minutes using LangWatch Skills
Manouk Draisma
Manouk Draisma
Manouk Draisma
Follow
for
LangWatch
Mar 24
From zero evals to a working multimodal evaluation in 30 minutes using LangWatch Skills
#
ai
#
agents
#
evals
#
claudecode
Comments
Add Comment
7 min read
Self-improving Coding Agents
Raphael Porto
Raphael Porto
Raphael Porto
Follow
Mar 27
Self-improving Coding Agents
#
agents
#
harness
#
ai
#
evals
1
 reaction
Comments
Add Comment
5 min read
Your coding agent already knows how to test your AI agent (we just turned it into a Skill)
Manouk Draisma
Manouk Draisma
Manouk Draisma
Follow
Mar 23
Your coding agent already knows how to test your AI agent (we just turned it into a Skill)
#
agents
#
agentskills
#
evals
#
simulations
1
 reaction
Comments
Add Comment
4 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a blogging-forward open source social network where we learn from one another
Log in
Create account