Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
reproducibility
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Beyond Scores: A Critical Review of Benchmark Reports for Evaluating Large Language Models
Ismail zamareh
Ismail zamareh
Ismail zamareh
Follow
May 17
Beyond Scores: A Critical Review of Benchmark Reports for Evaluating Large Language Models
#
llmevaluation
#
benchmarkcontamination
#
reproducibility
#
llmasjudge
Comments
Add Comment
7 min read
When an AI Pipeline Passes — But One Path Still Must Be Held: EXP-034
Kwansub Yun
Kwansub Yun
Kwansub Yun
Follow
Apr 27
When an AI Pipeline Passes — But One Path Still Must Be Held: EXP-034
#
bioinformatics
#
reproducibility
#
governance
#
ai
Comments
Add Comment
7 min read
Docker works; until it doesn't. Why I started using Nix for dev environments
Charalambos Emmanouilidis
Charalambos Emmanouilidis
Charalambos Emmanouilidis
Follow
Mar 20
Docker works; until it doesn't. Why I started using Nix for dev environments
#
devops
#
nix
#
docker
#
reproducibility
2
 reactions
Comments
Add Comment
9 min read
The Agent Reproducibility Paradox: Debugging Non-Determinism in Production
ArkForge
ArkForge
ArkForge
Follow
Mar 17
The Agent Reproducibility Paradox: Debugging Non-Determinism in Production
#
agents
#
debugging
#
reproducibility
#
observability
2
 reactions
Comments
Add Comment
6 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account