Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
evaluation
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Go Ahead and Judge Me- Agent Evaluators in AWS AgentCore
mgbec
mgbec
mgbec
Follow
for
AWS Community Builders
Jan 25
Go Ahead and Judge Me- Agent Evaluators in AWS AgentCore
#
evaluation
#
agents
#
amazonbedrock
Comments
Add Comment
6 min read
Why Image Hallucination Is More Dangerous Than Text Hallucination
Priyam
Priyam
Priyam
Follow
Jan 6
Why Image Hallucination Is More Dangerous Than Text Hallucination
#
evaluation
#
ai
#
machinelearning
#
futureagi
Comments
Add Comment
1 min read
The Self-Evolving Agent (Part 3): The Human in the Loop
Imran Siddique
Imran Siddique
Imran Siddique
Follow
Jan 1
The Self-Evolving Agent (Part 3): The Human in the Loop
#
architecture
#
aigovernance
#
evaluation
#
engineeringleadershi
Comments
Add Comment
4 min read
Exploring the Benefits of Synthetic Data Generation for AI Agent Evaluation
Kamya Shah
Kamya Shah
Kamya Shah
Follow
Nov 21 '25
Exploring the Benefits of Synthetic Data Generation for AI Agent Evaluation
#
agents
#
ai
#
evaluation
Comments
Add Comment
6 min read
[Free eBook] How We Built a Practical Framework for Evaluating AI Agents in Production
Karthik Avinash
Karthik Avinash
Karthik Avinash
Follow
Nov 12 '25
[Free eBook] How We Built a Practical Framework for Evaluating AI Agents in Production
#
ai
#
llm
#
evaluation
#
agents
Comments
Add Comment
1 min read
Machine Learning Zoomcamp Week 4
Shukurat Bello
Shukurat Bello
Shukurat Bello
Follow
Oct 22 '25
Machine Learning Zoomcamp Week 4
#
machinelearning
#
evaluation
#
programming
#
python
Comments
Add Comment
1 min read
LLM Experimentation: Optimizing My Journaling Agent
Margarita Sliachina
Margarita Sliachina
Margarita Sliachina
Follow
Oct 30 '25
LLM Experimentation: Optimizing My Journaling Agent
#
ai
#
llm
#
evaluation
#
langfuse
5
 reactions
Comments
1
 comment
15 min read
How to build a self-improving agent that updates your UI in real time
Oliver S
Oliver S
Oliver S
Follow
Aug 7 '25
How to build a self-improving agent that updates your UI in real time
#
ai
#
observability
#
evaluation
#
autonomousaifixes
11
 reactions
Comments
Add Comment
9 min read
Evaluating AI Agents: Performance, Reliability, and Real-World Impact
Aun Raza
Aun Raza
Aun Raza
Follow
Jul 31 '25
Evaluating AI Agents: Performance, Reliability, and Real-World Impact
#
technology
#
ai
#
evaluation
#
agents
Comments
Add Comment
4 min read
Debiasing LLM Judges: Understanding and correcting AI Evaluation Bias
gyani sinha
gyani sinha
gyani sinha
Follow
Jul 3 '25
Debiasing LLM Judges: Understanding and correcting AI Evaluation Bias
#
llm
#
ai
#
bias
#
evaluation
1
 reaction
Comments
Add Comment
5 min read
Case Study: How Junie Uses TeamCity to Evaluate Coding Agents
JetBrains TeamCity
JetBrains TeamCity
JetBrains TeamCity
Follow
Jun 3 '25
Case Study: How Junie Uses TeamCity to Evaluate Coding Agents
#
jetbrains
#
teamcity
#
agents
#
evaluation
Comments
Add Comment
5 min read
Retrieval Metrics Demystified: From BM25 Baselines to EM@5 & Answer F1
Shamsuddin Ahmed
Shamsuddin Ahmed
Shamsuddin Ahmed
Follow
Apr 29 '25
Retrieval Metrics Demystified: From BM25 Baselines to EM@5 & Answer F1
#
ragevaluation
#
rag
#
evaluation
#
bm25
Comments
Add Comment
4 min read
Evaluation Metrics for Summarization
Espoir Murhabazi
Espoir Murhabazi
Espoir Murhabazi
Follow
May 26 '25
Evaluation Metrics for Summarization
#
summarization
#
evaluation
#
ai
#
nlp
1
 reaction
Comments
Add Comment
6 min read
Top Open Source Tools for LLM Observability in 2025
Sarah Welsh
Sarah Welsh
Sarah Welsh
Follow
May 1 '25
Top Open Source Tools for LLM Observability in 2025
#
opensource
#
evaluation
#
observability
2
 reactions
Comments
Add Comment
11 min read
Code evaluation as a debugging tool
Anatolii Kozlov
Anatolii Kozlov
Anatolii Kozlov
Follow
Oct 7 '22
Code evaluation as a debugging tool
#
java
#
groovy
#
evaluation
#
spring
4
 reactions
Comments
Add Comment
3 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account