Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
AI Agent Evaluation Series' Articles
Back to shashank agarwal's Series
How to use System prompts as Ground Truth for Evaluation
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 10 '25
How to use System prompts as Ground Truth for Evaluation
#
testing
#
agents
#
llm
#
ai
1
reaction
Comments
Add Comment
1 min read
Stop Evaluating AI Agents Like ML Models: A Paradigm Shift for Developers
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 12 '25
Stop Evaluating AI Agents Like ML Models: A Paradigm Shift for Developers
#
ai
#
llm
#
agents
#
machinelearning
1
reaction
Comments
Add Comment
3 min read
Your System Prompt is Your Ground Truth: Ditch Manual Labeling for AI Agent Evaluation
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 15 '25
Your System Prompt is Your Ground Truth: Ditch Manual Labeling for AI Agent Evaluation
#
ai
#
programming
#
tutorial
#
agents
1
reaction
Comments
Add Comment
3 min read
Beyond Accuracy: The 73+ Dimensions of AI Agent Quality
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 17 '25
Beyond Accuracy: The 73+ Dimensions of AI Agent Quality
#
ai
#
agents
#
machinelearning
#
programming
Comments
Add Comment
3 min read
How to Analyze AI Agent Traces Like a Detective
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 19 '25
How to Analyze AI Agent Traces Like a Detective
#
ai
#
testing
#
agents
#
webdev
Comments
Add Comment
3 min read
5 Types of AI Hallucinations (And How to Detect Them)
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 22 '25
5 Types of AI Hallucinations (And How to Detect Them)
#
discuss
#
ai
#
machinelearning
#
agents
1
reaction
Comments
Add Comment
3 min read
The Hidden Costs of Inefficient AI Agents (And How to Fix Them)
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 24 '25
The Hidden Costs of Inefficient AI Agents (And How to Fix Them)
#
webdev
#
ai
#
programming
#
devops
Comments
1
comment
2 min read
Is Your AI Agent a Compliance Risk? How to Find Violations Hidden in Traces
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 26 '25
Is Your AI Agent a Compliance Risk? How to Find Violations Hidden in Traces
#
privacy
#
agents
#
security
#
ai
Comments
Add Comment
2 min read
How to Build an AI Agent Evaluation Framework That Scales
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 29 '25
How to Build an AI Agent Evaluation Framework That Scales
#
ai
#
webdev
#
programming
#
devops
Comments
Add Comment
3 min read
Monitoring vs. Evaluation: The Critical Distinction Most AI Devs Miss
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Dec 31 '25
Monitoring vs. Evaluation: The Critical Distinction Most AI Devs Miss
#
ai
#
webdev
#
programming
#
devops
Comments
Add Comment
2 min read
The AI Agent Feedback Loop: From Evaluation to Continuous Improvement
shashank agarwal
shashank agarwal
shashank agarwal
Follow
Jan 1
The AI Agent Feedback Loop: From Evaluation to Continuous Improvement
#
webdev
#
ai
#
programming
#
devops
Comments
Add Comment
3 min read
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account