Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
interpretability
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
I Trained Probes to Catch AI Models Sandbagging
Subhadip Mitra
Subhadip Mitra
Subhadip Mitra
Follow
Dec 28 '25
I Trained Probes to Catch AI Models Sandbagging
#
llm
#
interpretability
#
agents
#
machinelearning
Comments
Add Comment
6 min read
Peeking Under the Hood: Unlock AI Secrets Beyond Activations by Arvind Sundararajan
Arvind SundaraRajan
Arvind SundaraRajan
Arvind SundaraRajan
Follow
Oct 18 '25
Peeking Under the Hood: Unlock AI Secrets Beyond Activations by Arvind Sundararajan
#
machinelearning
#
ai
#
xai
#
interpretability
Comments
Add Comment
2 min read
AI Frontiers: Advances in Efficient, Robust, and Universal Machine Learning – Synthesizing Key Themes from August 2025 a
Ali Khan
Ali Khan
Ali Khan
Follow
Aug 11 '25
AI Frontiers: Advances in Efficient, Robust, and Universal Machine Learning – Synthesizing Key Themes from August 2025 a
#
machinelearning
#
efficiency
#
robustness
#
interpretability
Comments
1
 comment
8 min read
Frontiers in Computer Vision: Interpretability, Efficiency, Robustness, and Unified Learning in the Era of Deep AI Advan
Ali Khan
Ali Khan
Ali Khan
Follow
May 13 '25
Frontiers in Computer Vision: Interpretability, Efficiency, Robustness, and Unified Learning in the Era of Deep AI Advan
#
computervision
#
interpretability
#
neurosymbolicai
#
multimodallearning
Comments
Add Comment
8 min read
Frontiers in Computer Vision: Interpretability, Efficiency, Robustness, and Unified Learning in the Era of Deep AI Advan
Ali Khan
Ali Khan
Ali Khan
Follow
May 13 '25
Frontiers in Computer Vision: Interpretability, Efficiency, Robustness, and Unified Learning in the Era of Deep AI Advan
#
computervision
#
interpretability
#
neurosymbolicai
#
multimodallearning
Comments
Add Comment
8 min read
Klarity – Open-source tool to analyze uncertainty/entropy in LLM output (github.com/klara-research)
Giovanna
Giovanna
Giovanna
Follow
Feb 4 '25
Klarity – Open-source tool to analyze uncertainty/entropy in LLM output (github.com/klara-research)
#
opensource
#
ai
#
deepseek
#
interpretability
7
 reactions
Comments
1
 comment
1 min read
Choosing a Suitable Model for Our Data within the Machine Learning Development Process
Ahsan Mangal 👨🏻‍💻
Ahsan Mangal 👨🏻‍💻
Ahsan Mangal 👨🏻‍💻
Follow
Apr 22 '23
Choosing a Suitable Model for Our Data within the Machine Learning Development Process
#
machinelearning
#
interpretability
#
modelselecting
#
dataanalysis
Comments
Add Comment
3 min read
Using explanations for finding bias in black-box models
Andreas Messalas
Andreas Messalas
Andreas Messalas
Follow
for
Code4Thought
Oct 3 '19
Using explanations for finding bias in black-box models
#
machinelearning
#
interpretability
#
fairness
#
bias
6
 reactions
Comments
Add Comment
6 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account