John Dorsch: Trustworthy AI Starts with Human-Friendly Explanations

In a world increasingly shaped by algorithmic decisions, ensuring that artificial intelligence (AI) systems are explainable and trustworthy has never been more urgent. Dr. John Dorsch from CETE-P argues that we've been looking at this the wrong way.

In his recent presentation for our Responsible AI discussion group and research paper co-authored with Maximilian Moll, philosopher and AI ethicist John Dorsch introduces an original framework for understanding how AI can support, not replace, human judgment. The framework called the theory of epistemic quasi-partnerships (EQP) is grounded in a practical “RCC approach”, standing for: Reasons, Counterfactuals, and Confidence.

The Problem: Explainability for Experts, Confusion for Users

Most AI explanations today are made by computer scientists for computer scientists. This leaves actual users facing complex visualizations and technical justifications that offer little meaningful insight. One of Dorsch’s central case studies involves AI decision support systems used by social workers to assess potential child abuse. These are scenarios where getting the decision right isn't just a matter of efficiency – it's a matter of life and safety.

Dorsch argues that AI outputs must go beyond numerical scores. They should include reasons, counterfactuals, and justifications that align with the knowledge and reasoning styles of human users. For example, instead of just saying an intervention is recommended, the AI should explain: What would have changed the recommendation? Which input mattered the most?

Dorsch emphasizes the need for standardized metrics that evaluate explanation quality from the user’s perspective, not just technical completeness. He also clarifies his stance on the often-misunderstood distinction between interpretability (understanding of the model's inner workings) and explainability (making decisions clear for end users). The latter can often be misleading or superficial, failing to serve the user's actual needs.

Epistemic vs. Moral Trust

Dorsch emphasizes the importance of distinguishing between epistemic trust and moral trust. While people often speak of “trustworthy AI,” Dorsch warns against attributing moral agency to machines ("this AI is good"). AI can be epistemically trustworthy – reliable in the information it provides – but it is not a moral agent and should not be treated as such.

This distinction makes it clear that we shouldn't blindly follow AI just because it "seems confident." Instead, we should evaluate it as we do other information sources, much like we might trust a peer-reviewed article more than a social media post.

Explanations and Human Psychology

Why do we explain things? According to cognitive psychology, explanations serve two primary purposes: helping us predict behavior and justify actions. Dorsch argues that AI explanations should do both. When an AI system justifies a recommendation, it helps foster trust – not just by being right, but by being understandably right.

This is particularly important in domains like loan approval or legal sentencing, where users need context, not just outcomes. It's not enough to say "no loan approved"; people want to know why and what might change that outcome in the future.

AI as a Quasi-Partner – The RCC Approach

The central proposal in Dorsch’s paper is the concept of AI as a quasi-partner in decision-making. AI can process massive data sets and surface relevant insights, but lacks human judgment, ethical reasoning, and social accountability. In this sense, the AI can act like a partner, but cannot be a full one. This model holds particular promise for professionals like clinicians, legal analysts, engineers, and public health officials – those who navigate complex, high-stakes environments and can benefit from AI-supported insights without ceding human responsibility.

The goal, then, is not to replace human decision-makers, but to augment them. To work effectively as a quasi-partner, AI must engage in sound epistemic practices – the same practices that humans use to build understanding and trust in one another’s decisions. These include:

Providing Reasons: Clear, human-readable justifications for recommendations. Dorsch explores the nature of reasons, focusing on their stability and explanatory force, which allows for predictions in similar situations. He points out the absence of normative elements in this discussion, which complicates the understanding of what constitutes a reason.
Exploring Counterfactuals: Explaining how a decision might change if key inputs were different. Dorsch introduces the concept of “nearby possible worlds.” This refers to small changes in circumstances that could alter outcomes – e.g., what if one more call had been made in a child welfare case? While powerful, this approach has limitations. How do we define what counts as “nearby”? And how do we know in advance what outcomes we’re aiming for? Dorsch acknowledges these challenges but argues that exploring such counterfactuals is essential for meaningful, human-grounded AI explanations.
Stating Confidence: Offering calibrated measures of certainty to help users assess when to trust or challenge AI outputs. Here, we point out that offering a specific confidence value for each individual prediction is often not feasible, as most evaluation metrics, such as accuracy, are calculated at a global level rather than on a per-instance basis.

This forms the Reasons, Counterfactuals, and Confidence (RCC) approach, a structured way to design AI explanations that align with human reasoning. The authors of this article advocate for the inclusion of a fidelity condition in AI explanation frameworks, ensuring that explanations genuinely reflect the model's actual reasoning, rather than generic or overly simplified interpretations.

What the Research Shows

Dorsch’s theory is backed by empirical studies in explainable AI (XAI). Several key findings include:

Confidence Works – But With Limits: Users tend to trust AI more when it states how confident it is. But this can lead to overreliance unless paired with other forms of explanation.
Counterfactuals Enhance Understanding: Showing how slight changes in input would alter the outcome helps users grasp the logic behind a decision.
Example-Based Justifications Reduce Bias: Prototype explanations – like “this case is similar to that one” – help users think critically rather than blindly defer to the AI.

Crucially, methods like bar charts (feature graphs) often fail because they require additional technical interpretation. In contrast, explanations that resemble natural human reasoning – statements like “If X, then Y” – are more effective.

Final Thoughts

As AI regulations like the EU AI Act push for transparency in “high-risk” systems, we urgently need ways to design AI explanations that non-experts can use responsibly. Dorsch’s EQP theory provides both ethical and practical guidance. It avoids the pitfall of anthropomorphizing AI – pretending it's human – and instead focuses on building useful, transparent partnerships between people and machines.

On behalf of our RAI team, we’re excited to collaborate with Dorsch on bringing his ideas into real-world explanation methods and use cases. We agreed that grounding the theory in specific domains and model constraints will be essential for helping scientists and developers apply it effectively.

This article was written by Martin Krutský and Jakub Peleška (FEE CTU) in cooperation with John Dorsch (CETE-P), reflecting valuable inputs from the whole RAI discussion group.

The Problem: Explainability for Experts, Confusion for Users

Epistemic vs. Moral Trust

Explanations and Human Psychology

AI as a Quasi-Partner – The RCC Approach

What the Research Shows

Final Thoughts

Latest news

Fourth Round of the Outstanding Teaching Assistant Award

AIC Joins National AI Scene Map

AIC 2025 Student Paper Award