Guest Column | January 27, 2025

Enhancing Trust, Safety, And Performance Through Explainability In AI-Enabled Medical Devices

By Yu Zhao, Bridging Consulting

AI-artificial intelligence-GettyImages-1464561797

AI-enabled medical devices have become a transformative force in healthcare, offering sophisticated data-driven insights that can enhance diagnostic accuracy, personalize treatment, and optimize clinical workflows. In the United States alone, the FDA has cleared or approved more than 1,000 AI-enabled medical devices.¹ Many of these technologies leverage machine learning (ML) or deep learning (DL), which can introduce a “black box” challenge due to their complex, opaque decision-making processes.

This opacity raises questions² about safety, efficacy, and accountability in clinical environments, where physicians and other clinical users of AI need to trust and understand the rationale behind diagnostic or therapeutic recommendations. Explainability has thus emerged as a critical strategy to illuminate how an AI model arrives at its conclusions, allowing clinicians, patients, and regulators to verify outputs, detect potential biases, and navigate complex decisions with greater confidence.

Although the FDA has not yet authorized any generative AI–enabled medical devices, the same black box challenge exists for generative AI models. The discussions throughout this article apply equally to generative AI systems, underscoring the importance of transparency and accountability across the full spectrum of AI-driven medical solutions.

This article explores why explainability is essential for AI-enabled medical devices, how it enhances trust, safety, and performance, and which available methods — such as LIME and SHAP — can best serve clinical and regulatory needs. It then describes how these considerations integrate into the ISO 14971 risk management framework, offering a structured road map to help device manufacturers decide when and how to implement explainability.

Foundations Of Explainable AI (XAI)

In the AI domain, interpretability usually refers to the internal transparency of simpler “white box” models, whereas explainability focuses on making more complex “black box” algorithms accessible through post-hoc methods.

In healthcare, explainability is not merely a technical nicety; it often serves as an ethical and clinical imperative. By exposing how a device evaluates patient data, clinicians can more confidently integrate AI recommendations into care pathways, ensuring they do not overlook crucial insights or biases embedded in the model. For example, an oncologist using an AI tool to plan chemotherapy regimens must know which patient characteristics or historical treatment responses drive the model’s suggestions, especially if they differ from conventional protocols.

Because explainability is essential to both ethical and clinical decision-making, various techniques have emerged to open the black box.

Overview Of Current Explainability Techniques

More than a dozen methods are currently available to explain AI model outputs. Two widely recognized tools are LIME³ (Local Interpretable Model-Agnostic Explanations) and SHAP⁴ (SHapley Additive exPlanations):

LIME
LIME creates small changes (perturbations) around the input data and observes how the model's prediction shifts. It's like playing a "What If?" game with the AI. Then, LIME creates a simpler local model (such as a small linear model) based on these newly generated samples. This simpler model helps explain the AI's decision for that specific prediction. Clinicians or device developers can review the coefficients of this local model to understand which input features most influenced a particular prediction. LIME is model-agnostic, making it very flexible, although the reliability of its explanations can vary depending on how those "What If?" scenarios are generated.

SHAP
Grounded in cooperative game theory, SHAP assigns an importance value to each model input feature to quantify how much it contributes to the final output. These importance scores indicate the magnitude and direction of each feature's impact on the model's prediction. SHAP values can be used for both global (overall model behavior) and local (individual prediction) interpretability, vital for detecting hidden biases or performance anomalies. However, SHAP can be computationally intensive, especially for high-dimensional data, and may be challenging to interpret in very large or complex data sets. While SHAP provides insights into how features influence predictions, interpreting the underlying reasons behind feature importance often requires domain expertise and additional analysis.

Other techniques, including integrated gradients and Grad-CAM, are often used for neural networks, particularly in imaging applications. In some scenarios, manufacturers may choose inherently interpretable (white box) models⁵ if their clinical problem does not demand complex architectures. Regardless of the technique, it must deliver explanations that are both accurate (reflect the model’s real behavior) and comprehensible (meaningful to end users).

Most of the current explainability tools highlight influential features or factors in the input data or assimilate a model’s reasoning to help end users — clinicians or patients — understand and trust AI model outputs.

Applying The ISO 14971 Risk Management Framework⁶

To manage the potential hazards introduced by "black box" AI, many organizations leverage the ISO 14971 standard, which offers a structured approach to identifying, analyzing, and controlling risks across the medical device life cycle. The following describes how explainability can serve as a risk control within this framework.

Understanding Intended Use, Users, and Environment

The first step is clearly defining the device’s intended use, its intended users, and the intended use environment. Different clinical settings (e.g., emergency departments vs. outpatient clinics) have distinct requirements for how explanations are delivered and used. If clinicians have limited time or specialized expertise, a succinct high-level explanation may be more appropriate than a detailed feature importance map.

Hazard Identification for "Black Box" AI

Opaque AI models can potentially lead to numerous hazards:

Biases or Algorithmic Errors: Without explainability, discriminatory or inaccurate predictions may persist unnoticed.
Over- or Under-reliance: Clinicians might blindly follow AI’s recommendation or ignore it altogether if they do not understand the rationale behind its recommendation.
Model Drift: Over time, real-world data may diverge from the training set, leading to performance degradation that can remain undetected in the absence of explainability.

Risk Estimation: Severity and Probability of Occurrence of the Harm

Each identified hazard is assessed for the potential harms, including their severity and probability of occurrence (likelihood). For instance, a diagnostic tool incorrectly classifying a life-threatening condition could lead to high-severity harms. If the likelihood of such misclassification is more than remote, the pre-mitigation risk may be deemed unacceptable. In these cases, robust risk control measures — such as providing explanations on model predictions — are warranted to mitigate patient harm. One way of estimating likelihood of the harm is using clinical performance metrics (e.g., sensitivity) from the premarket validation studies.

Risk Control Measures

Explainability often emerges as a critical risk control measure:

Clear Outputs and Confidence Indicators: By highlighting which input features were most influential and providing confidence levels for the model predictions, clinicians are better able to cross-check or independently verify the AI’s outputs.
User Training: Even the best explanations have limited value if clinicians or patient users cannot interpret them. Training may be required to ensure that end users can benefit from these explanations effectively.
Escalation Triggers: Systems can prompt a “human-in-the-loop” review or a second opinion when the model’s confidence is low or its explanations appear inconsistent with known clinical norms.

Additionally, the FDA’s Transparency of Machine Learning-Enabled Medical Devices: Guiding Principles⁷ underscores the role of transparent AI outputs as a core element of risk control, complementing the ISO 14971 framework.

Verifying the Effectiveness of Risk Controls

It is crucial to confirm that explainability truly enhances decision-making and mitigates risk. Performance and usability tests can reveal whether clinicians or patient users can effectively act on the explanations and whether explanations in fact reduce errors or improve safety.

Residual Risk Evaluation and Post-Market Surveillance

Even with explainability in place, some risks may remain. Overly detailed explanations could overwhelm users, while simplistic explanations might hide important nuances. Therefore, manufacturers must evaluate any residual risks to confirm acceptability. Manufacturers should also assess whether adding explanations introduces any new risks or adversely affects any existing risks.

During post-market surveillance, ongoing real-world data collection helps detect model drift, prompting timely updates or retraining. Importantly, explanations themselves may require recalibration as the model evolves or as the medical practices shift over time, ensuring clarity and accuracy are maintained.

When, How, And Why To Implement Explainability

Explainability is not universally mandatory; its priority corresponds to the risk profile of the device. In low-risk scenarios, such as a noncritical patient monitoring app, extensive explanations may be less necessary and can even confuse users. However, for medium- and high-risk devices — like those used for diagnosing serious conditions or recommending treatments — the cost of an incorrect prediction can be significant, making the transparency offered by explainability indispensable.

Selecting an explainability technique depends on model complexity, data type, and user expertise. A radiologist reviewing an imaging-based solution might benefit from Grad-CAM’s heat maps, whereas a pulmonologist using a tabular prediction model for assessing respiratory function might prefer numeric feature attributions via SHAP.

Different stakeholders often require tailored explanation strategies. Clinicians generally want localized, clinically actionable insights, such as feature attributions or confidence thresholds, so they can validate an AI output against their own expertise. Patients, on the other hand, may benefit from simpler, more intuitive visuals that provide reassurance without causing confusion. While regulators are not direct users of the device, they still need evidence that these explanation methods effectively mitigate risks and align with relevant standards.

Ultimately, measuring the effectiveness of explainability involves evaluating whether it improves user trust — for example, by collecting feedback on clinicians’ readiness to adopt the AI-enabled device — and whether it enhances patient safety or improves human-device performance. Common metrics might include decreased misdiagnoses or improved clinical outcomes.

Best Practices And Key Takeaways

Integrate Explainability Early
Incorporate explainability into device design from the start rather than adding it retroactively. Early collaboration with clinicians and regulators can avoid pitfalls in explaining complex models.

Tailor Explanations To Intended Users
Different users — clinicians and patients — require tailored levels of explanation to maximize trust and usability.

Leverage ISO 14971 Framework
A structured risk management framework aligns explainability strategies with industry best practices, ensuring safety and effectiveness remain paramount.

Validate And Communicate
Conduct performance tests and usability studies under real-world conditions or simulated use environments. Keep transparent records to facilitate regulatory submissions and foster trust among stakeholders.

Monitor and Adjust
AI models can drift over time as the patient populations or clinical practices evolve. Periodically reevaluate whether explanations remain clear and accurate. Update training data or refine explanations as needed.

Conclusion

As the landscape of AI-enabled medical devices continues to expand, questions surrounding transparency and accountability for "black box" predictions remain central. Explainability stands as a critical component in bridging the gap between powerful data-driven insights and the practical trust-based needs of clinical environments. When integrated into the ISO 14971 framework, explainability serves as an effective risk control measure, illuminating how an algorithm arrives at its conclusions and ensuring that misdiagnoses, biases, or uncertainties do not go unnoticed.

By prioritizing explainability — and tailoring it appropriately to the device’s risk profile, clinical setting, and user population — device manufacturers can foster greater trust among regulators, healthcare providers, and patient users. This transparency ultimately guides better patient care and safer medical interventions. As AI technologies, including ML, DL, and generative AI, advance and become more deeply embedded in clinical workflows, the principles of explainability will remain paramount for maintaining high standards of patient safety, therapeutic effectiveness, and ethical responsibility.

Ongoing dialogue among device manufacturers, clinicians, patients, regulators, and research bodies will be essential to refine explainability techniques, harmonize standards, and ensure that AI-enabled devices continue to serve patients safely and effectively.

References

FDA. (2024). Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices.
Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G., & King, D. (2019). Key challenges for delivering clinical impact with artificial intelligence. BMC medicine, 17(1), 1-9.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135-1144.
Lundberg, S. M., & Lee, S. I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems, 30, 4765-4774.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206-215.
ISO 14971:2019. Medical devices — Application of risk management to medical devices. International Organization for Standardization.
FDA. (2024). Transparency of Machine Learning-Enabled Medical Devices: Guiding Principles. https://www.fda.gov/media/179269/download?attachment.

About The Author:

Yu Zhao founded Bridging Consulting LLC, a consultancy dedicated to assisting AI startups and medical device companies in achieving innovation and regulatory compliance, in 2020. He has more than two decades of experience in the life sciences and technology space, and specializes in medical device regulatory, quality, and clinical affairs. His clients range from startups to large industry companies. During his tenure at Medtronic, he led regulatory affairs departments for multi-billion-dollar business units.