Interpretability and Transparency in Artificial Intelligence

Publication typeBook Chapter
Publication date2022-10-20
Abstract

Artificial Intelligence (AI) systems are frequently thought of as opaque, meaning their performance or logic is thought to be inaccessible or incomprehensible to human observers. Models can consist of millions of features connected in a complex web of dependent behaviours. Conveying this internal state and dependencies in a humanly comprehensible way is extremely challenging. Explaining the functionality and behaviour of AI systems in a meaningful and useful way to people designing, operating, regulating, or affected by their outputs is a complex technical, philosophical, and ethical project. Despite this complexity, principles citing ‘transparency’ or ‘interpretability’ are commonly found in ethical and regulatory frameworks addressing technology. This chapter provides an overview of these concepts and methods design to explain how AI works. After reviewing key concepts and terminology, two sets of methods are examined: (1) interpretability methods designed to explain and approximate AI functionality and behaviour; and (2) transparency frameworks meant to help assess and provide information about the development, governance, and potential impact of training datasets, models, and specific applications. These methods are analysed in the context of prior work on explanations in the philosophy of science. The chapter closes by introducing a framework of criteria to evaluate the quality and utility of methods in explainable AI (XAI) and to clarify the open challenges facing the field.

Found 
Found 

Top-30

Journals

1
1

Publishers

1
2
3
1
2
3
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Found error?