As AI systems become increasingly complex it may become unclear, even to the designer of a system, why exactly a system does what it does. This leads to a lack of trust in AI systems. To solve this, the field of explainable AI has been working on ways to produce explanations of these systems’ behaviors. Many methods in explainable AI, such as LIME, offer only a statistical argument for the validity of their explanations. However, some methods instead study the internal structure of the system an…
Read moreAs AI systems become increasingly complex it may become unclear, even to the designer of a system, why exactly a system does what it does. This leads to a lack of trust in AI systems. To solve this, the field of explainable AI has been working on ways to produce explanations of these systems’ behaviors. Many methods in explainable AI, such as LIME, offer only a statistical argument for the validity of their explanations. However, some methods instead study the internal structure of the system and try to find components which can be assigned an interpretation. I believe that these methods provide more valuable explanations than those statistical in nature. I will try to identify which explanations can be considered internal to the system using the Chomskyan notion of tacit knowledge. I argue that each explanation expresses a rule, and through the localization of this rule in the system internals, we can take a system to have tacit knowledge of the rule. I conclude that the only methods which are able to sufficiently establish this tacit knowledge are those along the lines of Olah : 4901–4911, 2017), and therefore they provide explanations with unique strengths.