Three laws of model explanation

2 min readMay 20, 2022

We might have heard about the Three Laws of Robotics in popular cinema or books, written by Isaac Asimov in his 1942 story “Runaround”:

a robot may not injure a human being,
a robot must obey the orders given it by human beings, and
a robot must protect its own existence.

Today’s robots, like cleaning robots, robotic pets, or autonomous cars are far from being conscious enough to fall under Asimov’s ethics. However, we are more and more surrounded by complex predictive models and algorithms used for decision-making. Artificial-intelligence models are used in health care, politics, education, justice, and many other areas. The models and algorithms have a far larger influence on our lives than physical robots. Yet, applications of such models are left unregulated despite examples of their potential harmfulness. An excellent overview of selected issues is offered in the book by O’Neil (2016).

It is now becoming clear that we have got to control the models and algorithms that may affect us. Asimov’s laws are being referred to in the context of the discussion around ethics of artificial intelligence(https://en.wikipedia.org/wiki/Ethics_of_artificial_intelligence). Initiatives to formulate principles for artificial-intelligence development have been undertaken, for instance, in the UK (Olhede and Wolfe 2018). Following Asimov’s approach, there are three requirements that any predictive model should fulfil:

Prediction’s validation: For every prediction of a model, one should be able to verify how strong the evidence is that supports the prediction.
Prediction’s justification: For every prediction of a model, one should be able to understand which variables affect the prediction and to what extent.
Prediction’s speculation: For every prediction of a model, one should be able to understand how the prediction would change if the values of the variables included in the model changed.

There are two ways to comply with these requirements. One is to use only models that fulfil these conditions by design. These are so-called “interpretable-by-design models” that include linear models, rule-based models, or classification trees with a small number of parameters (Molnar 2019). However, the price of transparency may be a reduction in performance. Another way is to use tools that allow, perhaps by using approximations or simplifications, “explaining” predictions for any model. Modern packages of SHAP and LIME attempts to solve this problem.

Three laws of model explanation

Written by Harshal Soni