Reasons that LIME and SHAP might not agree with intuition

Question

I'm leveraging the Python packages lime and shap to explain single (test-set) predictions that a basic, trained model is making on new, tabular data. WLOG, the explanations generated by both methods do not agree with user intuition.

For example, when leveraging the methods in a healthcare setting, they might list the presence of a comorbidity (a disease that frequently co-occurs with the outcome disease of interest) as a factor that decreases a patient's risk of an adverse event.

Intuitively, such behavior is incorrect. We shouldn't see that history of heart attacks lowers the risk of adverse events, for example. What are some reasons that we might see these inconsistencies?

Some of my ideas

Class label imbalance: tried balancing the dataset, did not solve the issue
Kernel width for LIME: working on tuning this, but cursorily, no benefit
Relationship to training data: for tabular data, both lime and shap require the training dataset as input to build the explainer class. If there are instances in which a feature such as history of heart attacks were associated with a no adverse event outcome, such instances would "confuse" the methods, so to speak. However, I'm not sure I have the intuition correct there.
Error in understanding on my part: there may be nuances in intuition here that I've missed. Specifically, I am trying to make sure I correctly understand the relationship between the generated explanations and the training dataset used to build them.

I don't know much about shap and lime, but sometimes there are some seemingly counterintuitive results in the data, regardless of methods used see for exmaple en.wikipedia.org/wiki/Low_birth-weight_paradox or en.wikipedia.org/wiki/Simpson%27s_paradox this would probably not explain your heart attack and adverse events results, but it might explain results of some other variables — rep_ho
– rep_ho, Commented Jan 21, 2020 at 7:20
do you see that for all data points? or just a few? Lime and SHAP provide a set of explanations for each data point. — user3494047
– user3494047, Commented Jul 13, 2021 at 11:20

Björn · Accepted Answer · 2020-01-21 11:47:29Z

Just to state this up-front: most machine learning models just try to predict. They do not find/show causal effects, understand what is going on, model disease mechanisms or medical relationships. I.e. model explanations may not point to what happens in terms of causality/disease mechansim, but only highlight what appears to predict best. Something may be very useful as a predictor, but completely non-causal/not something you should intervene on/not a useful insight. Example: a priest giving the last rites to a Catholic patient is probably a pretty big predictor of mortality risk, but does not cause deaths and is not something you should try to intervene against.

Here are a couple of possible explanations for what you see:

There are often many correlated predictors (e.g. history of heart attacks, history of PCI, history of CABG, taking a statin, taking a P2Y12, taking low dose aspirin,...). Some possible issues include that other factors correlate with the ones you look at and conditional on the values of these other factors, the particular thing is less predictive of events.
There is the possibility that there is a medical intervention. Rich Caruana described a case where a model predicted that amongst patients hospitalized due to pneunomia those with asthma were at a low risk of dying, but that may have been only because guidelines foresee extremely aggressive treatment and close monitoring for such patients.
This may be some odd property of your training data that the models have identified. E.g. you may have sampled/obtained the data in such a way that the predictor you look at is actually (relatively speaking) a good sign (as compared to the alternative way of getting into the data). Extreme example: you sampled patients that survived a heart attack >2 years ago, patients with end-stage cancer and patients with NYHA class IV heart failure - in that case patients with a history of a heart attack will tend to be the healthies patients. Obviously, nobody does something like that deliberately, but e.g. if you took everyone in an intensive care unit of a particular hospital, then some pretty severe condition may be relatively mild compared to the other ways you ended up in the ICU.
The model just fitted to something in the data for some reason (e.g. random noise in the data, small dataset, overfitting etc.) and this may very well be completely wrong - perhaps you should be thankful to these methods for highlighting it.

All of these could be true and one needs to look at the particulars of the case to figure out which of these (or perhaps something else) applies.

+1 because of mentioning Caruana, an absolute ML unit. (And obviously because it is a good answer!) — usεr11852
– usεr11852, Commented Jan 21, 2020 at 12:08
Thanks @Björn. I'm marking your answer as accepted, as I believe it covers the most relevant points here. Cheers — AmeySMahajan
– AmeySMahajan, Commented Jan 21, 2020 at 19:40
I think one thing that is missed in this answer (unless I misunderstood) is that perhaps the model that LIME and SHAP are trying to explain is just plain wrong. LIME and SHAP try to have features that make sense with the model, but perhaps the model is wrong. — user3494047
– user3494047, Commented Jul 13, 2021 at 11:20
Sure, that's also a possibility. Some of my options are particular variations of wrong models that are particularly common, but all sorts of wrong models could lead to weird explanations (and predictions). — Björn
– Björn, Commented Jul 13, 2021 at 12:29

Janosch · Accepted Answer · 2020-01-21 09:45:03Z

0

It could also be that the model is training on noise in the data. Many ml models are over-parameterized and can thus can predict even randomly assigned targets. Depending on what kind of model you are using you can try to reduce the size of the model. Or use other methods usually used to prevent overfitting and see if that changes the output of lime. Did you try looking into the relationship of adverse events and history of heart attacks present in your dataset? Maybe something got messed up in the data

answered Jan 21, 2020 at 9:45

Janosch

1,0105 silver badges16 bronze badges

Add a comment |

Stack Exchange Network

Reasons that LIME and SHAP might not agree with intuition

2 Answers 2

Hot Network Questions

Reasons that LIME and SHAP might not agree with intuition

2 Answers 2

Related

Hot Network Questions