One difficulty (out of several) with far outliers in what
ought to be normal data is that the null hypothesis (no differences)
willmay be rejected too often, leading to false discovery..
pv.r = replicate(105, t.test(rank(c(rexp(20,1),rexp(20,1)))~g,
var.eq=T)$p.val)
mean(pv.r <= .05)
[1] 0.01904762
In this case, the true rejection rate when there is no difference between the two populations are the same is about 2%. Granted, about 5% would be better, but doing the pooled 2-sample t test on ranked data
is better than ignoring the skewness of exponential data and resuting
outliers.