Questions tagged [outliers]
An outlier is an observation that appears to be unusual or not well described relative to a simple characterization of a dataset. A discomfiting possibility is that these data come from a different population than the one intended to be studied.
1,386 questions
3
votes
2
answers
100
views
Outlier tests for time-series data: difference among methods?
I'm working with big data time-series and am trying to detect outliers. Upon my research I've come across a variety of different simple methods (e.g. here and here) and I'm trying to understand the ...
3
votes
2
answers
38
views
How to handle heteroskedasticity when detecting anomalies using Z-scores on growth rates?
Originally asked this question elsewhere (https://stackoverflow.com/questions/79839128) but was directed here.
I’m trying to detect anomalies in a dataset using Z-scores based on the logged index ...
8
votes
3
answers
751
views
Extreme outlier in real data
I'm looking at the amount of carbon in seven forest pools. For dead trees left on the landscape across many locations and over several harvest retention (logging) treatments, there is an extreme value ...
0
votes
0
answers
51
views
Outlier detection in many short time series
I have a dataset with ~20.000 entries containing mean values for different groups. The groups are defined with 4 categorical columns and I have the week number, the number of samples per week and the ...
0
votes
0
answers
53
views
Winsorizing outliers across multiple analyses: once or multiple times? (SPSS)
I have a 2×2 experimental design with four conditions and eight outcome variables. I’m supposed to winsorize outliers, but I’m confused about how many times this needs to be done because I’m ...
2
votes
2
answers
287
views
outlier detection in classification
I am curious if there are any methods of outlier detection [read: NOT high leverage point detection] that be used in classification problems without fitting a model.
As I understand it, some commonly ...
1
vote
0
answers
35
views
How to assign an observation to a group but include an out-group option?
I have collected data from a number of known groups, and from individuals that I would like to assign to a group but may be from an unknown group.
For simplicity's sake, I have created an example with ...
5
votes
3
answers
543
views
How to handle outliers when some predictors perform better with them and others without
I’m working on a project where I need to build a predictive model for wine quality based on its chemical properties. The goal is to find which features best explain or predict the quality score.
I’ve ...
8
votes
4
answers
2k
views
Should I transform my data before or after removing outliers? (Highly skewed cortisol example)
I am analyzing cortisol data collected over multiple days, with three samples per day (Cortisol_1, Cortisol_2, Cortisol_3). My data are extremely skewed:
Skewness of Cortisol_1: 26.3
Skewness of ...
2
votes
0
answers
32
views
Hypothesis testing for a weekly seasonal effect in the presence of outliers
Suppose that I have a time series where the mean usually changes smoothly over time, and I want a hypothesis test for whether there is a weekly seasonal pattern to the data. The time series also ...
0
votes
0
answers
66
views
A simple-ish way of estimating the number of modes, and the 'pronounced'-ness of said modes of a discrete, finite distribution
Intuitively, let's say we're given a price $p$ for some product, and we want to compare the prices with what's available on the market (ex: to determine if we're being ripped off or not).
We come back ...
0
votes
0
answers
71
views
What does iteration in sigma clipping do
If I only want the high-SNR data, I do sigma-clipping to an array.
As this link says
Suppose you have a set of data. Compute its median m and its standard deviation ...
8
votes
1
answer
395
views
Does the presence of outliers always mean that robust regression analysis should be used?
I revised my question to be more specific, as suggested by the community. Since my knowledge of statistics is limited, I'm not entirely sure what it means to specialize in this subject—but I'll give ...
3
votes
2
answers
128
views
How to test if a single value in a set of values is higher than the remaining values
I have a set of $8$ participants $P_1, \ldots P_8$. Each participant takes two tasks $A$ and $B$, and each task results in an ordered vector of $6$ positive values. I'll denote the vector recorded ...
0
votes
0
answers
69
views
Should varIdent be used in a linear model with outliers in nlme in R
I am unsure whether/how to use varIdent from the nlme package to allow different variances across factor levels when analysing a dataset which has outliers.
I am specifically interested in mixed ...