1

I'm trying to generate a contour plot with ggplot in R, but keeping only the top layers of the plot. So for instance, with the following toy data:

df <- data.frame(x=rnorm(2000, 10, 3), y=rnorm(2000, 10, 3))
stat_density_plot <- ggplot(df, aes(x, y)) +
                      geom_point() +
                      stat_density_2d(aes(fill = ..level..), geom = "polygon", bins=15) 
geom_density_plot <- ggplot(df, aes(x, y)) +
                      geom_point() +
                      geom_density_2d(bins = 15, color = "red") 

I would want to plot only the top 4 levels in stat_density_plot, and only the innermost 4 contours in the geom_density_plot.
I've been toying with the idea of generating the kernel density estimation myself (MASS::kde2d(df$x, df$y)) and manually removing all the rest of the layers, but I still don't know how to plot the result with ggplot.
Any insight about how to generate any of the two plots will be most welcome.

2 Answers 2

0

You can user layer_data() to get the actual data used by ggplot to create the polygons / lines, and focus on the levels you want.

# original
geom_density_plot <- ggplot(df, aes(x, y)) +
  geom_point() +
  geom_density_2d(bins = 15, color = "red", linewidth = 2) # thicker for better visibility

# filtered for desired rings (group numbering goes from outermost to innermost
# so we reverse that before filtering for the first four groups, which now
# correspond to the innermost rings)
layer_data(geom_density_plot, 2L) %>% 
  mutate(group = forcats::fct_rev(group)) %>%
  filter(as.integer(group) <= 4) %>%
  ggplot(aes(x = x, y = y)) +
  geom_point(data = df) +
  geom_path(aes(group = group), color = "red", linewidth = 2)

result 1

# original
stat_density_plot <- ggplot(df, aes(x, y)) +
  geom_point() +
  stat_density_2d(aes(fill = after_stat(level)), # after_stat is used in more recent versions of ggplot; the `...level...` syntax is considered old now
                  geom = "polygon", bins=15) 

# set transparency of unwanted levels to zero (we don't filter out the unwanted
# levels here, as the full range of levels is required to match the colour palette
# of the original)
layer_data(stat_density_plot, 2L) %>% 
  mutate(group1 = forcats::fct_rev(group)) %>%
  ggplot(aes(x = x, y = y)) +
  geom_point(data = df) +
  geom_polygon(aes(group = group, fill = level, 
                   alpha = ifelse(as.integer(group1) <= 4, 1, 0))) +
  scale_alpha_identity()

result 2

Sign up to request clarification or add additional context in comments.

Comments

0

Disclaimer: This answer is based on the answers in this question How to plot a contour line showing where 95% of values fall within, in R and in ggplot2

You can manually calculate the kernel density esitmate and use the resulting data for better control over what you display on the plot by specifying the probability-breaks.

# calculate kde
kde <- MASS::kde2d(df$x, df$y, n = 100)

# process kde
dx <- diff(kde$x[1:2])
dy <- diff(kde$y[1:2])
sz <- sort(kde$z)
c1 <- cumsum(sz) * dx * dy
dimnames(kde$z) <- list(kde$x, kde$y)
dc <- melt(kde$z)
dc$prob <- approx(sz, 1 - c1, dc$value)$y

# set probability levels for plot
binwidth <- 1/15
prob <- c(0, binwidth, binwidth * 2, binwidth * 3, binwidth * 4)
# or prob <- c(0, 0.25, 0.5) for example


# plot with discrete levels
ggplot(dc, aes(x = Var1, y = Var2)) +
  geom_point(data = df, aes(x = x, y = y), alpha = 0.2) +
  geom_contour_filled(aes(z = prob, fill = after_stat(level)),
                      breaks = prob,
                      alpha = 0.9) +
  geom_contour(aes(z = prob),
               breaks = prob,
               color = "red",
               alpha = 0.9) +
  scale_fill_brewer(palette = "Blues",
                    direction = -1,
                    name = "probability") +
  labs(x = "x", y = "y")

# plot with continuous levels using level_low
ggplot(dc, aes(x = Var1, y = Var2)) +
  geom_point(data = df, aes(x = x, y = y), alpha = 0.2) +
  geom_contour_filled(aes(z = prob, fill = after_stat(level_high)),
                      breaks = prob,
                      alpha = 0.9) +
  geom_contour(aes(z = prob),
               breaks = prob,
               color = "red",
               alpha = 0.9) +
  labs(x = "x", y = "y", fill = "probability")

You had 15 bins in your original plot, so that is the base for the breaks in this example (1/15). The resulting plot will only show the top 4 contours.

enter image description here

enter image description here

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.