Raincloud Plots with ggplot2

Cédric Scherer https://cedricscherer.com
2021-03-23

Packages

Plots

Violin-boxplot-combination

g1 <- 
  ggplot(iris, aes(Species, Sepal.Width)) + 
  geom_violin(fill = "grey90") + 
  geom_boxplot(width = .2, outlier.shape = NA, coef = 0)

g1

Better: add raw data

as jitter

g1 + geom_point(alpha = .7, position = position_jitter(seed = 1))

… does look messy. One can play around with width but likely that it will never look clean.

g1 + geom_point(alpha = .7, width = .1, position = position_jitter(seed = 1))
Warning: Ignoring unknown parameters: width

as sina

A sina ensures that the points are placed inside the violin. The ggforce package comes with a geom_sina() function.

g1 + ggforce::geom_sina(method = "counts", alpha = .5)

as quasirandom distribution

Similarly, one can use the quasirandom geom from the ggbeeswarm package to reduce overplotting within categories:

g1 + ggbeeswarm::geom_quasirandom(width = .3, alpha = .5, varwidth = TRUE, size = 2)

as beeswarm

… or true beeswarms from the same package—but note that it bins the data.

g1 + ggbeeswarm::geom_beeswarm(width = .3, alpha = .5, cex = 1.2, size = 2)
Warning: Ignoring unknown parameters: width

Raincloud plots

We can combine a (half) violin with a summary plot (usually parts of a boxplot) and at thesame time show the raw data. This chart is called raincloud plot.

with linerange + dotplot

It is straightforard to build a verison of a raincloud plot with the ggdist package. Here we use the stat_halfeye fucntion that draws a (half) violin and an interval slab to highlight important summary stats. I like the appearance a lot but note that the errorbar style might be confusing (and it was actually build to display confidence intervals).

We can combine this geom, scaled to show median, IQR and full range of the data (via .width), with stat_dots that draws a dot plot showing the raw data.

ggplot(iris, aes(Species, Sepal.Width)) + 
  ggdist::stat_halfeye(adjust = .5, width = .3, .width = c(0.5, 1)) + 
  ggdist::stat_dots(side = "left", dotsize = .4, justification = 1.05, binwidth = .1)

with boxplot + dotplot

We can replace the interval slab by a boxplot. We add some justification to the halfeye as well so the boxplot gets some space to fit in-between the cloud and the rain.

ggplot(iris, aes(Species, Sepal.Width)) + 
  ggdist::stat_halfeye(adjust = .5, width = .3, .width = 0, justification = -.3, point_colour = NA) + 
  geom_boxplot(width = .1, outlier.shape = NA) +
  ggdist::stat_dots(side = "left", dotsize = .3, justification = 1.1, binwidth = .1)

(You can also use gghalves::geom_half_dotplot(stackdir = "down") to draw the dotplot.)

with boxplot + jitter (on top)

Of course, if you don’t like the binning of the data points, one can use a classical jitter.

ggplot(iris, aes(Species, Sepal.Width)) + 
  ggdist::stat_halfeye(adjust = .5, width = .7, .width = 0, justification = -.2, point_colour = NA) + 
  geom_boxplot(width = .2, outlier.shape = NA) + 
  geom_jitter(width = .05, alpha = .3)

However, (as far as I know) there is no way to justify the jitter so it’s shown next to the boxplot.

with boxplot + jitter (side by side)

We can plot a side-by-side combination of jitter and boxplot with help of the gghalves package.

ggplot(iris, aes(Species, Sepal.Width)) + 
  ggdist::stat_halfeye(adjust = .5, width = .3, .width = 0, justification = -.3, point_colour = NA) + 
  geom_boxplot(width = .1, outlier.shape = NA) +
  gghalves::geom_half_point(side = "l", range_scale = .4, alpha = .5)

with boxplot + barcode (side by side)

By repalcing the shape and restricting the range of the jitter, we can also turn it into a barcode plot:

ggplot(iris, aes(Species, Sepal.Width)) + 
  ggdist::stat_halfeye(adjust = .5, width = .3, .width = 0, justification = -.3, point_colour = NA) + 
  geom_boxplot(width = .1, outlier.shape = NA) +
  gghalves::geom_half_point(side = "l", range_scale = 0, shape = 95, size = 15, alpha = .3)


Session Info
[1] "2021-03-23 11:24:54 CET"
git2r::repository()
Local:    main C:/Users/DataVizard/Google Drive/Work/Programing/R/raincloud-plots
Remote:   main @ origin (https://github.com/Z3tt/Rainclouds.git)
Head:     [1f5efcd] 2021-03-23: add files
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    
system code page: 65001

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] gghalves_0.1.1 ggdist_2.4.0   ggforce_0.3.2  ggplot2_3.3.3 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6           git2r_0.28.0         vipor_0.4.5         
 [4] highr_0.8            pillar_1.5.1         compiler_4.0.4      
 [7] forcats_0.5.1        tools_4.0.4          digest_0.6.27       
[10] downlit_0.2.1        evaluate_0.14        lifecycle_1.0.0     
[13] tibble_3.1.0         gtable_0.3.0         debugme_1.1.0       
[16] pkgconfig_2.0.3      rlang_0.4.10         DBI_1.1.1           
[19] distill_1.2          yaml_2.2.1           beeswarm_0.2.3      
[22] xfun_0.20            withr_2.4.1          stringr_1.4.0       
[25] dplyr_1.0.4          knitr_1.31           generics_0.1.0      
[28] vctrs_0.3.6          grid_4.0.4           tidyselect_1.1.0    
[31] glue_1.4.2           R6_2.5.0             fansi_0.4.2         
[34] distributional_0.2.2 ggbeeswarm_0.6.0     rmarkdown_2.6       
[37] polyclip_1.10-0      tidyr_1.1.2          farver_2.1.0        
[40] tweenr_1.0.1         purrr_0.3.4          magrittr_2.0.1      
[43] MASS_7.3-53          scales_1.1.1         ellipsis_0.3.1      
[46] htmltools_0.5.1.1    assertthat_0.2.1     colorspace_2.0-0    
[49] labeling_0.4.2       utf8_1.2.1           stringi_1.5.3       
[52] munsell_0.5.0        crayon_1.4.1