g1 <-
ggplot(iris, aes(Species, Sepal.Width)) +
geom_violin(fill = "grey90") +
geom_boxplot(width = .2, outlier.shape = NA, coef = 0)
g1
g1 + geom_point(alpha = .7, position = position_jitter(seed = 1))
… does look messy. One can play around with width but likely that it will never look clean.
g1 + geom_point(alpha = .7, width = .1, position = position_jitter(seed = 1))
Warning: Ignoring unknown parameters: width
A sina ensures that the points are placed inside the violin. The ggforce package comes with a geom_sina()
function.
g1 + ggforce::geom_sina(method = "counts", alpha = .5)
Similarly, one can use the quasirandom geom from the ggbeeswarm package to reduce overplotting within categories:
g1 + ggbeeswarm::geom_quasirandom(width = .3, alpha = .5, varwidth = TRUE, size = 2)
… or true beeswarms from the same package—but note that it bins the data.
g1 + ggbeeswarm::geom_beeswarm(width = .3, alpha = .5, cex = 1.2, size = 2)
Warning: Ignoring unknown parameters: width
We can combine a (half) violin with a summary plot (usually parts of a boxplot) and at thesame time show the raw data. This chart is called raincloud plot.
It is straightforard to build a verison of a raincloud plot with the ggdist package. Here we use the stat_halfeye
fucntion that draws a (half) violin and an interval slab to highlight important summary stats. I like the appearance a lot but note that the errorbar style might be confusing (and it was actually build to display confidence intervals).
We can combine this geom, scaled to show median, IQR and full range of the data (via .width
), with stat_dots
that draws a dot plot showing the raw data.
ggplot(iris, aes(Species, Sepal.Width)) +
ggdist::stat_halfeye(adjust = .5, width = .3, .width = c(0.5, 1)) +
ggdist::stat_dots(side = "left", dotsize = .4, justification = 1.05, binwidth = .1)
We can replace the interval slab by a boxplot. We add some justification to the halfeye as well so the boxplot gets some space to fit in-between the cloud and the rain.
ggplot(iris, aes(Species, Sepal.Width)) +
ggdist::stat_halfeye(adjust = .5, width = .3, .width = 0, justification = -.3, point_colour = NA) +
geom_boxplot(width = .1, outlier.shape = NA) +
ggdist::stat_dots(side = "left", dotsize = .3, justification = 1.1, binwidth = .1)
(You can also use gghalves::geom_half_dotplot(stackdir = "down")
to draw the dotplot.)
Of course, if you don’t like the binning of the data points, one can use a classical jitter.
ggplot(iris, aes(Species, Sepal.Width)) +
ggdist::stat_halfeye(adjust = .5, width = .7, .width = 0, justification = -.2, point_colour = NA) +
geom_boxplot(width = .2, outlier.shape = NA) +
geom_jitter(width = .05, alpha = .3)
However, (as far as I know) there is no way to justify the jitter so it’s shown next to the boxplot.
We can plot a side-by-side combination of jitter and boxplot with help of the gghalves package.
ggplot(iris, aes(Species, Sepal.Width)) +
ggdist::stat_halfeye(adjust = .5, width = .3, .width = 0, justification = -.3, point_colour = NA) +
geom_boxplot(width = .1, outlier.shape = NA) +
gghalves::geom_half_point(side = "l", range_scale = .4, alpha = .5)
By repalcing the shape and restricting the range of the jitter, we can also turn it into a barcode plot:
ggplot(iris, aes(Species, Sepal.Width)) +
ggdist::stat_halfeye(adjust = .5, width = .3, .width = 0, justification = -.3, point_colour = NA) +
geom_boxplot(width = .1, outlier.shape = NA) +
gghalves::geom_half_point(side = "l", range_scale = 0, shape = 95, size = 15, alpha = .3)
Sys.time()
[1] "2021-03-23 11:24:54 CET"
git2r::repository()
Local: main C:/Users/DataVizard/Google Drive/Work/Programing/R/raincloud-plots
Remote: main @ origin (https://github.com/Z3tt/Rainclouds.git)
Head: [1f5efcd] 2021-03-23: add files
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)
Matrix products: default
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
system code page: 65001
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] gghalves_0.1.1 ggdist_2.4.0 ggforce_0.3.2 ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.6 git2r_0.28.0 vipor_0.4.5
[4] highr_0.8 pillar_1.5.1 compiler_4.0.4
[7] forcats_0.5.1 tools_4.0.4 digest_0.6.27
[10] downlit_0.2.1 evaluate_0.14 lifecycle_1.0.0
[13] tibble_3.1.0 gtable_0.3.0 debugme_1.1.0
[16] pkgconfig_2.0.3 rlang_0.4.10 DBI_1.1.1
[19] distill_1.2 yaml_2.2.1 beeswarm_0.2.3
[22] xfun_0.20 withr_2.4.1 stringr_1.4.0
[25] dplyr_1.0.4 knitr_1.31 generics_0.1.0
[28] vctrs_0.3.6 grid_4.0.4 tidyselect_1.1.0
[31] glue_1.4.2 R6_2.5.0 fansi_0.4.2
[34] distributional_0.2.2 ggbeeswarm_0.6.0 rmarkdown_2.6
[37] polyclip_1.10-0 tidyr_1.1.2 farver_2.1.0
[40] tweenr_1.0.1 purrr_0.3.4 magrittr_2.0.1
[43] MASS_7.3-53 scales_1.1.1 ellipsis_0.3.1
[46] htmltools_0.5.1.1 assertthat_0.2.1 colorspace_2.0-0
[49] labeling_0.4.2 utf8_1.2.1 stringi_1.5.3
[52] munsell_0.5.0 crayon_1.4.1