Designing Charts in R

Reproducible Graphic Design with {ggplot2}

Cédric Scherer // Data Visualization Society Masterclass // March 9, 2023

Welcome!

The ggplot2 Package

The ggplot2 hex logo.


{ggplot2} is a system for declaratively creating graphics,
based on “The Grammar of Graphics” (Wilkinson, 2005).

You provide the data, tell {ggplot2} how to map variables to aesthetics,
what graphical primitives to use, and it takes care of the details.

Advantages of {ggplot2}

  • code-first approach → reproducible and transparent workflow
  • consistent underlying “grammar of graphics”
  • very flexible, layered plot specification
  • theme system for polishing plot appearance
  • lots of additional functionality thanks to extensions
  • active and helpful community
Allison Horsts monster illustration of explorative plotting with ggplot2.

Illustration by Allison Horst

A collection of the versatility of ggplot2 to create basic graphs. All of them use the default grey ggplot2 theme.

ggplot2 Examples featured on ggplot2.tidyverse.org

Allison Horsts monster illustration of building  data masterpiece ith ggplot2 featuring a little Picasso monster :)

Illustration by Allison Horst


A multi-plot panel of various data visualizations created by the BBC teams.


A multi-plot panel of various data visualizations created by the BBC teams.

A collection of more advanced graphics created100% with ggplot2.

Selection of visualizations created 100% with ggplot2 by Thomas Linn Pedersen,
Georgios Karamanis, Timo Gossenbacher, Torsten Sprengler, Jake Kaupp, Jack Davison, and myself.

A Motivational Example

Setup

The ggplot2 Package


… is an R package to visualize data created by Hadley Wickham in 2005

# install.packages("ggplot2")
library(ggplot2)


… is part of the {tidyverse}

# install.packages("tidyverse")
library(tidyverse)

The Grammar of {ggplot2}

The Grammar of {ggplot2}


Component Function Explanation
Data ggplot(data)          The raw data that you want to visualise.
Aesthetics           aes() Aesthetic mappings between variables and visual properties.
Geometries geom_*() The geometric shapes representing the data.

The Grammar of {ggplot2}


Component Function Explanation
Data ggplot(data)          The raw data that you want to visualise.
Aesthetics           aes() Aesthetic mappings between variables and visual properties.
Geometries geom_*() The geometric shapes representing the data.
Statistics stat_*() The statistical transformations applied to the data.
Scales scale_*() Maps between the data and the aesthetic dimensions.
Facets facet_*() The arrangement of the data into a grid of plots.
Coordinate System coord_*() Maps data into the plane of the data rectangle.
Visual Themes theme() and theme_*() The overall visual defaults of a plot.

The Data

Bike sharing counts in London, UK, powered by TfL Open Data

  • covers the years 2015 and 2016
  • incl. weather data acquired from freemeteo.com
  • prepared by Hristo Mavrodiev for Kaggle
  • further modification by myself


bikes <- readr::read_csv(
  "./data/london-bikes-custom.csv",
  ## or: "https://cedricscherer.com/data/london-bikes-custom.csv"
  col_types = "Dcfffilllddddc"
)

bikes$season <- forcats::fct_inorder(bikes$season)
Variable Description Class
date Date encoded as `YYYY-MM-DD` date
day_night `day` (6:00am–5:59pm) or `night` (6:00pm–5:59am) character
year `2015` or `2016` factor
month `1` (January) to `12` (December) factor
season `winter`, `spring`, `summer`, or `autumn` factor
count Sum of reported bikes rented integer
is_workday `TRUE` being Monday to Friday and no bank holiday logical
is_weekend `TRUE` being Saturday or Sunday logical
is_holiday `TRUE` being a bank holiday in the UK logical
temp Average air temperature (°C) double
temp_feel Average feels like temperature (°C) double
humidity Average air humidity (%) double
wind_speed Average wind speed (km/h) double
weather_type Most common weather type character

Fundamentals of {ggplot2}

ggplot2::ggplot()

The help page of the ggplot() function.

Data

ggplot(data = bikes)

Aesthetic Mapping


= link variables to graphical properties

  • positions (x, y)
  • colors (color, fill)
  • shapes (shape, linetype)
  • size (size)
  • transparency (alpha)
  • groupings (group)

Aesthetic Mapping


= link variables to graphical properties

  • feels-like temperature ⇄ x
  • reported bike shares ⇄ y
  • season ⇄ color
  • year ⇄ shape

Aesthetic Mapping

ggplot(data = bikes) +
  aes(x = temp_feel, y = count)

Aesthetic Mapping

ggplot(
  data = bikes,
  mapping = aes(x = temp_feel, y = count)
)

Aesthetic Mapping

ggplot(
  bikes,
  aes(x = temp_feel, y = count)
)

Geometrical Layers

Geometries


= interpret aesthetics as graphical representations

  • points
  • lines
  • polygons
  • text labels

Geometries

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point()

Geometries

ggplot(
    bikes,
    aes(x = humidity)
  ) +
  geom_histogram()

Geometries

ggplot(
    bikes,
    aes(x = season, y = count)
  ) +
  geom_boxplot()

Visual Properties of Layers

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point(
    color = "dodgerblue",
    alpha = .5,
    shape = "X",
    stroke = 1,
    size = 4
  )

Setting vs Mapping of Visual Properties

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point(
    color = "dodgerblue",
    alpha = .5
  )

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point(
    aes(color = season),
    alpha = .5
  )

Mapping Expressions

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point(
    aes(color = temp_feel > 20),
    alpha = .5
  )

Your Turn

  • Create a chart showing a time series of reported bike shares
    • What is the difference between geom_line() and geom_path()?
    • Map the variable day_night to the color of the lines.
    • Add points for each observation, encoded by the same variable.
    • Turn the points into diamonds.

Solution Exercise

ggplot(
    bikes,
    aes(x = date, y = count)
  ) +
  geom_line()

Solution Exercise

ggplot(
    bikes,
    aes(x = count, y = date)
  ) +
  geom_line()

ggplot(
    bikes,
    aes(x = count, y = date)
  ) +
  geom_path()

Solution Exercise

ggplot(
    bikes,
    aes(x = date, y = count)
  ) +
  geom_line(
    aes(color = day_night)
  )

Solution Exercise

ggplot(
    bikes,
    aes(x = date, y = count)
  ) +
  geom_line(
    aes(color = day_night)
  ) +
  geom_point()

Solution Exercise

ggplot(
    bikes,
    aes(x = date, y = count)
  ) +
  geom_line(
    aes(color = day_night)
  ) +
  geom_point(
    aes(color = day_night)
  )

Solution Exercise

ggplot(
    bikes,
    aes(x = date, y = count,
        color = day_night)
  ) +
  geom_line() +
  geom_point()

Solution Exercise

ggplot(
    bikes,
    aes(x = date, y = count,
        color = day_night)
  ) +
  geom_line() +
  geom_point(
    shape = "diamond",
    size = 3
  )

Solution Exercise

ggplot(
    bikes,
    aes(x = date, y = count,
        color = day_night)
  ) +
  geom_line() +
  geom_point(
    shape = 18,
    size = 3
  )

An overview of a set of available shapes, ordered by their type of shape (e.g. points, triangles etc).

Source: Albert’s Blog

Local vs. Global Encoding

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point(
    aes(color = season),
    alpha = .5
  )

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = season)
  ) +
  geom_point(
    alpha = .5
  )

Adding More Layers

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = season)
  ) +
  geom_point(
    alpha = .5
  ) +
  geom_smooth(
    method = "lm"
  )

Global Color Encoding

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = season)
  ) +
  geom_point(
    alpha = .5
  ) +
  geom_smooth(
    method = "lm"
  )

Local Color Encoding

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point(
    aes(color = season),
    alpha = .5
  ) +
  geom_smooth(
    method = "lm"
  )

The `group` Aesthetic

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point(
    aes(color = season),
    alpha = .5
  ) +
  geom_smooth(
    aes(group = day_night),
    method = "lm"
  )

Set Both as Global Aesthetics

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = season,
        group = day_night)
  ) +
  geom_point(
    alpha = .5
  ) +
  geom_smooth(
    method = "lm"
  )

Overwrite Global Aesthetics

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = season,
        group = day_night)
  ) +
  geom_point(
    alpha = .5
  ) +
  geom_smooth(
    method = "lm",
    color = "black"
  )

Store a ggplot as Object

g <-
  ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = season,
        group = day_night)
  ) +
  geom_point(
    alpha = .5
  ) +
  geom_smooth(
    method = "lm",
    color = "black"
  )
class(g)
[1] "gg"     "ggplot"

Add More Layers

g +
  geom_rug(
    alpha = .2
  )

Remove a Layer from the Legend

g +
  geom_rug(
    alpha = .2,
    show.legend = FALSE
  )

Add More Layers

g +
  geom_rug(
    alpha = .2,
    show.legend = FALSE
  ) +
  geom_linerange(
    aes(ymin = 0, ymax = count)
  )

A Polished
ggplot Example

Add Labels

g +
  labs(
    x = "Feels-like temperature (°C)",
    y = "Reported bike shares",
    title = "TfL bike sharing trends"
  )

Add Labels

g <- g +
  labs(
    x = "Feels-like temperature (°C)",
    y = "Reported bike shares",
    title = "TfL bike sharing trends",
    color = NULL
  )

g

Add Labels

g +
  labs(
    subtitle = "Reported bike rents versus feels-like temperature in London",
    caption = "Data: TfL",
    tag = "A)",
    color = "Season:"
  )

Add Labels

g +
  labs(
    x = "",
    caption = "Data: TfL"
  )

g +
  labs(
    x = NULL,
    caption = "Data: TfL"
  )

Themes

g + theme_light()

g + theme_minimal()

Themes

g + ggthemes::theme_excel()

g + tvthemes::theme_rickAndMorty()

Change the Theme Base Settings

g + theme_light(
  base_size = 14,
  base_family = "Asap Condensed"
)

Set a Theme Globally

theme_set(theme_light())

g

Change the Theme Base Settings

theme_set(theme_light(
  base_size = 14,
  base_family = "Asap Condensed"
))

g

{systemfonts}

# install.packages("systemfonts")
library(systemfonts)

system_fonts() %>%
  filter(stringr::str_detect(family, "Asap")) %>%
  pull(family) %>%
  unique() %>% 
  sort()
[1] "Asap"               "Asap Condensed"     "Asap Expanded"      "Asap SemiCondensed" "Asap SemiExpanded" 

{systemfonts}

register_variant(
  name = "Cabinet Grotesk Black",
  family = "Cabinet Grotesk",
  weight = "heavy",
  features = font_feature(letters = "stylistic")
)

{systemfonts} + {ggplot2}

g +
  theme_light(
    base_size = 18,
    base_family = "Cabinet Grotesk Black"
  )

Overwrite Specific Theme Settings

g +
  theme(
    panel.grid.minor = element_blank()
  )

Overwrite Specific Theme Settings

g +
  theme(
    panel.grid.minor = element_blank(),
    plot.title = element_text(face = "bold")
  )

Overwrite Specific Theme Settings

g +
  theme(
    panel.grid.minor = element_blank(),
    plot.title = element_text(face = "bold"),
    legend.position = "top"
  )

Overwrite Specific Theme Settings

g +
  theme(
    panel.grid.minor = element_blank(),
    plot.title = element_text(face = "bold"),
    legend.position = "none"
  )

Overwrite Specific Theme Settings

g +
  theme(
    panel.grid.minor = element_blank(),
    plot.title = element_text(face = "bold"),
    legend.position = "top",
    plot.title.position = "plot"
  )

Overwrite Theme Settings Globally

theme_update(
  panel.grid.minor = element_blank(),
  plot.title = element_text(face = "bold"),
  legend.position = "top",
  plot.title.position = "plot"
)

g

Export Your Graphic

Save the Graphic

ggsave(g, filename = "my_plot.png")
ggsave("my_plot.png")
ggsave("my_plot.png", width = 8, height = 5, dpi = 600)
ggsave("my_plot.pdf", width = 20, height = 12, unit = "cm", device = cairo_pdf)


A comparison of vector and raster graphics.

Modified from canva.com

Scales

Scales


= translate between variable and property ranges

  • feels-like temperature ⇄ x
  • reported bike shares ⇄ y
  • season ⇄ color
  • year ⇄ shape

Scales

The scale_*() components control the properties of all the
aesthetic dimensions mapped to the data.

Consequently, there are scale_*() functions for all aesthetics such as:

  • positions via scale_x_*() and scale_y_*()
  • colors via scale_color_*() and scale_fill_*()
  • sizes via scale_size_*() and scale_radius_*()
  • shapes via scale_shape_*() and scale_linetype_*()
  • transparency via scale_alpha_*()

… with tons of options for * such as continuous, discrete, manual, log10, gradient, and many more

Scales

The scale_*() components control the properties of all the
aesthetic dimensions mapped to the data.

Use scales to:

  • set breaks (tick marks)
  • overwrite axis / legend labels
  • adjust axis / legend title
  • modify colors and palettes
  • define shapes and line types
  • scale sizes (e.g. bubbles)
Allison Horsts illustration ofthe correct use of continuous versus discrete; however, in {ggplot2} these are interpeted in a different way: as quantitative and qualitative.

Illustration by Allison Horst

Continuous vs. Discrete in {ggplot2}

Continuous:
quantitative or numerical data

  • height
  • weight
  • age
  • counts

Discrete:
qualitative or categorical data

  • species
  • sex
  • study sites
  • age group

Continuous vs. Discrete in {ggplot2}

Continuous:
quantitative or numerical data

  • height (continuous)
  • weight (continuous)
  • age (continuous or discrete)
  • counts (discrete)

Discrete:
qualitative or categorical data

  • species (nominal)
  • sex (nominal)
  • study site (nominal or ordinal)
  • age group (ordinal)

Aesthetics + Scales

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = season)
  ) +
  geom_point() +
  scale_x_continuous() +
  scale_y_continuous() +
  scale_color_discrete()

Aesthetics + Scales

ggplot(
    bikes,
    aes(x = season, y = temp_feel)
  ) +
  geom_boxplot() +
  scale_x_discrete() +
  scale_y_continuous()

Aesthetics + Scales

g +
  scale_x_continuous() +
  scale_y_continuous() +
  scale_color_discrete()

Overwrite Scales

g +
  scale_x_binned() +
  scale_y_log10() +
  scale_color_viridis_d()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0)
  ) +
  scale_y_continuous() +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5)
  ) +
  scale_y_continuous() +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C")
  ) +
  scale_y_continuous() +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous() +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    limits = c(-20000, NA)
  ) +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    limits = c(5000, 20000)
  ) +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    breaks = c(100, 10000, 20000, 50000)
  ) +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    breaks = 0:5*10000
  ) +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scale_color_discrete()

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous( 
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scale_color_discrete(
    type = c("#3c89d9", "#1ec99b", "#f7b01b", "#a26e7c")
  )

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scale_color_manual(
    values = c("#3c89d9", "#1ec99b", "#f7b01b", "#a26e7c")
  )

Modify Scales

colors_sorted <- c(
  autumn = "#a26e7c",
  spring = "#1ec99b",
  summer = "#f7b01b",
  winter = "#3c89d9"
)

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scale_color_manual(
    values = colors_sorted
  )

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scale_color_brewer(
    palette = "Dark2"
  )

{RColorBrewer}

RColorBrewer::display.brewer.all()

{RColorBrewer}

RColorBrewer::display.brewer.all(colorblindFriendly = TRUE)

{rcartocolor}

# install.packages("rcartocolor")
rcartocolor::display_carto_all()

{rcartocolor}

# install.packages("rcartocolor")
rcartocolor::display_carto_all(colorblind_friendly = TRUE)

{rcartocolor}

g +
  scale_x_continuous(
    expand = c(mult = 0.02, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    expand = c(mult = 0, add = 1500), 
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  rcartocolor::scale_color_carto_d(
    palette = "Bold"
  )

{scico}

# install.packages("scico")
scico::scico_palette_show()

{scico}

g +
  scale_x_continuous(
    expand = c(mult = 0.02, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    expand = c(mult = 0, add = 1500), 
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scico::scale_color_scico_d(
    palette = "hawaii"
  )

{scico}

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = humidity)
  ) +
  geom_point(alpha = .7) +
  scico::scale_color_scico(
    palette = "davos",
    direction = -1,
    end = .8
  )

{scico}

ggplot(
    bikes,
    aes(x = temp_feel, y = count,
        color = humidity)
  ) +
  geom_point(alpha = .7) +
  scale_color_viridis_c(
    direction = -1
  )

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous( 
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scale_color_manual(
    values = colors_sorted,
    name = NULL
  )

Modify Scales

g +
  scale_x_continuous(
    expand = c(mult = 0, add = 0),
    breaks = seq(0, 30, by = 5), 
    labels = function(x) paste0(x, "°C"),
    name = "Feels-like temperature"
  ) +
  scale_y_continuous(
    breaks = 0:5*10000, 
    labels = scales::label_comma()
  ) +
  scale_color_manual(
    values = colors_sorted,
    name = NULL,
    labels = stringr::str_to_title
  )

Your Turn

  • Style the time series of reported bike shares.
    • Add a plot title and meaningful axis and legend titles.
    • Use a custom set of colors for day and night.
    • Explore complete themes and pick your favorite.
    • Bonus: Modify the x axis to show every four months along with the year.

Solution Exercise

(p <- 
   ggplot(
    bikes,
    aes(x = date, y = count,
        color = day_night)
  ) +
  geom_line() +
  geom_point()
)

Solution Exercise

p + 
  labs(
    title = "Most bikes are rented during summer days",
    subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
    x = NULL, 
    y = "Reported bike shares", 
    color = NULL
  )

Solution Exercise

p + 
  scale_color_manual(
    values = c("#98730F", "#44458e"),
    labels = c("Day (6am-6pm)", "Night (6pm-6am)")
  ) +
  labs(title = "Most bikes are rented during summer days",
       subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
       x = NULL, y = "Reported bike shares", color = NULL)

Solution Exercise

p + 
  scale_color_manual(
    values = c("#98730F", "#44458e"),
    labels = c("Day (6am-6pm)", "Night (6pm-6am)")
  ) +
  theme_minimal(base_family = "Spline Sans", base_size = 15) +
  labs(title = "Most bikes are rented during summer days",
       subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
       x = NULL, y = "Reported bike shares", color = NULL)

Solution Exercise

p + 
  scale_color_manual(
    values = c("#98730F", "#44458e"),
    labels = c("Day (6am-6pm)", "Night (6pm-6am)")
  ) +
  theme_minimal(base_family = "Spline Sans", base_size = 15) +
  theme(
    panel.grid.minor = element_blank(),
    plot.title = element_text(face = "bold"),
    plot.title.position = "plot",
    legend.position = "top"
  ) +
  labs(title = "Most bikes are rented during summer days",
       subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
       x = NULL, y = "Reported bike shares", color = NULL)

Solution Exercise

p + 
  scale_x_date(
    date_breaks = "4 months",
    date_labels = "%m/'%y"
  ) +
  scale_color_manual(
    values = c("#98730F", "#44458e"),
    labels = c("Day (6am-6pm)", "Night (6pm-6am)")
  ) +
  theme_minimal(base_family = "Spline Sans", base_size = 15) +
  theme(
    panel.grid.minor = element_blank(),
    plot.title = element_text(face = "bold"),
    plot.title.position = "plot",
    legend.position = "top"
  ) +
  labs(title = "Most bikes are rented during summer days",
       subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
       x = NULL, y = "Reported bike shares", color = NULL)

Facets

Facets


= split variables to multiple panels

Facets are also known as:

  • small multiples
  • trellis graphs
  • lattice plots
  • conditioning

Wrapped Facets

g +
  facet_wrap(
    vars(day_night)
  )

Wrapped Facets

g +
  facet_wrap(
    ~ day_night
  )

Facet Multiple Variables

g +
  facet_wrap(
    ~ is_workday + day_night
  )

Gridded Facets

g +
  facet_grid(
    rows = vars(day_night),
    cols = vars(is_workday)
  )

Gridded Facets

g +
  facet_grid(
    day_night ~ is_workday
  )

Facet Multiple Variables

g +
  facet_grid(
    day_night ~ is_workday + season
  )

Facet Options: Free Scaling

g +
  facet_grid(
    day_night ~ is_workday,
    scales = "free"
  )

Facet Options: Proportional Spacing

g +
  facet_grid(
    day_night ~ is_workday,
    scales = "free",
    space = "free"
  )

Facet Options: Proportional Spacing

g +
  facet_grid(
    day_night ~ is_workday,
    scales = "free_y",
    space = "free_y"
  )

Facet Labellers

g +
  facet_wrap(
    ~ day_night,
    labeller = label_both
  )

Facet Labellers

g +
  facet_wrap(
    ~ is_workday + day_night,
    labeller = labeller(
      day_night = stringr::str_to_title
    )
  )

Facet Labellers

codes <- c(
  `TRUE` = "Workday",
  `FALSE` = "Weekend or Holiday"
)

g +
  facet_wrap(
    ~ is_workday + day_night,
    labeller = labeller(
      day_night = stringr::str_to_title,
      is_workday = codes
    )
  )

Coordinate Systems

Coordinate Systems


= interpret the position aesthetics

  • linear coordinate systems: preserve the geometrical shapes
    • coord_cartesian()
    • coord_fixed()
    • coord_flip()
  • non-linear coordinate systems: likely change the geometrical shapes
    • coord_polar()
    • coord_map() and coord_sf()
    • coord_trans()

Cartesian Coordinate System

ggplot(
    bikes,
    aes(x = season, y = count)
  ) +
  geom_boxplot() +
  coord_cartesian()

Cartesian Coordinate System

ggplot(
    bikes,
    aes(x = season, y = count)
  ) +
  geom_boxplot() +
  coord_cartesian(
    ylim = c(NA, 15000)
  )

Changing Limits

ggplot(
    bikes,
    aes(x = season, y = count)
  ) +
  geom_boxplot() +
  coord_cartesian(
    ylim = c(NA, 15000)
  )

ggplot(
    bikes,
    aes(x = season, y = count)
  ) +
  geom_boxplot() +
  scale_y_continuous(
    limits = c(NA, 15000)
  )

Clipping

ggplot(
    bikes,
    aes(x = season, y = count)
  ) +
  geom_boxplot() +
  coord_cartesian(
    ylim = c(NA, 15000),
    clip = "off"
  ) +
  theme(plot.margin = margin(300, 5, 5, 5))

Remove All Padding

ggplot(
    bikes,
    aes(x = temp_feel, y = count)
  ) +
  geom_point() +
  coord_cartesian(
    expand = FALSE,
    clip = "off"
  )

Fixed Coordinate System

ggplot(
    bikes,
    aes(x = temp_feel, y = temp)
  ) +
  geom_point() +
  coord_fixed()

ggplot(
    bikes,
    aes(x = temp_feel, y = temp)
  ) +
  geom_point() +
  coord_fixed(ratio = 4)

Flipped Coordinate System

ggplot(
    bikes,
    aes(x = weather_type)
  ) +
  geom_bar() +
  coord_cartesian()

ggplot(
    bikes,
    aes(x = weather_type)
  ) +
  geom_bar() +
  coord_flip()

Flipped Coordinate System

ggplot(
    bikes,
    aes(y = weather_type)
  ) +
  geom_bar() +
  coord_cartesian()

ggplot(
    bikes,
    aes(x = weather_type)
  ) +
  geom_bar() +
  coord_flip()

Reminder: Sort Your Bars!

library(forcats)

ggplot(
    filter(bikes, !is.na(weather_type)),
    aes(y = fct_infreq(weather_type))
  ) +
  geom_bar()

Reminder: Sort Your Bars!

library(forcats)

ggplot(
    filter(bikes, !is.na(weather_type)),
    aes(y = fct_rev(
      fct_infreq(weather_type)
    ))
  ) +
  geom_bar()

Wrap-Up

g1 <- 
  ## create scatter plot, encoded based on season
  ggplot(bikes, aes(temp_feel, count)) +
  geom_point(
    aes(color = season),
    size = 2.2, alpha = .55
  ) 

g1
g2 <- g1 +
  ## add a linear fitting for each time of the day
  geom_smooth(
    aes(group = day_night),
    method = "lm", color = "black"
  )

g2
g3 <- g2 +
  ## add titles + labels
  labs(
    x = "Feels-Like Temperature", y = NULL,
    caption = "Data: Transport for London (TfL), Jan 2015—Dec 2016",
    title = "Reported TfL bike rents versus feels-like temperature in London, 2015–2016"
  )

g3
g4 <- g3 + 
  theme_light(base_size = 18, base_family = "Spline Sans")

g4
codes <- c(
  `TRUE` = "Workday",
  `FALSE` = "Weekend or Holiday"
)

g5 <- g4 +
  facet_grid(
    day_night ~ is_workday,
    scales = "free_y", space = "free_y",
    labeller = labeller(
      day_night = stringr::str_to_title,
      is_workday = codes
    )
  )

g5
g6 <- g5  +
  ## adjust labels x-axis
  scale_x_continuous(
    expand = c(mult = 0, add = 1),
    breaks = 0:6*5, labels = function(x) paste0(x, "°C")
  ) +
  ## adjust labels y-axis
  scale_y_continuous(
    expand = c(mult = .05, add = 0), limits = c(0, NA),
    breaks = 0:5*10000, labels = scales::label_comma()
  ) +
  ## modify colors + legend
  scale_color_manual(
    values = c("#3c89d9", "#1ec99b", "#F7B01B", "#a26e7c"), 
    name = NULL, labels = stringr::str_to_title
  ) 

g6
g6 + 
  ## theme modifications
  theme(
    plot.title.position = "plot",
    plot.caption.position = "plot",
    plot.title = element_text(face = "bold", size = rel(1.5)),
    axis.text = element_text(family = "Spline Sans Mono", color = "grey45"),
    axis.title.x = element_text(hjust = 0, margin = margin(t = 12), color = "grey25"),
    strip.text = element_text(face = "bold", size = rel(1.1)),
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank(),
    legend.text = element_text(size = rel(.9)),
    legend.position = "top",
    legend.justification = "left"
  )

That’s it Folks…
— Thank you! —

Appendix

Statistical Layers

geom_*() and stat_*()

ggplot(bikes, aes(x = temp_feel, y = count)) +
  geom_smooth(stat = "smooth")

ggplot(bikes, aes(x = temp_feel, y = count)) +
  stat_smooth(geom = "smooth")

geom_*() and stat_*()

ggplot(bikes, aes(x = date, y = temp_feel)) +
  geom_point(stat = "identity")

ggplot(bikes, aes(x = date, y = temp_feel)) +
  stat_identity(geom = "point")

geom_*() and stat_*()

ggplot(bikes, aes(x = is_weekend)) +
  geom_bar(stat = "count")

ggplot(bikes, aes(x = is_weekend)) +
  stat_count(geom = "bar")

Statistical Summaries

ggplot(
    bikes, 
    aes(x = season, y = temp_feel)
  ) +
  stat_summary() 

Statistical Summaries

ggplot(
    bikes, 
    aes(x = season, y = temp_feel)
  ) +
  stat_summary(
    fun.data = mean_se, ## the default
    geom = "pointrange"  ## the default
  ) 

Statistical Summaries

ggplot(
    bikes, 
    aes(x = season, y = temp_feel)
  ) +
  geom_boxplot() +
  stat_summary(
    fun = mean,
    geom = "point",
    color = "#28a87d",
    size = 3
  ) 

Statistical Summaries

ggplot(
    bikes, 
    aes(x = season, y = temp_feel)
  ) +
  stat_summary() +
  stat_summary(
    fun = mean,
    geom = "text",
    aes(label = after_stat(y))
  )

Statistical Summaries

ggplot(
    bikes, 
    aes(x = season, y = temp_feel)
  ) +
  stat_summary() +
  stat_summary(
    fun = mean,
    geom = "text",
    aes(label = after_stat(
      paste0(round(y, 2), "°C"))
    ),
    hjust = -.2,
    size = 3.5
  )

Aspect Ratios

How to Work with Aspect Ratios

  • don’t rely on the Rstudio viewer pane!
  • once you have a “it’s getting close” prototype, settle on a plot size

  • Approach 1: save the file to disk and inspect it; go back to your IDE
    • tedious and time-consuming…

  • Approach 2: use a qmd or rmd with inline output and chunk settings
    • set fig.width and fig.height per chunk or globally

  • Approach 3: use our {camcorder} package
    • saves output from all ggplot() calls and displays it in the viewer pane

Setting Plot Sizes in Rmd’s

A screenshot of an exemplary Rmd file with two chunks with different settings of fig.width and fig.height.

Setting Plot Sizes via {camcorder}


A screenshot of an exemplary R script with a plot automatically saved and isplayed in correct aspect ratio thanks to the camcorder package.