Reproducible Graphic Design with {ggplot2}
{ggplot2}
is a system for declaratively creating graphics,
based on “The Grammar of Graphics” (Wilkinson, 2005).
You provide the data, tell {ggplot2}
how to map variables to aesthetics,
what graphical primitives to use, and it takes care of the details.
… is an R package to visualize data created by Hadley Wickham in 2005
… is part of the {tidyverse}
Component | Function | Explanation |
---|---|---|
Data |
ggplot(data)
|
The raw data that you want to visualise. |
Aesthetics |
aes()
|
Aesthetic mappings between variables and visual properties. |
Geometries |
geom_*()
|
The geometric shapes representing the data. |
Component | Function | Explanation |
---|---|---|
Data |
ggplot(data)
|
The raw data that you want to visualise. |
Aesthetics |
aes()
|
Aesthetic mappings between variables and visual properties. |
Geometries |
geom_*()
|
The geometric shapes representing the data. |
Statistics |
stat_*()
|
The statistical transformations applied to the data. |
Scales |
scale_*()
|
Maps between the data and the aesthetic dimensions. |
Facets |
facet_*()
|
The arrangement of the data into a grid of plots. |
Coordinate System |
coord_*()
|
Maps data into the plane of the data rectangle. |
Visual Themes |
theme() and theme_*()
|
The overall visual defaults of a plot. |
Bike sharing counts in London, UK, powered by TfL Open Data
Variable | Description | Class |
---|---|---|
date | Date encoded as `YYYY-MM-DD` | date |
day_night | `day` (6:00am–5:59pm) or `night` (6:00pm–5:59am) | character |
year | `2015` or `2016` | factor |
month | `1` (January) to `12` (December) | factor |
season | `winter`, `spring`, `summer`, or `autumn` | factor |
count | Sum of reported bikes rented | integer |
is_workday | `TRUE` being Monday to Friday and no bank holiday | logical |
is_weekend | `TRUE` being Saturday or Sunday | logical |
is_holiday | `TRUE` being a bank holiday in the UK | logical |
temp | Average air temperature (°C) | double |
temp_feel | Average feels like temperature (°C) | double |
humidity | Average air humidity (%) | double |
wind_speed | Average wind speed (km/h) | double |
weather_type | Most common weather type | character |
ggplot2::ggplot()
= link variables to graphical properties
x
, y
)color
, fill
)shape
, linetype
)size
)alpha
)group
)= link variables to graphical properties
= interpret aesthetics as graphical representations
geom_line()
and geom_path()
?day_night
to the color of the lines.= translate between variable and property ranges
The scale_*()
components control the properties of all the
aesthetic dimensions mapped to the data.
Consequently, there are scale_*()
functions for all aesthetics such as:
scale_x_*()
and scale_y_*()
scale_color_*()
and scale_fill_*()
scale_size_*()
and scale_radius_*()
scale_shape_*()
and scale_linetype_*()
scale_alpha_*()
… with tons of options for *
such as continuous
, discrete
, manual
, log10
, gradient
, and many more
The scale_*()
components control the properties of all the
aesthetic dimensions mapped to the data.
Use scales to:
Continuous:
quantitative or numerical data
Discrete:
qualitative or categorical data
Continuous:
quantitative or numerical data
Discrete:
qualitative or categorical data
g +
scale_x_continuous(
expand = c(mult = 0, add = 0),
breaks = seq(0, 30, by = 5),
labels = function(x) paste0(x, "°C"),
name = "Feels-like temperature"
) +
scale_y_continuous(
breaks = 0:5*10000,
labels = scales::label_comma()
) +
scale_color_discrete(
type = c("#3c89d9", "#1ec99b", "#f7b01b", "#a26e7c")
)
g +
scale_x_continuous(
expand = c(mult = 0, add = 0),
breaks = seq(0, 30, by = 5),
labels = function(x) paste0(x, "°C"),
name = "Feels-like temperature"
) +
scale_y_continuous(
breaks = 0:5*10000,
labels = scales::label_comma()
) +
scale_color_manual(
values = c("#3c89d9", "#1ec99b", "#f7b01b", "#a26e7c")
)
colors_sorted <- c(
autumn = "#a26e7c",
spring = "#1ec99b",
summer = "#f7b01b",
winter = "#3c89d9"
)
g +
scale_x_continuous(
expand = c(mult = 0, add = 0),
breaks = seq(0, 30, by = 5),
labels = function(x) paste0(x, "°C"),
name = "Feels-like temperature"
) +
scale_y_continuous(
breaks = 0:5*10000,
labels = scales::label_comma()
) +
scale_color_manual(
values = colors_sorted
)
g +
scale_x_continuous(
expand = c(mult = 0.02, add = 0),
breaks = seq(0, 30, by = 5),
labels = function(x) paste0(x, "°C"),
name = "Feels-like temperature"
) +
scale_y_continuous(
expand = c(mult = 0, add = 1500),
breaks = 0:5*10000,
labels = scales::label_comma()
) +
rcartocolor::scale_color_carto_d(
palette = "Bold"
)
g +
scale_x_continuous(
expand = c(mult = 0.02, add = 0),
breaks = seq(0, 30, by = 5),
labels = function(x) paste0(x, "°C"),
name = "Feels-like temperature"
) +
scale_y_continuous(
expand = c(mult = 0, add = 1500),
breaks = 0:5*10000,
labels = scales::label_comma()
) +
scico::scale_color_scico_d(
palette = "hawaii"
)
g +
scale_x_continuous(
expand = c(mult = 0, add = 0),
breaks = seq(0, 30, by = 5),
labels = function(x) paste0(x, "°C"),
name = "Feels-like temperature"
) +
scale_y_continuous(
breaks = 0:5*10000,
labels = scales::label_comma()
) +
scale_color_manual(
values = colors_sorted,
name = NULL,
labels = stringr::str_to_title
)
p +
scale_color_manual(
values = c("#98730F", "#44458e"),
labels = c("Day (6am-6pm)", "Night (6pm-6am)")
) +
theme_minimal(base_family = "Spline Sans", base_size = 15) +
labs(title = "Most bikes are rented during summer days",
subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
x = NULL, y = "Reported bike shares", color = NULL)
p +
scale_color_manual(
values = c("#98730F", "#44458e"),
labels = c("Day (6am-6pm)", "Night (6pm-6am)")
) +
theme_minimal(base_family = "Spline Sans", base_size = 15) +
theme(
panel.grid.minor = element_blank(),
plot.title = element_text(face = "bold"),
plot.title.position = "plot",
legend.position = "top"
) +
labs(title = "Most bikes are rented during summer days",
subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
x = NULL, y = "Reported bike shares", color = NULL)
p +
scale_x_date(
date_breaks = "4 months",
date_labels = "%m/'%y"
) +
scale_color_manual(
values = c("#98730F", "#44458e"),
labels = c("Day (6am-6pm)", "Night (6pm-6am)")
) +
theme_minimal(base_family = "Spline Sans", base_size = 15) +
theme(
panel.grid.minor = element_blank(),
plot.title = element_text(face = "bold"),
plot.title.position = "plot",
legend.position = "top"
) +
labs(title = "Most bikes are rented during summer days",
subtitle = "Reported rents of TfL bikes for 2015 and 2016.",
x = NULL, y = "Reported bike shares", color = NULL)
= split variables to multiple panels
Facets are also known as:
= interpret the position aesthetics
coord_cartesian()
coord_fixed()
coord_flip()
coord_polar()
coord_map()
and coord_sf()
coord_trans()
g6 <- g5 +
## adjust labels x-axis
scale_x_continuous(
expand = c(mult = 0, add = 1),
breaks = 0:6*5, labels = function(x) paste0(x, "°C")
) +
## adjust labels y-axis
scale_y_continuous(
expand = c(mult = .05, add = 0), limits = c(0, NA),
breaks = 0:5*10000, labels = scales::label_comma()
) +
## modify colors + legend
scale_color_manual(
values = c("#3c89d9", "#1ec99b", "#F7B01B", "#a26e7c"),
name = NULL, labels = stringr::str_to_title
)
g6
g6 +
## theme modifications
theme(
plot.title.position = "plot",
plot.caption.position = "plot",
plot.title = element_text(face = "bold", size = rel(1.5)),
axis.text = element_text(family = "Spline Sans Mono", color = "grey45"),
axis.title.x = element_text(hjust = 0, margin = margin(t = 12), color = "grey25"),
strip.text = element_text(face = "bold", size = rel(1.1)),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
legend.text = element_text(size = rel(.9)),
legend.position = "top",
legend.justification = "left"
)
fig.width
and fig.height
per chunk or globallyggplot()
calls and displays it in the viewer paneCédric Scherer // DVS Masterclass // March 9, 2023