Skip to content

Commit 2c5a78c

Browse files
authored
Merge pull request #5132 from kevinxperese/kevinxperese-ggplot-docstring
Improve clarity of docstring for ggplot()
2 parents 35e146b + fa2b597 commit 2c5a78c

File tree

2 files changed

+143
-87
lines changed

2 files changed

+143
-87
lines changed

R/plot.r

+73-45
Original file line numberDiff line numberDiff line change
@@ -6,26 +6,35 @@
66
#' subsequent layers unless specifically overridden.
77
#'
88
#' `ggplot()` is used to construct the initial plot object,
9-
#' and is almost always followed by `+` to add component to the
10-
#' plot. There are three common ways to invoke `ggplot()`:
9+
#' and is almost always followed by a plus sign (`+`) to add
10+
#' components to the plot.
1111
#'
12-
#' * `ggplot(df, aes(x, y, other aesthetics))`
13-
#' * `ggplot(df)`
12+
#' There are three common patterns used to invoke `ggplot()`:
13+
#'
14+
#' * `ggplot(data = df, mapping = aes(x, y, other aesthetics))`
15+
#' * `ggplot(data = df)`
1416
#' * `ggplot()`
1517
#'
16-
#' The first method is recommended if all layers use the same
18+
#' The first pattern is recommended if all layers use the same
1719
#' data and the same set of aesthetics, although this method
18-
#' can also be used to add a layer using data from another
19-
#' data frame. See the first example below. The second
20-
#' method specifies the default data frame to use for the plot,
21-
#' but no aesthetics are defined up front. This is useful when
22-
#' one data frame is used predominantly as layers are added,
23-
#' but the aesthetics may vary from one layer to another. The
24-
#' third method initializes a skeleton `ggplot` object which
25-
#' is fleshed out as layers are added. This method is useful when
20+
#' can also be used when adding a layer using data from another
21+
#' data frame.
22+
#'
23+
#' The second pattern specifies the default data frame to use
24+
#' for the plot, but no aesthetics are defined up front. This
25+
#' is useful when one data frame is used predominantly for the
26+
#' plot, but the aesthetics vary from one layer to another.
27+
#'
28+
#' The third pattern initializes a skeleton `ggplot` object, which
29+
#' is fleshed out as layers are added. This is useful when
2630
#' multiple data frames are used to produce different layers, as
2731
#' is often the case in complex graphics.
2832
#'
33+
#' The `data =` and `mapping =` specifications in the arguments are optional
34+
#' (and are often omitted in practice), so long as the data and the mapping
35+
#' values are passed into the function in the right order. In the examples
36+
#' below, however, they are left in place for clarity.
37+
#'
2938
#' @param data Default dataset to use for plot. If not already a data.frame,
3039
#' will be converted to one by [fortify()]. If not specified,
3140
#' must be supplied in each layer added to the plot.
@@ -36,42 +45,61 @@
3645
#' evaluation.
3746
#' @export
3847
#' @examples
39-
#' # Generate some sample data, then compute mean and standard deviation
40-
#' # in each group
48+
#' # Create a data frame with some sample data, then create a data frame
49+
#' # containing the mean value for each group in the sample data.
4150
#' set.seed(1)
42-
#' df <- data.frame(
43-
#' gp = factor(rep(letters[1:3], each = 10)),
44-
#' y = rnorm(30)
51+
#'
52+
#' sample_df <- data.frame(
53+
#' group = factor(rep(letters[1:3], each = 10)),
54+
#' value = rnorm(30)
4555
#' )
46-
#' ds <- do.call(rbind, lapply(split(df, df$gp), function(d) {
47-
#' data.frame(mean = mean(d$y), sd = sd(d$y), gp = d$gp)
48-
#' }))
49-
#'
50-
#' # The summary data frame ds is used to plot larger red points on top
51-
#' # of the raw data. Note that we don't need to supply `data` or `mapping`
52-
#' # in each layer because the defaults from ggplot() are used.
53-
#' ggplot(df, aes(gp, y)) +
56+
#'
57+
#' group_means_df <- setNames(
58+
#' aggregate(value ~ group, sample_df, mean),
59+
#' c("group", "group_mean")
60+
#' )
61+
#'
62+
#' # The following three code blocks create the same graphic, each using one
63+
#' # of the three patterns specified above. In each graphic, the sample data
64+
#' # are plotted in the first layer and the group means data frame is used to
65+
#' # plot larger red points on top of the sample data in the second layer.
66+
#'
67+
#' # Pattern 1
68+
#' # Both the `data` and `mapping` arguments are passed into the `ggplot()`
69+
#' # call. Those arguments are omitted in the first `geom_point()` layer
70+
#' # because they get passed along from the `ggplot()` call. Note that the
71+
#' # second `geom_point()` layer re-uses the `x = group` aesthetic through
72+
#' # that mechanism but overrides the y-position aesthetic.
73+
#' ggplot(data = sample_df, mapping = aes(x = group, y = value)) +
5474
#' geom_point() +
55-
#' geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
56-
#'
57-
#' # Same plot as above, declaring only the data frame in ggplot().
58-
#' # Note how the x and y aesthetics must now be declared in
59-
#' # each geom_point() layer.
60-
#' ggplot(df) +
61-
#' geom_point(aes(gp, y)) +
62-
#' geom_point(data = ds, aes(gp, mean), colour = 'red', size = 3)
63-
#'
64-
#' # Alternatively we can fully specify the plot in each layer. This
65-
#' # is not useful here, but can be more clear when working with complex
66-
#' # mult-dataset graphics
75+
#' geom_point(
76+
#' mapping = aes(y = group_mean), data = group_means_df,
77+
#' colour = 'red', size = 3
78+
#' )
79+
#'
80+
#' # Pattern 2
81+
#' # Same plot as above, passing only the `data` argument into the `ggplot()`
82+
#' # call. The `mapping` arguments are now required in each `geom_point()`
83+
#' # layer because there is no `mapping` argument passed along from the
84+
#' # `ggplot()` call.
85+
#' ggplot(data = sample_df) +
86+
#' geom_point(mapping = aes(x = group, y = value)) +
87+
#' geom_point(
88+
#' mapping = aes(x = group, y = group_mean), data = group_means_df,
89+
#' colour = 'red', size = 3
90+
#' )
91+
#'
92+
#' # Pattern 3
93+
#' # Same plot as above, passing neither the `data` or `mapping` arguments
94+
#' # into the `ggplot()` call. Both those arguments are now required in
95+
#' # each `geom_point()` layer. This pattern can be particularly useful when
96+
#' # creating more complex graphics with many layers using data from multiple
97+
#' # data frames.
6798
#' ggplot() +
68-
#' geom_point(data = df, aes(gp, y)) +
69-
#' geom_point(data = ds, aes(gp, mean), colour = 'red', size = 3) +
70-
#' geom_errorbar(
71-
#' data = ds,
72-
#' aes(gp, mean, ymin = mean - sd, ymax = mean + sd),
73-
#' colour = 'red',
74-
#' width = 0.4
99+
#' geom_point(mapping = aes(x = group, y = value), data = sample_df) +
100+
#' geom_point(
101+
#' mapping = aes(x = group, y = group_mean), data = group_means_df,
102+
#' colour = 'red', size = 3
75103
#' )
76104
ggplot <- function(data = NULL, mapping = aes(), ...,
77105
environment = parent.frame()) {

man/ggplot.Rd

+70-42
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)