Today
ggplot
xvar
to be timeGroup=
to Make Multiple LinesYou need long data:
df <- data.frame(year = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
type = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
outcome = c(1, 2, 3, 2, 3, 3.1, 3, 4, 5))
df
year type outcome
1 1 1 1.0
2 2 1 2.0
3 3 1 3.0
4 1 2 2.0
5 2 2 3.0
6 3 2 3.1
7 1 3 3.0
8 2 3 4.0
9 3 3 5.0
Were the hurricane data long?
Group=
to Make Multiple LinesYou need long data:
Group=
to Make Multiple Linescolor =
to color linescolor =
to color linesWhat is very annoying about the legend?
In today’s tutorial, you’ll use bikeshare data
group_by
first)Remember that group_by
and then summarize
take you from one unit of observation to another.
General logic is
What you add is in geom = [name of geom]
which can be
year tree growth
1 1 1 5
2 2 1 6
3 3 1 7
4 1 2 8
5 2 2 8
6 3 2 9
Is this long or wide?
ggplot()
prefers long data
To think about this we will
wide <- data.frame(state = c("6","36","48"),
female_pop = c("10","12","14"),
male_pop = c("11","13","12"))
wide
state female_pop male_pop
1 6 10 11
2 36 12 13
3 48 14 12
What does the long version of this look like?
substr()
commandpivot_longer()
itselfpivot_wider()
for going the other way Write a minimal reproducible example
Doing this frequently solves your problem
Two basic methods
Taken largely from Stack Overflow’s advice. For Hadley Wickham’s official advice, see here.
Problem: Former student Jasmine can get a bar graph
Ack! A lot of stuff together
do the data prep before the graph, so the graph command has just the graphing, and not the summarizing or slicing or any of that. So prep the data and then show me the head of the dataframe, with all the variables that go into the graph
then run the code for both geom_col
and geom_line
that provides different estimates.
It’s possible that just doing (i) will figure out your problem, but we’ll see.
LB
\(\rightarrow\) (i) did figure out the problem – data not formatted correctly