---
title: Cats!
author: CRS
date: 26 August 2019 (36-462, Lecture 1)
---
Let's play with cats!
```{r}
library(MASS)
data(cats)
plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)",
col=ifelse(cats$Sex=="F","red","blue"))
```
Our first **code chunk** loads a library, loads some data, makes a plot,
_and embeds the plot in our document_. If we change the data, or change
what we do to it, the output changes along with it.
(The default is for each code chunk to also show the code --- you can turn that
off if you like.)
We can fit a linear model by ordinary least squares here:
```{r}
cats.lm <- lm(Hwt ~ Bwt, data=cats)
```
**Notice**: we don't have to re-load the data --- R Markdown keeps its own
history/environment from chunk to chunk.
But how do we see the results?
```{r}
summary(cats.lm)
```
This looks ugly --- there must be a better way!
```{r}
library(knitr) # For some R Markdown-related utilities
kable(summary(cats.lm)$coefficients)
```
That many decimal places is distracting _and_ nonsensical --- turn that
down.
```{r}
kable(signif(summary(cats.lm)$coefficients, 3))
```
Add the regression line to the plot:
```{r}
plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)",
col=ifelse(cats$Sex=="F","red","blue"))
abline(cats.lm)
```
Not bad but not outstanding: Try something else
```{r}
library(tree)
cats.tree <- tree(Hwt ~ Bwt+Sex, data=cats)
```
```{r}
plot(cats.tree)
text(cats.tree)
```
The tree has a root-mean-squared-error of
`r signif(sqrt(mean(residuals(cats.tree)^2)), 4)` g, which is (a little bit) smaller than the linear
model's RMSE of `r signif(sqrt(mean(residuals(cats.lm)^2)),4)` g.
**Notice**: The previous paragraph had **in-line** R code, which embedded
the output into text.
Here are the predictions of the tree against the data:
```{r}
plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)",
col=ifelse(cats$Sex=="F","red","blue"))
points(x=cats$Bwt, y=predict(cats.tree), pch=16)
```
EXERCISE: Write out a code chunk which would plot the data, the predictions
of the tree, _and_ the predictions of the linear model, all in one figure
```{r}
plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)",
col=ifelse(cats$Sex=="F","red","blue"))
points(x=cats$Bwt, y=predict(cats.tree), pch=16)
# points(x=cats$Bwt, y=predict(cats.lm), pch=16, col="brown" )
abline(cats.lm)
```