--- title: Cats! author: CRS date: 26 August 2019 (36-462, Lecture 1) --- Let's play with cats! ```{r} library(MASS) data(cats) plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)", col=ifelse(cats\$Sex=="F","red","blue")) ``` Our first **code chunk** loads a library, loads some data, makes a plot, _and embeds the plot in our document_. If we change the data, or change what we do to it, the output changes along with it. (The default is for each code chunk to also show the code --- you can turn that off if you like.) We can fit a linear model by ordinary least squares here: ```{r} cats.lm <- lm(Hwt ~ Bwt, data=cats) ``` **Notice**: we don't have to re-load the data --- R Markdown keeps its own history/environment from chunk to chunk. But how do we see the results? ```{r} summary(cats.lm) ``` This looks ugly --- there must be a better way! ```{r} library(knitr) # For some R Markdown-related utilities kable(summary(cats.lm)\$coefficients) ``` That many decimal places is distracting _and_ nonsensical --- turn that down. ```{r} kable(signif(summary(cats.lm)\$coefficients, 3)) ``` Add the regression line to the plot: ```{r} plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)", col=ifelse(cats\$Sex=="F","red","blue")) abline(cats.lm) ``` Not bad but not outstanding: Try something else ```{r} library(tree) cats.tree <- tree(Hwt ~ Bwt+Sex, data=cats) ``` ```{r} plot(cats.tree) text(cats.tree) ``` The tree has a root-mean-squared-error of `r signif(sqrt(mean(residuals(cats.tree)^2)), 4)` g, which is (a little bit) smaller than the linear model's RMSE of `r signif(sqrt(mean(residuals(cats.lm)^2)),4)` g. **Notice**: The previous paragraph had **in-line** R code, which embedded the output into text. Here are the predictions of the tree against the data: ```{r} plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)", col=ifelse(cats\$Sex=="F","red","blue")) points(x=cats\$Bwt, y=predict(cats.tree), pch=16) ``` EXERCISE: Write out a code chunk which would plot the data, the predictions of the tree, _and_ the predictions of the linear model, all in one figure ```{r} plot(Hwt ~ Bwt, data=cats, xlab="Body weight (kg)", ylab="Heart weight (g)", col=ifelse(cats\$Sex=="F","red","blue")) points(x=cats\$Bwt, y=predict(cats.tree), pch=16) # points(x=cats\$Bwt, y=predict(cats.lm), pch=16, col="brown" ) abline(cats.lm) ```