Concepts Review for the Final Exam

Below is information you can keep track for the concepts portion of the final exam. You will be expected to be familiar with these terms.

R for Data Science

  1. What does it mean to have a tidy dataset?

  2. In GGPLOT, what does aes, geom, facet_wrap, jitter refer to and what do they accomplish?

  3. The package dplyr provides several functions for data wrangling. Know the difference between filter, select, mutate, summarize, arrange.

  4. The tidyr package contains functions for wrangling with datasets. Describe the difference between spread, gather, separate, unite and also the purpose of join.

  5. How do we use %>%?

Basics of R and the Course Setup

  1. What does an RProject do (i.e., *.Rproj)?

  2. What use is this operator: <-

  3. In addition to the instructors pleasure in causing you distress, what is the point of using Git and GitHub for assignments?

  4. What’s the difference between .r and .Rmd files?

  5. What pointers do you have for make more readable and reproducible script? How can you improve your code writing?

DSUR Material

  1. How do you interpret a standardized beta coefficient? How is it different from a unstandardized, raw regression coefficient?

  2. What are three ways that allow you to assess the overall fit of your regression model?

  3. What is our rule of thumb for interpreting the significance of a regression coefficient?

  4. Practice interpreting a confidence interval for an unstandardized regression coefficient.

  5. Practice interpreting all the regression coefficients for a multiple regression with and without an interaction term.

  6. What are the three functions to calculate an correlation coefficient in R? How are they different?

  7. How do we test the following assumptions in R: homoscedasticity, normality of errors, violation of independence?

  8. How is logistic regression different from a linear regression?

  9. Be able to interpret the coefficients for a multiple logistic regression model with and without an interaction term.

  10. What is complete separation in a logistic regression and why is it a problem?



Assignments

Home