36-601: Perspectives on Statistical Practice: Intro to SAS Programming
Documents
601 SyllabusLectures
Lecture 1: IntroductionLecture 2: Inputting Data
Lecture 2 (Cont'd): The Data Step
Lecture 3: Procedures (PROCs)
Lecture 4: Macros
Lecture 5: Graphing
Lecture 6: Reports
Lecture 7: Logistic Regression
Lecture 8: Data Management
Lecture 9: SQL
Lectur 10: PROC COMPARE
Lecture 11: ANOVA
Lecture 12: ANOVA and Power
Lecture 13: Simulation and Power
Lecture 13: In class assignment solution
Lecture 14: The Bootstrap
Lecture 15: Advanced Macros
Lab 1: Regression
Lab 1 InstructionsBlood Level Data
Sample Code
Lab 2: Data Management
Lab 2 InstructionsLab 2 Data My Code
Lab 3: Data Management with PROC SQL
Lab 3 InstructionsLab 3 Data My Code
Lab 4: Principal Components/Factor Analysis
protein.dattrack.dat
- Data: USArrests.dat
- Standardize the dataset BEFORE doing PCA using the appropriate PROC
- Perform PCA
- Plot each state by the first two principal components using appropriate labels
- Interpret results
Lab 5: Logistic Regression/LDA/QDA
spamclass.sas7bdatemails.sas7bdat
Homework
Hw 1:- Read in the keyboard study data for the four schools (will be emailed to you)
- Create one large dataset with all four schools
- Add a variable signifying the school
- Create four school dummy variables
- Create numeric pre-test and post-test variables from the character variables (use the numeric part for observations with < and >)
- Create two variables, one with the change in score from pre to post, the other with percent change in score
Hw 2:
- Perform EDA on your selected tests for all four schools, but do NOT do any formal statistical analyses.
- Write up the results of your EDA. It should include a small summary about the study and the specific test(s) you analyzed along with a summary of your EDA. Your report should include relevent tables and figures. Also include what you would expect to find if you did perform a formal statistical analysis based on the results of the EDA. It should be no longer than 2 pages including tables and figures.
Hw 3: Fully comment the following code: hw3.sas
Hw 4:
- Part I.
- Write and call a macro that reads in the keyboard study data for each school. In addition to reading in data, the macro should create a variable SCHOOL that takes on the value of the school in which the data came and also create a dummy variable for the school. HINT: The EXCEL workbook sheet, the value of SCHOOL, and the dummy variable should all be the same.
- Create one large dataset will all four schools. "Fix" the school dummy variables created by the macro.
- Create numeric pretest, posttest, change, and percent change variables as in Hw 1.
- Read in the demographics data and MERGE it with the test data.
- Create an AGE variable equal to the age of the student in years on January 15, 2010.
- Part II.
- Perform a one-sample t-test for each of the four schools on the CHANGE variable.
- Perform a paired t-test of the PRETEST and POSTTEST variables for each of the four schools.
- Perform a two-sample t-test comparing the appropriate schools.
- Perform two linear regressions, regressing CHANGE on the appropriate paired schools using dummy variables. Make sure to perform the appropriate diagnostics and interpret your results.
- Repeat the above analyses using covariates. You may need to convert character variables to numeric dummy variables.
- Compare the results of all your analyses.
Data
smoker.sas7bdatsmoker.csv
smoker.txt
smoker_flat.txt
smoker.xls
Prien-logit.dat
p_dat.csv
p_gender.csv
Readings
SQL PaperFinal Project
Instructions: final-project.pdfData for the project:
arm.sas7bdat
schedule.sas7bdat
demog.sas7bdat
treatment.sas7bdat