+ All Categories
Home > Documents > Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data...

Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data...

Date post: 17-May-2018
Category:
Upload: buinhu
View: 213 times
Download: 0 times
Share this document with a friend
93
Analyzing FCQ Results Using Advanced Data Analytics Haoyue Zhang, Lilong Wang, Aaron Nielsen 1st December 2016
Transcript
Page 1: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

Analyzing FCQ ResultsUsing Advanced Data AnalyticsHaoyue Zhang, Lilong Wang, Aaron Nielsen1st December 2016

Page 2: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Preview

1 Introduction

2 Data Collection and Cleaning

3 Exploratory Data Analysis

4 Least Squares Regression

5 Cluster Analysis

6 Conclusions

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 2/91

Page 3: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

INTRODUCTION

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 3/91

Page 4: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What is an FCQ?

Students at the University of Colorado campuses evaluate each of their courses andinstructors every semester

This survey is called the Faculty Course Questionnaire (FCQ for short)

Evaluations can be administered in-class or online near the end of the term

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 4/91

Page 5: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What is an FCQ?

FCQ results can impact:

Tenure eligibility for tenure-track facultyRetention of adjunct instructorsFunding for teaching assistants

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 5/91

Page 6: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What is an FCQ?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 6/91

Page 7: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What is an FCQ?

Quantitative and qualitative data is gathered from FCQs

Quantitative data: Students rate instructors and courses in eight categories (1=lowest,6=highest) and how many hours per week they spent on course-related workQualitative data: Students are allowed to comment on the most effective and least effectiveaspects of the course

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 7/91

Page 8: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What can we learn from individual FCQs?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 8/91

Page 9: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What can we learn from individual FCQs?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 9/91

Page 10: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What can we learn from individual FCQs?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 10/91

Page 11: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What can we learn from individual FCQs?

Individual numerical results aren’t particularly useful

Individual comments potentially are very helpful

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 11/91

Page 12: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What can we learn from course-aggregatedFCQ results?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 12/91

Page 13: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What can we learn from course-aggregatedFCQ results?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 13/91

Page 14: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: What can we learn from course-aggregatedFCQ results?

Course-aggregated data is certainly easier to interpret rather than individual results

Many questions still remain!

Do recitations typically get higher or lower scores than lectures?Are summer classes rated higher than fall/spring courses?Does the level of the course affect scores?How much do factors like availability and challenge of the course affect overall course scores?Do tenure-track professors score higher than teaching assistants?Are easy classes rated higher?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 14/91

Page 15: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: Goals of Project

Statistics and Modeling (Lilong and Haoyue):

Collect and clean FCQ data from Math courses from the previous two yearsConduct Tukey-inspired Exploratory Data AnalysisUse ggplot2 package in R to effectively display features of the data setUse Regression Analysis to analyze the factors that contribute most to overall course scoresPerform cluster analysis to determine if certain factors are associated with higher overall coursescores

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 15/91

Page 16: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: Goals of Project

Management (Aaron):

Organize projectSet project deadlines and benchmarksConduct weekly meetingsMentor on various statistical methodsTroubleshoot when issues arise

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 16/91

Page 17: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: Some Disclaimers

Higher FCQ scores are not necessarily indicative of better teaching!

Names of instructors were immediately removed from data set

We intentionally did not analyze any instructor-level results

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 17/91

Page 18: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: Some Disclaimers

Higher FCQ scores are not necessarily indicative of better teaching!

Names of instructors were immediately removed from data set

We intentionally did not analyze any instructor-level results

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 17/91

Page 19: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Introduction: Some Disclaimers

Some studies suggest course evaluations are biased against women and minorities

Demographic information about the instructors was, in general, not included in this analysisOne exception: we did include whether each class was taught by permanent faculty, adjunctfaculty, or a teaching assistant

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 18/91

Page 20: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

DATA COLLECTIONAND CLEANING

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 19/91

Page 21: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Data Collection

https://fcq.colorado.edu/ucddata.htm (or google “ucd fcq”)

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 20/91

Page 22: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Data Collection

Aaron’s FCQ results

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 21/91

Page 23: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Data Collection

FCQ results from the University of Colorado Denver’s Mathematics Department werecollected

Data was obtained for Fall 2014 – Spring 2016

Results from 510 classes/sections were collected

This includes lectures, recitations, online classes, in-person classes, classes at Denver campus,classes at Beijing campus

Each class/section is represented with one row in our data set

Class averages are given for each question, as opposed to individual level results

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 22/91

Page 24: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Data Cleaning

All classes with no returned forms were removed from the data set (N/A results)

Some recitations weren’t coded properly, so this had to be fixed

Some recitation results weren’t properly exported from UCD database

Results from recitations and courses in Beijing were removed after a preliminary analysis

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 23/91

Page 25: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

EXPLORATORY DATAANALYSIS

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 24/91

Page 26: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Exploratory Data Analysis

Exploratory data analysis (EDA) is a method for analyzing data sets to summarize theirmain features or characteristics, typically with visual methods

EDA is useful for determining what the data can tell us without using formal hypothesistesting

John Tukey championed EDA as a method for statisticians to explore data and potentiallyformulate hypotheses that can lead to new experiments

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 25/91

Page 27: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Initial Exploratory Data Analysis Results

FCQ data comes from all types of classes including:

Academic year and summer classesOn-campus and onlineLectures and recitationsClasses in Denver and classes in Beijing

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 26/91

Page 28: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings

n 506

Mean 4.640

Median 4.700

SD 0.711

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 27/91

Page 29: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: In-class vs. Online

In-Class Online

n 478 28

Mean 4.659 4.311

Median 4.700 4.300

SD 0.695 0.899

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 28/91

Page 30: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: Lecture vs. Recitation

Lecture Recitation

n 478 28

Mean 4.649 4.486

Median 4.700 4.500

SD 0.721 0.480

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 29/91

Page 31: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: Denver campus vs. Beijing campus

Denver Beijing

n 484 22

Mean 4.614 5.218

Median 4.700 5.300

SD 0.705 0.584

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 30/91

Page 32: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings

Since we are mostly interested in on-campus lectures at the Denver campus, we haveexcluded recitations, online classes, and Beijing classes from our analysis.

This leaves 418 classes of FCQ results to analyze from Fall 2014 to Spring 2016.

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 31/91

Page 33: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: Fall, Spring, Summer

Fall Spring Summer

n 220 183 25

Mean 4.616 4.674 4.632

Median 4.700 4.800 4.800

SD 0.651 0.753 0.720

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 32/91

Page 34: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: Academic Year vs. Summer

AY Summer

n 403 25

Mean 4.642 4.632

Median 4.700 4.800

SD 0.700 0.720

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 33/91

Page 35: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: 2014, 2015, 2016

2014 2015 2016

n 100 238 90

Mean 4.675 4.574 4.783

Median 4.700 4.700 4.850

SD 0.616 0.713 0.732

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 34/91

Page 36: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: Undergraduate vs. Graduate Courses

Undergrad Grad

n 386 42

Mean 4.59 5.10

Median 4.70 5.25

SD 0.68 0.67

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 35/91

Page 37: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: By Level

Lower UG Upper UG Grad

n 265 121 42

Mean 4.60 4.58 5.10

Median 4.60 4.70 5.25

SD 0.62 0.81 0.67

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 36/91

Page 38: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings: By Title

Professor Adjunct TA

n 98 247 83

Mean 4.85 4.48 4.57

Median 5.00 4.70 4.60

SD 0.73 0.70 0.63

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 37/91

Page 39: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Course Ratings Summary

Spring classes may receive slightly higher course scores than fall and summer classes

There is not much of a difference in course scores between academic year and summerclasses

While 2014, 2015, and 2016 have slightly different average course scores, this may beinfluenced by the semesters included

Graduate classes score much higher than undergraduate classes

No apparent difference in scores between lower and upper level undergraduate courses

Tenure-track faculty score higher than teaching assistants. Teaching assistants scorehigher than adjunct faculty

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 38/91

Page 40: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Instructor Ratings

n 428

Mean 4.957

Median 5.200

SD 0.786

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 39/91

Page 41: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Other FCQ Results

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 40/91

Page 42: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Correlation Pairs Plot

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 41/91

Page 43: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

LEAST SQUARESREGRESSION

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 42/91

Page 44: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Ordinary Least Squares Regression

Our first attempt at modeling involved using least squares regression

The goal of this model was to predict overall course scores based on the other predictorvariables

AvailabilityChallengeHow much did you learnInstructor’s respectPrior interest in subjectInstructor’s effectiveness at encouraging interestNumber of Forms ReturnedPercent of Forms Returned (new variable)

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 43/91

Page 45: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Ordinary Least Squares Regression

Ordinary Least Squares Regression (OLS) is a statistical method for estimating unknownparameters in a linear regression model

The goal of OLS is to minimize the squared differences between observed values andpredicted values

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 44/91

Page 46: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Ordinary Least Squares Regression

yi = xTi β+ εi

yi is the ith average course rating

xi is a vector of ratings (availability, challenge, etc) for course i

β is the vector of unknown parameters

Goal: Minimize SSE =∑N

i=1(yi − yi)2 =∑N

i=1(yi − xTi β)

2

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 45/91

Page 47: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Ordinary Least Squares Regression

The following is the result of fitting an ordinary least squares regression model

Coefficient Estimate Std. Error p-value

(Intercept) 0.323 0.153 0.035PriorInterest 0.091 0.020 6.3 ·10−6

InstrEffective 0.362 0.032 < 10−6

Availability 0.102 0.035 0.004Challenge -0.276 0.036 < 10−6

HowMuchLearned 0.665 0.042 < 10−6

FormsReturned -0.0002 0.002 0.903

Number of forms returned wasn’t statistically significant (α= 0.05) and removed frommodel

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 46/91

Page 48: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Ordinary Least Squares Regression

Percent of forms returned was added to model

Coefficient Estimate Std. Error p-value

(Intercept) 0.342 0.148 0.022PriorInterest 0.095 0.020 2.1 ·10−6

InstrEffective 0.366 0.032 < 10−6

Availability 0.102 0.035 0.004Challenge -0.278 0.036 < 10−6

HowMuchLearned 0.664 0.041 < 10−6

PercentFormsReturned -0.0007 0.0006 0.256

Percent of forms returned wasn’t statistically significant (α= 0.05) and removed frommodel

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 47/91

Page 49: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Ordinary Least Squares Regression

Ordinary least squares regression model:

Coefficient Estimate Std. Error p-value

(Intercept) 0.318 0.147 0.031PriorInterest 0.091 0.020 4.0 ·10−6

InstrEffective 0.362 0.032 < 10−6

Availability 0.103 0.035 0.004Challenge -0.275 0.036 < 10−6

HowMuchLearned 0.664 0.041 < 10−6

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 48/91

Page 50: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

OLS Model Diagnostics

No apparent non-linearity

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 49/91

Page 51: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

OLS Model Diagnostics

Potentially heavy tails on the errors

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 50/91

Page 52: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

OLS Model Diagnostics

Entry #250 has high leverage and a large residual

Should we remove it?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 51/91

Page 53: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

OLS Model Diagnostics

Let’s examine entry #250 a little more closely

It’s an online summer class which had only two people filled out the FCQ

Should we remove it now?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 52/91

Page 54: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Weighted Least Squares Regression

Instead of removing entry #250, we decided that we should use weighted least squares toaccount for the different number of forms returned

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 53/91

Page 55: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Weighted Least Squares Regression

yi = xTi β+ εi

yi is the ith average course rating

xi is a vector of ratings (availability, challenge, etc) for course i

β is the vector of unknown parameters

Goal: Minimize SSE =∑N

i=1 wi(yi − yi)2 =∑N

i=1 wi(yi − xTi β)

2

Weights, wi , are assigned to be proportional to the number of forms returned

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 54/91

Page 56: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Weighted Least Squares Regression

Weighted least squares regression model:

Coefficient Estimate Std. Error p-value

(Intercept) 0.323 0.126 0.011PriorInterest 0.056 0.018 0.002InstrEffective 0.361 0.027 < 10−6

Availability 0.139 0.029 2.9 ·10−6

Challenge -0.253 0.032 < 10−6

HowMuchLearned 0.631 0.037 < 10−6

y = 0.323+ 0.056 ·PriorInterest+ 0.361 · InstrEffective+ 0.139 · Availability− . . .0.253 ·Challenge+ 0.631 ·HowMuchLearned

R2 = 0.84

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 55/91

Page 57: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

WLS Model Diagnostics

No apparent non-linearity

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 56/91

Page 58: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

WLS Model Diagnostics

Potentially heavy tails on the errors

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 57/91

Page 59: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

WLS Model Diagnostics

Entry #250 no longer has high leverage (since we decreased its weight)

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 58/91

Page 60: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Least Squares Regression Summary

Weighted Least Squares Regression is preferred to Ordinary Least Squares Regressionsince we have given aggregated data

The largest coefficients in the WLS Regression model corresponded to the variablesHowMuchLearned (β5 = 0.631) and InstrEffective (β2 = 0.361)

While challenge was positively correlated with overall course scores, after accounting forthe other variables challenge had a negative coefficient in our model (β4 =−0.253)

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 59/91

Page 61: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

CLUSTER ANALYSIS

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 60/91

Page 62: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Cluster Analysis

Using weighted least squares regression, we were able to explore how overall coursescores were affected by other factors (challenge, instructor availability, etc)

Next, we will use cluster analysis in attempt to determine if certain types of classestypically get higher FCQ scores.

Cluster Analysis (or clustering) is often in used in statistical data analysis where objectsare to be placed into subgroups with similar characteristics

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 61/91

Page 63: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Cluster Analysis: Iris Example

R.A. Fisher famously published astudy on Irises in 1936

Petal length and petal width from 150Irises are plotted

Cluster analysis can help usinvestigate underlying characteristicsin each cluster

How many clusters are there? 2? 3?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 62/91

Page 64: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Cluster Analysis: Iris Example

It turns out that there are threedifferent species of Irises in the dataset, so it would make sense if thereare three clusters

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 63/91

Page 65: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Cluster Analysis: FCQ Scores

In the plot to the left, overall coursescores and overall instructor scoresare given in a scatter plot

What is the “optimal” number ofclusters in this data set?

Are there common characteristics(besides course and instructorscores) in the clusters?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 64/91

Page 66: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

K-Means Clustering

k-means clustering is one popular method used to separate data points into k subgroups,that is, determine the partition S= {S1, S2, . . . , Sk}k-means clustering determines the partition by minimizing within-cluster sum of squares(WSS)

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 65/91

Page 67: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

K-Means Clustering

Let (x1, x2, . . . , xn) be a set of n observations, where each xk is a d-dimensional vector ofreal numbers

argminS

∑ki=1

x∈Si||x−µi ||2, where µi is the i th cluster mean

In other words, we want to partition our observations in k subsets in a manner that willminimize the squared distances from each observation to its respective cluster mean

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 66/91

Page 68: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Determining Number of Clusters

The k-means clustering algorithm requires that we know the number of clusters

One method for determining the “optimal” number of clusters is called the Elbow Method

Within-cluster sum of squares (WSS) is calculated for each value of k over a specifiedrange

The Elbow Method is a heuristic method that selects the “optimal” number of clusters k bychoosing the value of k such that WSS doesn’t decrease much if the number of clusters isincreased

Look for an “elbow” in a plot of WSS versus k

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 67/91

Page 69: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Determining Number of Clusters

In the plot to the left, the elbowmethod suggests k = 3, as it that isapproximately where the “elbowoccurs”

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 68/91

Page 70: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 69/91

Page 71: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Instructor’s Respect

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 70/91

Page 72: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Instructor’s Respect

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 71/91

Page 73: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Prior Interest

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 72/91

Page 74: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Prior Interest

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 73/91

Page 75: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Challenge

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 74/91

Page 76: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Challenge

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 75/91

Page 77: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: How Much Learned

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 76/91

Page 78: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: How Much Learned

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 77/91

Page 79: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Instructor Encourages Interest

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 78/91

Page 80: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Instructor Encourages Interest

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 79/91

Page 81: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Availability

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 80/91

Page 82: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Availability

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 81/91

Page 83: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Results: Average Number of Hours for Course

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 82/91

Page 84: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Summary

It was determined that k=3 was the optimal number of clusters by elbow method

These clusters were named “low”, “mid” , and “high”

FCQ scores with a low instructor respect was primarily associated with “low”

FCQ scores of prior interest are similar for “high”, “mid” and “low”

FCQ scores with a low challenge was primarily associated with “low”

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 83/91

Page 85: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Clustering Summary

Courses where students reported learning a lot were most associated with “high”

High scores with instructor’s encouragement are most associated with “high”

High scores with availability are most associated with “high”

Most people choose to study 7-9 hours no matter which cluster they belong to, however,there are lots of people in “low” that spend more time on learning than the other twoclusters

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 84/91

Page 86: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Conclusions

During the Exploratory Data Analysis phase, the following were noted:

Overall course scores are, on average, about 0.3 points lower than overall instructor scoresRecitations, online classes, and Beijing classes appeared to be distributed differently thanstandard, on-campus lecturesThere doesn’t appear to be much of a seasonal effect on overall scoresTenure-track professors typically score higher on overall course scores than teaching assistantsand adjunct facultyGraduate courses tend to receive higher scores than undergraduate courses

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 85/91

Page 87: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Conclusions

Least squares regression was utilized in attempt to model overall course scores as afunction of other FCQ questions (availability, challenge, etc)

Weighted least squares regression is preferred to ordinary least squares regression for this dataset since we are dealing with aggregated dataThe largest coefficients in the weighted least squared regression model corresponded to howmuch the student reported learning and how effective the instructor was at encouraging interestWhile challenge was positively correlated with overall course score, after accounting for otherfactors it had a negative regression coefficient

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 86/91

Page 88: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Conclusions

Clustering was used to group course and instructor scores into three subgroups. Thesesubgroups were studied for common characteristics

High instructor and course scores are associated with high scores on instructor respect, howmuch was learned, and instructor encouraging interestLow instructor and course scores are associated with low scores in instructor respect, challenge,and how much learnedAmount of time spent on course work was similarly distributed for all three clusters

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 87/91

Page 89: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Conclusions

If an instructor is aiming for high scores on FCQ results, the following are suggested:

Be available to help students as much as possibleAlways be respectful of studentsTeach students as much as possible while making it not too challenging

If all else fails, teach a graduate course!

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 88/91

Page 90: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Conclusions

If an instructor is aiming for high scores on FCQ results, the following are suggested:

Be available to help students as much as possibleAlways be respectful of studentsTeach students as much as possible while making it not too challenging

If all else fails, teach a graduate course!

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 88/91

Page 91: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Possible Future Work

It could be helpful to have the number of each grade given in a class (how many A’s doeseach teacher give?)

Investigate potential bias against women and minorities

Investigate if other departments follow a similar pattern

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 89/91

Page 92: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Works Cited

Textbooks:An Introduction to Statistical Learning (James, Whitten, Hastie, Tibshirani)Elements of Statistical Learning (Hastie, Tibshirani, Friedman)Probability and Statistics for Engineering and the Sciences (Devore)Applied Regression Models (Kutner, Nachtsheim, Neter, Li)

Websites:http://onlinecourses.science.psu.edu/stat501/node/352http://plot.ly/ggplot2/facet/http://www.r-bloggers.com/how-to-make-a-histogram-with-ggplot2/http://en.wikipedia.org/wiki/Elbow_method_(clustering)http://www.r-bloggers.com/bot-botany-k-means-and-ggplot2/http://www.sthda.com/english/wiki/ggplot2-histogram-plot-quick-start-guide-r-software-and-data-visualizationhttp://www.sthda.com/english/wiki/cluster-analysis-in-r-unsupervised-machine-learninghttp://www.statmethods.net/stats/rdiagnostics.html

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 90/91

Page 93: Analyzing FCQ Results Using Advanced Data Analytics - Using Advanced Data Analyticsnielsen/analyzing-fcq-results.… ·  · 2017-02-22Exploratory Data Analysis Course Ratings Instructor

IntroductionWhat is an FCQ?

Goals of Project

Data Collection andCleaning

Data Collection

Data Cleaning

Exploratory Data AnalysisCourse Ratings

Instructor Ratings

Other FCQ Results

Least Squares RegressionOrdinary Least Squares Regression

Weighted Least Squares Regression

Cluster AnalysisK-Means Clustering

Determining Number of Clusters

Clustering Results

Conclusions

Questions?

Haoyue Zhang, Lilong Wang, Aaron Nielsen Analyzing FCQ Results Using Advanced Data Analytics 1st December 2016 91/91


Recommended