Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | mattie-theroulde |
View: | 220 times |
Download: | 4 times |
The Q-matrix method: A new artificial intelligence tool
for data mining
Dr. Tiffany BarnesKennedy 213, [email protected]
PhD - North Carolina State University
Sep 10, 2004The Q-matrix method 2
Overview Introduction Adaptive Teaching and Data Mining Student Model Extraction Conclusions & Future Work
Sep 10, 2004The Q-matrix method 3
Research challenge Turn the computer into a private tutor
Diagnose and correct misconceptions Diagnosis tolerates careless errors &
guesses Build a scientific approach to improving
computer based education Build in fault tolerance, robustness Optimize for student performance Optimize teaching strategies for
effectiveness
Sep 10, 2004The Q-matrix method 4
The Problem Students take a tutorial and quiz
online Determine what students know Redirect students to new/repeat
material
Adaptive Tutorial Flow
Question Engine
student
DiagnosticEngine
Concept Model
TeachingStrategy
Ask questions
Student respondsDetermine learning path
Determine concept state
Select new material
Sep 10, 2004The Q-matrix method 6
Assume contents affect
behavior
Data mining for knowledge
student
ContentsUnknown
Behavior Known
Sep 10, 2004The Q-matrix method 7
Knowledge & student model
Concepts
Tutorial questions
Student responses
Studentconcepts
Goal: Mine to extract student concepts
Sep 10, 2004The Q-matrix method 8
Data mining & adaptive teaching Problem understanding
Effective direction of student learning Data understanding
Data from online tutorials Data preparation
Select relevant variables Modeling: Q-matrix, cluster, factor Evaluation of results
Misconceptions diagnosed?
References: Data Mining Server @ http://dms.irb.hr/tutorial
Sep 10, 2004The Q-matrix method 9
How the model worksStudent
response11100
Predicted responses:
01100 Err: 101101 Err: 211100 Err: 0
11111 Err: 2
Tutorial &Questions
match
11100 Err: 0
Q-matrix0001110010
Student understandsConcept 1 but not 2.
Teaching Strategy
Sep 10, 2004The Q-matrix method 10
How the model works-2 Concept state – a bit string that
describes understanding Concept state 01: understands
concept 2 but not concept 1 Q-matrix: concepts v. questions Each state has an “ideal response
vector” computed from Q-matrix
Sep 10, 2004The Q-matrix method 11
Binary Q-matrix exampleq1 q2 q3 q4 q5
Con1 0 0 0 1 1Con2 1 0 0 1 0
Concept State IDR 00 01100
01 11100 10 01101 11 11111
Sep 10, 2004The Q-matrix method 12
Research questions Are Q-matrix models interpretable? What factors affect Q-matrix
extraction? How well does the Q-matrix method
compare with other data mining methods?
Sep 10, 2004The Q-matrix method 13
Results on simulated students Brewer tested 2 Q-matrix extraction
methods based on ideal students + noise in ideal response vectors
Q-matrix method needs few students for high noise tolerance, factor analysis needs many more
References: Brewer 1996. NCSU Masters Thesis.
Sep 10, 2004The Q-matrix method 14
Student model extraction Q-matrix, factor, and cluster models Compared for error on student data
sets Q-matrix and cluster also compared
by maps and by cluster convergence
Sep 10, 2004The Q-matrix method 15
Q-matrix model Assumes concepts underlie
questions Students are in “concept states” C:
C1 = 1 understands concept 0 C2 = 0 doesn’t get concept 2
For each state, compute IDR Assign students to state with closest
IDR
Sep 10, 2004The Q-matrix method 16
Q-matrix creation
Until convergence criterion met:1. Increment number of concepts2. Create random q-matrix3. Fill concept states & compute error4. Vary q-matrix5. Fill concept states & compute error6. Repeat steps 4-5 until error not
improving7. Repeat steps 2-6 to avoid local minima
Sep 10, 2004The Q-matrix method 17
Factor analysis model Each tutorial question is a variable Create covariance matrix for vars Derive eigenvectors/values to explain
most of the variance in the covar matrix
Assumes that linear combinations of the variables will be able to explain the vars
Eigenvectors ROTATED
Sep 10, 2004The Q-matrix method 18
Cluster analysis model Answer vectors as points in plane Iterate until convergence:
Choose random seed from data set Assign vectors to nearest seed Set new seeds to cluster medians Chooses random seeds, assigns vecs to
closest seed, set new seed to cluster median
Similar to q-matrix except seeds are Ideal Response Vectors
Sep 10, 2004The Q-matrix method 19
Q-matrix vs. Factor Analysis CFA generated 4 factors/matrix Compared to q-matrix with 4 concepts Factor matrix converted to 0/1
Threshold of 0.3 -> 1, less -> 0 Factor matrix used as q-matrix Error computed for both Q-matrix performed significantly better (at
least 19% less error/stud) on all 14 problems
Smallest diff in performance when large amount of variance in student answers
Q-matrix and factor errors per student
0
0.5
1
1.5
2
2.5
3
Binq1Binq2Binq3CountPf1 Pf2 Pf3 Pf4 Pf5 Pf6 Pf7 Pf8 Pf9
Pf10
Errors/student
Factor Q-matrix
Ratio of q-matrix to factor error and relative # of distinct
observations
0
0.2
0.4
0.6
0.8
1
1.2
Binq1Binq2Binq3CountPf1 Pf2 Pf3 Pf4 Pf5 Pf6 Pf7 Pf8 Pf9 Pf10
# diff ans/max ratio q/fac
Sep 10, 2004The Q-matrix method 22
Q-matrix vs. Cluster Analysis Cluster Analysis does not map to q-
matrix as factor anal. does However, q-matrices do form
clusters of students in the same concept state
Ran Cluster Analysis with same number of clusters as q-matrix
Similar clusters generated by both
Sep 10, 2004The Q-matrix method 23
Clustering comparisons Determine equivalent concept state
& cluster groupings (by largest overlap)
These are in BOLD Count elements NOT in overlaps Overall diff = total NOT
overlapping / total elements
14,15
16
Con 0-4
Con 1-444
402,441,446,622
546,646,744
105,205,305
Con2-35 231
Con3-777
274
Proof 8 Q-matrix Cluster Comparison 6/15 clus different
Differences in cluster overlap
0
0.1
0.2
0.3
0.4
0.5
0.6
b1 b2 b3 ct p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
Ratio of different to total cluster assignments
Sep 10, 2004The Q-matrix method 26
Q-matrix vs. Cluster Analysis 2 Each cluster has a “seed” Distances from seeds determine
cluster membership For each cluster, summed
differences between seeds & answer vectors
Total error less than that of q-matrix clusters for all experiments
Sep 10, 2004The Q-matrix method 27
Q-matrix vs. Cluster Analysis 3 Why is total error less for clusters? Because we force the IDRs in q-
matrix method to be based on concepts
This yields higher errors but more help in directing teaching strategies
Sep 10, 2004The Q-matrix method 28
Q-matrix v. Clusters Summary If we used cluster results, how
would we determine what to do for each student after the analysis?
Cluster and q-matrix analyses could be used together for large data sets.
Important: student outcomes
Sep 10, 2004The Q-matrix method 29
Conclusions Full automation of economically
expandable adaptive teaching system Method for diagnosis of misconceptions Q-matrix model interpretable by humans Q-matrix outperforms factor analysis in
student modeling Q-matrix forms clusters similar to those in
cluster analysis
Sep 10, 2004The Q-matrix method 30
Future Work Any lesson can be augmented with
diagnostic engine Different teaching strategies can be
compared Apply Q-matrix method to benchmark data
mining datasets Perform detailed time analysis and
determine improvements Cross-validation tests to determine
accuracy of model Missing data adaptations
Sep 10, 2004The Q-matrix method 31
Thank you!
Email: [email protected]
This work was partially supported by NSF grants #9813902 and #0204222.
Sep 10, 2004The Q-matrix method 32
How the model works-2 Student takes quiz Assigned to state with nearest IDR Error determined from difference
between IDR & response, Q-matrix Q-matrices varied until error over all
students is minimized
Sep 10, 2004The Q-matrix method 33
Manual concept mapping Expert analysis of algebra tasks into
rules Evolved into Q-matrix
Relationship between questions & concepts Applications:
Student assessment Group performance measure Finding new rules (student innovations)
References: Birenbaum, et al. 1993, Tatsuoka 1983.
Sep 10, 2004The Q-matrix method 34
Prediction of student data Hubal found that randomly
generated rules were better predictors of student data than Tatsuoka’s Q-matrix
This suggests that student data should be used to generate dynamic Q-matrices
Mining for what the students know!References: Hubal 1992. NCSU Masters Thesis.
Sep 10, 2004The Q-matrix method 35
Knowledge Assessment Comparison with expert models Remediation Tutorial effectiveness
Sep 10, 2004The Q-matrix method 36
Remediation Analyze student states and apply a
teaching strategy to direct next step Process: Find the least-understood
concept, and have student retake the first lesson related to that concept