+ All Categories
Home > Documents > Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie...

Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie...

Date post: 03-Jan-2016
Category:
Upload: godfrey-barton
View: 215 times
Download: 2 times
Share this document with a friend
Popular Tags:
72
Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University
Transcript
Page 1: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Data Shop Introduction

Ken Koedinger & Alida Skogsholm

Human-Computer Interaction Institute

Carnegie Mellon University

Page 2: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

3(2x - 5) = 9

6x - 15 = 9 2x - 5 = 3 6x - 5 = 9

Cognitive Model drives behavior of intelligent tutor systems …

Cognitive Model: expert component of intelligent tutors that models how students solve problems

If goal is solve a(bx+c) = dThen rewrite as abx + ac = d

If goal is solve a(bx+c) = dThen rewrite as abx + c = d

If goal is solve a(bx+c) = dThen rewrite as bx+c = d/a

Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction

Page 3: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

3(2x - 5) = 9

6x - 15 = 9 2x - 5 = 3 6x - 5 = 9

Cognitive Model drives behavior of intelligent tutor systems …

Cognitive Model: expert component of intelligent tutors that models how students solve problems

If goal is solve a(bx+c) = dThen rewrite as abx + ac = d

If goal is solve a(bx+c) = dThen rewrite as abx + c = d

Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction

Hint message: “Distribute a across the parentheses.”

Bug message: “You need tomultiply c by a also.”

Knowledge Tracing: Assesses student's knowledge growth -> individualized activity selection and pacing

Known? = 85% chance Known? = 45%

Page 4: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

The Student Modeling Challenge

Problem: Intelligent Tutoring Systems depend on Cognitive Model, which is hard to get right It is technically hard, but more importantly it

requires a deep understanding of student thinking Cognitive Models created by intuition are often

wrong (e.g., Koedinger & Nathan, 2004)

Page 5: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Significance of improving a cognitive model

A better cognitive model means: better feedback & hints (model tracing) better problem selection & pacing (knowledge

tracing) Making cognitive models better advances

basic cognitive science

Page 6: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Learning events over timeD

urat

ion

Fourth Third Second First Fifth

While studying an example, tries to self-explain; fails; looks in text; succeeds

While solving a problem, looks up example; recalls explanation; maps it to problem

Recalls explanation; slips; corrects

Solves without slipsSolves without slips

5 sec.

10 sec.

15 sec.

25 sec.

20 sec.

Page 7: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Student Performance As They Practice with the LISP Tutor

Page 8: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Production Rule Analysis

14121086420

0.0

0.1

0.2

0.3

0.4

0.5

Opportunity to Apply Rule (Required Exercises)

Error Rate

Confirms Production Rule as an appropriate unit of knowledge acquisition

Page 9: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Production Rule Analysis “Cleans Up”

14121086420

0.0

0.1

0.2

0.3

0.4

0.5

Opportunity to Apply Rule (Required Exercises)

Error Rate

Learning?

Yes! At the production rule level.

Page 10: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Using learning curves to evaluate a cognitive model Lisp Tutor Model

Learning curves used to validate cognitive model Fit better when organized by knowledge = productions

rather than surface forms = programming language terms But, curves not smooth for some production rules

“Blips” in leaning curves indicate the knowledge representation may not be right

Corbett, Anderson, O’Brien (1995) Let me illustrate …

Page 11: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Curve for “Declare Parameter” production rule

Page 12: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Curve for “Declare Parameter” production rule

What’s happening on the 6th & 10th opportunities?

Page 13: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Curve for “Declare Parameter” production rule

How are steps with blips different from others? What’s the unique feature or factor explaining these

blips?

What’s happening on the 6th & 10th opportunities?

Page 14: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Can modify cognitive model using unique factor present at “blips” Blips occur when to-be-written program has 2 parameters Split Declare-Parameter by parameter-number factor:

Declare-first-parameter Declare-second-parameter

Page 15: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Pittsburgh Science of Learning Center provides datasets, see http://learnlab.org

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

1

2

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.3

Page 16: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Data sets in PSLC’s DataShop

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 17: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Geometry Cognitive Tutor screen shot example

Page 18: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

TWO_CIRCLES_IN_SQUARE Example 1

Page 19: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

TWO_CIRCLES_IN_SQUARE Example 2

Page 20: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

TWO_CIRCLES_IN_SQUARE Example 3

Page 21: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Learning Curves

Meta-data: “knowledge components” or “skills” labels

See learning (or not) over time

Can view consequences of alternative cognitive models or “knowledge component models”

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 22: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Example Domain

15 knowledge component or “skills”

1. Circle-area

2. Circle-circumference

3. Circle-diameter

4. Circle-radius

5. Compose-by-addition

6. Compose-by-multiplication

7. Parallelogram-area

8. Parallelogram-side

9. Pentagon-area

10. Pentagon-side

11. Trapezoid-area

12. Trapezoid-base

13. Trapezoid-height

14. Triangle-area

15. Triangle-side

Area unit of the Geometry Cognitive TutorOriginal cognitive model:

Page 23: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Example Domain

15 knowledge components or “skills”

1. Circle-area2. Circle-circumference

3. Circle-diameter

4. Circle-radius

5. Compose-by-addition

6. Compose-by-multiplication

7. Parallelogram-area

8. Parallelogram-side

9. Pentagon-area

10. Pentagon-side

11. Trapezoid-area

12. Trapezoid-base

13. Trapezoid-height

14. Triangle-area

15. Triangle-side

Area unit of the Geometry Cognitive TutorOriginal cognitive model:

r =2

Page 24: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Example Domain

15 knowledge components or “skills”1. Circle-area2. Circle-circumference3. Circle-diameter4. Circle-radius

5. Compose-by-addition6. Compose-by-multiplication7. Parallelogram-area8. Parallelogram-side9. Pentagon-area10. Pentagon-side11. Trapezoid-area12. Trapezoid-base13. Trapezoid-height14. Triangle-area15. Triangle-side

Area unit of the Geometry Cognitive TutorOriginal cognitive model:

r 1 r 2

r

Page 25: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Other DataShop Features

Error Reports Identify misconceptions by looking for common student errors When do students ask for hints? Are there alternative correct strategies?

Export Data Get all or part of the data in tab-delimited file Use your favorite analysis tools …

More DataShop features in the making …

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 26: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Data Shop Demo …

Page 27: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Exported File Loaded into Excel

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 28: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Using Excel Example

Get file (later!) from http://ctat.pact.cs.cmu.edu/downloadsClick on geometry-area.xls

For now, watch me! And be ready to ask (& answer!) questions!

Page 29: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Data Mining-Data Shop Offerings Tomorrow Learning from Learning Curves Difficulty Factors Assessment (DFA) &

Learning Factors Analysis (LFA) Data Mining Project Examples

Do you know which offering you will go to tomorrow?

Any conflicts -- two you want to go to that are at the same time?

Page 30: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

END

Page 31: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Log Data -- Skills in the Base Model

Student Step Skill Opportunity

A p1s1 Circle-area 1

A p2s1 Circle-area 2

A p2s2 Parallelogram-area 1

A p2s3Compose-by-

addition 1

A p3s1 Circle-area 3

Page 32: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Main Point of Talk

Problem: Need better methods to create & refine student models = “cognitive models”

Key opportunities: Good cognitive model => smooth learning curve Mine accumulating student interaction data

Solution: Learning Factors Analysis Hypothesize factors that may affect learning Use factors to pose alternative cognitive models Automate using AI search & statistical techniques

Page 33: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Main Point of Talk

Problem: Need better methods to create & refine student models = “cognitive models”

Key opportunities: Good cognitive model => smooth learning curve Mine accumulating student interaction data

Solution: Learning Factors Analysis Hypothesize factors that may affect learning Use factors to pose alternative cognitive models Automate using AI search & statistical techniques

Page 34: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Learning Factors Analysis

Statistics

Combinatorial SearchDifficulty Factors

a set of factors that may make a problem-solving step harder

Logistic regression, model scoring to fit statistical models to student log data

A* search algorithm with “smart” operators for proposing new cognitive models based on the factors

Page 35: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

The Statistical Model

Generalized Power Law to fit learning curves Logistic regression (Draney, Wilson, Pirolli, 1995)

Assumptions Different students may initially know more or less

=> use an intercept parameter for each student Students learn at the same rate

=> no slope parameters for each student Some productions may be more known than others

=> use an intercept parameter for each production Some productions are easier to learn than others

=> use a slope parameter for each production

These assumptions are reflected in detailed math model …

Page 36: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

The Statistical Model

Probability of getting a step correct (p) is proportional to:- if student i performed this step = Xi,

add overall “smarts” of that student = i

- if skill j is needed for this step = Yj, add easiness of that skill = j

add product of number of opportunities to learn = Tj & amount gained for each opportunity = j

( ) jjjjjiipp TYYX ∑ ∑∑ ++=− γβα1ln p

Use logistic regression because response is discrete (correct or not) Probability (p) is transformed by “log odds” “stretched out” with “s curve” to not bump up against 0 or 1

Page 37: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Results of model fit

Regression coefficients

Skill Intercept SlopeAvg Opportunties Initial Probability Avg

ProbabilityFinal Probability

Parallelogram-area 2.14 -0.01 14.9 0.95 0.94 0.93

Pentagon-area -2.16 0.45 4.3 0.2 0.63 0.84

Student Intercept

student0 1.18

student1 0.82

student2 0.21

Higher intercept of skill -> easier skill

Higher slope of skill -> faster students learn it

Higher intercept of student -> student initially knew more

Page 38: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Main Point of Talk

Problem: Need better methods to create & refine student models = “cognitive models”

Key opportunities: Good cognitive model => smooth learning curve Mine accumulating student interaction data

Solution: Learning Factors Analysis Hypothesize factors that may affect learning Use factors to pose alternative cognitive models Automate using AI search & statistical techniques

Page 39: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Difficulty Factors

Difficulty Factors -- a property of the problem that causes student difficulties Like first vs. second parameter in LISP example above

Four factors in this study Embed: alone, embed Backward: forward, backward Repeat: initial, repeat FigurePart: area, area-difference, area-combination, diameter, circumference,

radius, side, segment, base, height, apothem

Embed factor: Whether figure is embedded in another figure or by itself (alone)Example for skill Circle Area:

Q: Given AB = 2, find circle area in the context of the problem goal to calculate the shaded area

A B

A B

Page 40: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Main Point of Talk

Problem: Need better methods to create & refine student models = “cognitive models”

Key opportunities: Good cognitive model => smooth learning curve Mine accumulating student interaction data

Solution: Learning Factors Analysis Hypothesize factors that may affect learning Use factors to pose alternative cognitive models Automate using AI search & statistical techniques

Page 41: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Generate new models by splitting on difficulty factors

Model 1

1. Circle-area

2. Circle-circum

3. Circle-diameter

4. ….

Model 2Split Circle-area by embed

1. Circle-area *alone

2. Circle-area *embed3. Circle-circumference

4. Circle-diameter

5. ….

Model 3 1. Circle-area

2. Circle-circum*alone

3. Circle-circum*embed4. Circle-diameter

5. ….

Split Circle-circumference by embed

Model N

Page 42: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

New skill labels & opportunity counts are computed

Binary Split -- splits a skill a skill with a factor value, & a skill without the factor value.

Student Step Skill Opportunity

A p1s1 Circle-area-alone 1

A p2s1 Circlearea-embed 1

A p2s2 Rectangle-area 1

A p2s3Compose-by-addition 1

A p3s1 Circle-area-alone 2

Student Step Skill Opportunity Factor- Embed

A p1s1 Circle-area 1 alone

A p2s1 Circle-area 2 embed

A p2s2 Rectangle-area 1

A p2s3Compose-by-

addition 1

A p3s1 Circle-area 3 alone

After Splitting Circle-area by Embed

Page 43: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Measuring the quality of a model

Good model captures sufficient variation in data but is not overly complicated balance between model fit & complexity minimizing

prediction risk (Wasserman 2005) AIC and BIC

two estimators for prediction risk select models that fit well without being too complex

AIC = -2*log-likelihood + 2*number of parameters BIC = -2*log-likelihood + number of parameters * number of observations

Page 44: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Main Point of Talk

Problem: Need better methods to create & refine student models = “cognitive models”

Key opportunities: Good cognitive model => smooth learning curve Mine accumulating student interaction data

Solution: Learning Factors Analysis Hypothesize factors that may affect learning Use factors to pose alternative cognitive models Automate using AI search & statistical techniques

Page 45: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Combinatorial A* Search

Goal: Do model selection within the logistic regression model space

Steps:1. Start from an initial “node” in search graph2. Iteratively create new child nodes by splitting a

model using covariates or “factors”3. Employ a heuristic, like AIC, to rank each node 4. Pick best node return to step 2

Perform pre-specified # of iterations

Page 46: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Example of search process

Page 47: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

System: A*SearchOriginalModel

AIC = 5328

5301 5312 53205322

Split by Embed Split by Backward Add Formula

50+

Page 48: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

System: A*Search

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by BackwardAdd Formula

53135322

50+

Page 49: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

System: A*Search

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by Backward Add Formula

53135322

50+

5322 53245325

Page 50: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

System: A*Search

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by Backward Add Formula

53135322

50+

5322 53245325

Page 51: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

System: A*Search

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by Backward Add Formula

53135322

5248

50+

5322 53245325

15 expansions later

Page 52: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Example in Area domain …

Page 53: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Best fitting (by BIC) alternative models

Model 1 Model 2 Model 3

Number of Splits:3 Number of Splits:3 Number of Splits:2

1. Binary split compose-by-multiplication by figurepart segment

2. Binary split circle-radius by repeat repeat

3. Binary split compose-by-addition by backward backward

1. Binary split compose-by-multiplication by figurepart segment

2. Binary split circle-radius by repeat repeat

3. Binary split compose-by-addition by figurepart area-difference

1. Binary split compose-by-multiplication by figurepart segment

2. Binary split circle-radius by repeat repeat

Number of Skills: 18 Number of Skills: 18 Number of Skills: 17

AIC: 3,888.67BIC: 4,248.86MAD: 0.071

AIC: 3,888.67BIC: 4,248.86MAD: 0.071

AIC: 3,897.20BIC: 4,251.07MAD: 0.075

All best fitting models have a split of Compose-by-multiplication by the figure-part factor 2 new skills: CM-area & CM-segment that distinguish which

geometric quantity is being multiplied

Page 54: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Evaluating Learning Factors Assessment (LFA) Might a simpler, less “split”, model provide a

better fit? Will LFA reproduce original model? To perform test, start with a simpler model &

run LFA What’s a reasonable simpler model?

Page 55: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Simpler Model

Create by Merge skills in original model to remove forward vs. backward

distinction Add new difficulty factor for “direction”: forward vs. backward

Naïve model reduces original 15 skill model to 8 skills1. Circle-area, Circle-radius => Circle2. Circle-circumference, Circle-diameter => Circle-CD3. Parallelogram-area, Parallelogram-side => Parallelogram4. Pentagon-area, Pentagon-side => Pentagon5. Trapezoid-area, Trapezoid-base, Trapezoid-height => Trapezoid6. Triangle-area, Triangle-side => Triangle7. Compose-by-addition8. Compose-by-multiplication

Page 56: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Does direction factor matter?Results of running LFA starting with Simpler model

Model 1 Model 2 Model 3

Number of Splits: 4 Number of Splits: 3 Number of Splits: 4

Number of skills: 12 Number of skills: 11 Number of skills: 12

Circle *areaCircle *radius*initialCircle *radius*repeatCompose-by-additionCompose-by-addition*area-differenceCompose-by-multiplication*area-combinationCompose-by-multiplication*segment

All skills are the same as those in model 1 except that 1. Circle is split into Circle *backward*initial, Circle *backward*repeat, Circle*forward,2. Compose-by-addition is not split

All skills are the same as those in model 1 except that 1. Circle is split into Circle *backward*initial, Circle *backward*repeat, Circle *forward,2. Compose-by-addition is split into Compose-by-addition and Compose-by-addition*segment

AIC: 3,884.95 AIC: 3,893.477 AIC: 3,887.42

BIC: 4,169.315 BIC: 4,171.523 BIC: 4,171.786

MAD: 0.075 MAD: 0.079 MAD: 0.077

Page 57: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Results of “recovery” evaluation

In best fitting “recovered” models: Direction factor matters for three skills: Circle, Parallelogram,

Triangle Sometimes matters for two skills: Trapezoid, Pentagon

Other factors, like “initial vs. repeat”, appear Did not matter for one skill: Circle-CD

Thus, this forward-backward distinction seems more critical for some figures than for others

LFA results appear “sensible”

Page 58: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Main Point of Talk

Problem: Need better methods to create & refine student models = “cognitive models”

Key opportunities: Good cognitive model => smooth learning curve Mine accumulating student interaction data

Solution: Learning Factors Analysis Hypothesize factors that may affect learning Use factors to pose alternative cognitive models Automate using AI search & statistical techniques

Page 59: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

What can we do with these results? Can we use LFA to improve tutor hint

messages or curriculum? Yes! Parameter fits suggest curriculum

improvements LFA search suggests distinctions to address

in instruction & assessment

Page 60: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Parameter fit implications for curriculum revision Some skills are over taught

Example: Parallelogram-area high intercept (2.06), low slope (-.01). initial success probability = .94 (mastery threshold = .8 - .95) average opportunities per student = 15

Some skills are under taught Example: Trapezoid-height

low intercept (-1.55), positive slope (.27). final success probability = .69 average opportunities per student = 4

Clear redesign implications! Reduce opportunities on over taught Increase opportunities on under taught

Page 61: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Learning Factors Analysis Tutor Implications LFA search suggests distinctions to address in instruction & assessment

With these new distinctions, tutor can generate better hints do better problem selection for cognitive mastery

Example: Consider Compose-by-multiplication before LFA

Intercept slope Avg Practice Opportunties

Initial Probability Avg Probability

Final Probability

CM -.15 .1 10.2 .65 .84 .92

With final probability .92, many students are short of .95 mastery threshold

Page 62: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Making a distinction changes assessment decision However, after split:

CM-area and CM-segment look quite different CM-area is now above .95 mastery threshold (at .96) But CM-segment is only at .60

Implications: Original model penalizes students who have key idea about

composite areas Should CM-segment be an instructional objective or not; if so,

need to give more practice opportunities

Intercept slope Avg Practice Opportunties

Initial Probability

Avg Probability

Final Probability

CM -.15 .1 10.2 .65 .84 .92

CMarea -.009 .17 9 .64 .86 .96

CMsegment -1.42 .48 1.9 .32 .54 .60

Page 63: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Conclusion

Learning Factors Analysis combines statistics, human expertise, & combinatorial search to evaluate & improve a cognitive model

System evaluates a model in seconds; Searches 100’s of models in 4-5 hours

Model statistics are meaningful Improved models are interpretable & suggest tutor improvement This fall: Modify Area Unit and compare to existing tutor

Go to LearnLab.org! Get data to mine yourself Get LFA to apply to your own data

Page 64: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Acknowledgements

This research is sponsored by a National Science Foundation grant to the Pittsburgh Science of Learning Center.

Thanks to Joseph Beck, Albert Colbert, & Ruth Wylie for their comments.

Page 65: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Questions?

Thanks and Questions

Page 66: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Why no slope (learning rate) parameters for students?

Good question! Main focus of Learning Factors Analysis is on refining

cognitive model By adding a slope parameter for each student, model may get

unnecessarily complex But, we could add …

Might first try for groups of students: Is learning rate faster for

students in experimental group vs. control group? girls vs. boys? high visual skill vs. low visual skill

Would you like to try it?

Page 67: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Results using AIC

In best fitting models: Circle-Area gets split by “embed” 2 new skills: Circle-Area-alone and Circle-Area-embed

Page 68: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Include ad for PLSC …

Get from AERA or APS talk or NSF site visit talk …

Page 69: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Collapse next 3 slides into 1 (or 0)

Page 70: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Approach 1: Learning curve analysis Learning curve analysis

Identify blips by hand & eye Manually create a new model Qualitative judgment

Need to automatically: Identify blips by system Propose alternative cognitive models Evaluate each model quantitatively

Page 71: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Approach 2: Simulated students

Find incorrect rules & to learn new rules via human tutor intervention (Ur, VenLehn 1995)

Theory refinement using example-based machine learning (Baffes, Mooney 1996)

Issues Requires building a simulated student Depends on accuracy of learning theory May over-fit data

Page 72: Data Shop Introduction Ken Koedinger & Alida Skogsholm Human-Computer Interaction Institute Carnegie Mellon University.

Approach 3: Rule Space & Q-matrix Discover knowledge structure from student

response data, automatically extract features in the problem set (Tatsuoka 1983, Barnes 2005)

Somewhat similar to Learning Factors Analysis, but: Features are unlabeled feature vectors -- hard to interpret

Like exploratory factor analysis Search process is unprincipled, features are proposed by

random tweaking of feature vectors Uses item difficulty data, does not use learning data

Can’t model change in student performance over time


Recommended