Search/Retrieval Functions
Q550: Models in Cognitive Science
Lecture 15
Homework Assignment #2:
Cumulative Recall Functions
http://www.indiana.edu/~clcl/Q550/Homework_2.pdf
Cumulative Recall Function
We see this in almost all situations of free retrieval from memory….law of diminishing return….but why?
Cumulative Recall Function
• Bousfield originally proposed a simple process of sampling with replacement from memory
• But is it debated whether the likelihood of retrieving an item decreases as an exponential or power function of time:
€
Exponential : p(t) = e−λt
Power law : p(t) = t−λ
Cumulative Recall Function
€
Exponential : p(t) = e−λt
Power law : p(t) = t−λ
You can download data from one subject doing this “name all the animals you can in X
seconds” task here: http://www.indiana.edu/~clcl/Q550/HW2.txt
Time Prod Cumulative
1 1 1
2 1 2
3 1 3
4 1 4
5 1 5
6 1 6
7 1 7
8 1 8
9 1 9
10 1 10
11 1 11
12 1 12
13 1 13
14 1 14
15 1 15
16 1 16
17 1 17
18 0 17
0
20
40
60
80
100
120
1 21 41 61 81 101 121 141 161 181 201 221 241 261 281
Time (s)
Cu
m_
Recall
This subject’s data was produced with either the exponential or power function in the
following way:
for i = 1:300
p = exp(-lambda*i) OR i^(-lambda)
if (p > rand) produce_item = 1;
end
end
You can make predictions for a particular model as follows:
for i = 1:300
p = exp(-lambda*i) OR i^(-lambda) if (Data(i) == 1)
Predicted(i) = p;
else Predicted(i) = 1-p;
end end
Log_Likelihood = sum(log(Predicted)); G_squared = -Log_Likelihood;
Your job is to find the optimum value of the parameter lambda for each model; that is,
what is the best value if this is the model that actually produced the data. Then, determine
the objective fit (G2
) for each model for the optimum parameter setting, and perform a
quantitative model comparison to determine which of the two models I used to generate
the data.
Here are the specific questions:
1) Which model gives a better fit to the data?
2) What are the optimum parameters for the two models?
3) What are the odds that the better fitting model is the one (of the two) that generated
these data?
You can do this however you please. But here are some suggested steps:
1. Use fminsearch() to fit each model to the data (i.e., find the optimum value
for lambda)
2. F or your objective function, minimize G2
(-Log_Likelihood here). Remember that
each time an event happens (1 word produced, or 0 words produced) compute the
model’s prediction of that observation, then sum the log(predictions) for
Log_Likelihood. Make sure that you are minimizing –Log_Likelihood though, or
fminsearch will just find the worst possible parameters to produce –inf.
3. L et’s call your better fitting model A and your worse fitting one B (the better-
fitting one will have a lower G2
). Then compute BIC (both models only have a
single parameter, so I’ve simplied):
€
BIC =GB
2−GA
2
2
And then the Bayes Factor, which is how much more likely it is that Model A
produced the data vs. Model B:
€
BayesFactor = eBIC Due March 28th
A Challenge…
• If you’re interested, a good project would be to actually implement the sampling w/ replacement model with vector representations to see what predictions it naturally makes
• Or, this has not been demonstrated using a model that actually makes assumptions about the structure of semantic memory…
• Does the task require memory?
Cumulative Recall Function
We see this in almost all situations of free retrieval from memory….law of diminishing return….but why?
What do we know about category fluency? • We see bursts of items, and long lags at random intervals
Ø Modeled as a unitary process, or as the sum of two non-stationary processes? (Rhodes & Turvey, 2007) Ø Levy distributions (long tail, but predominantly short RTs)
• Estes (1975): Structure and function
Zippy the Goldfish:
Using a semantic space model for structure
• Semantic memory is not a random pool of items; it has structure
• Semantic space models (e.g., HAL, LSA) learn semantic representations for words from statistical redundancies in text
• Use a SSM to represent the structure of semantic memory, and then explore what process models of retrieval would explain how humans sample/produce items
Ø cf. Batchelder, but on a much larger scale
Ø foraging in external information spaces (Pirolli, 2007)