CMU SCS
Tensor Analysis – Applications and Algorithms
Christos Faloutsos CMU
CMU SCS
Roadmap • Applications – pattern discovery
– Brain scans – coupled matrix-tensor factorization
– Power grid • Applications – anomaly detection • Algorithms • Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 2
CMU SCS
Roadmap • Applications – pattern discovery
– Brain scans – coupled matrix-tensor factorization
– Power grid • Applications – anomaly detection • Algorithms • Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 3
CMU SCS
Neuro-semantics
13�
�
3.�Additional�Figures�and�legends��
Figure�S1.�Presentation�and�set�of�exemplars�used�in�the�experiment. Participants were presented 60 distinct word-picture pairs describing common concrete nouns. These consisted of 5 exemplars from each of 12 categories, as shown above. A slow event-related paradigm was employed, in which the stimulus was presented for 3s, followed by a 7s fixation period during which an X was presented in the center of the screen. Images were presented as white lines and characters on a dark background, but are inverted here to improve readability. The entire set of 60 exemplars was presented six times, randomly permuting the sequence on each presentation.
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
• Brain Scan Data*
• 9 persons • 60 nouns
• Questions • 218 questions • ‘is it alive?’, ‘can
you eat it?’
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
SIAM, July 2017 4 (c) C. Faloutsos, 2017
*Mitchell et al. Predicting human brain activity associated with the meanings of nouns. Science,2008. Data@ www.cs.cmu.edu/afs/cs/project/theo-73/www/science2008/data.html
CMU SCS
Neuro-semantics
13�
�
3.�Additional�Figures�and�legends��
Figure�S1.�Presentation�and�set�of�exemplars�used�in�the�experiment. Participants were presented 60 distinct word-picture pairs describing common concrete nouns. These consisted of 5 exemplars from each of 12 categories, as shown above. A slow event-related paradigm was employed, in which the stimulus was presented for 3s, followed by a 7s fixation period during which an X was presented in the center of the screen. Images were presented as white lines and characters on a dark background, but are inverted here to improve readability. The entire set of 60 exemplars was presented six times, randomly permuting the sequence on each presentation.
• Brain Scan Data*
• 9 persons • 60 nouns
• Questions • 218 questions • ‘is it alive?’, ‘can
you eat it?’
SIAM, July 2017 5 (c) C. Faloutsos, 2017
Patterns?
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
CMU SCS
Neuro-semantics
13�
�
3.�Additional�Figures�and�legends��
Figure�S1.�Presentation�and�set�of�exemplars�used�in�the�experiment. Participants were presented 60 distinct word-picture pairs describing common concrete nouns. These consisted of 5 exemplars from each of 12 categories, as shown above. A slow event-related paradigm was employed, in which the stimulus was presented for 3s, followed by a 7s fixation period during which an X was presented in the center of the screen. Images were presented as white lines and characters on a dark background, but are inverted here to improve readability. The entire set of 60 exemplars was presented six times, randomly permuting the sequence on each presentation.
• Brain Scan Data*
• 9 persons • 60 nouns
• Questions • 218 questions • ‘is it alive?’, ‘can
you eat it?’
Tofullyspecify
amodelwithint
hiscom-
putationalmode
lingframework,o
nemustfirst
defineasetofin
termediateseman
ticfeatures
f 1(w)f 2(w)…f n(w)tobeextract
edfromthetext
corpus.Inthispa
per,eachintermed
iatesemantic
featureisdefined
intermsoftheco
-occurrence
statisticsofthein
putstimuluswo
rdwwitha
particularotherwo
rd(e.g.,“taste”)or
setofwords
(e.g.,“taste,”“tas
tes,”or“tasted”)w
ithinthetext
corpus.Themode
listrainedbythe
applicationof
multipleregressio
ntothesefeature
sf i(w)andthe
observedfMRIim
ages,soastoobt
ainmaximum-
likelihoodestima
tesforthemodel
parametersc vi
(26).Oncetrained,
thecomputational
modelcanbe
evaluatedbygivin
gitwordsoutside
thetraining
setandcomparin
gitspredictedfM
RIimagesfor
thesewordswith
observedfMRIda
ta.Thiscom
putationalmodeli
ngframeworkis
basedontwokey
theoreticalassump
tions.First,it
assumesthesema
nticfeaturesthatd
istinguishthe
meaningsofarbit
raryconcretenoun
sarereflected
inthestatisticsof
theirusewithina
verylargetext
corpus.Thisassum
ptionisdrawnfro
mthefieldof
computationallin
guistics,wherest
atisticalword
distributionsare
frequentlyusedt
oapproximate
themeaningofd
ocumentsandw
ords(14–17).
Second,itassume
sthatthebrainact
ivityobserved
whenthinkinga
boutanyconcrete
nouncanbe
derivedasaweig
htedlinearsumo
fcontributions
fromeachofits
semanticfeatures.
Althoughthe
correctnessofthis
linearityassumpt
ionisdebat-
able,itisconsist
entwiththewide
spreaduseof
linearmodelsin
fMRIanalysis(2
7)andwiththe
assumptionthatf
MRIactivationo
ftenreflectsa
linearsuperpositio
nofcontributions
fromdifferent
sources.Ourtheore
ticalframeworkdo
esnottakea
positiononwheth
ertheneuralactiv
ationencoding
meaningislocali
zedinparticular
corticalre-
gions.Instead,it
considersallcort
icalvoxelsand
allowsthetrainin
gdatatodetermin
ewhichloca-
tionsaresystemat
icallymodulated
bywhichas-
pectsofwordme
anings.
Results.Weevaluated
thiscomputational
mod-elusing
fMRIdatafromn
inehealthy,college
-ageparticipan
tswhoviewed60
differentword-pic
turepairspre
sentedsixtimes
each.Anatomicall
yde-finedreg
ionsofinterestwe
reautomaticallyla
beledaccording
tothemethodolog
yin(28).The60
ran-domlyo
rderedstimuliinc
ludedfiveitems
fromeachof1
2semanticcatego
ries(animals,body
parts,buildings
,buildingparts,clo
thing,furniture,in
sects,kitcheni
tems,tools,vegeta
bles,vehicles,and
otherman-mad
eitems).Areprese
ntativefMRIimage
foreachstim
uluswascreatedb
ycomputingthem
eanfMRIre
sponseoveritssi
xpresentations,a
ndthemeanof
all60oftheserep
resentativeimages
wasthensub
tractedfromeach
[fordetails,see(
26)].Toinstan
tiateourmodeling
framework,wefir
stchoseas
etofintermediate
semanticfeatures.
Tobeeffective,
theintermediatese
manticfeaturesm
ustsimultane
ouslyencodethew
idevarietyofsema
nticcontento
ftheinputstimul
uswordsandfacto
rtheobserved
fMRIactivationin
tomoreprimitivec
om-
Predicted
“celery” = 0.84
“celery”“airplan
e”
Predicted:
Observed:
AB
+...
high
average below average
Predicted “celer
y”:+ 0.35+ 0.32
“eat”“taste”
“fill”
Fig.2.Predicting
fMRIimages
forgivenstimulus
words.(A)
Formingapredict
ionforpar-
ticipantP1fort
hestimulus
word“celery”afte
rtrainingon
58otherwords.Le
arnedc vico-efficients
for3ofthe25s
e-manticfe
atures(“eat,”“tast
e,”and“fill”
)aredepictedby
thevoxelcolo
rsinthethreeima
gesatthetop
ofthepanel.Thec
o-occurrenc
evalueforeacho
fthesefeaturesfor
thestimulusword
“celery”is
showntothelefto
ftheirrespectiveim
ages[e.g.,thevalu
efor“eat(celery)”i
s0.84].The
predictedactivation
forthestimuluswo
rd[shownatthebo
ttomof(A)]isal
inearcombinationo
fthe25semantic
fMRIsignatures,w
eightedby
theirco-occurrence
values.Thisfigure
showsjustonehor
izontalslice[z=
–12mminMontr
ealNeurological
Institute(MNI)spa
ce]ofthepredict
edthree-dim
ensionalimage.(
B)Predictedand
observedfMRIima
gesfor“celery”a
nd“airplane”after
trainingthatuses5
8otherwords.The
twolongredandb
lueverticalstreaks
nearthetop(post
eriorregion)ofthe
predictedandobse
rvedimagesaret
heleftandrightfu
siformgyri.
A B
C
Mean overparticipants
Participant P5
Fig.3.Locations
ofmostac
curatelypre-
dictedvoxels.Sur
face(A)andg
lassbrain(B)
renderingofthecor
rela-tionbetw
eenpredicted
andactualvoxelac
tiva-tionsfor
wordsoutside
thetrainingsetfor
par-ticipantP
5.Thesepanelssho
wclusterscontaining
atleast10contigu
ousvoxels,eachof
whosepredicted-
actualcorrelationis
atleast0.28.These
voxelclustersared
istributedthroughou
tthecortexan
dlocatedinthelef
tandrightoccipit
alandparietallobe
s;leftandrightfu
siform,postcentra
l,andmiddlefronta
lgyri;leftinferiorfr
ontalgyrus;medial
frontalgyrus;anda
nteriorcingulate.
(C)Surfacerenderi
ngofthepredicted-
actualcorrelation
averagedoveralln
ineparticipan
ts.Thispanelrepres
entsclusterscontai
ningatleast10co
ntiguousvoxels,ea
chwithaveragec
orrelationofatleast
0.14.
30MAY2008V
OL320SCIENC
Ewww.science
mag.org1192RESEAR
CHARTICLES
on May 30, 2008 www.sciencemag.org Downloaded from
Tofullyspecify
amodelwithin
thiscom-
putationalmode
lingframework
,onemustfirst
defineasetof
intermediatese
manticfeatures
f 1(w)f 2(w)…f n(w)tobeextrac
tedfromthetext
corpus.Inthisp
aper,eachinter
mediatesemanti
cfeature
isdefinedinte
rmsoftheco-o
ccurrence
statisticsofthe
inputstimulus
wordwwitha
particularotherw
ord(e.g.,“taste”
)orsetofword
s(e.g.,“ta
ste,”“tastes,”or
“tasted”)within
thetextcorpus.
Themodelistra
inedbytheappl
icationof
multipleregressio
ntothesefeatu
resf i(w)andthe
observedfMRI
images,soasto
obtainmaximum
-likelihoo
destimatesfor
themodelparam
etersc vi
(26).Oncetraine
d,thecomputatio
nalmodelcanbe
evaluatedbygiv
ingitwordsou
tsidethetrainin
gsetand
comparingitsp
redictedfMRIim
agesforthesew
ordswithobser
vedfMRIdata.
Thiscomputatio
nalmodeling
frameworkis
basedontwok
eytheoreticalas
sumptions.First
,itassumes
thesemanticfea
turesthatdisting
uishthemeaning
sofarbitraryco
ncretenounsare
reflected
inthestatisticso
ftheirusewithi
naverylargetex
tcorpus.
Thisassumption
isdrawnfromthe
fieldofcomputa
tionallinguistic
s,wherestatist
icalword
distributionsare
frequentlyused
toapproximate
themeaningof
documentsand
words(14–17).
Second,itassum
esthatthebrain
activityobserve
dwhenth
inkingaboutan
yconcretenou
ncanbe
derivedasawe
ightedlinearsu
mofcontributio
nsfromea
chofitsseman
ticfeatures.Alth
oughthe
correctnessofth
islinearityassu
mptionisdebat
-able,it
isconsistentw
iththewidesprea
duseof
linearmodelsin
fMRIanalysis(2
7)andwiththe
assumptiontha
tfMRIactivatio
noftenreflects
alinearsu
perpositionofc
ontributionsfrom
differentsources.
Ourtheoreticalfr
ameworkdoesn
ottakeaposition
onwhetherthen
euralactivation
encoding
meaningisloc
alizedinpartic
ularcorticalre
-gions.I
nstead,itconsid
ersallcorticalv
oxelsand
allowsthetrain
ingdatatodeterm
inewhichloca-
tionsaresystem
aticallymodulat
edbywhichas-
pectsofwordm
eanings.
Results.Weevaluate
dthiscomputatio
nalmod-
elusingfMRId
atafromninehea
lthy,college-age
participantswho
viewed60diffe
rentword-pictur
epairspr
esentedsixtime
seach.Anatom
icallyde-
finedregionsofi
nterestwereauto
maticallylabele
daccordin
gtothemethodo
logyin(28).Th
e60ran-
domlyordered
stimuliincluded
fiveitemsfrom
eachof12sema
nticcategories(a
nimals,bodypart
s,building
s,buildingparts
,clothing,furnit
ure,insects,
kitchenitems,to
ols,vegetables,
vehicles,andoth
erman-ma
deitems).Arepr
esentativefMRI
imagefor
eachstimuluswa
screatedbycomp
utingthemean
fMRIresponse
overitssixpres
entations,andth
emeanof
all60oftheser
epresentativeim
ageswas
thensubtractedf
romeach[ford
etails,see(26)]
.Toinsta
ntiateourmodeli
ngframework,w
efirstchosea
setofintermedia
tesemanticfeatu
res.Tobe
effective,theint
ermediatesema
nticfeaturesmu
stsimultan
eouslyencodeth
ewidevarietyof
semanticcontent
oftheinputstim
uluswordsand
factorthe
observedfMRIa
ctivationintomor
eprimitivecom-
Predicted
“celery” = 0.84
“celery”
“airplane”
Predicted:
Observed:
AB
+...
high
average below averag
e
Predicted “cele
ry”:+ 0.35+ 0.32
“eat”“taste”
“fill”
Fig.2.Predictin
gfMRIimages
forgivenstimu
luswords.(A)
Formingapredic
tionforpar-
ticipantP1for
thestimulus
word“celery”aft
ertrainingon
58otherwords.L
earnedcvico-
efficientsfor3of
the25se-
manticfeatures
(“eat,”“taste,”
and“fill”)ared
epictedbythe
voxelcolorsinth
ethreeimages
atthetopofthe
panel.Theco-
occurrencevalue
foreachofthese
featuresforthes
timulusword“ce
lery”isshownto
theleftoftheirre
spectiveimages[
e.g.,thevaluefor
“eat(celery)”is
0.84].Thepredict
edactivationfor
thestimulusword
[shownatthebo
ttomof(A)]isa
linearcombinatio
nofthe25sema
nticfMRIsignatu
res,weightedby
theirco-occurren
cevalues.Thisf
igureshowsjust
onehorizontals
lice[z=
–12mminMont
realNeurologica
lInstitute(MNI)
space]ofthepr
edictedthree-dim
ensionalimage.
(B)Predictedan
dobservedfMR
Iimagesfor
“celery”and“air
plane”aftertrain
ingthatuses58
otherwords.The
twolongredand
blueverticalstre
aksnearthetop
(posteriorregion
)ofthepredicte
dandobs
ervedimagesare
theleftandrigh
tfusiformgyri.
A B
C
Mean overparticipants
Participant P5
Fig.3.Location
sofmostac
curatelypre-
dictedvoxels.S
urface(A)and
glassbrain(B)
renderingofthe
correla-tionbet
weenpredicted
andactualvoxel
activa-tionsfor
wordsoutside
thetrainingsetf
orpar-ticipantP
5.Thesepanelssh
owclusterscontai
ningatleast10c
ontiguousvoxels,
eachofwhose
predicted-actualc
orrelationisatleas
t0.28.Thesevox
elclustersaredis
tributedthrougho
utthecortexan
dlocatedinthele
ftandrightoccip
italandparietal
lobes;leftandri
ghtfusiform,
postcentral,andm
iddlefrontalgyri;
leftinferiorfronta
lgyrus;medialfron
talgyrus;andant
eriorcingulate
.(C)Surfaceren
deringofthepr
edicted-actualcor
relationaveraged
overallnine
participants.This
panelrepresents
clusterscontaining
atleast10contig
uousvoxels,each
withaverage
correlationofatle
ast0.14.
30MAY2008
VOL320SCI
ENCEwww.sc
iencemag.org
1192RESEARCHART
ICLES
on May 30, 2008 www.sciencemag.org Downloaded from
Tofullyspecify
amodelwithint
hiscom-
putationalmode
lingframework,o
nemustfirst
defineasetofin
termediateseman
ticfeatures
f 1(w)f 2(w)…f n(w)tobeextract
edfromthetext
corpus.Inthispa
per,eachintermed
iatesemantic
featureisdefined
intermsoftheco
-occurrence
statisticsofthein
putstimuluswo
rdwwitha
particularotherwo
rd(e.g.,“taste”)or
setofwords
(e.g.,“taste,”“tas
tes,”or“tasted”)w
ithinthetext
corpus.Themode
listrainedbythe
applicationof
multipleregressio
ntothesefeature
sf i(w)andthe
observedfMRIim
ages,soastoobt
ainmaximum-
likelihoodestima
tesforthemodel
parametersc vi
(26).Oncetrained,
thecomputational
modelcanbe
evaluatedbygivin
gitwordsoutside
thetraining
setandcomparin
gitspredictedfM
RIimagesfor
thesewordswith
observedfMRIda
ta.Thiscom
putationalmodeli
ngframeworkis
basedontwokey
theoreticalassump
tions.First,it
assumesthesema
nticfeaturesthatd
istinguishthe
meaningsofarbit
raryconcretenoun
sarereflected
inthestatisticsof
theirusewithina
verylargetext
corpus.Thisassum
ptionisdrawnfro
mthefieldof
computationallin
guistics,wherest
atisticalword
distributionsare
frequentlyusedt
oapproximate
themeaningofd
ocumentsandw
ords(14–17).
Second,itassume
sthatthebrainact
ivityobserved
whenthinkinga
boutanyconcrete
nouncanbe
derivedasaweig
htedlinearsumo
fcontributions
fromeachofits
semanticfeatures.
Althoughthe
correctnessofthis
linearityassumpt
ionisdebat-
able,itisconsist
entwiththewide
spreaduseof
linearmodelsin
fMRIanalysis(2
7)andwiththe
assumptionthatf
MRIactivationo
ftenreflectsa
linearsuperpositio
nofcontributions
fromdifferent
sources.Ourtheore
ticalframeworkdo
esnottakea
positiononwheth
ertheneuralactiv
ationencoding
meaningislocali
zedinparticular
corticalre-
gions.Instead,it
considersallcort
icalvoxelsand
allowsthetrainin
gdatatodetermin
ewhichloca-
tionsaresystemat
icallymodulated
bywhichas-
pectsofwordme
anings.
Results.Weevaluated
thiscomputational
mod-elusing
fMRIdatafromn
inehealthy,college
-ageparticipan
tswhoviewed60
differentword-pic
turepairspre
sentedsixtimes
each.Anatomicall
yde-finedreg
ionsofinterestwe
reautomaticallyla
beledaccording
tothemethodolog
yin(28).The60
ran-domlyo
rderedstimuliinc
ludedfiveitems
fromeachof1
2semanticcatego
ries(animals,body
parts,buildings
,buildingparts,clo
thing,furniture,in
sects,kitcheni
tems,tools,vegeta
bles,vehicles,and
otherman-mad
eitems).Areprese
ntativefMRIimage
foreachstim
uluswascreatedb
ycomputingthem
eanfMRIre
sponseoveritssi
xpresentations,a
ndthemeanof
all60oftheserep
resentativeimages
wasthensub
tractedfromeach
[fordetails,see(
26)].Toinstan
tiateourmodeling
framework,wefir
stchoseas
etofintermediate
semanticfeatures.
Tobeeffective,
theintermediatese
manticfeaturesm
ustsimultane
ouslyencodethew
idevarietyofsema
nticcontento
ftheinputstimul
uswordsandfacto
rtheobserved
fMRIactivationin
tomoreprimitivec
om-
Predicted
“celery” = 0.84
“celery”“airplan
e”
Predicted:
Observed:
AB
+...
high
average below average
Predicted “celer
y”:+ 0.35+ 0.32
“eat”“taste”
“fill”
Fig.2.Predicting
fMRIimages
forgivenstimulus
words.(A)
Formingapredict
ionforpar-
ticipantP1fort
hestimulus
word“celery”afte
rtrainingon
58otherwords.Le
arnedc vico-efficients
for3ofthe25s
e-manticfe
atures(“eat,”“tast
e,”and“fill”
)aredepictedby
thevoxelcolo
rsinthethreeima
gesatthetop
ofthepanel.Thec
o-occurrenc
evalueforeacho
fthesefeaturesfor
thestimulusword
“celery”is
showntothelefto
ftheirrespectiveim
ages[e.g.,thevalu
efor“eat(celery)”i
s0.84].The
predictedactivation
forthestimuluswo
rd[shownatthebo
ttomof(A)]isal
inearcombinationo
fthe25semantic
fMRIsignatures,w
eightedby
theirco-occurrence
values.Thisfigure
showsjustonehor
izontalslice[z=
–12mminMontr
ealNeurological
Institute(MNI)spa
ce]ofthepredict
edthree-dim
ensionalimage.(
B)Predictedand
observedfMRIima
gesfor“celery”a
nd“airplane”after
trainingthatuses5
8otherwords.The
twolongredandb
lueverticalstreaks
nearthetop(post
eriorregion)ofthe
predictedandobse
rvedimagesaret
heleftandrightfu
siformgyri.
A B
C
Mean overparticipants
Participant P5
Fig.3.Locations
ofmostac
curatelypre-
dictedvoxels.Sur
face(A)andg
lassbrain(B)
renderingofthecor
rela-tionbetw
eenpredicted
andactualvoxelac
tiva-tionsfor
wordsoutside
thetrainingsetfor
par-ticipantP
5.Thesepanelssho
wclusterscontaining
atleast10contigu
ousvoxels,eachof
whosepredicted-
actualcorrelationis
atleast0.28.These
voxelclustersared
istributedthroughou
tthecortexan
dlocatedinthelef
tandrightoccipit
alandparietallobe
s;leftandrightfu
siform,postcentra
l,andmiddlefronta
lgyri;leftinferiorfr
ontalgyrus;medial
frontalgyrus;anda
nteriorcingulate.
(C)Surfacerenderi
ngofthepredicted-
actualcorrelation
averagedoveralln
ineparticipan
ts.Thispanelrepres
entsclusterscontai
ningatleast10co
ntiguousvoxels,ea
chwithaveragec
orrelationofatleast
0.14.
30MAY2008V
OL320SCIENC
Ewww.science
mag.org1192RESEAR
CHARTICLES
on May 30, 2008 www.sciencemag.org Downloaded from
Tofullyspecify
amodelwithin
thiscom-
putationalmode
lingframework
,onemustfirst
defineasetof
intermediatese
manticfeatures
f 1(w)f 2(w)…f n(w)tobeextrac
tedfromthetext
corpus.Inthisp
aper,eachinter
mediatesemanti
cfeature
isdefinedinte
rmsoftheco-o
ccurrence
statisticsofthe
inputstimulus
wordwwitha
particularotherw
ord(e.g.,“taste”
)orsetofword
s(e.g.,“ta
ste,”“tastes,”or
“tasted”)within
thetextcorpus.
Themodelistra
inedbytheappl
icationof
multipleregressio
ntothesefeatu
resf i(w)andthe
observedfMRI
images,soasto
obtainmaximum
-likelihoo
destimatesfor
themodelparam
etersc vi
(26).Oncetraine
d,thecomputatio
nalmodelcanbe
evaluatedbygiv
ingitwordsou
tsidethetrainin
gsetand
comparingitsp
redictedfMRIim
agesforthesew
ordswithobser
vedfMRIdata.
Thiscomputatio
nalmodeling
frameworkis
basedontwok
eytheoreticalas
sumptions.First
,itassumes
thesemanticfea
turesthatdisting
uishthemeaning
sofarbitraryco
ncretenounsare
reflected
inthestatisticso
ftheirusewithi
naverylargetex
tcorpus.
Thisassumption
isdrawnfromthe
fieldofcomputa
tionallinguistic
s,wherestatist
icalword
distributionsare
frequentlyused
toapproximate
themeaningof
documentsand
words(14–17).
Second,itassum
esthatthebrain
activityobserve
dwhenth
inkingaboutan
yconcretenou
ncanbe
derivedasawe
ightedlinearsu
mofcontributio
nsfromea
chofitsseman
ticfeatures.Alth
oughthe
correctnessofth
islinearityassu
mptionisdebat
-able,it
isconsistentw
iththewidesprea
duseof
linearmodelsin
fMRIanalysis(2
7)andwiththe
assumptiontha
tfMRIactivatio
noftenreflects
alinearsu
perpositionofc
ontributionsfrom
differentsources.
Ourtheoreticalfr
ameworkdoesn
ottakeaposition
onwhetherthen
euralactivation
encoding
meaningisloc
alizedinpartic
ularcorticalre
-gions.I
nstead,itconsid
ersallcorticalv
oxelsand
allowsthetrain
ingdatatodeterm
inewhichloca-
tionsaresystem
aticallymodulat
edbywhichas-
pectsofwordm
eanings.
Results.Weevaluate
dthiscomputatio
nalmod-
elusingfMRId
atafromninehea
lthy,college-age
participantswho
viewed60diffe
rentword-pictur
epairspr
esentedsixtime
seach.Anatom
icallyde-
finedregionsofi
nterestwereauto
maticallylabele
daccordin
gtothemethodo
logyin(28).Th
e60ran-
domlyordered
stimuliincluded
fiveitemsfrom
eachof12sema
nticcategories(a
nimals,bodypart
s,building
s,buildingparts
,clothing,furnit
ure,insects,
kitchenitems,to
ols,vegetables,
vehicles,andoth
erman-ma
deitems).Arepr
esentativefMRI
imagefor
eachstimuluswa
screatedbycomp
utingthemean
fMRIresponse
overitssixpres
entations,andth
emeanof
all60oftheser
epresentativeim
ageswas
thensubtractedf
romeach[ford
etails,see(26)]
.Toinsta
ntiateourmodeli
ngframework,w
efirstchosea
setofintermedia
tesemanticfeatu
res.Tobe
effective,theint
ermediatesema
nticfeaturesmu
stsimultan
eouslyencodeth
ewidevarietyof
semanticcontent
oftheinputstim
uluswordsand
factorthe
observedfMRIa
ctivationintomor
eprimitivecom-
Predicted
“celery” = 0.84
“celery”
“airplane”
Predicted:
Observed:
AB
+...
high
average below averag
e
Predicted “cele
ry”:+ 0.35+ 0.32
“eat”“taste”
“fill”
Fig.2.Predictin
gfMRIimages
forgivenstimu
luswords.(A)
Formingapredic
tionforpar-
ticipantP1for
thestimulus
word“celery”aft
ertrainingon
58otherwords.L
earnedcvico-
efficientsfor3of
the25se-
manticfeatures
(“eat,”“taste,”
and“fill”)ared
epictedbythe
voxelcolorsinth
ethreeimages
atthetopofthe
panel.Theco-
occurrencevalue
foreachofthese
featuresforthes
timulusword“ce
lery”isshownto
theleftoftheirre
spectiveimages[
e.g.,thevaluefor
“eat(celery)”is
0.84].Thepredict
edactivationfor
thestimulusword
[shownatthebo
ttomof(A)]isa
linearcombinatio
nofthe25sema
nticfMRIsignatu
res,weightedby
theirco-occurren
cevalues.Thisf
igureshowsjust
onehorizontals
lice[z=
–12mminMont
realNeurologica
lInstitute(MNI)
space]ofthepr
edictedthree-dim
ensionalimage.
(B)Predictedan
dobservedfMR
Iimagesfor
“celery”and“air
plane”aftertrain
ingthatuses58
otherwords.The
twolongredand
blueverticalstre
aksnearthetop
(posteriorregion
)ofthepredicte
dandobs
ervedimagesare
theleftandrigh
tfusiformgyri.
A B
C
Mean overparticipants
Participant P5
Fig.3.Location
sofmostac
curatelypre-
dictedvoxels.S
urface(A)and
glassbrain(B)
renderingofthe
correla-tionbet
weenpredicted
andactualvoxel
activa-tionsfor
wordsoutside
thetrainingsetf
orpar-ticipantP
5.Thesepanelssh
owclusterscontai
ningatleast10c
ontiguousvoxels,
eachofwhose
predicted-actualc
orrelationisatleas
t0.28.Thesevox
elclustersaredis
tributedthrougho
utthecortexan
dlocatedinthele
ftandrightoccip
italandparietal
lobes;leftandri
ghtfusiform,
postcentral,andm
iddlefrontalgyri;
leftinferiorfronta
lgyrus;medialfron
talgyrus;andant
eriorcingulate
.(C)Surfaceren
deringofthepr
edicted-actualcor
relationaveraged
overallnine
participants.This
panelrepresents
clusterscontaining
atleast10contig
uousvoxels,each
withaverage
correlationofatle
ast0.14.
30MAY2008
VOL320SCI
ENCEwww.sc
iencemag.org
1192RESEARCHART
ICLES
on May 30, 2008 www.sciencemag.org Downloaded from
…
airplane
dog
noun
s
questions
voxels SIAM, July 2017 6 (c) C. Faloutsos, 2017
Patterns?
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.
This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected
in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.
Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].
To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-
Predicted“celery” = 0.84
“celery” “airplane”
Predicted:
Observed:
A B
+.. .
high
average
belowaverage
Predicted “celery”:
+ 0.35 + 0.32
“eat” “taste” “fill”
Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =
–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.
A
B
C
Mean over
participants
Participant P
5
Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-
ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.
30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192
RESEARCH ARTICLES
on
May
30,
200
8 w
ww
.sci
ence
mag
.org
Dow
nloa
ded
from
CMU SCS
Neuro-semantics
Dog
Airplane
Puppy
Toful
lyspe
cifya
modelw
ithin
thisc
om-
putation
almo
delin
gfram
ework
,one
mustf
irst
defin
easet
ofint
ermediates
emantic
featur
esf 1(w
)f 2(w)…
f n(w)to
beext
racted
from
thetex
tcor
pus.I
nthis
paper,
eachinte
rmedi
atesem
antic
featur
eisd
efinedinterm
softh
eco-o
ccurre
nce
statisticso
fthe
input
stimu
luswo
rdw
witha
particu
laroth
erwo
rd(e.g
.,“taste”
)orsetof
words
(e.g.,“
taste,”
“tastes,”
or“ta
sted”)
within
thetex
tcor
pus.T
hemo
delist
rained
bythe
applica
tiono
fmu
ltiplereg
ressio
ntot
hesef
eature
sfi(w
)and
theobser
vedfM
RIimage
s,soa
stoobtain
maxim
um-
likelih
oodestimate
sfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putati
onalm
odelca
nbe
evalua
tedby
giving
itword
souts
idethe
training
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
withobser
vedfM
RIdat
a.Th
iscom
putati
onal
model
ingframe
work
isbas
edon
twokey
theore
ticalassu
mption
s.First
,itass
umes
thesem
anticfea
tures
thatd
istingu
ishthe
meani
ngso
farbitrary
concre
tenouns
arereflec
ted
inthe
statist
icsofthe
iruse
withina
verylarg
etext
corpus.T
hisass
umption
isdraw
nfrom
thefieldof
computati
onal
linguisti
cs,wh
erestatist
icalw
orddis
tributionsa
refrequent
lyuse
dtoa
pprox
imate
theme
aning
ofdocum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinking
about
anycon
crete
noun
canbe
derive
dasa
weigh
tedlinear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.
Although
thecor
rectne
ssof
thisli
nearity
assum
ption
isdeba
t-abl
e,iti
sconsisten
twith
thewides
pread
useof
linear
model
sinfM
RIana
lysis(
27)and
withthe
assum
ption
thatfM
RIact
ivationo
ftenr
eflect
salinear
superp
osition
ofcon
tributio
nsfro
mdifferent
source
s.Ourthe
oretica
lfram
ework
doesnotta
kea
positi
onon
wheth
erthe
neural
activa
tionenc
oding
meaning
isloc
alized
inpartic
ularc
ortica
lre-
gions.
Instea
d,itc
onsid
ersall
cortica
lvoxels
andallo
wsthe
training
datatod
eterm
inewh
ichloc
a-tions
aresys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elusi
ngfM
RIdat
afrom
nineh
ealthy
,colleg
e-age
partici
pantswh
oview
ed60
different
word-
picture
pairs
presen
tedsix
times
each.
Anato
mical
lyde-
fined
region
sofin
terestw
ereaut
omatic
allylab
eled
accord
ingtothe
metho
dology
in(28
).The
60ran
-dom
lyord
eredstim
uliinc
luded
fivei
temsf
romeac
hof12s
emant
iccategor
ies(an
imals,bo
dypar
ts,bui
ldings,
buildin
gparts
,cloth
ing,fu
rniture,i
nsects,
kitchen
items,t
ools,v
egetab
les,vehi
cles,a
ndoth
erma
n-made
items).
Arep
resent
ativef
MRIim
agefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,and
theme
anof
all60
ofthe
serep
resent
ativeim
agesw
asthe
nsubtracte
dfrom
each[
fordet
ails,se
e(26)].
Toins
tantiat
eourmo
deling
frame
work,
wefirs
tcho
seaseto
finterm
ediate
semant
icfeat
ures.T
obe
effect
ive,th
einte
rmedi
atesem
antic
features
must
simultane
ouslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einputstim
ulusw
ordsa
ndfac
torthe
observ
edfM
RIact
ivation
intomo
reprim
itivec
om-
Pred
icted
“celer
y” = 0
.84
“celer
y”“ai
rplan
e”
Pred
icted
:
Obse
rved:
AB
+...
high
avera
ge below
avera
ge
Pred
icted
“cele
ry”:
+ 0.35
+ 0.32
“eat”
“taste
”“fil
l”
Fig.2
.Pred
icting
fMRIim
ages
forgiv
enstim
ulusw
ords.(
A)For
minga
predic
tionfor
par-
ticipan
tP1
forthe
stimulu
swo
rd“ce
lery”a
ftertr
aining
on58
other
words
.Learn
edc vi
co-effi
cients
for3o
fthe
25se-
mantic
featur
es(“e
at,”“ta
ste,”
and“fil
l”)are
depicte
dbyt
hevox
elcolo
rsint
hethree
images
atthe
topof
thepan
el.The
co-occ
urrenc
evalu
efor
eacho
fthese
features
forthe
stimulu
sword
“celery
”is
shown
tothe
leftofthe
irresp
ective
images
[e.g.,the
value
for“ea
t(cele
ry)”is
0.84].
Thepre
dicted
activa
tionfor
thestim
ulusw
ord[sh
owna
ttheb
ottom
of(A)]is
alinea
rcomb
ination
ofthe
25sem
anticf
MRIsi
gnatur
es,weigh
tedby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12m
min
Montr
ealNe
urolog
icalIn
stitute
(MNI)s
pace]
ofthe
predic
tedthr
ee-dim
ension
alima
ge.(B)
Predic
tedand
observ
edfMR
Iima
gesfor
“celery
”and
“airpl
ane”a
ftertra
iningthat
uses5
8othe
rword
s.The
twolon
gred
andblu
evert
icalst
reaks
nearth
etop
(poste
riorre
gion)
ofthe
predic
tedand
observ
edima
gesare
theleft
andrig
htfus
iform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
uratel
ypre
-dic
tedvox
els.S
urface
(A)and
glass
brain
(B)ren
dering
ofthe
correla
-tion
betwee
npre
dicted
andact
ualvox
elact
iva-
tionsf
orwords
outside
thetrai
nings
etfor
par-
ticipan
tP5.T
hesep
anelss
howclu
stersc
ontain
ingatl
east1
0cont
iguous
voxels,
eacho
fwhos
epre
dicted
-actua
lcorrelati
onisa
tleast
0.28.The
sevox
elclusters
aredistrib
utedthro
ughout
thecor
texand
locate
dint
heleft
andrigh
toccip
italand
parieta
llobes
;lefta
ndrigh
tfusifo
rm,
postce
ntral,
andmid
dlefronta
lgyri;l
eftinfe
riorfronta
lgyrus
;medi
alfron
talgyr
us;and
anterio
rcin
gulate
.(C)S
urface
render
ingof
thepre
dicted
-actua
lcorr
elation
averag
edove
ralln
inepar
ticipan
ts.Thi
spane
lrepre
sentsc
lusters
contain
ingatlea
st10c
ontigu
ousvox
els,eac
hwith
averag
ecorr
elation
ofatle
ast0.1
4.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncem
ag.or
g11
92RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifyam
odelw
ithint
hiscom
-put
ational
model
ingfra
mewo
rk,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f
n(w)to
beext
racted
from
thetex
tcor
pus.In
thispap
er,eac
hinterm
ediate
semant
icfea
tureis
define
dinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwith
apar
ticular
otherw
ord(e.g
.,“tast
e”)ors
etofw
ords
(e.g.,“
taste,”
“tastes,
”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltipler
egress
ionto
these
feature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
ameter
scvi
(26).O
ncetrai
ned,th
ecomp
utation
almode
lcanb
eeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIima
gesfor
thesew
ordsw
ithobs
erved
fMRI
data.
Thisc
omput
ational
model
ingfram
ework
isbas
edon
twokey
theore
ticalas
sumptio
ns.Firs
t,itass
umes
thesem
anticf
eatures
thatd
istingui
shthe
meani
ngsofa
rbitrary
concre
tenoun
sarere
flected
inthe
statisti
csofth
eiruse
within
avery
largete
xtcor
pus.Th
isassu
mption
isdraw
nfrom
thefiel
dof
comput
ational
linguis
tics,w
heres
tatistic
alwo
rddis
tributio
nsare
frequen
tlyuse
dtoa
pproxi
mate
theme
aning
ofdoc
ument
sand
words
(14–17
).Sec
ond,it
assum
esthat
thebra
inactiv
ityobs
erved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linears
umof
contrib
utions
fromeac
hofit
ssem
anticf
eatures.
Althou
ghthe
correc
tnesso
fthisl
inearit
yassu
mption
isdeba
t-abl
e,itis
consist
entwit
hthe
widesp
readu
seof
linearm
odelsi
nfMRI
analys
is(27)
andwit
hthe
assum
ptionth
atfMR
Iactiva
tionofte
nrefle
ctsa
linears
uperpo
sitiono
fcontri
bution
sfrom
differen
tsou
rces.O
urtheo
retical
framew
orkdoe
snotta
kea
positio
nonw
hether
theneu
ralacti
vation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todeter
minew
hichlo
ca-tion
sare
system
atically
modul
atedby
which
as-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singfM
RIdat
afrom
nineh
ealthy,
colleg
e-age
partici
pantsw
hovie
wed6
0diffe
rentw
ord-pic
turepai
rspre
sented
sixtim
eseac
h.An
atomic
allyde-
finedre
gions
ofinte
restwe
reauto
matica
llylab
eledacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccateg
ories(a
nimals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
created
bycom
puting
theme
anfM
RIresp
onseo
verits
sixpre
sentatio
ns,and
theme
anofa
ll60o
fthese
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(2
6)].
Toinst
antiate
ourmo
deling
framew
ork,w
efirst
chosea
setofi
nterme
diates
emant
icfeatu
res.To
beeffe
ctive,t
heinte
rmedia
tesem
anticf
eatures
must
simulta
neously
encode
thewid
evarie
tyofse
mantic
conten
tofthe
inputs
timulu
sword
sandfa
ctorth
eobs
erved
fMRI
activat
ioninto
morep
rimitiv
ecom-
Predic
ted“ce
lery” =
0.84
“celer
y”“air
plane”
Predic
ted:
Obser
ved:
AB
+...
high
averag
e below
averag
e
Predic
ted “ce
lery”:+ 0
.35+ 0
.32
“eat”
“taste
”“fill
”
Fig.2
.Predic
tingfMR
Iimage
sfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-ticip
antP1
forthe
stimulu
swor
d“cele
ry”afte
rtrain
ingon
58oth
erword
s.Lear
nedc vic
o-effi
cientsf
or3o
fthe2
5se-
mantic
feature
s(“eat
,”“tast
e,”and
“fill”)a
redepic
tedby
thevox
elcolor
sinthe
threeim
ages
atthe
topoft
hepan
el.The
co-occ
urrence
valuefo
reach
ofthes
efeatu
resfor
thestim
uluswor
d“cele
ry”is
shown
tothe
leftoft
heirresp
ectivei
mages
[e.g.,th
evalue
for“ea
t(celer
y)”is
0.84].
Thepre
dicted
activati
onfor
thestim
uluswor
d[show
natth
ebotto
mof
(A)]isa
linearc
ombin
ationo
fthe2
5sema
nticfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values.
Thisfi
guresh
owsjust
onehor
izontal
slice[z
=
–12mm
inMo
ntrealN
eurolo
gicalIn
stitute(
MNI)s
pace]o
fthep
redicte
dthre
e-dime
nsional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”an
d“airp
lane”a
ftertrai
ningth
atuses
58oth
erword
s.The
twolon
gred
andblu
evertic
alstrea
ksnear
thetop
(poster
iorreg
ion)of
thepre
dicted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surface
(A)and
glassb
rain(B)
render
ingoft
hecorre
la-tion
between
predict
edand
actualv
oxelac
tiva-
tionsfo
rword
soutsi
dethe
trainin
gsetfo
rpar-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofwh
osepre
dicted-
actualc
orrelatio
nisatl
east0.
28.The
sevoxe
lcluster
sared
istribut
edthro
ughout
thecort
exand
located
inthe
leftand
rightoc
cipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmiddl
efront
algyri;
leftinfe
riorfron
talgyru
s;media
lfronta
lgyrus;
andant
erior
cingulat
e.(C)
Surface
render
ingof
thepre
dicted-
actualc
orrelatio
navera
gedove
ralln
inepar
ticipant
s.This
panelr
eprese
ntsclus
terscon
taining
atleas
t10con
tiguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y2008
VOL3
20SC
IENCE
www.s
cience
mag.o
rg11
92RESEA
RCHA
RTICL
ES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifya
model
within
thisc
om-
putati
onalm
odelin
gfram
ework
,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f n(w
)tobe
extrac
tedfro
mthe
text
corpus
.Inthis
paper,
eachin
termedi
atesem
antic
featur
eisdef
inedinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwit
hapar
ticular
otherw
ord(e.g
.,“tast
e”)or
setof
words
(e.g.,“
taste,”
“tastes
,”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltiplereg
ressio
ntot
hesef
eature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putatio
nalmo
delcan
beeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
witho
bserve
dfMR
Idata
.Th
iscom
putatio
nalmo
deling
framew
orkis
based
ontwo
keythe
oretica
lassum
ptions.
First,
itass
umes
thesem
anticfea
turesth
atdis
tinguis
hthe
meani
ngsof
arbitra
rycon
creten
ounsa
rerefle
cted
inthe
statist
icsofthe
iruse
within
avery
largete
xtcor
pus.T
hisass
umptio
nisdra
wnfro
mthe
fieldo
fcom
putatio
nalling
uistics
,wher
estati
stical
word
distrib
utions
arefreq
uently
usedt
oappr
oxima
tethe
meani
ngof
docum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.A
lthough
thecor
rectne
ssof
thisline
aritya
ssump
tionisd
ebat-
able,i
tiscon
sistent
witht
hewid
esprea
duse
ofline
armode
lsinf
MRIan
alysis
(27)an
dwith
theass
umptio
nthat
fMRI
activa
tionoft
enref
lectsa
linear
superp
osition
ofcon
tributio
nsfrom
differe
ntsou
rces.O
urtheo
retical
frame
workd
oesnot
takea
positio
nonw
hether
theneu
ralact
ivation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todet
ermine
which
loca-
tionsa
resys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singf
MRId
atafro
mnin
eheal
thy,co
llege-a
gepar
ticipan
tswho
viewe
d60d
ifferen
tword
-picture
pairs
presen
tedsix
times
each.
Anato
mically
de-fine
dregi
onsof
interes
twere
autom
atically
labele
dacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccategor
ies(an
imals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,an
dthe
mean
ofall
60of
these
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(
26)].
Toins
tantiat
eourm
odeling
framew
ork,w
efirst
chosea
setofi
nterm
ediate
semant
icfeat
ures.T
obe
effectiv
e,the
interm
ediate
semant
icfea
turesm
ustsim
ultaneo
uslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einput
stimulu
sword
sand
factor
theobs
erved
fMRI
activa
tioninto
morep
rimitiv
ecom
-
Predic
ted“ce
lery”
= 0.84
“celer
y”“ai
rplan
e”
Predic
ted:
Obse
rved:
AB
+...
high
averag
e below
averag
e
Predic
ted “c
elery”
:
+ 0.35
+ 0.32
“eat”
“taste
”“fill
”
Fig.2
.Pred
ictingfM
RIima
gesfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-tici
pantP
1for
thestim
ulus
word“
celery”
aftertr
aining
on58
otherw
ords.L
earned
c vico-
efficien
tsfor
3ofth
e25s
e-ma
nticfea
tures(“
eat,”“
taste,”
and“fil
l”)are
depicte
dbyth
evox
elcolo
rsinth
ethree
images
atthe
topoft
hepan
el.The
co-occ
urrence
valuef
oreac
hofth
esefea
turesfo
rthes
timulu
sword
“celery
”issho
wntot
heleft
ofthei
rrespe
ctiveim
ages[e
.g.,the
valuefo
r“eat(
celery)”
is0.8
4].The
predict
edacti
vation
forthe
stimulu
sword
[shown
atthe
bottom
of(A)]
isaline
arcom
binatio
nofth
e25s
emant
icfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12mm
inMo
ntreal
Neurolo
gical
Institu
te(MN
I)spac
e]of
thepre
dicted
three-
dimens
ional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”a
nd“ai
rplane”
aftertr
aining
thatus
es58o
therw
ords.T
hetwo
long
redand
bluev
ertical
streaks
nearth
etop(p
osterio
rregio
n)ofth
epred
icted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surfac
e(A)
andgla
ssbrain
(B)ren
dering
ofthe
correla
-tion
betwee
npred
icted
andactu
alvoxe
lactiva
-tion
sforw
ordso
utside
thetrai
nings
etfor
par-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofw
hose
predict
ed-actu
alcorre
lation
isatle
ast0.2
8.Thes
evoxe
lcluste
rsared
istribu
tedthro
ughout
thecor
texand
located
inthe
leftand
righto
ccipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmidd
lefron
talgyr
i;leftin
feriorf
rontal
gyrus;
media
lfronta
lgyrus;
andant
erior
cingula
te.(C)
Surface
render
ingof
thepre
dicted-
actual
correla
tionave
raged
overa
llnine
particip
ants.T
hispan
elrepr
esents
clusters
contain
ingatl
east1
0cont
iguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncema
g.org
1192RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifya
modelw
ithin
thisc
om-
putation
almo
delin
gfram
ework
,one
mustf
irst
defin
easet
ofint
ermediates
emantic
featur
esf 1(w
)f 2(w)…
f n(w)to
beext
racted
from
thetex
tcor
pus.I
nthis
paper,
eachinte
rmedi
atesem
antic
featur
eisd
efinedinterm
softh
eco-o
ccurre
nce
statisticso
fthe
input
stimu
luswo
rdw
witha
particu
laroth
erwo
rd(e.g
.,“taste”
)orsetof
words
(e.g.,“
taste,”
“tastes,”
or“ta
sted”)
within
thetex
tcor
pus.T
hemo
delist
rained
bythe
applica
tiono
fmu
ltiplereg
ressio
ntot
hesef
eature
sfi(w
)and
theobser
vedfM
RIimage
s,soa
stoobtain
maxim
um-
likelih
oodestimate
sfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putati
onalm
odelca
nbe
evalua
tedby
giving
itword
souts
idethe
training
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
withobser
vedfM
RIdat
a.Th
iscom
putati
onal
model
ingframe
work
isbas
edon
twokey
theore
ticalassu
mption
s.First
,itass
umes
thesem
anticfea
tures
thatd
istingu
ishthe
meani
ngso
farbitrary
concre
tenouns
arereflec
ted
inthe
statist
icsofthe
iruse
withina
verylarg
etext
corpus.T
hisass
umption
isdraw
nfrom
thefieldof
computati
onal
linguisti
cs,wh
erestatist
icalw
orddis
tributionsa
refrequent
lyuse
dtoa
pprox
imate
theme
aning
ofdocum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinking
about
anycon
crete
noun
canbe
derive
dasa
weigh
tedlinear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.
Although
thecor
rectne
ssof
thisli
nearity
assum
ption
isdeba
t-abl
e,iti
sconsisten
twith
thewides
pread
useof
linear
model
sinfM
RIana
lysis(
27)and
withthe
assum
ption
thatfM
RIact
ivationo
ftenr
eflect
salinear
superp
osition
ofcon
tributio
nsfro
mdifferent
source
s.Ourthe
oretica
lfram
ework
doesnotta
kea
positi
onon
wheth
erthe
neural
activa
tionenc
oding
meaning
isloc
alized
inpartic
ularc
ortica
lre-
gions.
Instea
d,itc
onsid
ersall
cortica
lvoxels
andallo
wsthe
training
datatod
eterm
inewh
ichloc
a-tions
aresys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elusi
ngfM
RIdat
afrom
nineh
ealthy
,colleg
e-age
partici
pantswh
oview
ed60
different
word-
picture
pairs
presen
tedsix
times
each.
Anato
mical
lyde-
fined
region
sofin
terestw
ereaut
omatic
allylab
eled
accord
ingtothe
metho
dology
in(28
).The
60ran
-dom
lyord
eredstim
uliinc
luded
fivei
temsf
romeac
hof12s
emant
iccategor
ies(an
imals,bo
dypar
ts,bui
ldings,
buildin
gparts
,cloth
ing,fu
rniture,i
nsects,
kitchen
items,t
ools,v
egetab
les,vehi
cles,a
ndoth
erma
n-made
items).
Arep
resent
ativef
MRIim
agefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,and
theme
anof
all60
ofthe
serep
resent
ativeim
agesw
asthe
nsubtracte
dfrom
each[
fordet
ails,se
e(26)].
Toins
tantiat
eourmo
deling
frame
work,
wefirs
tcho
seaseto
finterm
ediate
semant
icfeat
ures.T
obe
effect
ive,th
einte
rmedi
atesem
antic
features
must
simultane
ouslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einputstim
ulusw
ordsa
ndfac
torthe
observ
edfM
RIact
ivation
intomo
reprim
itivec
om-
Pred
icted
“celer
y” = 0
.84
“celer
y”“ai
rplan
e”
Pred
icted
:
Obse
rved:
AB
+...
high
avera
ge below
avera
ge
Pred
icted
“cele
ry”:
+ 0.35
+ 0.32
“eat”
“taste
”“fil
l”
Fig.2
.Pred
icting
fMRIim
ages
forgiv
enstim
ulusw
ords.(
A)For
minga
predic
tionfor
par-
ticipan
tP1
forthe
stimulu
swo
rd“ce
lery”a
ftertr
aining
on58
other
words
.Learn
edc vi
co-effi
cients
for3o
fthe
25se-
mantic
featur
es(“e
at,”“ta
ste,”
and“fil
l”)are
depicte
dbyt
hevox
elcolo
rsint
hethree
images
atthe
topof
thepan
el.The
co-occ
urrenc
evalu
efor
eacho
fthese
features
forthe
stimulu
sword
“celery
”is
shown
tothe
leftofthe
irresp
ective
images
[e.g.,the
value
for“ea
t(cele
ry)”is
0.84].
Thepre
dicted
activa
tionfor
thestim
ulusw
ord[sh
owna
ttheb
ottom
of(A)]is
alinea
rcomb
ination
ofthe
25sem
anticf
MRIsi
gnatur
es,weigh
tedby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12m
min
Montr
ealNe
urolog
icalIn
stitute
(MNI)s
pace]
ofthe
predic
tedthr
ee-dim
ension
alima
ge.(B)
Predic
tedand
observ
edfMR
Iima
gesfor
“celery
”and
“airpl
ane”a
ftertra
iningthat
uses5
8othe
rword
s.The
twolon
gred
andblu
evert
icalst
reaks
nearth
etop
(poste
riorre
gion)
ofthe
predic
tedand
observ
edima
gesare
theleft
andrig
htfus
iform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
uratel
ypre
-dic
tedvox
els.S
urface
(A)and
glass
brain
(B)ren
dering
ofthe
correla
-tion
betwee
npre
dicted
andact
ualvox
elact
iva-
tionsf
orwords
outside
thetrai
nings
etfor
par-
ticipan
tP5.T
hesep
anelss
howclu
stersc
ontain
ingatl
east1
0cont
iguous
voxels,
eacho
fwhos
epre
dicted
-actua
lcorrelati
onisa
tleast
0.28.The
sevox
elclusters
aredistrib
utedthro
ughout
thecor
texand
locate
dint
heleft
andrigh
toccip
italand
parieta
llobes
;lefta
ndrigh
tfusifo
rm,
postce
ntral,
andmid
dlefronta
lgyri;l
eftinfe
riorfronta
lgyrus
;medi
alfron
talgyr
us;and
anterio
rcin
gulate
.(C)S
urface
render
ingof
thepre
dicted
-actua
lcorr
elation
averag
edove
ralln
inepar
ticipan
ts.Thi
spane
lrepre
sentsc
lusters
contain
ingatlea
st10c
ontigu
ousvox
els,eac
hwith
averag
ecorr
elation
ofatle
ast0.1
4.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncem
ag.or
g11
92RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifyam
odelw
ithint
hiscom
-put
ational
model
ingfra
mewo
rk,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f
n(w)to
beext
racted
from
thetex
tcor
pus.In
thispap
er,eac
hinterm
ediate
semant
icfea
tureis
define
dinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwith
apar
ticular
otherw
ord(e.g
.,“tast
e”)ors
etofw
ords
(e.g.,“
taste,”
“tastes,
”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltipler
egress
ionto
these
feature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
ameter
scvi
(26).O
ncetrai
ned,th
ecomp
utation
almode
lcanb
eeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIima
gesfor
thesew
ordsw
ithobs
erved
fMRI
data.
Thisc
omput
ational
model
ingfram
ework
isbas
edon
twokey
theore
ticalas
sumptio
ns.Firs
t,itass
umes
thesem
anticf
eatures
thatd
istingui
shthe
meani
ngsofa
rbitrary
concre
tenoun
sarere
flected
inthe
statisti
csofth
eiruse
within
avery
largete
xtcor
pus.Th
isassu
mption
isdraw
nfrom
thefiel
dof
comput
ational
linguis
tics,w
heres
tatistic
alwo
rddis
tributio
nsare
frequen
tlyuse
dtoa
pproxi
mate
theme
aning
ofdoc
ument
sand
words
(14–17
).Sec
ond,it
assum
esthat
thebra
inactiv
ityobs
erved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linears
umof
contrib
utions
fromeac
hofit
ssem
anticf
eatures.
Althou
ghthe
correc
tnesso
fthisl
inearit
yassu
mption
isdeba
t-abl
e,itis
consist
entwit
hthe
widesp
readu
seof
linearm
odelsi
nfMRI
analys
is(27)
andwit
hthe
assum
ptionth
atfMR
Iactiva
tionofte
nrefle
ctsa
linears
uperpo
sitiono
fcontri
bution
sfrom
differen
tsou
rces.O
urtheo
retical
framew
orkdoe
snotta
kea
positio
nonw
hether
theneu
ralacti
vation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todeter
minew
hichlo
ca-tion
sare
system
atically
modul
atedby
which
as-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singfM
RIdat
afrom
nineh
ealthy,
colleg
e-age
partici
pantsw
hovie
wed6
0diffe
rentw
ord-pic
turepai
rspre
sented
sixtim
eseac
h.An
atomic
allyde-
finedre
gions
ofinte
restwe
reauto
matica
llylab
eledacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccateg
ories(a
nimals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
created
bycom
puting
theme
anfM
RIresp
onseo
verits
sixpre
sentatio
ns,and
theme
anofa
ll60o
fthese
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(2
6)].
Toinst
antiate
ourmo
deling
framew
ork,w
efirst
chosea
setofi
nterme
diates
emant
icfeatu
res.To
beeffe
ctive,t
heinte
rmedia
tesem
anticf
eatures
must
simulta
neously
encode
thewid
evarie
tyofse
mantic
conten
tofthe
inputs
timulu
sword
sandfa
ctorth
eobs
erved
fMRI
activat
ioninto
morep
rimitiv
ecom-
Predic
ted“ce
lery” =
0.84
“celer
y”“air
plane”
Predic
ted:
Obser
ved:
AB
+...
high
averag
e below
averag
e
Predic
ted “ce
lery”:+ 0
.35+ 0
.32
“eat”
“taste
”“fill
”
Fig.2
.Predic
tingfMR
Iimage
sfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-ticip
antP1
forthe
stimulu
swor
d“cele
ry”afte
rtrain
ingon
58oth
erword
s.Lear
nedc vic
o-effi
cientsf
or3o
fthe2
5se-
mantic
feature
s(“eat
,”“tast
e,”and
“fill”)a
redepic
tedby
thevox
elcolor
sinthe
threeim
ages
atthe
topoft
hepan
el.The
co-occ
urrence
valuefo
reach
ofthes
efeatu
resfor
thestim
uluswor
d“cele
ry”is
shown
tothe
leftoft
heirresp
ectivei
mages
[e.g.,th
evalue
for“ea
t(celer
y)”is
0.84].
Thepre
dicted
activati
onfor
thestim
uluswor
d[show
natth
ebotto
mof
(A)]isa
linearc
ombin
ationo
fthe2
5sema
nticfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values.
Thisfi
guresh
owsjust
onehor
izontal
slice[z
=
–12mm
inMo
ntrealN
eurolo
gicalIn
stitute(
MNI)s
pace]o
fthep
redicte
dthre
e-dime
nsional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”an
d“airp
lane”a
ftertrai
ningth
atuses
58oth
erword
s.The
twolon
gred
andblu
evertic
alstrea
ksnear
thetop
(poster
iorreg
ion)of
thepre
dicted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surface
(A)and
glassb
rain(B)
render
ingoft
hecorre
la-tion
between
predict
edand
actualv
oxelac
tiva-
tionsfo
rword
soutsi
dethe
trainin
gsetfo
rpar-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofwh
osepre
dicted-
actualc
orrelatio
nisatl
east0.
28.The
sevoxe
lcluster
sared
istribut
edthro
ughout
thecort
exand
located
inthe
leftand
rightoc
cipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmiddl
efront
algyri;
leftinfe
riorfron
talgyru
s;media
lfronta
lgyrus;
andant
erior
cingulat
e.(C)
Surface
render
ingof
thepre
dicted-
actualc
orrelatio
navera
gedove
ralln
inepar
ticipant
s.This
panelr
eprese
ntsclus
terscon
taining
atleas
t10con
tiguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y2008
VOL3
20SC
IENCE
www.s
cience
mag.o
rg11
92RESEA
RCHA
RTICL
ES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifya
model
within
thisc
om-
putati
onalm
odelin
gfram
ework
,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f n(w
)tobe
extrac
tedfro
mthe
text
corpus
.Inthis
paper,
eachin
termedi
atesem
antic
featur
eisdef
inedinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwit
hapar
ticular
otherw
ord(e.g
.,“tast
e”)or
setof
words
(e.g.,“
taste,”
“tastes
,”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltiplereg
ressio
ntot
hesef
eature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putatio
nalmo
delcan
beeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
witho
bserve
dfMR
Idata
.Th
iscom
putatio
nalmo
deling
framew
orkis
based
ontwo
keythe
oretica
lassum
ptions.
First,
itass
umes
thesem
anticfea
turesth
atdis
tinguis
hthe
meani
ngsof
arbitra
rycon
creten
ounsa
rerefle
cted
inthe
statist
icsofthe
iruse
within
avery
largete
xtcor
pus.T
hisass
umptio
nisdra
wnfro
mthe
fieldo
fcom
putatio
nalling
uistics
,wher
estati
stical
word
distrib
utions
arefreq
uently
usedt
oappr
oxima
tethe
meani
ngof
docum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.A
lthough
thecor
rectne
ssof
thisline
aritya
ssump
tionisd
ebat-
able,i
tiscon
sistent
witht
hewid
esprea
duse
ofline
armode
lsinf
MRIan
alysis
(27)an
dwith
theass
umptio
nthat
fMRI
activa
tionoft
enref
lectsa
linear
superp
osition
ofcon
tributio
nsfrom
differe
ntsou
rces.O
urtheo
retical
frame
workd
oesnot
takea
positio
nonw
hether
theneu
ralact
ivation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todet
ermine
which
loca-
tionsa
resys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singf
MRId
atafro
mnin
eheal
thy,co
llege-a
gepar
ticipan
tswho
viewe
d60d
ifferen
tword
-picture
pairs
presen
tedsix
times
each.
Anato
mically
de-fine
dregi
onsof
interes
twere
autom
atically
labele
dacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccategor
ies(an
imals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,an
dthe
mean
ofall
60of
these
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(
26)].
Toins
tantiat
eourm
odeling
framew
ork,w
efirst
chosea
setofi
nterm
ediate
semant
icfeat
ures.T
obe
effectiv
e,the
interm
ediate
semant
icfea
turesm
ustsim
ultaneo
uslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einput
stimulu
sword
sand
factor
theobs
erved
fMRI
activa
tioninto
morep
rimitiv
ecom
-
Predic
ted“ce
lery”
= 0.84
“celer
y”“ai
rplan
e”
Predic
ted:
Obse
rved:
AB
+...
high
averag
e below
averag
e
Predic
ted “c
elery”
:
+ 0.35
+ 0.32
“eat”
“taste
”“fill
”
Fig.2
.Pred
ictingfM
RIima
gesfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-tici
pantP
1for
thestim
ulus
word“
celery”
aftertr
aining
on58
otherw
ords.L
earned
c vico-
efficien
tsfor
3ofth
e25s
e-ma
nticfea
tures(“
eat,”“
taste,”
and“fil
l”)are
depicte
dbyth
evox
elcolo
rsinth
ethree
images
atthe
topoft
hepan
el.The
co-occ
urrence
valuef
oreac
hofth
esefea
turesfo
rthes
timulu
sword
“celery
”issho
wntot
heleft
ofthei
rrespe
ctiveim
ages[e
.g.,the
valuefo
r“eat(
celery)”
is0.8
4].The
predict
edacti
vation
forthe
stimulu
sword
[shown
atthe
bottom
of(A)]
isaline
arcom
binatio
nofth
e25s
emant
icfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12mm
inMo
ntreal
Neurolo
gical
Institu
te(MN
I)spac
e]of
thepre
dicted
three-
dimens
ional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”a
nd“ai
rplane”
aftertr
aining
thatus
es58o
therw
ords.T
hetwo
long
redand
bluev
ertical
streaks
nearth
etop(p
osterio
rregio
n)ofth
epred
icted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surfac
e(A)
andgla
ssbrain
(B)ren
dering
ofthe
correla
-tion
betwee
npred
icted
andactu
alvoxe
lactiva
-tion
sforw
ordso
utside
thetrai
nings
etfor
par-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofw
hose
predict
ed-actu
alcorre
lation
isatle
ast0.2
8.Thes
evoxe
lcluste
rsared
istribu
tedthro
ughout
thecor
texand
located
inthe
leftand
righto
ccipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmidd
lefron
talgyr
i;leftin
feriorf
rontal
gyrus;
media
lfronta
lgyrus;
andant
erior
cingula
te.(C)
Surface
render
ingof
thepre
dicted-
actual
correla
tionave
raged
overa
llnine
particip
ants.T
hispan
elrepr
esents
clusters
contain
ingatl
east1
0cont
iguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncema
g.org
1192RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifya
modelw
ithin
thisc
om-
putation
almo
delin
gfram
ework
,one
mustf
irst
defin
easet
ofint
ermediates
emantic
featur
esf 1(w
)f 2(w)…
f n(w)to
beext
racted
from
thetex
tcor
pus.I
nthis
paper,
eachinte
rmedi
atesem
antic
featur
eisd
efinedinterm
softh
eco-o
ccurre
nce
statisticso
fthe
input
stimu
luswo
rdw
witha
particu
laroth
erwo
rd(e.g
.,“taste”
)orsetof
words
(e.g.,“
taste,”
“tastes,”
or“ta
sted”)
within
thetex
tcor
pus.T
hemo
delist
rained
bythe
applica
tiono
fmu
ltiplereg
ressio
ntot
hesef
eature
sfi(w
)and
theobser
vedfM
RIimage
s,soa
stoobtain
maxim
um-
likelih
oodestimate
sfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putati
onalm
odelca
nbe
evalua
tedby
giving
itword
souts
idethe
training
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
withobser
vedfM
RIdat
a.Th
iscom
putati
onal
model
ingframe
work
isbas
edon
twokey
theore
ticalassu
mption
s.First
,itass
umes
thesem
anticfea
tures
thatd
istingu
ishthe
meani
ngso
farbitrary
concre
tenouns
arereflec
ted
inthe
statist
icsofthe
iruse
withina
verylarg
etext
corpus.T
hisass
umption
isdraw
nfrom
thefieldof
computati
onal
linguisti
cs,wh
erestatist
icalw
orddis
tributionsa
refrequent
lyuse
dtoa
pprox
imate
theme
aning
ofdocum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinking
about
anycon
crete
noun
canbe
derive
dasa
weigh
tedlinear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.
Although
thecor
rectne
ssof
thisli
nearity
assum
ption
isdeba
t-abl
e,iti
sconsisten
twith
thewides
pread
useof
linear
model
sinfM
RIana
lysis(
27)and
withthe
assum
ption
thatfM
RIact
ivationo
ftenr
eflect
salinear
superp
osition
ofcon
tributio
nsfro
mdifferent
source
s.Ourthe
oretica
lfram
ework
doesnotta
kea
positi
onon
wheth
erthe
neural
activa
tionenc
oding
meaning
isloc
alized
inpartic
ularc
ortica
lre-
gions.
Instea
d,itc
onsid
ersall
cortica
lvoxels
andallo
wsthe
training
datatod
eterm
inewh
ichloc
a-tions
aresys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elusi
ngfM
RIdat
afrom
nineh
ealthy
,colleg
e-age
partici
pantswh
oview
ed60
different
word-
picture
pairs
presen
tedsix
times
each.
Anato
mical
lyde-
fined
region
sofin
terestw
ereaut
omatic
allylab
eled
accord
ingtothe
metho
dology
in(28
).The
60ran
-dom
lyord
eredstim
uliinc
luded
fivei
temsf
romeac
hof12s
emant
iccategor
ies(an
imals,bo
dypar
ts,bui
ldings,
buildin
gparts
,cloth
ing,fu
rniture,i
nsects,
kitchen
items,t
ools,v
egetab
les,vehi
cles,a
ndoth
erma
n-made
items).
Arep
resent
ativef
MRIim
agefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,and
theme
anof
all60
ofthe
serep
resent
ativeim
agesw
asthe
nsubtracte
dfrom
each[
fordet
ails,se
e(26)].
Toins
tantiat
eourmo
deling
frame
work,
wefirs
tcho
seaseto
finterm
ediate
semant
icfeat
ures.T
obe
effect
ive,th
einte
rmedi
atesem
antic
features
must
simultane
ouslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einputstim
ulusw
ordsa
ndfac
torthe
observ
edfM
RIact
ivation
intomo
reprim
itivec
om-
Pred
icted
“celer
y” = 0
.84
“celer
y”“ai
rplan
e”
Pred
icted
:
Obse
rved:
AB
+...
high
avera
ge below
avera
ge
Pred
icted
“cele
ry”:
+ 0.35
+ 0.32
“eat”
“taste
”“fil
l”
Fig.2
.Pred
icting
fMRIim
ages
forgiv
enstim
ulusw
ords.(
A)For
minga
predic
tionfor
par-
ticipan
tP1
forthe
stimulu
swo
rd“ce
lery”a
ftertr
aining
on58
other
words
.Learn
edc vi
co-effi
cients
for3o
fthe
25se-
mantic
featur
es(“e
at,”“ta
ste,”
and“fil
l”)are
depicte
dbyt
hevox
elcolo
rsint
hethree
images
atthe
topof
thepan
el.The
co-occ
urrenc
evalu
efor
eacho
fthese
features
forthe
stimulu
sword
“celery
”is
shown
tothe
leftofthe
irresp
ective
images
[e.g.,the
value
for“ea
t(cele
ry)”is
0.84].
Thepre
dicted
activa
tionfor
thestim
ulusw
ord[sh
owna
ttheb
ottom
of(A)]is
alinea
rcomb
ination
ofthe
25sem
anticf
MRIsi
gnatur
es,weigh
tedby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12m
min
Montr
ealNe
urolog
icalIn
stitute
(MNI)s
pace]
ofthe
predic
tedthr
ee-dim
ension
alima
ge.(B)
Predic
tedand
observ
edfMR
Iima
gesfor
“celery
”and
“airpl
ane”a
ftertra
iningthat
uses5
8othe
rword
s.The
twolon
gred
andblu
evert
icalst
reaks
nearth
etop
(poste
riorre
gion)
ofthe
predic
tedand
observ
edima
gesare
theleft
andrig
htfus
iform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
uratel
ypre
-dic
tedvox
els.S
urface
(A)and
glass
brain
(B)ren
dering
ofthe
correla
-tion
betwee
npre
dicted
andact
ualvox
elact
iva-
tionsf
orwords
outside
thetrai
nings
etfor
par-
ticipan
tP5.T
hesep
anelss
howclu
stersc
ontain
ingatl
east1
0cont
iguous
voxels,
eacho
fwhos
epre
dicted
-actua
lcorrelati
onisa
tleast
0.28.The
sevox
elclusters
aredistrib
utedthro
ughout
thecor
texand
locate
dint
heleft
andrigh
toccip
italand
parieta
llobes
;lefta
ndrigh
tfusifo
rm,
postce
ntral,
andmid
dlefronta
lgyri;l
eftinfe
riorfronta
lgyrus
;medi
alfron
talgyr
us;and
anterio
rcin
gulate
.(C)S
urface
render
ingof
thepre
dicted
-actua
lcorr
elation
averag
edove
ralln
inepar
ticipan
ts.Thi
spane
lrepre
sentsc
lusters
contain
ingatlea
st10c
ontigu
ousvox
els,eac
hwith
averag
ecorr
elation
ofatle
ast0.1
4.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncem
ag.or
g11
92RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifyam
odelw
ithint
hiscom
-put
ational
model
ingfra
mewo
rk,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f
n(w)to
beext
racted
from
thetex
tcor
pus.In
thispap
er,eac
hinterm
ediate
semant
icfea
tureis
define
dinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwith
apar
ticular
otherw
ord(e.g
.,“tast
e”)ors
etofw
ords
(e.g.,“
taste,”
“tastes,
”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltipler
egress
ionto
these
feature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
ameter
scvi
(26).O
ncetrai
ned,th
ecomp
utation
almode
lcanb
eeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIima
gesfor
thesew
ordsw
ithobs
erved
fMRI
data.
Thisc
omput
ational
model
ingfram
ework
isbas
edon
twokey
theore
ticalas
sumptio
ns.Firs
t,itass
umes
thesem
anticf
eatures
thatd
istingui
shthe
meani
ngsofa
rbitrary
concre
tenoun
sarere
flected
inthe
statisti
csofth
eiruse
within
avery
largete
xtcor
pus.Th
isassu
mption
isdraw
nfrom
thefiel
dof
comput
ational
linguis
tics,w
heres
tatistic
alwo
rddis
tributio
nsare
frequen
tlyuse
dtoa
pproxi
mate
theme
aning
ofdoc
ument
sand
words
(14–17
).Sec
ond,it
assum
esthat
thebra
inactiv
ityobs
erved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linears
umof
contrib
utions
fromeac
hofit
ssem
anticf
eatures.
Althou
ghthe
correc
tnesso
fthisl
inearit
yassu
mption
isdeba
t-abl
e,itis
consist
entwit
hthe
widesp
readu
seof
linearm
odelsi
nfMRI
analys
is(27)
andwit
hthe
assum
ptionth
atfMR
Iactiva
tionofte
nrefle
ctsa
linears
uperpo
sitiono
fcontri
bution
sfrom
differen
tsou
rces.O
urtheo
retical
framew
orkdoe
snotta
kea
positio
nonw
hether
theneu
ralacti
vation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todeter
minew
hichlo
ca-tion
sare
system
atically
modul
atedby
which
as-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singfM
RIdat
afrom
nineh
ealthy,
colleg
e-age
partici
pantsw
hovie
wed6
0diffe
rentw
ord-pic
turepai
rspre
sented
sixtim
eseac
h.An
atomic
allyde-
finedre
gions
ofinte
restwe
reauto
matica
llylab
eledacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccateg
ories(a
nimals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
created
bycom
puting
theme
anfM
RIresp
onseo
verits
sixpre
sentatio
ns,and
theme
anofa
ll60o
fthese
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(2
6)].
Toinst
antiate
ourmo
deling
framew
ork,w
efirst
chosea
setofi
nterme
diates
emant
icfeatu
res.To
beeffe
ctive,t
heinte
rmedia
tesem
anticf
eatures
must
simulta
neously
encode
thewid
evarie
tyofse
mantic
conten
tofthe
inputs
timulu
sword
sandfa
ctorth
eobs
erved
fMRI
activat
ioninto
morep
rimitiv
ecom-
Predic
ted“ce
lery” =
0.84
“celer
y”“air
plane”
Predic
ted:
Obser
ved:
AB
+...
high
averag
e below
averag
e
Predic
ted “ce
lery”:+ 0
.35+ 0
.32
“eat”
“taste
”“fill
”
Fig.2
.Predic
tingfMR
Iimage
sfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-ticip
antP1
forthe
stimulu
swor
d“cele
ry”afte
rtrain
ingon
58oth
erword
s.Lear
nedc vic
o-effi
cientsf
or3o
fthe2
5se-
mantic
feature
s(“eat
,”“tast
e,”and
“fill”)a
redepic
tedby
thevox
elcolor
sinthe
threeim
ages
atthe
topoft
hepan
el.The
co-occ
urrence
valuefo
reach
ofthes
efeatu
resfor
thestim
uluswor
d“cele
ry”is
shown
tothe
leftoft
heirresp
ectivei
mages
[e.g.,th
evalue
for“ea
t(celer
y)”is
0.84].
Thepre
dicted
activati
onfor
thestim
uluswor
d[show
natth
ebotto
mof
(A)]isa
linearc
ombin
ationo
fthe2
5sema
nticfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values.
Thisfi
guresh
owsjust
onehor
izontal
slice[z
=
–12mm
inMo
ntrealN
eurolo
gicalIn
stitute(
MNI)s
pace]o
fthep
redicte
dthre
e-dime
nsional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”an
d“airp
lane”a
ftertrai
ningth
atuses
58oth
erword
s.The
twolon
gred
andblu
evertic
alstrea
ksnear
thetop
(poster
iorreg
ion)of
thepre
dicted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surface
(A)and
glassb
rain(B)
render
ingoft
hecorre
la-tion
between
predict
edand
actualv
oxelac
tiva-
tionsfo
rword
soutsi
dethe
trainin
gsetfo
rpar-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofwh
osepre
dicted-
actualc
orrelatio
nisatl
east0.
28.The
sevoxe
lcluster
sared
istribut
edthro
ughout
thecort
exand
located
inthe
leftand
rightoc
cipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmiddl
efront
algyri;
leftinfe
riorfron
talgyru
s;media
lfronta
lgyrus;
andant
erior
cingulat
e.(C)
Surface
render
ingof
thepre
dicted-
actualc
orrelatio
navera
gedove
ralln
inepar
ticipant
s.This
panelr
eprese
ntsclus
terscon
taining
atleas
t10con
tiguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y2008
VOL3
20SC
IENCE
www.s
cience
mag.o
rg11
92RESEA
RCHA
RTICL
ES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifya
model
within
thisc
om-
putati
onalm
odelin
gfram
ework
,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f n(w
)tobe
extrac
tedfro
mthe
text
corpus
.Inthis
paper,
eachin
termedi
atesem
antic
featur
eisdef
inedinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwit
hapar
ticular
otherw
ord(e.g
.,“tast
e”)or
setof
words
(e.g.,“
taste,”
“tastes
,”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltiplereg
ressio
ntot
hesef
eature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putatio
nalmo
delcan
beeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
witho
bserve
dfMR
Idata
.Th
iscom
putatio
nalmo
deling
framew
orkis
based
ontwo
keythe
oretica
lassum
ptions.
First,
itass
umes
thesem
anticfea
turesth
atdis
tinguis
hthe
meani
ngsof
arbitra
rycon
creten
ounsa
rerefle
cted
inthe
statist
icsofthe
iruse
within
avery
largete
xtcor
pus.T
hisass
umptio
nisdra
wnfro
mthe
fieldo
fcom
putatio
nalling
uistics
,wher
estati
stical
word
distrib
utions
arefreq
uently
usedt
oappr
oxima
tethe
meani
ngof
docum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.A
lthough
thecor
rectne
ssof
thisline
aritya
ssump
tionisd
ebat-
able,i
tiscon
sistent
witht
hewid
esprea
duse
ofline
armode
lsinf
MRIan
alysis
(27)an
dwith
theass
umptio
nthat
fMRI
activa
tionoft
enref
lectsa
linear
superp
osition
ofcon
tributio
nsfrom
differe
ntsou
rces.O
urtheo
retical
frame
workd
oesnot
takea
positio
nonw
hether
theneu
ralact
ivation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todet
ermine
which
loca-
tionsa
resys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singf
MRId
atafro
mnin
eheal
thy,co
llege-a
gepar
ticipan
tswho
viewe
d60d
ifferen
tword
-picture
pairs
presen
tedsix
times
each.
Anato
mically
de-fine
dregi
onsof
interes
twere
autom
atically
labele
dacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccategor
ies(an
imals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,an
dthe
mean
ofall
60of
these
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(
26)].
Toins
tantiat
eourm
odeling
framew
ork,w
efirst
chosea
setofi
nterm
ediate
semant
icfeat
ures.T
obe
effectiv
e,the
interm
ediate
semant
icfea
turesm
ustsim
ultaneo
uslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einput
stimulu
sword
sand
factor
theobs
erved
fMRI
activa
tioninto
morep
rimitiv
ecom
-
Predic
ted“ce
lery”
= 0.84
“celer
y”“ai
rplan
e”
Predic
ted:
Obse
rved:
AB
+...
high
averag
e below
averag
e
Predic
ted “c
elery”
:
+ 0.35
+ 0.32
“eat”
“taste
”“fill
”
Fig.2
.Pred
ictingfM
RIima
gesfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-tici
pantP
1for
thestim
ulus
word“
celery”
aftertr
aining
on58
otherw
ords.L
earned
c vico-
efficien
tsfor
3ofth
e25s
e-ma
nticfea
tures(“
eat,”“
taste,”
and“fil
l”)are
depicte
dbyth
evox
elcolo
rsinth
ethree
images
atthe
topoft
hepan
el.The
co-occ
urrence
valuef
oreac
hofth
esefea
turesfo
rthes
timulu
sword
“celery
”issho
wntot
heleft
ofthei
rrespe
ctiveim
ages[e
.g.,the
valuefo
r“eat(
celery)”
is0.8
4].The
predict
edacti
vation
forthe
stimulu
sword
[shown
atthe
bottom
of(A)]
isaline
arcom
binatio
nofth
e25s
emant
icfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12mm
inMo
ntreal
Neurolo
gical
Institu
te(MN
I)spac
e]of
thepre
dicted
three-
dimens
ional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”a
nd“ai
rplane”
aftertr
aining
thatus
es58o
therw
ords.T
hetwo
long
redand
bluev
ertical
streaks
nearth
etop(p
osterio
rregio
n)ofth
epred
icted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surfac
e(A)
andgla
ssbrain
(B)ren
dering
ofthe
correla
-tion
betwee
npred
icted
andactu
alvoxe
lactiva
-tion
sforw
ordso
utside
thetrai
nings
etfor
par-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofw
hose
predict
ed-actu
alcorre
lation
isatle
ast0.2
8.Thes
evoxe
lcluste
rsared
istribu
tedthro
ughout
thecor
texand
located
inthe
leftand
righto
ccipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmidd
lefron
talgyr
i;leftin
feriorf
rontal
gyrus;
media
lfronta
lgyrus;
andant
erior
cingula
te.(C)
Surface
render
ingof
thepre
dicted-
actual
correla
tionave
raged
overa
llnine
particip
ants.T
hispan
elrepr
esents
clusters
contain
ingatl
east1
0cont
iguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncema
g.org
1192RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifya
modelw
ithin
thisc
om-
putation
almo
delin
gfram
ework
,one
mustf
irst
defin
easet
ofint
ermediates
emantic
featur
esf 1(w
)f 2(w)…
f n(w)to
beext
racted
from
thetex
tcor
pus.I
nthis
paper,
eachinte
rmedi
atesem
antic
featur
eisd
efinedinterm
softh
eco-o
ccurre
nce
statisticso
fthe
input
stimu
luswo
rdw
witha
particu
laroth
erwo
rd(e.g
.,“taste”
)orsetof
words
(e.g.,“
taste,”
“tastes,”
or“ta
sted”)
within
thetex
tcor
pus.T
hemo
delist
rained
bythe
applica
tiono
fmu
ltiplereg
ressio
ntot
hesef
eature
sfi(w
)and
theobser
vedfM
RIimage
s,soa
stoobtain
maxim
um-
likelih
oodestimate
sfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putati
onalm
odelca
nbe
evalua
tedby
giving
itword
souts
idethe
training
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
withobser
vedfM
RIdat
a.Th
iscom
putati
onal
model
ingframe
work
isbas
edon
twokey
theore
ticalassu
mption
s.First
,itass
umes
thesem
anticfea
tures
thatd
istingu
ishthe
meani
ngso
farbitrary
concre
tenouns
arereflec
ted
inthe
statist
icsofthe
iruse
withina
verylarg
etext
corpus.T
hisass
umption
isdraw
nfrom
thefieldof
computati
onal
linguisti
cs,wh
erestatist
icalw
orddis
tributionsa
refrequent
lyuse
dtoa
pprox
imate
theme
aning
ofdocum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinking
about
anycon
crete
noun
canbe
derive
dasa
weigh
tedlinear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.
Although
thecor
rectne
ssof
thisli
nearity
assum
ption
isdeba
t-abl
e,iti
sconsisten
twith
thewides
pread
useof
linear
model
sinfM
RIana
lysis(
27)and
withthe
assum
ption
thatfM
RIact
ivationo
ftenr
eflect
salinear
superp
osition
ofcon
tributio
nsfro
mdifferent
source
s.Ourthe
oretica
lfram
ework
doesnotta
kea
positi
onon
wheth
erthe
neural
activa
tionenc
oding
meaning
isloc
alized
inpartic
ularc
ortica
lre-
gions.
Instea
d,itc
onsid
ersall
cortica
lvoxels
andallo
wsthe
training
datatod
eterm
inewh
ichloc
a-tions
aresys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elusi
ngfM
RIdat
afrom
nineh
ealthy
,colleg
e-age
partici
pantswh
oview
ed60
different
word-
picture
pairs
presen
tedsix
times
each.
Anato
mical
lyde-
fined
region
sofin
terestw
ereaut
omatic
allylab
eled
accord
ingtothe
metho
dology
in(28
).The
60ran
-dom
lyord
eredstim
uliinc
luded
fivei
temsf
romeac
hof12s
emant
iccategor
ies(an
imals,bo
dypar
ts,bui
ldings,
buildin
gparts
,cloth
ing,fu
rniture,i
nsects,
kitchen
items,t
ools,v
egetab
les,vehi
cles,a
ndoth
erma
n-made
items).
Arep
resent
ativef
MRIim
agefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,and
theme
anof
all60
ofthe
serep
resent
ativeim
agesw
asthe
nsubtracte
dfrom
each[
fordet
ails,se
e(26)].
Toins
tantiat
eourmo
deling
frame
work,
wefirs
tcho
seaseto
finterm
ediate
semant
icfeat
ures.T
obe
effect
ive,th
einte
rmedi
atesem
antic
features
must
simultane
ouslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einputstim
ulusw
ordsa
ndfac
torthe
observ
edfM
RIact
ivation
intomo
reprim
itivec
om-
Pred
icted
“celer
y” = 0
.84
“celer
y”“ai
rplan
e”
Pred
icted
:
Obse
rved:
AB
+...
high
avera
ge below
avera
ge
Pred
icted
“cele
ry”:
+ 0.35
+ 0.32
“eat”
“taste
”“fil
l”
Fig.2
.Pred
icting
fMRIim
ages
forgiv
enstim
ulusw
ords.(
A)For
minga
predic
tionfor
par-
ticipan
tP1
forthe
stimulu
swo
rd“ce
lery”a
ftertr
aining
on58
other
words
.Learn
edc vi
co-effi
cients
for3o
fthe
25se-
mantic
featur
es(“e
at,”“ta
ste,”
and“fil
l”)are
depicte
dbyt
hevox
elcolo
rsint
hethree
images
atthe
topof
thepan
el.The
co-occ
urrenc
evalu
efor
eacho
fthese
features
forthe
stimulu
sword
“celery
”is
shown
tothe
leftofthe
irresp
ective
images
[e.g.,the
value
for“ea
t(cele
ry)”is
0.84].
Thepre
dicted
activa
tionfor
thestim
ulusw
ord[sh
owna
ttheb
ottom
of(A)]is
alinea
rcomb
ination
ofthe
25sem
anticf
MRIsi
gnatur
es,weigh
tedby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12m
min
Montr
ealNe
urolog
icalIn
stitute
(MNI)s
pace]
ofthe
predic
tedthr
ee-dim
ension
alima
ge.(B)
Predic
tedand
observ
edfMR
Iima
gesfor
“celery
”and
“airpl
ane”a
ftertra
iningthat
uses5
8othe
rword
s.The
twolon
gred
andblu
evert
icalst
reaks
nearth
etop
(poste
riorre
gion)
ofthe
predic
tedand
observ
edima
gesare
theleft
andrig
htfus
iform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
uratel
ypre
-dic
tedvox
els.S
urface
(A)and
glass
brain
(B)ren
dering
ofthe
correla
-tion
betwee
npre
dicted
andact
ualvox
elact
iva-
tionsf
orwords
outside
thetrai
nings
etfor
par-
ticipan
tP5.T
hesep
anelss
howclu
stersc
ontain
ingatl
east1
0cont
iguous
voxels,
eacho
fwhos
epre
dicted
-actua
lcorrelati
onisa
tleast
0.28.The
sevox
elclusters
aredistrib
utedthro
ughout
thecor
texand
locate
dint
heleft
andrigh
toccip
italand
parieta
llobes
;lefta
ndrigh
tfusifo
rm,
postce
ntral,
andmid
dlefronta
lgyri;l
eftinfe
riorfronta
lgyrus
;medi
alfron
talgyr
us;and
anterio
rcin
gulate
.(C)S
urface
render
ingof
thepre
dicted
-actua
lcorr
elation
averag
edove
ralln
inepar
ticipan
ts.Thi
spane
lrepre
sentsc
lusters
contain
ingatlea
st10c
ontigu
ousvox
els,eac
hwith
averag
ecorr
elation
ofatle
ast0.1
4.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncem
ag.or
g11
92RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifyam
odelw
ithint
hiscom
-put
ational
model
ingfra
mewo
rk,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f
n(w)to
beext
racted
from
thetex
tcor
pus.In
thispap
er,eac
hinterm
ediate
semant
icfea
tureis
define
dinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwith
apar
ticular
otherw
ord(e.g
.,“tast
e”)ors
etofw
ords
(e.g.,“
taste,”
“tastes,
”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltipler
egress
ionto
these
feature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
ameter
scvi
(26).O
ncetrai
ned,th
ecomp
utation
almode
lcanb
eeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIima
gesfor
thesew
ordsw
ithobs
erved
fMRI
data.
Thisc
omput
ational
model
ingfram
ework
isbas
edon
twokey
theore
ticalas
sumptio
ns.Firs
t,itass
umes
thesem
anticf
eatures
thatd
istingui
shthe
meani
ngsofa
rbitrary
concre
tenoun
sarere
flected
inthe
statisti
csofth
eiruse
within
avery
largete
xtcor
pus.Th
isassu
mption
isdraw
nfrom
thefiel
dof
comput
ational
linguis
tics,w
heres
tatistic
alwo
rddis
tributio
nsare
frequen
tlyuse
dtoa
pproxi
mate
theme
aning
ofdoc
ument
sand
words
(14–17
).Sec
ond,it
assum
esthat
thebra
inactiv
ityobs
erved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linears
umof
contrib
utions
fromeac
hofit
ssem
anticf
eatures.
Althou
ghthe
correc
tnesso
fthisl
inearit
yassu
mption
isdeba
t-abl
e,itis
consist
entwit
hthe
widesp
readu
seof
linearm
odelsi
nfMRI
analys
is(27)
andwit
hthe
assum
ptionth
atfMR
Iactiva
tionofte
nrefle
ctsa
linears
uperpo
sitiono
fcontri
bution
sfrom
differen
tsou
rces.O
urtheo
retical
framew
orkdoe
snotta
kea
positio
nonw
hether
theneu
ralacti
vation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todeter
minew
hichlo
ca-tion
sare
system
atically
modul
atedby
which
as-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singfM
RIdat
afrom
nineh
ealthy,
colleg
e-age
partici
pantsw
hovie
wed6
0diffe
rentw
ord-pic
turepai
rspre
sented
sixtim
eseac
h.An
atomic
allyde-
finedre
gions
ofinte
restwe
reauto
matica
llylab
eledacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccateg
ories(a
nimals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
created
bycom
puting
theme
anfM
RIresp
onseo
verits
sixpre
sentatio
ns,and
theme
anofa
ll60o
fthese
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(2
6)].
Toinst
antiate
ourmo
deling
framew
ork,w
efirst
chosea
setofi
nterme
diates
emant
icfeatu
res.To
beeffe
ctive,t
heinte
rmedia
tesem
anticf
eatures
must
simulta
neously
encode
thewid
evarie
tyofse
mantic
conten
tofthe
inputs
timulu
sword
sandfa
ctorth
eobs
erved
fMRI
activat
ioninto
morep
rimitiv
ecom-
Predic
ted“ce
lery” =
0.84
“celer
y”“air
plane”
Predic
ted:
Obser
ved:
AB
+...
high
averag
e below
averag
e
Predic
ted “ce
lery”:+ 0
.35+ 0
.32
“eat”
“taste
”“fill
”
Fig.2
.Predic
tingfMR
Iimage
sfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-ticip
antP1
forthe
stimulu
swor
d“cele
ry”afte
rtrain
ingon
58oth
erword
s.Lear
nedc vic
o-effi
cientsf
or3o
fthe2
5se-
mantic
feature
s(“eat
,”“tast
e,”and
“fill”)a
redepic
tedby
thevox
elcolor
sinthe
threeim
ages
atthe
topoft
hepan
el.The
co-occ
urrence
valuefo
reach
ofthes
efeatu
resfor
thestim
uluswor
d“cele
ry”is
shown
tothe
leftoft
heirresp
ectivei
mages
[e.g.,th
evalue
for“ea
t(celer
y)”is
0.84].
Thepre
dicted
activati
onfor
thestim
uluswor
d[show
natth
ebotto
mof
(A)]isa
linearc
ombin
ationo
fthe2
5sema
nticfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values.
Thisfi
guresh
owsjust
onehor
izontal
slice[z
=
–12mm
inMo
ntrealN
eurolo
gicalIn
stitute(
MNI)s
pace]o
fthep
redicte
dthre
e-dime
nsional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”an
d“airp
lane”a
ftertrai
ningth
atuses
58oth
erword
s.The
twolon
gred
andblu
evertic
alstrea
ksnear
thetop
(poster
iorreg
ion)of
thepre
dicted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surface
(A)and
glassb
rain(B)
render
ingoft
hecorre
la-tion
between
predict
edand
actualv
oxelac
tiva-
tionsfo
rword
soutsi
dethe
trainin
gsetfo
rpar-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofwh
osepre
dicted-
actualc
orrelatio
nisatl
east0.
28.The
sevoxe
lcluster
sared
istribut
edthro
ughout
thecort
exand
located
inthe
leftand
rightoc
cipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmiddl
efront
algyri;
leftinfe
riorfron
talgyru
s;media
lfronta
lgyrus;
andant
erior
cingulat
e.(C)
Surface
render
ingof
thepre
dicted-
actualc
orrelatio
navera
gedove
ralln
inepar
ticipant
s.This
panelr
eprese
ntsclus
terscon
taining
atleas
t10con
tiguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y2008
VOL3
20SC
IENCE
www.s
cience
mag.o
rg11
92RESEA
RCHA
RTICL
ES
on May 30, 2008 www.sciencemag.org Downloaded from
Toful
lyspe
cifya
model
within
thisc
om-
putati
onalm
odelin
gfram
ework
,one
mustf
irstdef
ineas
etof
interm
ediate
semant
icfea
tures
f 1(w)f 2
(w)…f n(w
)tobe
extrac
tedfro
mthe
text
corpus
.Inthis
paper,
eachin
termedi
atesem
antic
featur
eisdef
inedinterm
softh
eco-o
ccurre
ncesta
tistics
ofthe
input
stimulu
sword
wwit
hapar
ticular
otherw
ord(e.g
.,“tast
e”)or
setof
words
(e.g.,“
taste,”
“tastes
,”or“
tasted”
)withi
nthe
text
corpus
.The
model
istrain
edby
theapp
lication
ofmu
ltiplereg
ressio
ntot
hesef
eature
sf i(w)
andthe
observ
edfM
RIima
ges,so
astoo
btainm
aximu
m-like
lihood
estima
tesfor
themo
delpar
amete
rsc vi
(26).O
ncetrai
ned,th
ecom
putatio
nalmo
delcan
beeva
luated
bygiv
ingitw
ordso
utside
thetrai
ning
setand
compar
ingits
predic
tedfM
RIimage
sfor
these
words
witho
bserve
dfMR
Idata
.Th
iscom
putatio
nalmo
deling
framew
orkis
based
ontwo
keythe
oretica
lassum
ptions.
First,
itass
umes
thesem
anticfea
turesth
atdis
tinguis
hthe
meani
ngsof
arbitra
rycon
creten
ounsa
rerefle
cted
inthe
statist
icsofthe
iruse
within
avery
largete
xtcor
pus.T
hisass
umptio
nisdra
wnfro
mthe
fieldo
fcom
putatio
nalling
uistics
,wher
estati
stical
word
distrib
utions
arefreq
uently
usedt
oappr
oxima
tethe
meani
ngof
docum
entsa
ndwo
rds(14
–17).
Second
,itass
umes
thatth
ebrain
activit
yobse
rved
when
thinkin
gabou
tany
concre
tenou
ncan
beder
iveda
sawe
ighted
linear
sumof
contrib
utions
from
eacho
fitss
emant
icfea
tures.A
lthough
thecor
rectne
ssof
thisline
aritya
ssump
tionisd
ebat-
able,i
tiscon
sistent
witht
hewid
esprea
duse
ofline
armode
lsinf
MRIan
alysis
(27)an
dwith
theass
umptio
nthat
fMRI
activa
tionoft
enref
lectsa
linear
superp
osition
ofcon
tributio
nsfrom
differe
ntsou
rces.O
urtheo
retical
frame
workd
oesnot
takea
positio
nonw
hether
theneu
ralact
ivation
encodi
ngme
aning
isloc
alized
inpar
ticular
cortica
lre-
gions.
Instea
d,itco
nsider
sallco
rticalv
oxelsa
ndallo
wsthe
trainin
gdata
todet
ermine
which
loca-
tionsa
resys
tematic
allymo
dulate
dbyw
hicha
s-pec
tsofw
ordme
anings
.
Results.W
eeval
uated
thiscom
putatio
nalmo
d-elu
singf
MRId
atafro
mnin
eheal
thy,co
llege-a
gepar
ticipan
tswho
viewe
d60d
ifferen
tword
-picture
pairs
presen
tedsix
times
each.
Anato
mically
de-fine
dregi
onsof
interes
twere
autom
atically
labele
dacc
ording
tothe
metho
dology
in(28)
.The
60ran
-dom
lyord
ereds
timuli
includ
edfive
itemsf
romeac
hof12
semant
iccategor
ies(an
imals,
bodyp
arts,
buildin
gs,bui
ldingp
arts,cl
othing
,furnit
ure,in
sects,
kitchen
items,t
ools,v
egetab
les,veh
icles,a
ndoth
erma
n-made
items).
Arepr
esenta
tivefM
RIima
gefor
eachs
timulu
swas
createdb
ycom
puting
theme
anfM
RIrespon
seove
ritss
ixpre
sentati
ons,an
dthe
mean
ofall
60of
these
repres
entativ
eimage
swas
thens
ubtrac
tedfro
meac
h[for
details
,see(
26)].
Toins
tantiat
eourm
odeling
framew
ork,w
efirst
chosea
setofi
nterm
ediate
semant
icfeat
ures.T
obe
effectiv
e,the
interm
ediate
semant
icfea
turesm
ustsim
ultaneo
uslye
ncode
thewid
evarie
tyofse
mantic
conten
tofth
einput
stimulu
sword
sand
factor
theobs
erved
fMRI
activa
tioninto
morep
rimitiv
ecom
-
Predic
ted“ce
lery”
= 0.84
“celer
y”“ai
rplan
e”
Predic
ted:
Obse
rved:
AB
+...
high
averag
e below
averag
e
Predic
ted “c
elery”
:
+ 0.35
+ 0.32
“eat”
“taste
”“fill
”
Fig.2
.Pred
ictingfM
RIima
gesfor
given
stimulu
sword
s.(A)
Formin
gapre
diction
forpar
-tici
pantP
1for
thestim
ulus
word“
celery”
aftertr
aining
on58
otherw
ords.L
earned
c vico-
efficien
tsfor
3ofth
e25s
e-ma
nticfea
tures(“
eat,”“
taste,”
and“fil
l”)are
depicte
dbyth
evox
elcolo
rsinth
ethree
images
atthe
topoft
hepan
el.The
co-occ
urrence
valuef
oreac
hofth
esefea
turesfo
rthes
timulu
sword
“celery
”issho
wntot
heleft
ofthei
rrespe
ctiveim
ages[e
.g.,the
valuefo
r“eat(
celery)”
is0.8
4].The
predict
edacti
vation
forthe
stimulu
sword
[shown
atthe
bottom
of(A)]
isaline
arcom
binatio
nofth
e25s
emant
icfMR
Isigna
tures,w
eighte
dby
theirc
o-occu
rrence
values
.Thisf
igure
shows
justo
nehor
izonta
lslice
[z=
–12mm
inMo
ntreal
Neurolo
gical
Institu
te(MN
I)spac
e]of
thepre
dicted
three-
dimens
ional
image.
(B)Pre
dicted
andobs
erved
fMRIim
agesf
or“ce
lery”a
nd“ai
rplane”
aftertr
aining
thatus
es58o
therw
ords.T
hetwo
long
redand
bluev
ertical
streaks
nearth
etop(p
osterio
rregio
n)ofth
epred
icted
andobs
erved
images
arethe
leftand
rightfu
siform
gyri.
A B
C
Mean overparticipants
Participant P5
Fig.3
.Loca
tionso
fmo
stacc
urately
pre-
dicted
voxels.
Surfac
e(A)
andgla
ssbrain
(B)ren
dering
ofthe
correla
-tion
betwee
npred
icted
andactu
alvoxe
lactiva
-tion
sforw
ordso
utside
thetrai
nings
etfor
par-
ticipant
P5.The
sepane
lsshow
clusters
contain
ingatl
east10
contigu
ousvox
els,eac
hofw
hose
predict
ed-actu
alcorre
lation
isatle
ast0.2
8.Thes
evoxe
lcluste
rsared
istribu
tedthro
ughout
thecor
texand
located
inthe
leftand
righto
ccipital
andpar
ietallo
bes;le
ftand
rightfu
siform,
postcen
tral,an
dmidd
lefron
talgyr
i;leftin
feriorf
rontal
gyrus;
media
lfronta
lgyrus;
andant
erior
cingula
te.(C)
Surface
render
ingof
thepre
dicted-
actual
correla
tionave
raged
overa
llnine
particip
ants.T
hispan
elrepr
esents
clusters
contain
ingatl
east1
0cont
iguous
voxels,
eachw
ithave
rageco
rrelatio
nofat
least0.
14.
30MA
Y200
8VO
L320
SCIEN
CEww
w.scie
ncema
g.org
1192RESE
ARCH
ARTIC
LES
on May 30, 2008 www.sciencemag.org Downloaded from
✔ ✔ ✔
✔ ✔ ✔
✔ ✔
Y
X
7 SIAM, July 2017 (c) C. Faloutsos, 2017
CMU SCS Coupled Matrix-Tensor Factorization
(CMTF)
Y X
a1 aF
b1 bF
c1 cF
a1 aF
d1 dF
SIAM, July 2017 8 (c) C. Faloutsos, 2017
+
+
CMU SCS Neuro-semantics
50 100 150 200 250
50
100
150
200
250
3000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
Premotor Cortex
50 100 150 200 250
50
100
150
200
250
300
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Group1
Group 2 Group 4
Group 3
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?
bear does it grow?cow is it alive?coat was it ever alive?
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
bed does it use electricity?house can you sit on it?car does it cast a shadow?
bed does it use electricity?house can you sit on it?car does it cast a shadow?
Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.
v
1
and v
2
which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:
kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2
Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch
we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.
5 Experiments
We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used
5http://www.cs.cmu.edu/
~
epapalex/src/turbo_smt.zip
the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-
SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.
5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that
6http://www.models.life.ku.dk/joda/CMTF_Toolbox
9 SIAM, July 2017 (c) C. Faloutsos, 2017
=
CMU SCS Neuro-semantics
10 SIAM, July 2017 (c) C. Faloutsos, 2017
50 100 150 200 250
50
100
150
200
250
3000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
Premotor Cortex
50 100 150 200 250
50
100
150
200
250
300
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Group1
Group 2 Group 4
Group 3
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?
bear does it grow?cow is it alive?coat was it ever alive?
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
bed does it use electricity?house can you sit on it?car does it cast a shadow?
bed does it use electricity?house can you sit on it?car does it cast a shadow?
Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.
v
1
and v
2
which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:
kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2
Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch
we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.
5 Experiments
We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used
5http://www.cs.cmu.edu/
~
epapalex/src/turbo_smt.zip
the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-
SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.
5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that
6http://www.models.life.ku.dk/joda/CMTF_Toolbox
Small items -> Premotor cortex
=
✔ Unsupervised ✔ Matches intuition
CMU SCS Neuro-semantics
11 SIAM, July 2017 (c) C. Faloutsos, 2017
50 100 150 200 250
50
100
150
200
250
3000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
Premotor Cortex
50 100 150 200 250
50
100
150
200
250
300
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Group1
Group 2 Group 4
Group 3
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?
bear does it grow?cow is it alive?coat was it ever alive?
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
bed does it use electricity?house can you sit on it?car does it cast a shadow?
bed does it use electricity?house can you sit on it?car does it cast a shadow?
Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.
v
1
and v
2
which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:
kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2
Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch
we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.
5 Experiments
We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used
5http://www.cs.cmu.edu/
~
epapalex/src/turbo_smt.zip
the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-
SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.
5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that
6http://www.models.life.ku.dk/joda/CMTF_Toolbox
Evangelos Papalexakis, Tom Mitchell, Nicholas Sidiropoulos, Christos Faloutsos, Partha Pratim Talukdar, Brian Murphy, Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200x, SDM 2014
Small items -> Premotor cortex
CMU SCS
Roadmap • Applications – pattern discovery
– Brain scans – coupled matrix-tensor factorization
– Power grid • Applications – anomaly detection • Algorithms • Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 12
CMU SCS
PowerCast Mining electric power data
SIAM, July 2017 (c) C. Faloutsos, 2017 13
Hyun Ah Song, Bryan Hooi, Marko Jereminov, Amritanshu Pandey, Larry Pileggi, and CF, PowerCast: Mining and Forecasting Power Grid Sequences, PKDD’17, Skopje, FYROM
CMU SCS
Problem definition • Given: real and imaginary current and voltage • Forecast: power demand in the future, and • Guess: how the forecasts will change under
various scenarios (e.g. population drops in half, etc)
SIAM, July 2017 (c) C. Faloutsos, 2017 14
… … … …
What-if: population drops? What-if: more factories are built?
?
?
?
?
Given Forecast
What-if: normal condition? PowerCast
CMU SCS
Domain knowledge
SIAM, July 2017 (c) C. Faloutsos, 2017 15
Coils/capacitors
resistors Ir
Ii
CMU SCS
PowerCast
SIAM, July 2017 (c) C. Faloutsos, 2017 16
!"!#$"$#
!"!#$"$#
timeoftheday
days %…………
…
Tensorize Transfertoexpertdomain
&'("(#
timeoftheday
days %
!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())
/"0
1"2"
= 3 4"5
"67×
ARSAR…
oror
Extensionby
Forecast(days)/"
1"2"
= 3 4"5
"67×&'
("(#
%
&'("(#
…%
Tensorextension(forecast)
Tensordecomposition Backtooriginal form(fromBIGdomain)
!"!#$"$#
…%
DETAILS
CMU SCS
PowerCast
SIAM, July 2017 (c) C. Faloutsos, 2017 17
!"!#$"$#
!"!#$"$#
timeoftheday
days %…………
…Tensorize Transferto
expertdomain
&'("(#
timeoftheday
days %
!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())
/"0
1"2"
= 3 4"5
"67×
ARSAR…
oror
Extensionby
Forecast(days)/"
1"2"
= 3 4"5
"67×&'
("(#
%
&'("(#
…%
Tensorextension(forecast)
Tensordecomposition Backtooriginal form(fromBIGdomain)
!"!#$"$#
…%
DETAILS
CMU SCS
PowerCast
SIAM, July 2017 (c) C. Faloutsos, 2017 18
!"!#$"$#
!"!#$"$#
timeoftheday
days %…………
…
Tensorize Transfertoexpertdomain
&'("(#
timeoftheday
days %
!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())
/"0
1"2"
= 3 4"5
"67×
ARSAR…
oror
Extensionby
Forecast(days)/"
1"2"
= 3 4"5
"67×&'
("(#
%
&'("(#
…%
Tensorextension(forecast)
Tensordecomposition Backtooriginal form(fromBIGdomain)
!"!#$"$#
…%
DETAILS
CMU SCS
Tensor factors (concepts)
SIAM, July 2017 (c) C. Faloutsos, 2017 19
weekends weekends weekends 9am 6pm4pm
“background”“human-activities”
G(e.g.““)
G(e.g.““)
B(e.g.“”)
B(e.g.“”)
1st component:
2nd component:
Long-term-concepts (ur)
Daily-concepts (vr)
User-profile-concepts (wr)
!"!#$"$#
!"!#$"$#
timeoftheday
days %…………
…
Tensorize Transfertoexpertdomain
&'("(#
timeoftheday
days %
!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())
/"0
1"2"
= 3 4"5
"67×
ARSAR…
oror
Extensionby
Forecast(days)/"
1"2"
= 3 4"5
"67×&'
("(#
%
&'("(#
…%
Tensorextension(forecast)
Tensordecomposition Backtooriginal form(fromBIGdomain)
!"!#$"$#
…%
CMU SCS
Forecast
SIAM, July 2017 (c) C. Faloutsos, 2017 20
PowerCast forecasts 24-steps (1-day) ahead on CMU data more accurate than other competitors
CMU SCS
Anomaly detection & explanation
SIAM, July 2017 (c) C. Faloutsos, 2017 21
Anomaly! A: Q: What happened?
‘background’ component increased!
“background”“human-activities”
G(e.g.““)
G(e.g.““)
B(e.g.“”)
B(e.g.“”)
1st component:
2nd component:
Case 1?
Case 2?
?:
CMU SCS
Anomaly detection & explanation
SIAM, July 2017 (c) C. Faloutsos, 2017 22
Anomaly! A: Q: What happened?
‘background’ component increased!
“background”“human-activities”
G(e.g.““)
G(e.g.““)
B(e.g.“”)
B(e.g.“”)
1st component:
2nd component:
Case 1?
Case 2?
?: Case 1?
Case 2?
80ºF 88ºF
CMU SCS
Roadmap • Applications – pattern discovery • Applications – anomaly detection
– Phone-call data – Intrusion detection
• Algorithms • Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 23
CMU SCS
Anomalies in phone-call data • PARAFAC decomposition • Results for who-calls-whom-when
– 4M x 15 days
SIAM, July 2017 (c) C. Faloutsos, 2017 24
= + + caller
callee
?? ?? ??
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day!
1 caller 5 receivers 4 days of activity
SIAM, July 2017 25 (c) C. Faloutsos, 2017
=
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day!
1 caller 5 receivers 4 days of activity
SIAM, July 2017 26 (c) C. Faloutsos, 2017
=
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day!
1 caller 5 receivers 4 days of activity
Tencent, 6/22 27 (c) C. Faloutsos, 2017
=
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day!
SIAM, July 2017 28 (c) C. Faloutsos, 2017
=
Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos Papalexakis, Danai Koutra. Com2: Fast Automatic Discovery of Temporal (Comet) Communities. PAKDD 2014, Tainan, Taiwan.
CMU SCS
Roadmap • Applications – pattern discovery • Applications – anomaly detection
– Phone-call data – Intrusion detection
• Algorithms • Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 29
CMU SCS
ParCube at Work: Port-Scanning
LBNL Network Tra�c This dataset consists of (source, destination, port #)triplets, where each value of the corresponding tensor is the number of packetssent. The snapshot of the dataset we used, formed a 65170 ⇥ 65170 ⇥ 65327tensor of 27269 non-zeros. We ran Algorithm 3 using s = 5 and r = 10 and wewere able to identify what appears to be a port-scanning attack: The componentshown in Fig. 9 contains only one source address (addr. 29571), contacting onedestination address (addr. 30483) using a wide range of near-consecutive ports(while sending the same amount of packets to each port), a behaviour whichshould certainly raise a flag to the network administrator, indicating a possibleport-scanning attack.
0 1 2 3 4 5 6 7
x 104
0
10
20
0 1 2 3 4 5 6 7
x 104
0
0.5
1
0 1 2 3 4 5 6 7
x 104
0
0.05
0.1
Src
Dst
Port Scaning AttackPort
Fig. 9. Anomaly on the Lbnl data: We have one source address (addr. 29571), con-tacting one destination address (addr. 30483) using a wide range of near-consecutiveports, possibly indicating a port scanning attack.
Facebook Wall posts This dataset 5 first appeared in [25]; the specific partof the dataset we used consists of triplets of the form (Wall owner, Poster,day), where the Poster created a post on the Wall owner’s Wall on the specifiedtimestamp. By choosing daily granularity, we formed a 63891 ⇥ 63890 ⇥ 1847tensor, comprised of 737778 non-zero entries; subsequently, we ran Algorithm 3using s = 100 and r = 10. In Figure 10 we present our most surprising findings:On the left subfigure, we demonstrate what appears to be the Wall owner’sbirthday, since many posters posted on a single day on this person’s Wall; thisevent may well be characterized as an ”anomaly”. On the right subfigure, wedemonstrate what ”normal” Facebook activity looks like.
NELL This dataset consists of triplets of the form (noun-phrase, noun-phrase,context). which form a tensor with assorted modes of size 14545⇥14545⇥28818and 76879419 non-zeros, and as values the number of occurrences of each triplet.The context phrase may be just a verb or a whole sentence. After computing theParafac decomposition of the tensor using ParCube with s = 500, and r = 10repetitions, we computed the noun-phrase similarity matrix AAT + BBT and
5 Download Facebook at http://socialnetworks.mpi-sws.org/data-wosn2009.
html
Src
Dst
Port
+…+ ≅ LBNL Network Data
SIAM, July 2017 (c) C. Faloutsos, 2017 30 Papalexakis et al. ECML-PKDD 2012
CMU SCS
Roadmap • Applications – pattern discovery • Applications – anomaly detection • Algorithms
– ParCube (and TurboSMT) – S-HOT for higher order Tucker
• Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 31
CMU SCS ParCube/Turbo-SMT: Triple-sparse Parallel Tensor & Coupled
Decomposition
Xr
X1
X
≈
≈ …
FACTORMERGE
32 SIAM, July 2017 (c) C. Faloutsos, 2017 Papalexakis et al. ECML-PKDD 2012 / SDM 2014
CMU SCS Speedup
SIAM, July 2017 33
Baseline (ALS)
~ 1 day
4 Intel Xeon E74850 512Gb RAM, Fedora 14
Data size ~500Mb
0123456789
101112
0.001 0.01 0.1 1
Rela%veerror
Rela%verun%me
8 workers
100x faster
1/20
1/10
1/5 1/2
sample size
(c) C. Faloutsos, 2017
CMU SCS
Roadmap • Applications – pattern discovery • Applications – anomaly detection • Algorithms
– ParCube (and TurboSMT) – S-HOT for higher order Tucker
• Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 34
CMU SCS
• Tucker Decomposition:
SIAM, July 2017 35 (c) C. Faloutsos, 2017
Jinoh Oh, Kijung Shin, Evangelos E. Papalexakis, Christos Faloutsos, and Hwanjo Yu, S-HOT: Scalable High-Order Tucker Decomposition, WSDM 2017
S-HOT: Scalable High-Order Tucker Decomposition
X C A(2)
A(1
)
CMU SCS
High-Order Tucker Decomposition (Example)
• Input tensor – X: a 5-way sparse tensor (size: 1M … 1M, #non-zeros: 100M)
• Output tensors: – C: a 5-way core tensor (size: 10 … 10, #non-zeros : ~100K) – A, …, A(5): factor matrices (size: 1M 10, #non-zeros : ~10M)
SIAM, July 2017 (c) C. Faloutsos, 2017 36
CMU SCS
High-Order Tucker Decomposition (Main idea)
SIAM, July 2017 (c) C. Faloutsos, 2017 37
Huge, & dense
CMU SCS
High-Order Tucker Decomposition (Main idea)
SIAM, July 2017 (c) C. Faloutsos, 2017 38
Huge, & dense
-> DON’T materialize it
CMU SCS
Scalability of S-HOT • Baselines: Standard ALS (Naïve), & (opt)
SIAM, July 2017 (c) C. Faloutsos, 2017 45
CMU SCS
Scalability of S-HOT • Baselines: Standard ALS (Naïve), & (opt) • I = 1M; J=10; M=1B; N=5 modes
SIAM, July 2017 (c) C. Faloutsos, 2017 46
CMU SCS
Discovery using S-HOT • Microsoft Academic Graph (42M papers × 25K venues × 115M authors × 54K keywords)
SIAM, July 2017 (c) C. Faloutsos, 2017 47
CMU SCS
Discovery using S-HOT • Microsoft Academic Graph (42M papers × 25K venues × 115M authors × 54K keywords)
SIAM, July 2017 (c) C. Faloutsos, 2017 48
CMU SCS
Roadmap • Applications – pattern discovery • Applications – anomaly detection • Algorithms
– ParCube (and TurboSMT) – S-HOT for higher order Tucker
• Conclusions
SIAM, July 2017 (c) C. Faloutsos, 2017 49
CMU SCS
(c) C. Faloutsos, 2017 50
Thanks
SIAM, July 2017
Thanks to: NSF IIS-0705359, IIS-0534205, CTA-INARC; Yahoo (M45), LLNL, IBM, SPRINT, Google, INTEL, HP, iLab
Disclaimer: All opinions are mine; not necessarily reflecting the opinions of the funding agencies
CMU SCS
(c) C. Faloutsos, 2017 51
CONCLUSION#1
• MANY applications for tensors
SIAM, July 2017
=
50 100 150 200 250
50
100
150
200
250
3000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
50 100 150 200 250
50
100
150
200
250
3000
0.01
0.02
0.03
0.04
0.05
Premotor Cortex
50 100 150 200 250
50
100
150
200
250
300
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Group1
Group 2 Group 4
Group 3
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
beetle can it cause you pain?pants do you see it daily?bee is it conscious?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?
Nouns Questions Nouns Questions
Nouns Questions
beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?
bear does it grow?cow is it alive?coat was it ever alive?
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’
bed does it use electricity?house can you sit on it?car does it cast a shadow?
bed does it use electricity?house can you sit on it?car does it cast a shadow?
Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.
v
1
and v
2
which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:
kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2
Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch
we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.
5 Experiments
We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used
5http://www.cs.cmu.edu/
~
epapalex/src/turbo_smt.zip
the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-
SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.
5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that
6http://www.models.life.ku.dk/joda/CMTF_Toolbox
CMU SCS
(c) C. Faloutsos, 2017 52
CONCLUSION#2 • Domain-experts: valuable
SIAM, July 2017
Q: What happened?
Case 1?
Case 2?
80ºF 88ºF
CMU SCS
(c) C. Faloutsos, 2017 53
CONCLUSION#2 • Domain-experts: valuable
SIAM, July 2017
Q: What happened?
Case 1?
Case 2?
80ºF 88ºF
Thank you!