Post on 03-Nov-2014
description
transcript
November 18th, 2010, Frankfurt, Germany
Latent Semantics and Social Interaction
Fridolin WildKMi, The Open University
<2>
OutlineContext & Framing Theories
Latent Semantic Analysis (LSA)
(Social) Network Analysis (S/NA)
Meaningful Interaction Analysis (MIA)
Outlook: Analysing Practices
<3>
Context & Theories
<4>
what is
Information
Meaning could be the quality of a certain signal.
Meaning could be a logical abstractor = a release mechanism.
(96dpi)
meaning
<5>
meaning is social
Network effects make a network of shared understandings more valuable with growing size: allowing e.g. ‘distributed cognition’.
To develop a shared understanding is a natural and necessary process, because language underspecifies meaning: future understanding builds on it
At same time: linguistic relativity (Sapir-Whorf hypothesis): our own language culture restricts our thinking
<6>
<7>
Concepts & Competencethings we can (not) construct from language
Tying shoelaces
Douglas Adams’ ‘meaning of liff’:
Epping: The futile movements of forefingers and eyebrows used when failing to attract the attention of waiters and barmen.
Shoeburyness: The vague uncomfortable feeling you get when sitting on a seat which is still warm from somebody else's bottom
I have been convincingly
Sapir-Whorfed by this book.
<8>
A “Semantic Community”
Associative Closeness
Concept (disambiguated term)
Person
Social relation
LSA SNA
<9>
LSA
<10>
Latent Semantic Analysis Two-Mode factor analysis of the co-
occurrences in the terminology
Results in a latent-semantic vector
space
“Humans learn word meanings and how to
combine them into passage meaning
through experience with ~paragraph
unitized verbal environments.”
“They don’t remember all the separate
words of a passage; they remember
its overall gist or meaning.”
“LSA learns by ‘reading’ ~paragraph
unitized texts that represent the
environment.”
“It doesn’t remember all the separate words
of a text it; it remembers
its overall gist or meaning.”
-- Landauer, 2007
<11>
latent-semantic space
Singular values (factors, dims, …)
TermLoadings
Document Loadings
<12>
The meaning of "life" =
0.0465 -0.0453 -0.0275 -0.0428 0.0166 -0.0142 -0.0094 0.0685 0.0297 -0.0377
-0.0166 -0.0165 0.0270 -0.0171 0.0017 0.0135 -0.0372 -0.0045 -0.0205 -0.0016
0.0215 0.0067 -0.0302 -0.0214 -0.0200 0.0462 -0.0371 0.0055 -0.0257 -0.0177
-0.0249 0.0292 0.0069 0.0098 0.0038 -0.0041 -0.0030 0.0021 -0.0114 0.0092
-0.0454 0.0151 0.0091 0.0021 -0.0079 -0.0283 -0.0116 0.0121 0.0077 0.0161
0.0401 -0.0015 -0.0268 0.0099 -0.0111 0.0101 -0.0106 -0.0105 0.0222 0.0106
0.0313 -0.0091 -0.0411 -0.0511 -0.0351 0.0072 0.0064 -0.0025 0.0392 0.0373
0.0107 -0.0063 -0.0006 -0.0033 -0.0403 0.0481 0.0082 -0.0587 -0.0154 -0.0342
-0.0057 -0.0141 0.0340 -0.0208 -0.0060 0.0165 -0.0139 0.0060 0.0249 -0.0515
0.0083 -0.0303 -0.0070 -0.0033 0.0408 0.0271 -0.0629 0.0202 0.0101 0.0080
0.0136 -0.0122 0.0107 -0.0130 -0.0035 -0.0103 -0.0357 0.0407 -0.0165 -0.0181
0.0369 -0.0295 -0.0262 0.0363 0.0309 0.0180 -0.0058 -0.0243 0.0038 -0.0480
0.0008 -0.0064 0.0152 0.0470 0.0071 0.0183 0.0106 0.0377 -0.0445 0.0206
-0.0084 -0.0457 -0.0190 0.0002 0.0283 0.0423 -0.0758 0.0005 0.0335 -0.0693
-0.0506 -0.0025 -0.1002 -0.0178 -0.0638 0.0513 -0.0599 -0.0456 -0.0183 0.0230
-0.0426 -0.0534 -0.0177 0.0383 0.0095 0.0117 0.0472 0.0319 -0.0047 0.0534
-0.0252 0.0266 -0.0210 -0.0627 0.0424 -0.0412 0.0133 -0.0221 0.0593 0.0506
0.0042 -0.0171 -0.0033 -0.0222 -0.0409 -0.0007 0.0265 -0.0260 -0.0052 0.0388
0.0393 0.0393 0.0652 0.0379 0.0463 0.0357 0.0462 0.0747 0.0244 0.0598
-0.0563 0.1011 0.0491 0.0174 -0.0123 0.0352 -0.0368 -0.0268 -0.0361 -0.0607
-0.0461 0.0437 -0.0087 -0.0109 0.0481 -0.0326 -0.0642 0.0367 0.0116 0.0048
-0.0515 -0.0487 -0.0300 0.0515 -0.0312 -0.0429 -0.0582 0.0730 -0.0063 -0.0479
0.0230 -0.0325 0.0240 -0.0086 -0.0401 0.0747 -0.0649 -0.0658 -0.0283 -0.0184
-0.0297 -0.0122 -0.0883 -0.0138 -0.0072 -0.0250 -0.1139 -0.0172 0.0507 0.0252
0.0307 -0.0821 0.0328 0.0584 -0.0216 0.0117 0.0801 0.0186 0.0088 0.0224
-0.0079 0.0462 -0.0273 -0.0792 0.0127 -0.0568 0.0105 -0.0167 0.0923 -0.0843
0.0836 0.0291 -0.0201 0.0807 0.0670 0.0592 0.0312 -0.0272 -0.0207 0.0028
-0.0092 0.0385 0.0194 -0.0451 0.0002 -0.0041 0.0203 0.0313 -0.0093 -0.0444
0.0142 -0.0458 0.0223 -0.0688 -0.0334 -0.0361 -0.0636 0.0217 -0.0153 -0.0458
-0.0322 -0.0615 -0.0206 0.0146 -0.0002 0.0148 -0.0223 0.0471 -0.0015 0.0135
(Landauer, 2007)
<13>
Associative Closeness
m
ii
m
ii
m
iii
ba
ba
1
2
1
2
1cos
Term 1
Document 1
Document 2Angle 2
Angle 1
Y d
ime
ns
ion
X dimension
You need factor stability?
> Project using fold-ins!
<14>
Example: Classic Landauer
{ M } =
Deerwester, Dumais, Furnas, Landauer, and Harshman (1990): Indexing by Latent Semantic Analysis, In: Journal of the American Society for Information Science, 41(6):391-407
Only the red terms appear in more than one document, so strip the rest.
term = feature
vocabulary = ordered set of features
TEXTMATRIX
<15>
Reconstructed, Reduced Matrix
m4: Graph minors: A survey
<16>
doc2doc - similarities
Unreduced = pure vector space model
- Based on M = TSD’
- Pearson Correlation over document vectors
reduced
- based on M2 = TS2D’
- Pearson Correlation over document vectors
<17>
(S)NA
<18>
Social Network Analysis
Existing for a long time (term coined 1954)
Basic idea: Actors and Relationships between them (e.g.
Interactions) Actors can be people (groups, media, tags, …) Actors and Ties form a Graph (edges and nodes) Within that graph, certain structures can be
investigated
• Betweenness, Degree of Centrality, Density, Cohesion
• Structural Patterns can be identified (e.g. the Troll)
<19>
Constructing a network
from raw data
forum postings
incidence matrix IM
adjacency matrix AMIM x IMT
<20>
Visualization: Sociogramme
<21>
Measuring Techniques (Sample)
Degree Centralitynumber of (in/out) connections to others
Closenesshow close to all others
Betweennesshow often intermediary
Componentse.g. kmeans cluster (k=3)
<22>
Example: Joint virtual meeting attendance(Flashmeeting co-attendance in the Prolearn Network of Excellence)
<23>
Example: Subscription structures in a blogging network (2nd trial of the iCamp project)
<24>
MIA
<25>
Meaningful Interaction Analysis (MIA)
Combines latent semantics with the means of network analysis
Allows for investigating associative closeness structures at the same time as social relations
In latent-semantic spaces onlyor in spaces with additional and different (!) relations
<26>
The mathemagics behindMeaningful Interaction Analysis
<27>
Contextualised Doc & Term Vectors
Tk = left-hand sided matrix
= ‚term loadings‘ on the singular value
Dk = right-hand sided matrix
= ‚document loadings‘ on the singular value
Multiply them into same space VT = Tk Sk
VD = DkT
Sk
Cosine Closeness Matrix over ... = adjacency matrix = a graph
More: e.g. add author vectors VA through cluster centroids or vector addition of their publication vectors
latent-semantic space
DT VV
ADT VVV
Speed: use existing space and fold in e.g. author vectors
<28>
Network Analysis
<29>
MIA of the classic Landauer
<30>
Influencing Parameters
(LSA)
Pearson(eu, österreich)
Pearson(jahr, wien)
<31>
<32>
<33>
<34>
Capturing traces in text: medical student case report
<35>
Internal latent-semantic graph structure (MIA output)
<36>
<37>
Outlook: Practices?
<38>
<39>
Mash-Up Personal Learning
Environment
<40>
<41>
<42>
Conclusion
<43>
Conclusion
Both LSA and SNA alone are not
sufficient for a modern
representation theory
MIA provides one possible
bridge between them
It is a powerful technique
And it is simple to use (in R)
<44>
#eof.