Date post: | 02-Dec-2014 |
Category: |
Documents |
Upload: | timo-honkela |
View: | 342 times |
Download: | 2 times |
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Einsiedeln 23rd of May 2014
Self-Organizing Map as a Means for
Gaining Perspectives
Timo Honkela
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Timo Honkela
23 May 2014
Self-Organizing Mapas a Means for
Gaining Perspectives
Metalithicum # 5Computation as literacy: Self Organizing Maps
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Part I:
The Self-Organizing Map
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Teuvo Kohonen before the SOM
● School time interest in mathematics, physics, chemistry, psychology, radio technology, etc.
● Studies at Helsinki University of Technology in theoretical physics, PhD in 1962, Professor 1963-
● First designer of a computer in Finland (REFLAC), mid-1960s, keen interest on analog computers
● Visiting professor, University of Washington 1968-69● Research professor (funded by Academy of Finland),
1975-● Book “Associative Memory: A Systems-Theoretical
Approach”, 1978
Anderson, James A., and Edward Rosenfeld, eds. Talking nets: An oral history of neural networks. MiT Press, 2000.
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Kohonen, Teuvo (1982). "Self-Organized Formation of Topologically Correct Feature Maps". Biological Cybernetics 43 (1): 59–69.
Kohonen, T. (1981). Self-organized formation of generalized topological maps of observations in a physical system. Report TKK-F-A450, Helsinki University of Technology, Espoo, Finland.
First SOM publications
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Google:
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
SO
M in
trod
uctio
n
(Honkela 1997)
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Milos Manic
“Poverty map”Kaski & Kohonen
“Pockets Full of Memories”Legrady, Honkela et al.
André Skupin
“Map of Mozart”Rauber, Lidy &Mayer
“WEBSOM”Honkela, Kaski,
Kohonen & Lagus
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Variants of the SOM
● Input● Network structure● Learning rule
– Information-theoretical
– Probabilistic
● Recurrent and recursive versions● Operator maps for dynamic phenomena● Output presentation and postprocessing (clustering,
coloring, etc.)● Etc.
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Views into the SOM
● Vector quantization● Dimensionality reduction (visualization)● (Clustering)● Cortical modeling● Conceptualization (“semantification”)● Cognitive function modeling● Antidote against categorical thinking● ...
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Different kinds of input
Somervuo & Kohonen (1999): Self-organizing maps and learning vector quantization for feature sequences. Neural Processing Letters.
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Different kinds of map structures
● Fixed topology (rectangular, hexagonal)
● Fixed unusual topology (e.g. portrait of Mozart)
● Different dimensionalities (1-, 2-, 3-,..., mixed)
● Growing neural gas● Hierarchical maps● Etc. etc.
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Some other Kohonen algorithms
● Correlation matrix memories (1972)● Median strings (1985)● Learning Vector Quantization (1986)● Dynamically Expanding Context (1986)● Self-learning musical grammar (1989)● Adaptive Subspace SOM (1996)● Symbol string SOM (1998)● Evolutionary SOM (1999)● Self-organizing neural projections (2006)
Years are partly approximate
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Part II:
Perspectives tolanguage, cognitionand human knowing
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Classical example: Learning meaning from context:
Maps of words in Grimm fairy tales
Honkela, Pulkki & Kohonen 1995
Automated learning of word re
lations
using self-organizing m
ap on text c
ontext data
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Map of Finnish Science
Chemistry
Physics andengineering
Biosciences
Medicine
Culture and society
A fully automated process from terminology extraction (Likey) to semantic space construction (SOM) without any manually constructed resources.
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
You can measure
things that were not
measurable before
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
A. Measuring meaning
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Challenges:
“Language is BIG”
“Human INTERPRETATION isinherently involved”
Texts as input instead of measurements
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Example:
Complexity ofFinnish at thelevel of wordforms
Kimmo Koskenniemi (2013):Johdatus kieliteknologiaan,sen merkitykseen ja sovelluksiin(Introduction to language technology, its significance andapplications)
https://helda.helsinki.fi/bitstream/handle/10138/38503/kt-johd.pdf?sequence=1
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
> 6000 languages,many more dialects Billions of people
blogs.state.gov
en.wikipedia.org
A large number ofdifferent cultures
en.wikipedia.org A vast number of ways to relatelanguage, concepts andthe world to each other
Simulating processes of language emergence and communication 22
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Language as a system
● Considering natural language as a signal and dynamic system at cognitive and social levels (also in its written form) rather than a symbolic and logical system
● Importance of embodiment (cf. e.g. Harnad) and embeddedness (cf. e.g. Edelman)
● Learning and pattern recognition processes are essential (as opposed to the theories presented e.g. by Chomsky, Fodor, Pinker); much of the learning is bound to be unsupervised
Simulating processes of language emergence and communication 23
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Predicate logic is not about meaning
● Formalisms like first-order predicate logic have widely been used as a basis for theories of meaning; consider also contemporary efforts such as Semantic Web
● These formalisms provide only limited means for creating in-depth theories of how language is understood
● Traditional logic provides means e.g. for modeling quantification, connectives, analytical truths and conceptual hierarchies
● However, many semantic phenomena are matters of degree. Various proposals that apply Bayesian probability theory or fuzzy sets deal with this.
Simulating processes of language emergence and communication 24
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Traditional AI & logic viewpoint
Agents Language Model of the world World
= = =
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Pattern recognition
● Even these methodological extensions do not suffice if the pattern recognition processes are not taken into account
● The world is not straightforwardly experienced as discrete objects and events but there are complex underlying cognitive processes involved
Simulating processes of language emergence and communication 26
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Agents Language
World
Model of the world
Emergentist viewpoint(importance of pattern recognition and learning)
Simulating processes of language emergence and communication 27
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
General communication system andmeasuring information (Shannon & Weaver)
INFORMATIONSOURCE TRANSMITTER RECEIVER DESTINATION
MESSAGE MESSAGE
NOISESOURCE
SIGNAL RECEIVEDSIGNAL
H = - Σ pi log piNoisy channel model
Simulating processes of language emergence and communication 28
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Weaver on Shannon
● “Relative to the broad subject of communication, there seem to be problems at three levels. [...]
– LEVEL A. How accurately can the symbols of communication be transmitted? (The technical problem)
– LEVEL B. How precisely do the transmitted symbols convey the desired meaning? (The semantic problem)
– LEVEL C. How effectively does the received meaning affect conduct in the desired way? (The effectiveness problem)”
● “The semantic problems are concerned with the identity, or satisfactorily close approximation, in the interpretation of meaning by the receiver, as compared with the intended meaning of the sender.” (1949, p. 4)
Simulating processes of language emergence and communication 29
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Distributional hypothesis
● Two words are semantically similar to the extent that their contextual representations are similar (Miller & Charles 1991)
● The meaning of words is in their use (Wittgenstein)
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Context isconcretelyrelevant
in physics
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Meaning is contextual
red winered skinred shirt
Gärdenfors: Conceptual Spaces
Hardin: Color for Philosophers
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Meaning is contextual
SNOW -WHITE?
WHITE
Simulating processes of language emergence and communication 33
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Complex challenge: differentcontexts and cultures
“Shall I compare thee to a summer's day?”
? ?
Simulating processes of language emergence and communication 34
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Modeling distributional similarity: word space models
● Word space models represent meaning as points or areas in a high dimensional vector space– Self-Organizing Semantic Maps (Ritter and Kohonen 1989)
– LSA (Landauer & Dumais 1997)
– HAL (Lund & Burgess 1996)
– Conceptual spaces (Gärdenfors 2000)
– Word ICA (Honkela, Hyvärinen & Väyrynen 2004)
– etc. etc.
Simulating processes of language emergence and communication 35
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Language as dimensionalityreduction?
ICA of wordcontexts; nonlinearitythrough thresholding
Comparisonwith SVD/LSA
Effect of sparsenessand meaningfulemergent components
Data: TOEFL tests
(Väyrynen, Lindqvist, Honkela 2007)
Simulating processes of language emergence and communication 36
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
ICA
SVDprec
isio
n
active dimensions
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Point of view fromcognitive linguistics
● The meaning of linguistic symbols in the mind of the language users derives from the users' sensory perceptions, their actions with the world and with each other.
● For example: the meaning of the word 'walk' involves– what walking looks like– what it feels like to walk and after having walked
– how the world looks when walking (e.g. objects approach at a certain speed, etc.).
– ...
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Abstract vs concrete grounding
Ronald Langacker
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Motion capture
AnimationImage analysis
Video analysis
Robotics
Machine learning
Language learning
Socio-cognitive modelingSymbol grounding
Jorma Laaksonen
Tapio Takala
Klaus Förger
Harri Valpola
Oskar Kohonen
Reinforcementlearning
Paul WagnerMarkus Koskela
Xi Chen
Learning relations
Kinect
OptiTrack
Timo Honkela
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
goo.gl / UZnvH
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Förger & Honkela, 2013
WALKING
RUNNINGRUNNING
Consider how different languagesdivide the conceptual space
in different ways(cf. e.g. Melissa Bowerman et al.)
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
B. Measuring (inter)subjectivity
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
“Einsiedeln Abbey is a Benedictine monastery in the town of Einsiedeln in the Canton of Schwyz, Switzerland. The abbey is dedicated to Our Lady of the Hermits, the title being derived from the circumstances of its foundation, for the first inhabitant of the region was Saint Meinrad, a hermit. It is a territorial abbey and, therefore, not part of a diocese, subject to a bishop. It has been a major resting point on the Way of St. James for centuries.” (Wikipedia)
Objective facts?Other points of view?
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Non-linear projections next to Hotel Drei Könige
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Meaning is subjective
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Meaning is subjective
● Good● Fair● Useful● Scientific● Democratic● Sustainable● etc.
A proper theory ofmeaning has to takethis into account
Simulating processes of language emergence and communication 49
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Experiential groundingof human knowledge
Human understanding of the world and of the relationship between language use and perception and action within the world is based on a long active and interactive learning process for which the genotype gives a certain basis but which is mainly determined by the individual interaction with the world including other human beings and the social and cultural context
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Concept Formation andCommunication - General Theory
Timo Honkela, Ville Könönen, Tiina Lindh-Knuutila, and Mari-Sanna Paukkeri. Simulating processes of concept formation and communication. Journal of Economic Methodology, 15(3):245–259, 2008.
λ : Ci × Cj → R, i ≠ jA distance between two points in the concept spaces of different agents
S: symbol space,The vocabulary of anagent that consists of discrete symbols
: sξ i S∈ i → CAn individual mapping function from symbols to concepts
φi: Si D→An individual mapping from agent i's vocabulary to the signal space D andan inverse mapping φ
1 i from the signal
space to the symbol space
Ci: Ndimensional metric concept space
Observing f1 and after symbol selection process, agent 1 communicates a symbol s*to agent 2 as signal d. When agent 2 observes d, it maps it to some s2
S∈ 2 by using the function φ 11.
Then it maps the symbol to some point in its concept space by using ξ2. If this point is close to its observation f2 in the sense of λ, the communication process has succeeded.
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
GICA:Grounded
Intersubjective Concept Analysis
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Timo Honkela, Juha Raitio, Krista Lagus, Ilari T. Nieminen, Nina Honkela, and Mika Pantzar.
Subjects on objects in contexts: Using GICA method to quantify epistemological subjectivity.
Proceedings of IJCNN 2012, International Joint Conference on Neural Networks, pp. 2875-2883, 2012.
Publication:
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Case: State of the Union Addresses
● Text mining is used in populating a Subject-Object-Context tensor
● This took place by calculating the frequencies on how often a subject uses an object word in the context of a context word– Context window of 30 words
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Analysis of the word 'health'
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
This is whyunsupervised learningis betterin most casesin comparisonwith supervised learning
Human-made categories cannotsimply be taken as a ground truth
There are even a large number ofwell grounded category systems, none of which has an objective status
Kuhn
Local … global
Simulating processes of language emergence and communication 56
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Relevance?
● A large proportion of modern human activity in its different forms (science, industry, society, culture, etc.) is based on the use of language
● There are at least 6000 languages in the world and many more dialects
● Each language has the order of 105 to 1010 different word forms
● Each word is understood differently by each speaker of that language at least to some degree
Simulating processes of language emergence and communication 57
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Relevance, cont'd
● The formal basis of in practice all information systems does not take this basic phenomenon into account
● The assumption of shared meanings is simply not adequate
● Socio-cognitive modeling is needed
Simulating processes of language emergence and communication 58
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Language use and theoryformation as social phenomena
data collectionand generalization
theories language use
regularity,variation
regularity,variation
producing/creating
learning/observing
producing/creating
producing/creating
description andharmonization
Simulating processes of language emergence and communication 59
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Emergence of a coherent lexicon in a community of interacting SOM-based agents
(Lindh-Knuutila, Lagus & Honkela, SAB'06)Related to e.g. Steels and Vogt on language games
Simulating processes of language emergence and communication 60
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Survival and reinforcementlearning in conceptual system evolution
(Honkela & Winter 2003)
Simulating processes of language emergence and communication 61
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Practical consequences
● The traditional notion of uncertainty in decision making doesnot cover the uncertainties caused by differences inconceptual systems of individual agents within a community
● In many transactions, including symbolic/linguisticcommunication, the differences in the underlying conceptualsystems play an important role
● Serious efforts have been made to harmonize or to standardizethe classification systems or ontologies used by agents
● Even if standardization is conducted, there can not be any trueguarantee that all participating agents would share themeaning of all the expressions used in the transactions invarious contexts
Simulating processes of language emergence and communication 62
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Quantifying the effect of“semantic noise”
● Sintonen, Raitio & Honkela: “Quantifying the effect of meaning variation in survey analysis”, forthcoming in ICANN 2014
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Part III:
Closing remarkson digital humanities
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Digital humanities
● Research within humanities with the help of computers– Digital resources
– Computational models
● Basic motivation– One can already fly to moon and
build sophisticated factory products
– The most important open questionsin the world are related to humanitiesand social sciences
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Digital Computational
Humanities
Contentstorage and
transfer
Contentanalysis
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Societal and
CulturalText
Mining
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Honkela, Korhonen, Lagus & Saarinen: Five-dimensional sentiment analysis of corpora, documents and words,forthcoming in WSOM 2014
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Project ఠ
(ttha,Telugu)
Science Society
Culture
Timo Honkela in Metalithicum #5: Self-Organizing Map as a Means for Gaining Perspectives, Einsiedeln, 23rd of May, 2014
Thank you for your attention!
Danke schön!Kiitos!Tack!Merci!謝謝!
Σας ευχαριστούμε!¡Gracias!