+ All Categories
Home > Documents > La fisica delle parole - Roma Tre...

La fisica delle parole - Roma Tre...

Date post: 17-Sep-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
79
La fisica delle parole alla ricerca delle origini del linguaggio Vittorio Loreto Sapienza Università di Roma Dipartimento di Fisica & Fondazione ISI, Torino
Transcript
Page 1: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

La fisica delle parolealla ricerca delle origini del linguaggio

Vittorio LoretoSapienza Università di Roma

Dipartimento di Fisica&

Fondazione ISI, Torino

Page 2: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Language dynamics

Language dynamics is an emerging field that focuses on all processesrelated to the emergence, evolution and extinction of languages.

Page 3: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Language dynamics

How did language emerge in our species?Emergence of conventions on:

names, categories, syntax structures ...

Language dynamics is an emerging field that focuses on all processesrelated to the emergence, evolution and extinction of languages.

Page 4: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Language dynamics

How did language emerge in our species?Emergence of conventions on:

names, categories, syntax structures ...

Language dynamics is an emerging field that focuses on all processesrelated to the emergence, evolution and extinction of languages.

View of language as an evolving and self-organizing system

Page 5: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Time-scales

Page 6: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Time-scales

Page 7: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Time-scales

Page 8: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Cultural time-scale

horizontal transmission

Time-scales

Page 9: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Cultural time-scale

Biological time-scale

horizontal transmission

vertical transmission

Time-scales

Page 10: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 11: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 12: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

KOOTI Farshad & CHA Meeyoung & GUMMADI Khrishna P. & MASON Winter 2012. The Emergence of Conventions in Online Social Networks, Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM), Dublin, Ireland.

Emergence of conventions in Twitter

Page 13: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Robotic exps//

Field simulations//

Web gamingTheoretical modeling

//simulations

Field work

new ICT tools

“In silico” linguistics

Traditional linguistics

Page 14: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

The “Talking Heads” experiment

L. Steels, The Talking Heads Experiment. Vol.1 - Words and Meanings, Antwerpen, (1999)

Robotic experiments

Grounded Naming Game

Page 15: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Language Games

Let us imagine a language ...The language is meant to serve for communication between a builder A and an assistant B. A is building with building-stones; there are blocks, pillars, slabs and beams. B has to pass the stones, and that in the order in which A needs them. For this purpose they use a language consisting of the words 'block', 'pillar', 'slab', 'beam'. A calls them out; --B brings the stone which he has learnt to bring at such-and-such a call. -- Conceive of this as a complete primitive language.

(L. Wittgenstein)

communication acts of increasing complexitynames, categories, syntax structures ...

Page 16: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Theoretical challenges

• What are the minimal requirements for a shared linguistic feature to emerge?

• What is the asymptotic state (absorbing state, stationary state, slow dynamics)?

• Which features lead to efficiency?

• Which is the role of the system size?

• Which is the role of topology?

Understand how global behaviors emerge out of local interactions

Page 17: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

The Naming GameHow a population of individuals bootstrap a shared name?

A. Baronchelli, A. Barrat, E. Caglioti, L. Dall’Asta, M. Felici, V. Servedio, L. Steels, F. Tria

Page 18: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 19: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

The Naming Game• Population of N agents

• Each agent is characterized by its inventory (or lexicon) i.e. a list of name-object associations

• Agents want to build a shared lexicon

• Homonymy is discarded one single object

• Peer to peer negotiation. At each time step twoagents (speaker and hearer) are selected

Page 20: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

The Naming Game• Population of N agents

• Each agent is characterized by its inventory (or lexicon) i.e. a list of name-object associations

• Agents want to build a shared lexicon

• Homonymy is discarded one single object

• Peer to peer negotiation. At each time step twoagents (speaker and hearer) are selected

local (dyadic) interactions

speaker hearer

Page 21: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

negotiation + memory + dynamic inventories

The Naming Game

[1] Baronchelli et al. J. Stat. Mech. P06014 (2006)

Page 22: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Basic quantities

N

1

The communication system is efficient

N=1000

N/2

Page 23: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Basic quantities

N

1

The communication system is efficient

N=1000

N/2

Invention

Page 24: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Basic quantities

N

1

The communication system is efficient

N=1000

N/2

Building of correlations

Page 25: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Basic quantities

N

1

The communication system is efficient

N=1000

N/2

Convergence

Page 26: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Scaling relations

Page 27: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

The system always converges though with different

modalities (role of the hubs) and time-scales

Interactions among individuals create complex networks: a population can be

represented as a graph.

AgentInteraction

fully connected networksd-dimensional lattices

small-worldrandom graphs

scale-free networks...

The role of topology

Page 28: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Regular lattices

• Fast local consensus• Convergence through coarsening

• Finite memory

Page 29: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

With probability p each link is rewired → shortcutsSmall distance among any pairs of nodes.

Finite connectivity (as in lattices) ⇒ small memory

Small-world (as in mean-field) ⇒ fast convergence

Short-time coarsening, then mean-field behavior

Small-world networks

Page 30: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Convergencetime

Maximummemory

small-worldd=1complete graph

Role of topology: summary

Regular lattices Fast local consensuscoarsening

Small-world networksShort-time coarsening, then

mean-field behavior

Page 31: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Is consensus always reached ?

As before

New parameter β:inclination to trust

other agents (usual rules: β=1)

Page 32: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Consensus/Fragmentation phase transition

Non-equilibrium phase transition in negotiation dynamics A. Baronchelli, L. Dall'Asta, A. Barrat and V. Loreto Phys. Rev. E 76, 051102 (2007)

β>1/3 stable consensus

β<1/3 stable fragmentation

Page 33: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

β<1/3: a hierarchy of transitions (Fully connected graph)

2 words

3 words

4 words

5 words

Consensus/Fragmentation phase transition

Page 34: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Evolutionary time-scales

At each time step one individual is

substituted with a blank slate with

probability r

r = inverse of the average lifetime

Substitute adult individuals with blank-slate

Page 35: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Emergence of creoles languages

with: S. Mufwene, V.D.P. Servedio, F. Tria

Page 36: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Creoles languagesA creole language is a stable natural language developed from the mixing of parent languages.

Salikoko Mufwene

sa ka pèmet - vou konpwann --> ça te permet de comprendreka pèmet klèsi teks-li --> ça lui permet d’éclaircir le texteÇa qui rivé-yo --> ce qui leur est arrivé

The vocabulary of a creole language is largely supplied by the parent languages.

The grammar often has original features that may differ substantially from those of the parent languages.

Page 37: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Europeans (free whites)

Bozals(slaves)

Mulattos (free blacks)

Page 38: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

hearer

A E

hearer

A E CNG

interactionhearer

A E C

hearer

A E

hearer

C

hearer

C

NGinteraction

γ

γ

1− γ

1− γ

Emergence of a new language (creole) from the contact of two other languages

E European languageA African languageC Creole language

Page 39: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Census data

Page 40: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(N_C

+N_B

) / (N

_E+N

_C+N

_B)

N_C/(N_C+N_B)

Louisiana (1850)Bahamas (1774)

South Carolina (1790)

Georgia (1790)

St Lucia (1776)

St Vincent (1787)

Granada (1785)

Martinique (1776)

Barbados (1786)

Guadalupe (1779)

Jamaica (1787)

St Domingue (1779)

Dominica (1788)

Isle de Bourbon (1776)

Antigua (1774)St Christopher (1774)

Nevis (1774)Cayenne (1780),

Monserrat (1774), Virgin Islands (1774)

Alabama (1820)

Virginia (1790)Mississippi (1800)

Maryland (1790)

North Carolina (1790)

Kentucky (1790)Arkansas (1820)

Tennessee (1790)

New Jersey (1790)

Missouri (1810)

Pennsylvania (1790)

Delaware (1810)

• States with creole • States without creole

(NM

+N

B)/

(NE

u+

NM

+N

B)

NM/(NM + NB)

Page 41: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(N_C

+N_B

) / (N

_E+N

_C+N

_B)

N_C/(N_C+N_B)

Louisiana (1850)Bahamas (1774)

South Carolina (1790)

Georgia (1790)

St Lucia (1776)

St Vincent (1787)

Granada (1785)

Martinique (1776)

Barbados (1786)

Guadalupe (1779)

Jamaica (1787)

St Domingue (1779)

Dominica (1788)

Isle de Bourbon (1776)

Antigua (1774)St Christopher (1774)

Nevis (1774)Cayenne (1780),

Monserrat (1774), Virgin Islands (1774)

Alabama (1820)

Virginia (1790)Mississippi (1800)

Maryland (1790)

North Carolina (1790)

Kentucky (1790)Arkansas (1820)

Tennessee (1790)

New Jersey (1790)

Missouri (1810)

Pennsylvania (1790)

Delaware (1810)

• States with creole • States without creole(N

M+

NB

)/(N

Eu

+N

M+

NB

)

NM/(NM + NB)

Louisiana; French creale in the sugarcane plantations.

Alabama: cotton plantations smaller ha sugarcane and rice plantations

Page 42: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Caribbean sea

Page 43: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

United States

Page 44: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

The Category GameHow does a population of agents establish and share an

effective set of categories?

with: A. Baronchelli, A. Puglisi and T. Gong, A. Mukherjee, F. Tria

Page 45: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

allow to quickly point out something without giving too many details (lossy compression)

are well calibrated to avoid confusion, i.e. to discriminate something among different things

in brief: must be not too large nor too small

Where do linguistic categories come from?

Page 46: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

allow to quickly point out something without giving too many details (lossy compression)

are well calibrated to avoid confusion, i.e. to discriminate something among different things

in brief: must be not too large nor too small

Where do linguistic categories come from?

Page 47: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

allow to quickly point out something without giving too many details (lossy compression)

are well calibrated to avoid confusion, i.e. to discriminate something among different things

in brief: must be not too large nor too small

Where do linguistic categories come from?

Glasscommon names

Page 48: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

allow to quickly point out something without giving too many details (lossy compression)

are well calibrated to avoid confusion, i.e. to discriminate something among different things

in brief: must be not too large nor too small

Where do linguistic categories come from?

Glasscommon names color names

red

blue

green

green

red

blue

Page 49: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 50: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

World color survey (WCS)

110 “preindustrialized” languages24 “monolingual” speakers

speakers were asked to:1. name each of the 330 munsell chips2. indicate the best example(s) of each of his basic color terms

Page 51: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Basic Color Terms name all the colors:

English (11 words)

BluePurple

Pinkyellow

Brown

GreenOrange

White

Black

Gray

Red

Courtesy of Lindsey & Brown (2006). PNAS, 102.

Page 52: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 53: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Testing universality of color naming

Paul Kay and Terry Regier“Resolving the question of color naming universals”

Proc. Natl. Acad. Sci USA (PNAS) 100, 9085 (2003).

(either real or hypothetical), we found the closest term c* in eachlanguage l* in the BK data set and added up those distances toobtain the sum S.

S ! !l!WCS,l*!BK

!c!l

minc*!l*

distance!c, c*". [2]

Comparing the value for S observed in the WCS data set to thedistribution of values obtained in 1,000 hypothetical randomiza-tions of that data set, Fig. 3b shows that the value of S for theactual WCS data is well below the lower limit of the hypotheticaldistribution. Thus, the WCS data are significantly closer to theBK data than expected by chance, P # 0.001. We then removedfrom the BK data set the only unwritten languages of nonin-dustrialized societies in that data set (Ibibio, Pomo, and Tzeltal),reran this test, and obtained the same qualitative result, P #0.001. This finding indicates a similarity in color naming acrosslanguages of industrialized and nonindustrialized societies.

These universal tendencies are shown in Fig. 4a. The floorplane of this display corresponds to the 320 chromatic (non-neutral) colors in the stimulus array of Fig. 1, and the height ofthe surface at each position represents the number of WCSspeaker centroids falling at that point in color space [MacLaury(23) displays a comparable histogram, restricted to the huedimension]. This distribution of color terms from nonindustri-alized languages is shown from above in the contour plot of Fig.

Fig. 3. Monte Carlo tests. (a) Clustering within the WCS. The distribution ofdispersion values shown in gray was obtained from 1,000 randomized datasets. The arrow indicates the dispersion value obtained from the WCS data. (b)Comparing the WCS with BK. The distribution of separation values shown ingray was obtained from 1,000 randomized data sets. The arrow indicates theseparation value obtained by comparing the WCS data with BK data (1).

Table 2. Languages studied by BK (1)

Index Language Where spoken

1 Arabic (Lebanese colloquial) Lebanon2 Bahasa Indonesia Indonesia3 Bulgarian Bulgaria4 Cantonese China5 Catalan Spain6 (American) English United States7 Hebrew Israel8 Hungarian Hungary9 Ibibio Nigeria

10 Japanese Japan11 Korean Korea12 Mandarin China13 (Mexican) Spanish Mexico14 Pomo United States15 Swahili Tanzania16 Tagalog Philippines17 Thai Thailand18 Tzeltal Mexico19 Urdu Pakistan20 Vietnamese Vietnam

Data reported from one subject per language.

Fig. 4. Distribution of color terms from nonindustrialized languages. (a) Thefloor plane corresponds to the chromatic (non-neutral) portion of the colorstimulus array. The height of the surface at each point in the plane denotes thenumber of speaker centroids in the WCS data set that fall at that position incolor space. (b) The distribution of a is viewed from above by a contour plot.The outermost contour represents a height of 100 centroids, and each subse-quent contour represents an increment in height of 100 centroids. Englishcolor terms fall near the peaks of the WCS distribution.

9088 " www.pnas.org#cgi#doi#10.1073#pnas.1532837100 Kay and Regier

We approach the issue of whether there are universal ten-dencies in color naming by asking two questions:

(i) Do color terms from different languages in the WCS clustertogether in color space to a degree greater than chance?

(ii) Do WCS color terms, all from unwritten languages ofnonindustrialized societies, fall near color terms of writtenlanguages from industrialized societies, as represented by the BKsample?

To test for clustering, we represented color terms as pointsin color space, and then tested for clustering of those points.Because the idea of clustering depends essentially on theconcept of distance, we required a color space in whichpsychologically meaningful distances can be calculated. Con-sequently we transformed our 330 color stimuli from Munsellspace, which lacks such a distance metric, to CIEL*a*b* space,which has one (22). CIEL*a*b* is a 3D color space, in whichthe L* dimension represents lightness, and the two remainingdimensions, a* and b*, define a plane orthogonal to L*, suchthat angle in that plane represents hue, and radius representssaturation. We represented each color term T in each languageL by its centroid in this space. This was computed by firstfinding, for each speaker of L who used term T, the centroidin CIEL*a*b* space of the chips named T by that speaker.These speaker centroids were then averaged together to yieldan overall term centroid for T. Finally, that term centroid wascoerced back to the chip most similar to it in the stimulus array,so that our overall representation of the term resided withinthe set of points out of which it was constructed. This coercionwas done by first selecting that row of the array with L* valuenearest that of the centroid [L* values are constant within eachvalue (i.e., lightness) row of the stimulus array]. We thenexamined two chips, the chromatic (colored) chip in that rowwith hue angle in the a*b* plane closest to the centroid, andthe neutral chip in that row, and selected the one that had hueradius in the a*b* plane closest to the average radius of thechips represented by the centroid. This selected chip was ourpoint representation of the color term.

Given such point representations of all color terms, we testedwhether these points were more clustered across languages thanwould be expected by chance, through a Monte Carlo test. Thisrequired first a measure of color-term clustering and then anindication of how clustered one might expect color terms to beby chance.

We defined a measure D of the dispersion of the terms in theWCS data set: for each color term c in each language l, we foundthe closest term c* in each other language l*, and added up those

distances. Distance between terms was defined as CIEL*a*b*distance between their point representations.

D ! !l,l*!WCS

!c!l

minc*!l*

distance!c, c*". [1]

Because D is a measure of dispersion, low values of D indicateclustering.

To determine how much dispersion one would expect bychance, we created a set of randomized hypothetical datasetsthrough computer simulation and measured dispersion inthem. Our randomization method was informed by the obser-vation that general principles of categorization operatingwithin a given language can be expected to produce a certainamount of dispersion in any natural system of categories. Wewanted to be certain that our randomized data sets obeyedsuch within-language principles of categorization. To this end,we started with the actual WCS data set and rotated eachlanguage’s term centroids in the a*b* (hue) plane by a randomamount, the same random amount for all terms within alanguage, but different random amounts for different lan-guages, as shown in Fig. 2. These rotated centroids were thencoerced back to the WCS color array in the manner describedabove. This process produced one hypothetical data set, whichpreserved within-language structure while randomizing cross-language structure, appropriately, as the latter is the centralfocus of this study.

The process creating a randomized data set was repeatedindependently 1,000 times, and the D dispersion measure wascalculated for each hypothetical data set. Fig. 3a shows thedistribution of D in the 1,000 hypothetical data sets comparedwith D in the actual WCS data. The actual WCS D value is wellbelow the lower boundary of the hypothetical distribution.

Fig. 2. Creating a randomized data set.

Fig. 1. Color array from the WCS. For the Munsell notations of the colors in this stimulus array see ref. 1.

9086 " www.pnas.org#cgi#doi#10.1073#pnas.1532837100 Kay and Regier

Human Case

We approach the issue of whether there are universal ten-dencies in color naming by asking two questions:

(i) Do color terms from different languages in the WCS clustertogether in color space to a degree greater than chance?

(ii) Do WCS color terms, all from unwritten languages ofnonindustrialized societies, fall near color terms of writtenlanguages from industrialized societies, as represented by the BKsample?

To test for clustering, we represented color terms as pointsin color space, and then tested for clustering of those points.Because the idea of clustering depends essentially on theconcept of distance, we required a color space in whichpsychologically meaningful distances can be calculated. Con-sequently we transformed our 330 color stimuli from Munsellspace, which lacks such a distance metric, to CIEL*a*b* space,which has one (22). CIEL*a*b* is a 3D color space, in whichthe L* dimension represents lightness, and the two remainingdimensions, a* and b*, define a plane orthogonal to L*, suchthat angle in that plane represents hue, and radius representssaturation. We represented each color term T in each languageL by its centroid in this space. This was computed by firstfinding, for each speaker of L who used term T, the centroidin CIEL*a*b* space of the chips named T by that speaker.These speaker centroids were then averaged together to yieldan overall term centroid for T. Finally, that term centroid wascoerced back to the chip most similar to it in the stimulus array,so that our overall representation of the term resided withinthe set of points out of which it was constructed. This coercionwas done by first selecting that row of the array with L* valuenearest that of the centroid [L* values are constant within eachvalue (i.e., lightness) row of the stimulus array]. We thenexamined two chips, the chromatic (colored) chip in that rowwith hue angle in the a*b* plane closest to the centroid, andthe neutral chip in that row, and selected the one that had hueradius in the a*b* plane closest to the average radius of thechips represented by the centroid. This selected chip was ourpoint representation of the color term.

Given such point representations of all color terms, we testedwhether these points were more clustered across languages thanwould be expected by chance, through a Monte Carlo test. Thisrequired first a measure of color-term clustering and then anindication of how clustered one might expect color terms to beby chance.

We defined a measure D of the dispersion of the terms in theWCS data set: for each color term c in each language l, we foundthe closest term c* in each other language l*, and added up those

distances. Distance between terms was defined as CIEL*a*b*distance between their point representations.

D ! !l,l*!WCS

!c!l

minc*!l*

distance!c, c*". [1]

Because D is a measure of dispersion, low values of D indicateclustering.

To determine how much dispersion one would expect bychance, we created a set of randomized hypothetical datasetsthrough computer simulation and measured dispersion inthem. Our randomization method was informed by the obser-vation that general principles of categorization operatingwithin a given language can be expected to produce a certainamount of dispersion in any natural system of categories. Wewanted to be certain that our randomized data sets obeyedsuch within-language principles of categorization. To this end,we started with the actual WCS data set and rotated eachlanguage’s term centroids in the a*b* (hue) plane by a randomamount, the same random amount for all terms within alanguage, but different random amounts for different lan-guages, as shown in Fig. 2. These rotated centroids were thencoerced back to the WCS color array in the manner describedabove. This process produced one hypothetical data set, whichpreserved within-language structure while randomizing cross-language structure, appropriately, as the latter is the centralfocus of this study.

The process creating a randomized data set was repeatedindependently 1,000 times, and the D dispersion measure wascalculated for each hypothetical data set. Fig. 3a shows thedistribution of D in the 1,000 hypothetical data sets comparedwith D in the actual WCS data. The actual WCS D value is wellbelow the lower boundary of the hypothetical distribution.

Fig. 2. Creating a randomized data set.

Fig. 1. Color array from the WCS. For the Munsell notations of the colors in this stimulus array see ref. 1.

9086 " www.pnas.org#cgi#doi#10.1073#pnas.1532837100 Kay and Regier

Page 54: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

The category gameN individuals performing binary language gamesIndividual task: discriminate stimuli from a continuous [0:1] perceptual space

Real values on the interval [0, 1]

N = 50, dmin = 0.01

N = 50, dmin = 0.02A. Puglisi, A. Baronchelli and VL

“Cultural route to the emergence of linguistic categories”Proc. Natl. Acad. Sci USA (PNAS) 105, 7936 (2008).

Page 55: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Non-uniform across the spectrum

dmin

From Long et al. 2006.

Long PH, Yang ZY, Purves D. 2006. Special statistics in natural scenes predict hue, saturation, and brightness. PNAS, 103(15): 6013-6018.

Human eyes discrimination ability Just Noticeable Difference (JND)

dmin

perceptual space

Page 56: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

“In silico” version of the WCS

Page 57: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Individual

“In silico” version of the WCS

Page 58: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Individual

Population

“In silico” version of the WCS

Page 59: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Individual

Population

World

“In silico” version of the WCS

Page 60: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Dhuman Dneutral

Individual

Population

World

“In silico” version of the WCS

Page 61: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

1 1.05 1.1 1.15 1.2normalized Dispersion

0

0.1

0.2

0.3

0.4

frequ

ency

0 0.25 0.5 0.75 1stimulus

0

0.02

0.04

JND

human neutral

randomized WCS

simulations

“In silico” version of the WCS

A. Baronchelli, T. Gong, A. Puglisi and VL, Modeling the emergence of universality in color naming patterns PNAS, 107, 2403 (2010).

Page 62: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Hierarchies of colors

with: A. Mukherjee, F. Tria

Page 63: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

if a language had a color term with a prototype at any point in the hierarchy, then it would also have color terms with prototypes at all the colors to the left of

that color

Color Implicational hierarchy

Berlin, B., & Kay, P. (1969). Basic color terms. Berkeley: University of California Press.

Page 64: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Evolutionary stages for basic color terms

Kay, P. & McDaniel, K. (1978). The Linguistic Significance of the Meanings of Basic Color Terms. Language, 54 (3): 610-646.

Page 65: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

mili cool/dark shades such as blue, green, and blackmola warm/light colours such as red, yellow, and white.

Stage I

Dani (New Guinea)

Page 66: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Stage VKalam (Papua New Guinea)

mosimb

tund

likañ muk minj-kimemb walin

Page 67: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Hierarchy in the Category Game

[red, (magenta)-red], [violet], [green/yellow], [blue, blue (dark)], [orange] and [cyan]

VL, A. Mukherjee and F. Tria, On the origin of the hierarchy of color namesPNAS 109, 2819 (2012).

Page 68: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Summary

NAMING GAME naming an object

CATEGORY GAMEcategorizing and naming the

color spaceWorld-Color Survey

Hierarchy of basic color names

BLENDING GAME naming related objects

emergence of creoles

combinatoriality & compositionalityemergence of syntax

Page 69: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Ongoing activities

syntax features: combinatoriality, compositionality

grammar: e.g. numeral systems

interplay of cultural and evolutionary time scales

complexity and regularization

...and perspectives

web-based experiments

social computation

field simulations

Page 70: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Field simulationsPut two (or more) humans into a virtual environment that requires them to coordinate their individual actions and… neutralize the use of pre-established communication systems (e.g., speech, writing, body language…)

B. Galantucci e S. Garrod (eds.)Social Behaviour and Communication in Biological and Artificial SystemsInteraction Studiesvolume 115 issue 12 (2010)

Galantucci, B. (2005). An experimental study of the emergence of human communication systems.Cognitive Science, 29 (5), 737–67.

Page 71: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Social computation

Populations of users facing collectivelydifficult problems using a small cognitive

overhead

Page 72: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

Social computation

Populations of users facing collectivelydifficult problems using a small cognitive

overhead

http://www.espgame.org/

Page 73: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 74: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

! !

!"#$%#&'(#)*+,-./

0123

45!/675

8597:617

6;-

;'$*#(< 9*#)*#

=<&+><,?<,@$<&#,

A+(#,B?+(<)?<C

3$'*+,D&'$%+$*<,+,

E##?F<AGH

=@)*#II'+,?',>#I'**'('*J

=<&+>#,,,,,,,,,,,,,,

/+))#$$'+)'

K

K L

M

NO

M

L N

NN

N

2#*#,?',<$$+A'<P'+)'!')G

"#$!%&' (#&!%&'

'#)!%&*

Page 75: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 76: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from
Page 77: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

A new platform for web-based experiments

http://www.xtribe.eu/

with: S. Caminiti, C. Cicali, P. Gravino, V.D.P. Servedio, A. Sirbu, F. Tria

Page 78: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

VL and L. Steels, Emergence of Language, Nature Phys., Vol. 3, 758-760 (2007).

A. Puglisi, A. Baronchelli and VL, Cultural route to the emergence of linguistics categories Proc. Natl. Acad. Sci. USA, 105, 7936 (2008).

C. Castellano, S. Fortunato and VL,Statistical physics of social dynamicsRev. Mod. Phys., 81, 591-645 (2009).

A. Baronchelli, T. Gong, A. Puglisi and VL, Modeling the emergence of universality in color naming patternsProc. Natl. Acad. Sci. USA, 107, 2403 (2010).

A. Mukherjee, F. Tria, A. Baronchelli, A. Puglisi and VL, Aging in language dynamicsPLoS ONE, 6, e16677 (2011).

VL, A. Mukherjee and F. Tria, On the origin of the hierarchy of color namesProc. Natl. Acad. Sci. USA, 109, 2819 (2012).

F. . Tria, B. Galantucci and VLNaming a structured world: a cultural route to duality of patterningPLoS ONE, 7(6), e37744 (2012).

Recent publications

http://samarcanda.phys.uniroma1.it/vittorioloreto/

Thankyou

Page 79: La fisica delle parole - Roma Tre Universitywebusers.fis.uniroma3.it/fisincitta/2013/materiale1.pdf · Creoles languages A creole language is a stable natural language developed from

VL and L. Steels, Emergence of Language, Nature Phys., Vol. 3, 758-760 (2007).

A. Puglisi, A. Baronchelli and VL, Cultural route to the emergence of linguistics categories Proc. Natl. Acad. Sci. USA, 105, 7936 (2008).

C. Castellano, S. Fortunato and VL,Statistical physics of social dynamicsRev. Mod. Phys., 81, 591-645 (2009).

A. Baronchelli, T. Gong, A. Puglisi and VL, Modeling the emergence of universality in color naming patternsProc. Natl. Acad. Sci. USA, 107, 2403 (2010).

A. Mukherjee, F. Tria, A. Baronchelli, A. Puglisi and VL, Aging in language dynamicsPLoS ONE, 6, e16677 (2011).

VL, A. Mukherjee and F. Tria, On the origin of the hierarchy of color namesProc. Natl. Acad. Sci. USA, 109, 2819 (2012).

F. . Tria, B. Galantucci and VLNaming a structured world: a cultural route to duality of patterningPLoS ONE, 7(6), e37744 (2012).

Recent publications

http://samarcanda.phys.uniroma1.it/vittorioloreto/

Andrea Baronchelli

Tao Gong

Animesh Mukherjee

Andrea Puglisi

Francesca Tria

Category Game

Thankyou

Andrea Baronchelli

Alain Barrat

Emanuele Caglioti

Luca Dall’Asta

Maddalena Felici

Salikoko Mufwene

Martina Pugliese

Vito D.P. Servedio

Luc Steels

Francesca Tria

Naming Game


Recommended