Date post: | 01-Apr-2015 |
Category: |
Documents |
Upload: | tamara-vowels |
View: | 215 times |
Download: | 3 times |
Department of Communication Science, VU University Amsterdam
Semantic NETwork analysis
Manual and automatic content analysis of
Source :agent / predicate / target
relationships
Jan Kleinnijenhuis / Wouter van Atteveldt Atteveldt
Semantic Network Analysis 2Department of Communication Science
The Network Institute, VU University Amsterdam
Topics
1. Introduction Semantic NETwork Content Analysis
2. Human coding, using CETA, iNET
3. Automatic coding1. Extraction of source: agent /(predicate) / target
quadruples2. Sentiment Analysis: (predicate) association ..
dissocation
4. Discussion: extraction of issue positions2
Semantic Network Analysis 3Department of Communication Science
The Network Institute, VU University Amsterdam
Introduction Semantic NETwork Content Analysis
3
1
Semantic Network Analysis 4Department of Communication Science
The Network Institute, VU University Amsterdam
Backgroundo subject / predicate / object
o Early Greeks, both semantic and syntactic, agent / predicate / target
o Namen gleichen Punkten, Sätzen Pfeilen (1921)o xRy propositions, Ludwig Wittgenstein (Tractatus 3.144)
o Evaluative assertion analysis (1956)o Heider (1946) balance theory => cognitive consistency theoryo Charles Osgood, Nunnally Saporta (1956)
o Automatic content analysis, (co)occurrence keywords (1960s - ..)o Stone e.a. (1965), The General Inquirero Efficient indexing algorithms, e.g. Google, Lucene
o Semantic Network Analysis, relational content analysis (1980s-..)o Van Cuilenburg e.a. 1986, deRidder, 1994, Kleinnijenhuis e.a. 1997, 2001,
Van Atteveldt, forthcoming; also labeled as NET-method
o Semantic Web (1990s - …), xRy + logic => inferences
Semantic Network Analysis 5Department of Communication Science
The Network Institute, VU University Amsterdam
Definitions key concepts 1o (meaning) object entity
o actor, issue, Ideal or UnspecifiedReality, Actor animated, e.g. person, group, organization Issue non-animated, e.g. employment, health care Ideal, value criterion for evaluation actor or issue, e.g. referent of entranching in “Obama’s
smile is entranching”) unspecifiedReality, e.g. referent of it in “it’s lucky for Bush”)
o appearing as subjects ( agents) and/or objects (targets, recipients) in texts
o ontology o A priori knowledge of relationships between meaning objects
Politian Person Actor; Politician[period] Party Actor; politician[period] PolFunction BarackObama Democrats [1994..?]; BarrackObama PresCandidate[2007..?]
o operationalized with an ontology dictionary: set of (linguistically or statistically enhanced) queries to search for occurrence of separate
meaning objects in natural language
Semantic Network Analysis 6Department of Communication Science
The Network Institute, VU University Amsterdam
Definition Semantic NETwork Analysis1
Extraction from texts of source: agent / predicate / target-quadruples so as to infer conclusions from their network representation
o subject agent {actors, issues; default=unspecifiedReality} = meaning object directing action or energy as described by the
o predicate {dissociations .. associations}
Thesaurus: (eventually context specific) synsets of words / mwu’s whose conjugations and combinations amount predictably to a value on the dissociation..association-scale (e.g. cooperate +1; bomb -1)
o towards the object target recipient {actors, issues; default= Ideal}
o according to a (quoted or paraphrased) source {actors; default=author}
(cf. R.M.W. Dixon, 1992, A new approach to English grammar, on semantic principles)
Semantic Network Analysis 7Department of Communication Science
The Network Institute, VU University Amsterdam
NET relation types with prototype examples
Semantic Network Analysis 8Department of Communication Science
The Network Institute, VU University Amsterdam
Issue positions: often ends, means and causal expectation in 1 sentence
Het CDA gaat door met ingrepen in de zorg om de overheidsuitgaven te beperken
o Issue position, meansCDA / gaat door met (+) / ingrepen in de zorg
o Issue position, endCDA / wil beperken (-) / overheidsuitgaven
o Causal relationshipCDA: ingrepen in de zorg / om te beperken (-) /
overheidsuitgaven
Semantic Network Analysis 9Department of Communication Science
The Network Institute, VU University Amsterdam
Human coding, using CETA / iNET
9
2
Semantic Network Analysis 10Department of Communication Science
The Network Institute, VU University Amsterdam
NET by human coders using CETA (Jan A, de Ridder),
iNET (Wouter van Atteveldt)
Semantic Network Analysis 11Department of Communication Science
The Network Institute, VU University Amsterdam
SNA by human coders using INET, ontology lookup
Semantic Network Analysis 12Department of Communication Science
The Network Institute, VU University Amsterdam
SNA by human coders using INET, network lookup
Semantic Network Analysis 13Department of Communication Science
The Network Institute, VU University Amsterdam
SNA by human coders using INET, 3 more networks
Semantic Network Analysis 14Department of Communication Science
The Network Institute, VU University Amsterdam
Automatic coding. Source: subject /pred/object-extraction
14
3.1
Semantic Network Analysis 15Department of Communication Science
The Network Institute, VU University Amsterdam
Tools ontology construction: co-occurrence analysis
Amos Tversky (1977): features of similarity
Islam*,terror* and immig*in NRC, AD 2004-2006
Semantic Network Analysis 16Department of Communication Science
The Network Institute, VU University Amsterdam
Tools ontology 2: syntactic profiling http://www.let.rug.nl/gosse/bin/verwant.py
BOLKESTEIN 1994-1995Werkwoorden waarmee Bolkestein als lijdend voorwerp geassocieerd is: kapittel, beticht, haal uit naar, zet af tegen, besta tussen, beweer, sla aan, citeer, typeer, kritiseer, beschuldig, waarschuw, bedien, verwijt, vergelijk, val aan, leg voor aan, roep, verras, overtuig
Werkwoorden waarmee Bolkestein als onderwerp geassocieerd is: schop-in de war, moraliseer, zoek aan, matig, scherts, overspeel, zwengel aan, verkwansel, nuanceer, snoer-de mond, herroep, kom-in botsing, zwalk, neem terug, krab, trek-van leer, vier feest, maak-korte metten, opteer, belijd
Bijvoeglijke naamwoorden waarmee Bolkestein geassocieerd is: negentiende-eeuws
BRINKMAN 1994-1995Werkwoorden waarmee Brinkman als lijdend voorwerp geassocieerd is: licht-beentje, sta-terzijde, tik-op de vingers, fluit terug, reken aan, stem op, ondervraag, interview, adviseer, wijs aan, schrijf af, prijs aan, eer, corrigeer, sta bij, kritiseer, houd-in de gaten, confronteer, schuif, spreek aan
Werkwoorden waarmee Brinkman als onderwerp geassocieerd is: diskwalificeer, paai, bijt vast, nuanceer, volhard, draag mee, sta-te woord, bezin, baal, profileer, blijf aan, leun, herzie, poseer, speculeer, beraad, leg neer, bid, heb-de tijd, betreur
Bijvoeglijke naamwoorden waarmee Brinkman geassocieerd is: gereformeerd, kil, arm, ander
© Gosse Bouma, RUG
Semantic Network Analysis 17Department of Communication Science
The Network Institute, VU University Amsterdam
Automation NET
Semantic Network Analysis 18Department of Communication Science
The Network Institute, VU University Amsterdam
Alpino-tree ==> source: subject / pred / object
Semantic Network Analysis 19Department of Communication Science
The Network Institute, VU University Amsterdam
Concurrent validity extraction Source: subject / (pred )/ object
Semantic Network Analysis 20Department of Communication Science
The Network Institute, VU University Amsterdam
Predictive validity, Sources and Subjects (acting actors)
Semantic Network Analysis 21Department of Communication Science
The Network Institute, VU University Amsterdam
Automatic coding. Sentiment analysis assoc .. dissoc
21
3.2
Semantic Network Analysis 22Department of Communication Science
The Network Institute, VU University Amsterdam
Sentiment analysis: decomposition F1 performance
Semantic Network Analysis 23Department of Communication Science
The Network Institute, VU University Amsterdam
F1 per relation type, elections 2006 manual corpus
Semantic Network Analysis 24Department of Communication Science
The Network Institute, VU University Amsterdam
Sentiment analysis: aggregate performance 2006 campaign
Semantic Network Analysis 25Department of Communication Science
The Network Institute, VU University Amsterdam
Extraction aggregate issue positions, 2006 campaign
Semantic Network Analysis 26Department of Communication Science
The Network Institute, VU University Amsterdam
Relative performance 2006 campaign full ‘grammar’ model
Cell entries represent correlation coefficients with manual content analysis
Semantic Network Analysis 27Department of Communication Science
The Network Institute, VU University Amsterdam
Discussion: Content Analysis of Issue Positions
27
4
Semantic Network Analysis 28Department of Communication Science
The Network Institute, VU University Amsterdam
Automatic extraction of Issue Positions presupposeso Manual codings (machine learning; validity tests)
o Ontology of meaning objects (actors, issues, values, reality)o Ontology dictionary
o Linguistic preprocessing:o tokenization, lemmatizing, parsing syntax grapho e.g. Van Noord, Bouma : ALPINO
o Identificationo Syntax graph + rules semantic roles of source, agent,
targeto Semantic roles + ontology dictionary + anaphora
resolution + posthoc extraction source:agent/pred(assoc..dissoc)/target
o Sentiment analysis pred(assoc..dissoc)
Semantic Network Analysis 29Department of Communication Science
The Network Institute, VU University Amsterdam
Discussion: prospects for advanceo Ontology, ontology dictionary
o e.g. more subissues, context-specific synonyms
o Rules to transform syntax graph semantic roleso e.g. rules dealing with different modifier types
o sentiment analysiso e.g. multi word unit-recognition; informed features
o Combining rule-based with statistical approaches (e.g. LSA) starting from high-order linguistic features
o Error analysiso e.g. more specific validity tests starting from manual
coding
o Other languages .. new language domains ..
Semantic Network Analysis 30Department of Communication Science
The Network Institute, VU University Amsterdam
Literature Antoniou, G., & van Harmelen, F. (2004). A semantic web primer.
Cambridge: MIT Press. Bouma, G. (2005). Zoek verwante woorden, uitgaande van Algemeen
Dagblad en het NRC Handelsblad van 1994 en 1995 (80 miljoen woorden). . Rijksuniversiteit Groningen / NWO: http://www.let.rug.nl/gosse/bin/verwant.py.
Bouma, G., & van Noord, G. (2005). ALPINO: automatisch ontleden van het Nederlands. RUG Alfa Informatica / NWO: http://www.let.rug.nl/~kleiweg/alpino/index1.html; http://www.let.rug.nl/~vannoord/alp/.
de Ridder, J. A. (1994). Van Tekst naar informatie: ontwikkeling en toetsing van een inhoudsanalyse-instrument. Amsterdam: Universiteit van Amsterdam (proefschrift).
Dixon, R. M. W. (1992). A new approach to English grammar, based on semantic principles. Oxford: Clarendon.
Dixon, R. M. W. (2005). A semantic approach to English grammar. Oxford: Clarendon.
Kleinnijenhuis, J., de Ridder, J. A., & Rietberg, E. M. (1997). Reasoning in economic discourse: an application of the network approach to the Dutch press. In C. W. Roberts (Ed.), Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Texts and Transcripts (pp. 191-207). New York: Erlbaum.
Kleinnijenhuis, J., Scholten, O., van Atteveldt, W. H., van Hoof, A. M. J., Krouwel, A. P., Oegema, D., et al. (2007). Nederland vijfstromenland: de rol van media en stemwijzers bij de verkiezingen in 2006. Amsterdam: Bert Bakker.
Kleinnijenhuis, J., & van Atteveldt, W. H. (2006). Geautomatiseerde inhoudsanalyse, met de berichtgeving over het EU-referendum als voorbeeld. In F. Wester (Ed.), Inhoudsanalyse: theorie en praktijk. Dordrecht: Kluwer.
Kleinnijenhuis, J., Van Hoof, A. M. J., Oegema, D., & De Ridder, J. A.
(2007). A test of rivaling hypotheses to explain news effects: news on issue positions of parties, real world developments, support and criticism, and success and failure. Journal of Communication, 57(2), 366-384.
Krippendorff, K. (2004). Content Analysis. Thousand Oaks: Sage. Osgood, C. E., Saporta, S., & Nunally, J. C. (1956). Evaluation assertion
analysis. Litera, 3(. ), 47-102. van Atteveldt, W. H. (2008, forthcoming). Extracting and Representing
Semantic Networks from Textual Sources (working title). Amsterdam:: PhD thesis, VU University Amsterdam.
van Atteveldt, W. H., Kleinnijenhuis, J., & Ruigrok, P. C. (submitted for publication 2007). Parsing, Semantic Networks and Political Authority: using syntactic analysis to extract semantic relations from Dutch newspaper articles.
van Atteveldt, W. H., Kleinnijenhuis, J., Ruigrok, P. C., & Schlobach, S. (2008, forthcoming). Good News or Bad News? Conducting sentiment analysis on Dutch text to dinstinguish between positive and negative relations. resubmitted for publication in Journal of Information Technology and Politics; presented as paper at Etmaal van de Communcatiewetenschap, Amsterdam, February 2008.
van Atteveldt, W. H., Schlobach, S., & van Harmelen, F. (2007). Media, Politics and the Semantic Web. In Proceedings of the European Semantic Web Conference 2007 (pp. 205-219). Berlin: Springer.
van Cuilenburg, J. J., Kleinnijenhuis, J., & de Ridder, J. A. (1986). Towards a graph theory of journalistic texts. European Journal of Communication, 1, 65-96.
van der Beek, L., Bouma, G., & van Noord, G. (2002). Een brede computationele grammatica voor het Nederlands. Nederlandse Taalkunde; downloaded from http://www.let.rug.nl/~vannoord/papers/taalkunde.pdf.