Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler)...

Post on 11-May-2015

145 views 1 download

Tags:

transcript

Digitale Zeitungsarchive als Quellen (digitaler)

Geschichtsforschung

Dr. Pim Huijnen

Universität Utrecht

p.huijnen@uu.nl

Berlin, 28.02.2014

www.translantis.nl

Translantis

Digital Humanities Approaches to Reference Cultures; The Emergence of the United States in Public Discourse in the Netherlands, 1890-1990

“…uses digital technologies to analyze the role of reference cultures in debates about social issues and collective identities, looking specifically at the emergence of the United States in public discourse in the Netherlands from the end of the nineteenth century to the end of the Cold War.

!

The United States as a reference culture

Business

Society

Consumption

Media

Crime

Health

Amerikanisierung

Business/economy: Americanization

1870-1914 - 1918-1940 - 1945-1989

Fordism Taylorism

Professionalization Managerism

Productivity

Rationalisation Efficiency

Standardization Mass production

Mass market Consumer society

Credit

Consultancy Accountancy

Abweisung, Aneignung,

Verflechtung

Leeuwarder Courant, 27 oktober 1950

Die USA als Referenz-Kultur

27 oktober 1195r 500 anntt, 2

Un

ited

Sta

tes in

Du

tch

ne

ws m

ed

ia

"!

#"""!

$"""!

%"""!

&"""!

'"""!

("""!

)"""!

*"""!

#*'"!

#*'%!

#*'(!

#*'+!

#*($!

#*('!

#*(*!

#*)#!

#*)&!

#*))!

#**"!

#**%!

#**(!

#**+!

#*+$!

#*+'!

#*+*!

#+"#!

#+"&!

#+")!

#+#"!

#+#%!

#+#(!

#+#+!

#+$$!

#+$'!

#+$*!

#+%#!

#+%&!

#+%)!

#+&"!

#+&%!

,-./..0123.!4565.0,!

,78./19

660:;<

,!

Text mining for historical research

National Library Den Haag: ~9.000.000 digitized pages from Dutch news media 1618-1995

Opportunities for comparative and transnational historical research

(esp. History of mentalities/ of ideas)

Development of a digital text mining tool

!

!

Digital research on public debates

servers nodig voor opslag (500 gb aan data) computers nodig voor computationele bewerking (geheugen) duurzaamheid nodig bij opslag en bestandsformaten (min. 5 jaar – maar liefst oneindig) beheer nodig (mankracht)

programmeerkennis nodig

Big Data?

The change of scale has led to a change of state. The quantitative change has led to a qualitative one. […]

[B]ig data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value

Viktor Mayer-Schönberger en Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think

(Boston 2013) 13.!

Big Data!

“Letting the data speak”

Top-down vs. bottom-up

Bob Nicholson, ‘The Digital Turn’, Media History 19 (2013) 59-73.!

Query: ‘Standard oil’ <1900 (1030 hits)

Wortwolke

Word cloud ‘manager’ 1910-1920

(3437 hits)

Word cloud ‘manager’ 1945-1950

(1173 hits)

Voyant word cloud ‘efficiency’ 1945-1960 (46040 hits)

Voyant word cloud ‘efficiëntie’ 1945-1990 (2861 hits)

Histogram

Query: ‘consultancy’ (2167 hits)

Histogram (SPSS)

Query: ‘manager’ (191.710 hits)

BILAND

Query: ‘Heredity’ (1876) (22/1465 hits)

BILAND

Query: ‘Heredity’ (1935) (1465 hits)

BILAND

Query: ‘Hygiene’ (87/41 hits)

‘Typisch Amerikanisch’

Topic modeling

SPSS

Translantis

Query: ‘manager’ (191.710 hits)

Translantis

Query: ‘manager’ in advertenties (82.695 hits)

=8>1/1:;<.!;?;@A:!

query

kwantitatieve analyse

kwalitatieve analyse

inzicht

Digital research on public debates

No limitation source material

No selection issues

No representativeness issues

Enabling research on hidden debates, mentalities, implicit notions

Reproducibility of research, from various perspectives

Source criticism: data

representativeness

internal coherence

(OCR) quality

"!

#"""!

$"""!

%"""!

&"""!

'"""!

("""!

)"""!

*"""!

#*'"!

#*'%!

#*'(!

#*'+!

#*($!

#*('!

#*(*!

#*)#!

#*)&!

#*))!

#**"!

#**%!

#**(!

#**+!

#*+$!

#*+'!

#*+*!

#+"#!

#+"&!

#+")!

#+#"!

#+#%!

#+#(!

#+#+!

#+$$!

#+$'!

#+$*!

#+%#!

#+%&!

#+%)!

#+&"!

#+&%!

,-./..0123.!4565.0,!

,78./19

660:;<

,!

Representative?

Representative?

Libraries, archives, museums and other collection institutions have now been digitising corpora of material for many years, but with a very few exceptions, it is still quite rare for an entire run of primary sources to be digitised and made available online. This means that there are gaps within the digital record. Yet it is unusual for online resources to actively demonstrate these gaps; resources may be advertised as a growing corpus, but when searching through or downloading a digital resources there is rarely any indication of what has not been digitised. This skews the sense of the nature of the collection the scholar is working with and erodes trust.

Abstract submitted to DH2014 by Alastair Dunning (The European Library) and Clemens Neudecker (KB National Library of the Netherlands).

See: http://availableonline.wordpress.com/

Source criticism: comparison

Source criticism: Press history

[O]ne of the biggest challenges facing press historians will be to ensure that the historical agency and complex materiality of newspapers are not forgotten in a rush to mine their contents.

Bob Nicholson, ‘The Digital Turn’, Media History 19 (2013) 59-73, on

p. 67

Source criticism (interpretation)

Newspapers = public debate?

What newspapers write = what public thinks?

How to interprete results?

What are stopwords? (“staat”) !

Mining for meaning?