+ All Categories
Home > Documents > Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical...

Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical...

Date post: 21-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
21
Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton, UK http://linkanalysis.wlv.ac.uk RC33 August 2004
Transcript
Page 1: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Using Search Engines and Web Crawlers in Social

Science Research

Mike Thelwall

Head, Statistical Cybermetrics Research Group

University of Wolverhampton, UK

http://linkanalysis.wlv.ac.ukRC33 August 2004

Page 2: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Link Analysis in Social Science Research Use to study web phenomena

E.g. NGO web site interlinking E.g. university web site interlinking

Use to study offline phenomena with web aspects E.g. scholarly communication E.g. the perception of news events

The web is a free, accessible massive data source for information about many aspects of life

Page 3: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

What use is hyperlink data to qualitative researchers?

Part of a mixed methodology Numbers to back up theories To obtain samples of types of Web pages for

qualitative analyses Background information on how the Web

is used

Page 4: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Quick example 1:

UK universityinterlinkingwith geographicclusters indicated

Page 5: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Quick example 2:

Asia-Pacific university interlinking.

{Research with Alastair Smith, VUW, NZ}

Page 6: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Quick example 3:

Geographic interlinking trends for UK universities.

Page 7: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Talk overview A social science approach for link analysis Data collection with commercial search

engines Data collection and analysis with

SocSciBot

Page 8: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

A social science approach for link analysis 1: Preliminary steps1. Formulate an appropriate research question,

taking into account existing knowledge of web structure

2. Conduct a pilot study3. Identify web pages or sites that are appropriate to

address a research question4. Collect link data from a commercial search

engine or a personal crawler taking appropriate safeguards to ensure that the

results obtained are accurate

Page 9: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

A social science approach for link analysis 2: Validation

5. Partially validate the link count results through correlation tests

6. Partially validate the interpretation of the results through a link classification exercise or web author interviews

Page 10: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

A social science approach for link analysis 3: Reporting8. Report results with an interpretation

consistent with link classification exercise include either a detailed description of the

classification or exemplars to illustrate the categories

9. Report the limitations of the study and parameters used in data collection and processing

Page 11: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Link data from commercial search engines

Commercial search engines can give information about the existence of links in the web Can be used for data collection Advanced interfaces are usually needed, or

special commands

Page 12: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Google Can find all links to a given web page with

the link: command E.g. link:http://www.siswo.uva.nl/rc33/

Page 13: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Yahoo! site-specific searches Yahoo! allows searching for links between

pairs of web sites/web spaces E.g. linkdomain:db.dk +site:ac.uk returns

web pages in the ac.uk domain that link to the db.dk site

…ac.uk/… …db.dk/…

Page 14: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

SocSciBot Personal crawler for link research Available free at socscibot.wlv.ac.uk Crawls sets of web sites and analyses the

links between them, producing: Link lists Link counts Network diagrams

Page 15: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
Page 16: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
Page 17: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
Page 18: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
Page 19: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
Page 20: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
Page 21: Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Reprise: Link Analysis in Social Science Research Use to study web phenomena

E.g. NGO web site interlinking E.g. university web site interlinking

Use to study offline phenomena with web aspects E.g. scholarly communication E.g. the perception of news events

The web is a free, accessible massive data source for information about many aspects of life

But don’t forget the need for validation!


Recommended