Date post: | 07-Apr-2018 |
Category: |
Documents |
Upload: | tpitikaris |
View: | 218 times |
Download: | 0 times |
of 26
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
1/26
WEB BASED INFORMATION
RETRIEVAL
byTheodoros Pitikaris
A thesis submitted in partial fulfill-
ment of the requirements for the
degree of:
BSc in Computing an d
Information Technology
Department of Computing
University of Surrey
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
2/26
UNIVERSITY OF SURREY
ABSTRACT
WEB BASED INFORMATION RE-
TRIEVAL
by
Theodoros Pitikaris
Supervisory Committee: Dr . Bogdan Vrusias
Department of Computing
Dr. Nick Antonopoulos
Department of Computing
Web World Wide contains large sets of information. This characteristic ofWeb however, can become a real pain fo r users who seek sources that
would be qualitative an d relative, at the same time, to their informative
needs. In this Final Year project we tr y to examine some information re-
trieval methods over web stored information. The main focus is given on if
an d how software agents could potentially enhance the information re-
trieval process .
Another topic that we examine in this final year project is the require-
ments, phases an d evaluation process that are necessary in software de-
sign & production process.
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
3/26
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
4/26
Project development process ... . ... ... ..... ............ .. ... ..... ................ .... ........... 54Chapter 5 . Software Development PHASES in Details ...... .......... ....... 58
Design Overview ............................................ ...... . .. ................ .. ... 58Facilities .. ......... . ................. ........ .... . ................... . ........................ ......... .. 58
The core system .... .. ............................... .. .. .. ............................. .. 59Software development platform ............. . .......................... ......... . . ............. 59Intergraded Development Environment Development.. ..................... ... .. ...... 60System Design ............................................... .......... ....................... ..... ... 61.Unit Testing ........................... .............. ................................ ........ .......... 69Integration Testing ................ . ............................. .. . ................. . ... ............ 70
Chapter 6. DISCUSSION ...................................... .. ....................... 72Interesting parts during development process ................................ . 72
Prototype evaluation ................ .................................. .................. 72Comments on th e evaluation results and related work ..................... 74Overall project Evaluation ................................ ............................. 75
Chapter 7. Conclusions ...... .... ..... ........................................... .. ..... 77Future work ............................................................................. ... 78
INDEX . . . ............ . . .... ...... ... .. .. ... ..... .... . ...... ........ ........ ... ..................... 83
ii
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
5/26
LIST OF TABLES
Table 1 Agile vs Waterfall methodo logy (available fromhttp:/ en. wikipedia.org/wiki/ Agile_software_development) ........................ . . 29
Table 2 Development Phases ........................................ .. ................... .. ................ 57Table 3 Sample of a Matrix candidate for SVD .. .. .... .. ...... .. .. ................................... 64
List of figuresFigure 1 Google database development ................................................................ 6Figure 2 The Waterfall Model .............................................................................. 26Figure 4 Waterfa ll vs. Agile ................................................................................ 28Figure 5 System Use Case ................ .. .. ....... .... .. .. .. ............ .. ......... .. ................... 62Figure 6 System State Qiagram .......................................................................... 63Figure 7 Users' opinion about the system ................ .... ............ .... .. ..................... 74
iii
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
6/26
AcknowledgmentsThe author wishes to express sincere appreciation to Mr Staurakakis
Emanuel and Mr Tsagatsakis John for their assistance in the preparation of
this Final year Project report.
iv
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
7/26
INTRODUCTION
In 2001 the Bank of Sweden Prize in Economic Sciences in Memory
of Alfred Nobel was awarded to James Mirrlees and William Vickrey
fo r their fundamental contributions to the theory of incentives under
asymmetric information.
With their work
(http://www .nobel.se/economics/laureates/2001/ecoadv. pdf)
they have validated no t only the importance of the Information but
also the importance of accessibility over this information.
Nowadays everyone in west, especially after the development of th e
internet, has access to large amount of data, in electronic or paper
form. The main problem that we usually face is that the volume of
this information is so large that we can no t easily handle it, or worse
it has no use.
In order to take advantage of this information we need to categorize
it in thematic cohesion and thus to manageable data. A few decades
ago this was librarians' line, bu t as already mentioned the volume of
data has increased dramatically in such a degree(Society, 2004)
that the traditional methods of indexing are no t in position to face
this new challenge.
Th e problem gets bigger when we need to categorize ne w documents
based on their content, of course in many documents their is an ab -
stract on top of them ; bu t in fact only scientific papers with a special
purpose have this form, fo r example an abstract is essential fo r a
paper bu t no t fo r a newspaper or a magazine.
1
8/6/2019 Bsc Thesis part1 - WEB BASED INFORMATION RETRIEVAL
8/26
Some people believe that when we talk about retrieving dat