Date post: | 12-Apr-2016 |
Category: |
Documents |
Upload: | ujjwal-anand |
View: | 35 times |
Download: | 5 times |
A Software Requirement Specification
for “IET SEARCH”
in partial fulfillment for the award of the Degree of
Bachelors of Technology in Department of Computer Science and Engineering
Submitted To: Submitted By:
Mohit Khandelwal Abhishek Pal
Project In-charge Ujjwal Anand
Akshat Patel
CS & IT Department
IET Alwar
Department of Computer Science and Engineering
Institute of Engineering and Technology
Alwar
September 2015
Candidate’s Declaration
We hereby declare that the work, which is being presented in this report, entitled “ IET SEARCH ”in partial fulfillment for the award of Degree of “Bachelor of Technology” in department of Computer Science and Engineering , Institute of Engineering and Technology affiliated to, Rajasthan Technical University is a record of my own investigations carried under the Guidance of Mr. Vinit Bhargava and Mrs. Anupma Mathur, Department of Computer Science and Engineering , IET Alwar.
We have not submitted the matter presented in this report any where for the award of any other Degree.
Abhishek Pal (12EIACS702)
Ujjwal Anand (12EIACS110)
Akshat Patel (12EIACS703)
Computer Science and Engineering,
Counter Signed by
Mr. Vinit Bhargava Mrs. Anupma Mathur
Preface
The aim of this project is to develop a simple web-based search engine that demonstrates the main features of a search engine (web crawling,
indexing and ranking) and the interaction between them.
Acknowledgement
We would like to thank our advisors Mr. Vinit Bhargava and Mrs. Anupma Mathur for providing their valuable time, constant guidance
and support throughout this project. We appreciate and thank our project in-charge Mr. Mohit Khandelwal and Mr. Nitin Sharma for their time
and suggestions. We would also like to thank our H.O.D. Mr. Rohit Singhal and our friends for their moral support during the project.
Table of Contents
Candidate’s Declaration ........................................................................................................... i Table of Contents .......................................................................................................................... ii 1. Introduction ..............................................................................................................................1
1.1 Background Study ............................................................................................................1 1.2 Project Scope ...................................................................................................................1
2. Overall Description ..................................................................................................................1 2.1 Product Perspective ..........................................................................................................1 2.2 Product Features...............................................................................................................2 2.3 User Classes and Characteristics .....................................................................................2 2.4 Operating Environment ....................................................................................................2 2.5 Design and Implementation Constraints ..........................................................................2 2.6 Assumptions and Dependencies ......................................................................................2
3. External Interface Requirements ...........................................................................................3 3.1 User Interfaces .................................................................................................................3 3.2 Hardware Interfaces .........................................................................................................3 3.3 Software Interfaces ..........................................................................................................3 3.4 Communications Interfaces .............................................................................................3
4. Other Nonfunctional Requirements .......................................................................................3 4.1 Performance Requirements ..............................................................................................3 4.2 Safety Requirements ........................................................................................................4 4.3 Security Requirements .....................................................................................................4 4.4 Software Quality Attributes .............................................................................................4
5. Design Specifications ...............................................................................................................4 5.1 Assumptions .....................................................................................................................4 5.2 Constraints .......................................................................................................................4 5.3 Design Methodology ........................................................................................................4 5.4 Risk and Volatile areas ..................................................................................................10
6. Architecture ............................................................................................................................10 7. Database Schema ...................................................................................................................10
7.1 Tables, Fields and Relationships....................................................................................10 7.1.1 Databases .............................................................................................................. 10 7.1.2 New Tables ........................................................................................................... 10
8. Cost Estimation……………………………………………………………………………..11 Appendix A: Glossary..................................................................................................................13 Appendix B: References ..............................................................................................................13
Page | 1
1. Introduction
1.1 Background Study
In the summer of 1993, no search engine existed for the web, though numerous specialized catalogues were maintained by hand. Oscar Nierstrasz at the University of Geneva wrote a series of Perl scripts that periodically mirrored these pages and rewrote them into a standard format. This formed the basis for W3Catalog, the web's first primitive search engine, released on September 2, 1993. The web's second search engine Aliweb appeared in November 1993. One of the first "all text" crawler-based search engines was WebCrawler, which came out in 1994. Google adopted the idea of selling search terms in 1998, from a small search engine company named goto.com. Around 2000, Google's search engine rose to prominence. The company achieved better results for many searches with an innovation called PageRank. By 2000, Yahoo! was providing search services based on Inktomi's search engine. Yahoo! switched to Google's search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions. Microsoft's rebranded search engine, Bing, was launched on June 1, 2009. By the passing of time the use of search engine is increasing. As increased use of search engine for searching information, a system has been developed that helps users to search information. When a person wants to search anything he simply places his words in search engine. Then search engine returns him relevant information according to his/her words based on many more criteria. But user has to extract their necessary information after doing much analysis as search engines can’t give the exact information manually. Search engines use many criteria such as SEO (Search Engine Optimization), searching and returning information but we choose primarily only the words that are given for searching.
1.2 Project Scope
With a 2 months time constraint we students have looked into the analysis of the search engine and its design and implementation (integration of modules too). College Information based search engine is the specific area on which will be dealing in the next 7 months prior to the implementation details. For gaining an insight into how the existing search engine works, a comparatative study of various features the several engines offer have been made. A survey of the existing search engines has also been conducted in order to understand the in-addition expectations from current search engine. The planning stage and requirement gathering stage is a base work for further analysis and design. Hence planning and requirement gathering stage has also been allotted a time period of 2 months.
2. Overall Description
2.1 Product Perspective
IET Search Engine is a standalone system.It provides modules for crawling, indexing, sorting and searching web pages.
Page | 2
2.2 Product Features
The main function of IET Search Engine is to allow its users to search for College Details pages throughout the www. It also allows the users to specify query using filters and also allows for searching phrases.
2.3 User Classes and Characteristics
The major user classes that are expected to use this product are as follows 2.3.1. Students and Teachers The IET Search Engine has been developed for college search . This project is used by students to view animation on various topics belonging to physics, mathematics, networking etc. Hence they can use this Search Engine to efficiently search animation on a required topic from OSCAR as well as from WWW. 2.3.2. General Users The general users who are interested in viewing College details on different topics can use IET Search Engine to search for College details. 2.3.3. Website Developer The person who develops a website containing College information with the Search Engine so that his website is crawled and indexed in future.
2.4 Operating Environment
2.4.1. Client Side Requirement OS : Linux, Windows Software Packages : Browser 2.4.2. Server Side Requirements OS : Linux, Windows Software Packages : VB.NET
2.5 Design and Implementation Constraints
The biggest implementation constraint will be imposed by browser incompatibility.
2.6 Assumptions and Dependencies
1. The pages containing query term inside the webpage are more relevant to query, hence given more importance in the final ranking process.
Page | 3
2. HTML Parser package is used to parse the HTML pages. 3. Fast MD5 package for creating hash values of files, words and URLs.
3. External Interface Requirements
3.1 User Interfaces
Basically there will be two user interfaces provided by the software. 1. A screen containing a search panel providing area for the user to input his search query. 2. A results page which lists the links of the documents relevant to the given query.
3.2 Hardware Interfaces
Device should be enabled with Internet. Name Minimum Requirements Optimal Requirements Processor 1800 MHz 2400 MHz
RAM 128 MB 1 GB
3.3 Software Interfaces
Software will require the following libraries: Microsoft Visual Basic .NET HTML Parser package Fast MD5 package
3.4 Communications Interfaces
The crawler module of the search engine software uses the HTTP protocol to download the pages from WWW. The user uses the search engine through browser.
4. Other Nonfunctional Requirements
4.1 Performance Requirements
The number of crawlers working at a time is dynamically created depending on the available bandwidth .The average response time for a user is 0.36 sec. The expected accuracy of output is 90%.
Page | 4
4.2 Safety Requirements
If the speed of crawler is higher than that the web server can handle then it may lad the web server to crash. Hence a website developer should specify the speed supported.
4.3 Security Requirements
Information transmission should be securely transmitted to server without any changes
information.
4.4 Software Quality Attributes
All the software modules are developed in ASP .NET , which makes the system platform independent and robust. Secondly the system will provide the user with easy to use and understandable GUI interface.
5. Design Specifications
5.1 Assumptions
The code should be free with compilation errors/syntax errors. The product must have an interface which is simple enough to understand.
5.2 Constraints
This product is a web based application hence a major constraint on the performance will be due to the bandwidth of the server’s web connection. A faster bandwidth will result in faster crawling of web pages.
5.3 Design Methodology
Modular Design The whole system is divided into two parts i.e. the user and the admin section. That is why, the modular design of the system is also divided into two modular diagrams.
Modular diagram for user Modular diagram for admin
Use-case Diagram
Page | 5
Actor Use-case
Use-case Diagram for Admin and Use
Data Flow Diagram (DFD)
Level-0 Data Flow Diagram
Page | 6
Level-1 Data Flow Diagram
Level-2 Data Flow Diagram
Page | 7
Relational Diagram
Relational Diagram
Activity Diagram
Activity Diagram
Page | 8
Page | 9
Page | 10
5.4 Risk and Volatile areas
There are no risks and volatile areas involved in this project.
6. Architecture
7. Database Schema
7.1 Tables, Fields and Relationships
Database Name : iet_data Table Name : admin_detail, user_info, missing_keyword, keywords
7.1.1 Databases
Database Name : iet_data
7.1.2 New Tables
Table Name Field Names Data
Type Allow Nulls
Field Description
admin_detail user_id int Not null user_name varchar user_password varchar last_login varchar
user_info id_keyword int Not null url varchar title varchar
Page | 11
matches varchar discription varchar
missing_keyword id_search int Not null keyword varchar date_of_search varchar
keywords id_keyword int Not null User_id varchar keyword varchar date_of_creation varchar date_of_lastupdate varchar
8. Cost Estimation We have made use of the COCOMO model to estimate the cost of this project:- The Constructive Cost Model(COCOMO) is an algorithmic software cost estimation model developed by Barry W. Boehm. The model uses a basic regression formula with parameters that are derived from historical project data and current as well as future project characteristics.
COCOMO applies to three classes of software projects:
Organic projects - "small" teams with "good" experience working with "less than rigid" requirements. Semi-detached projects - "medium" teams with mixed experience working with a mix of rigid and less than rigid requirements.
Embedded projects - developed within a set of "tight" constraints. It is also combination of organic and semi-detached projects.(hardware, software, operational, ...).
The basic COCOMO equations take the form
Effort Applied (E) = ab(KLOC)bb[ person-months ]
Development Time (D) = cb(Effort Applied)d
b[months] People required (P) = Effort Applied / Development Time [count] where, KLOC is the estimated number of delivered lines (expressed in thousands ) of code for project. The coefficients ab, bb, cb and db are given in the following table: Software project ab bb cb db
Page | 12
Organic 2.4 1.05 2.5 0.38
Semi-detached 3.0 1.12 2.5 0.35
Embedded 3.6 1.20 2.5 0.32
Basic COCOMO is good for quick estimate of software costs. However it does not account for differences in hardware constraints, personnel quality and experience, use of modern tools and techniques, and so on. Our Project are the type Semi-Detached so the values for a, b, c, d are 3, 1.12, 2.5, 0.35and for Estimating the KLOC First of all we will have to estimate the no. of classes which are as follows:
Main Class
Web Crawling
Indexing
Searching
DBconnection Class
Database Scanning
Searched Code
So these are some of the classes which has to incorporated in the project. So on the basis of this is near around line code value may be 1500-2000. The estimated value may exceed in future but for now let us assume it to be in between 1500-2000 i.e. 1750. So taking 1.750 as KLOC Effort = 3*(1.750)1.12 = 5.61 =6(approx) person months Development Time = 2.5*(6)0.35 months = 5(approx) months And the team members involved in our project are there. So if salary of one developer is 5,000/month then according to 5 months the salary per developer would be 25,000. So overall cost for the project will be 75,000/-
Page | 13
Appendix A: Glossary
SRS – Software Requirement Specification Web crawler - Generic terms applied to any program which visits websites and systematically retrieves information from them. HTML - Article formatted in HTML so as to be readable by a web browser Hypertext Markup Language. Database - A collection of electronically stored data or unit records (facts, bibliographic data, texts) with a common user interface and software for the retrieval and manipulation of data. Web Host - An intermediary online service which stores items that can be downloaded by the user. SEO – Search Engine Optimization WWW. – World Wide Web
Appendix B: References
[1] http://en.wikipedia.org/wiki/Web_search_engine [2] http://en.wikipedia.org/wiki/Web_crawler. [3] http://www.brightplanet.com/2012/11/deep-web-search-engines-vs-web-harvest-engines-
finding-intel-in- a-growing-internet/ [4] Roger S.Pressman’s Software Engineering