+ All Categories
Home > Documents > Search Engine Seminar

Search Engine Seminar

Date post: 30-May-2018
Upload: sidgrab
View: 217 times
Download: 0 times
Share this document with a friend

of 17

  • 8/14/2019 Search Engine Seminar


    SIDHARTHA SARANGIRoll: 064021Regno:0601292094

  • 8/14/2019 Search Engine Seminar


    WEB SEARCH ENGINE A web search engine is a tool designed to search for information on the World Wide Web.

    The search results may consist of web pages, images, information and other types of files.Search engine work algorithmically or are a mixture of algorithmic and human input.

    According to netcraft

    there are around 240000000web domains globally.

  • 8/14/2019 Search Engine Seminar


    Current search engines:

  • 8/14/2019 Search Engine Seminar



    First generation search engine:First generation search engine:

    Search results were depended on what was on the Web page. factors included keyword density, title, and where in the document keywards appeared.

    First generation added relevancy for META tags, keywords in the domain name, and a few bonus points for having keywords in the URL.

    Second generation search engine:Second generation search engine:

    Employ tracking clicks, link popularity and link quality. Then they added context where two-word keyword pairs were extracted from a page to bettercategorize it.

    Google's Page Rank system and the length of visits are the evidence of 2nd generation search engine.

    Third generation search engine:Third generation search engine:

    It adds word stemming to keep a search in context. Auto extraction of keyword pairs helps categorize a page. It extracts data about your individual

    searching habits. It adds Web maps which are a useful filtering tool to get rid of duplicate sites.

  • 8/14/2019 Search Engine Seminar


    works: A search engine operates, in the following

    order Web crawling Indexing


  • 8/14/2019 Search Engine Seminar


    Web crawling Web search engines work by storing information about many web

    pages, from the WWW by a Web crawler (spider) an automated

    Web browser which follows every link it sees.

    Googlebot is Googles web crawling robot.It functions like web browser, by sending a

    request to a web server for a web page,downloading the entire page,

    then handing it off toGoogles indexer.

    Search engine spiders do not read pagesthe way a human does. Instead, they tend

    to see only particular stuffand are blind

    for many extras (Flash, JavaScript ,images)that are intended for humans.

  • 8/14/2019 Search Engine Seminar


    Spider simulator:Bput.org

    As we can see the


    /vbscript does not

    have any Impact on

    the webspider.

    The only thing matters

    is text, in-bound / out-

    bound links, meta key-

    words etc.

  • 8/14/2019 Search Engine Seminar


    Indexing Web crawler gives the indexer the full text of the pages it finds. These pages

    are stored in Googles index database by search term, with each index entrystoring a list of documents in which the term appears and the location withinthe text where it occurs. This data structure allows rapid access to documentsthat contain user query terms.

    To improve search performance,Google ignores stop words (such

    as the, is, on, or, of, how, why,

    as well as certain single digitsand single letters).

    The indexer ignores some punctuation

    and multiple spaces, as well asconverting all letters to lowercase,

    to improve Googles performance.

  • 8/14/2019 Search Engine Seminar


    Searching:The query processor has

    several parts, including the userinterface (search box), the

    engine that evaluates queries

    and matches

    them to relevant documents,

    and the results formatter.

  • 8/14/2019 Search Engine Seminar


    engine: 3D search engine

    Theme search engine

    Meta search engine

    its time to look beyondgoogle

  • 8/14/2019 Search Engine Seminar


    3D search engine: A Search engines that can mine catalogs of three-

    dimensional objects , which lets users create images asqueries for searches.

    Query formulationUsers can select objects from a catalog of images based onproduct groupings, or they can let users draw a 2D or 3Drepresentation of the object they want to find.

    Search processIt uses algorithms to convert the selected or drawn image-based query into a mathematical model. The search systemthen compares the mathematical description of the drawnor selected object to those of 3D objects stored in adatabase, looking for similarities in the described features.

    Ex : Princeton 3D Model Search Enginehttp://shape.cs.princeton.edu/search.html

  • 8/14/2019 Search Engine Seminar


  • 8/14/2019 Search Engine Seminar


  • 8/14/2019 Search Engine Seminar


    Theme search engine: It is called as `in context' searching or on topic


    What you say your page is about, what the searchengine calculates your page to be about, and what the

    rest of the Internet thinks your page is about, mustmatch, according to their mathematical formulas.

    The 2nd & 3rd Generation search engines are example oftheme search engine.

  • 8/14/2019 Search Engine Seminar


    Meta search engine: A meta-search engine is a search tool that sends user requests to

    several other search engines and/or databases and aggregates theresults into a single list or displays them according to their source.

    Web is too large for any one search engine to index it all and that morecomprehensive search results can be

    obtained by combining the results fromseveral search engines. This also

    may save the user from having touse multiple search engines

    separately. This alsohelps in deep web searching.

    Metasearch engines create what isknown as a virtual database.

    They take a user's request,pass it toseveral other heterogeneous search

    engines and then compile the results.

  • 8/14/2019 Search Engine Seminar


    optimization Search engine optimization (SEO) is the

    process of improving the volume or quality oftraffic to a web site from search engines.

    Current Optimization Strategies

    1. Cloaking: Hide it from the spiders eye. 2. Keyword Weight: Use proper key word.

    4. Stop Words: Be careful with stop words.

    5. Redundancy: Dont use same pages again.

    6. Lengthy Pages: focus on one topic

  • 8/14/2019 Search Engine Seminar

