Date post: | 12-Apr-2017 |
Category: |
Data & Analytics |
Upload: | ajaybabu1314 |
View: | 20 times |
Download: | 0 times |
WEB MINING
contents The Web Web mining Data mining vs web mining Why mine the web Web mining taxonomy Applications of web mining Conclusion
The Web Web is a collection of inter-related files on one or
more Web servers.
Wealth of information : Presence everywhere.
Structure : Graph structure with links between
pages.
Access : Hundreds of millions of requests per day.
Web Mining is the use of the data mining techniques to automatically discover and extract information from web documents
Discovering useful information from the World-Wide Web and its usage patterns
Web Mining
Data Mining vs Web Mining Traditional data mining
Data is structured and relational.
Well-defined tables, columns, rows, keys, and
constraints.
Web data
Semi-structured and unstructured.
Rich in features and patterns.
Enormous wealth of information on Web
Financial information Book/CD/Video stores Restaurant information Car prices
Lots of data on user access patterns Web logs contain sequence of URLs accessed by
users
Why Mine the Web?
The Web is a huge collection of documents except for
Hyper-link information Access and usage information
The Web is very dynamic
New pages are constantly being generated
Why is Web Mining Different?
Web Mining TaxonomyWeb Mining
Content Mining
Text
Image
Video
Audio
Structure Record
Structure Mining
Hyperlink
Inter Document Hyperlink
Intra Document Hyperlink
Document Structure
Usage Mining
Web Server Log
Application Sever Log
Application Level Log
Web Content Mining
This is the process of mining useful information from the contents of Web pages and Web documents,
which are mostly text, images and audio/video files.
Web structure mining Web structure mining is the process of discovering structure information from the web
This type of mining can be performed either the documents level or at the hyperlink level
web structure mining can be divided into two kinds:
1. Hyperlink : A hyperlink is a structural unit that connects a location in a web page to a different location, either within the same web page or on a different web page
2. document structure : The content within a Web page can also be organized in a tree structured format, based on the various HTML and XML tags within the page
Web usage miningWeb Usage Mining is the application of
data mining techniques to discover interesting usage patterns from Web data
Usage data captures the identity or origin of Web users along with their browsing behavior at a Web site.
Web usage mining itself can be classified further depending on the kind of usage data considered:
Web Server Data: The user logs are collected by the Web server. Typical data includes IP address
Application Server Data: Commercial application servers have significant features to enable e-commerce applications to be built on top of them with little effort. Application Level Data: New kinds of events can be defined in an application, and logging can be turned on for them thus generating histories of these specially defined events
Applications of web mining Information retrieval on the Web�
Network Management
E-commerce
conclusionAs the web and its usage continues to grow.
The past five years have seen the emergence of web mining as a rapidly growing area, due to the efforts of the research community as well as various organizations that are practicing it
Thank you