ppt

ORACLE DATA MINING

TOPICS

DEVELOPERDEVELOPER

INTRODUCTIONINTRODUCTION

HISTORY

FUNCTIONS

1 1

2

3

4

DEVELOPER

Oracle Data Mining

Developer(s) : Oracle Corporation

Stable release : 11gR2 / September, 2009

Type : data mining and analytics

License : proprietary

INTRODUCTION

Oracle Data Mining (ODM) is an option of Oracle Corporation's Relational

Database Management System (RDBMS) Enterprise Edition (EE).

It contains several data mining and data analysis algorithms

for classification, prediction, regression, associations, feature selection,

anomaly detection, feature extraction, and specialized analytics.

It provides means for the creation, management and operational deployment

of data mining models inside the database environment.

Oracle Data Mining (ODM) provides powerful data mining functionality as native

SQL functions within the Oracle Database.

Oracle Data Mining enables users to discover new insights hidden in data and to

leverage investments in Oracle Database technology.

With Oracle Data Mining, you can build and apply predictive models that help you

target your best customers, develop detailed customer profiles, and find and prevent

fraud.

Oracle Data Mining, a component of the Oracle Advanced Analytics Option, helps

companies better "compete on analytics."

The Oracle Data Miner "work flow" based GUI, an extension to SQL Developer,

allows data analysts to explore their data, build and evaluate models, apply them

to new data and save and share their analytical methodologies.

Data analysts and application developers can use the SQL APIs to build next-

generation applications that automatically mine star schema data to build and

deploy predictive models that deliver real-time results and predictions throughout

the enterprise.

Because the data, models and results remain in the Oracle Database, data

movement is eliminated, information latency is minimized and security is

maintained.

Additionally, Oracle Data Mining models can be included in SQL queries and

embedded in applications to offer improved business intelligence.

Data analysts can quickly access their Oracle data using Oracle Data Miner

11g Release 2 graphical user interface and explore their data to find patterns,

relationships, and hidden insights.

Oracle Data Mining provides a collection of in-database data mining

algorithms that solve a wide range of business problems.

Anyone who can access data stored in an Oracle Database can access Oracle

Data Mining results-predictions, recommendations, and discoveries

using Oracle Business Intelligence Solutions.

HISTORY

Oracle Data Mining was first introduced in 2002 and its releases

are named according to the corresponding Oracle database

release:

– Oracle Data Mining 9iR2 (9.2.0.1.0 - May 2002)

– Oracle Data Mining 10gR1 (10.1.0.2.0 - February 2004)

– Oracle Data Mining 10gR2 (10.2.0.1.0 - July 2005)

– Oracle Data Mining 11gR1 (11.1 - September 2007)

– Oracle Data Mining 11gR2 (11.2 - September 2009)

FUNCTIONS

As of release 11gR1 Oracle Data Mining contains the

following data mining functions:

Data transformation and model analysis:

• Data sampling, binning, discretization, and other data

transformations.

• Model exploration, evaluation and analysis.

Feature selection (Attribute Importance):

• Minimum description length (MDL).

Classification:

• Naive Bayes (NB).

• Generalized linear model (GLM) for Logistic regression.

• Support Vector Machine (SVM).

• Decision Trees (DT).

Regression:

• Support Vector Machine (SVM).

• Generalized linear model (GLM) for Multiple regression

Anomaly detection:

• One-class Support Vector Machine (SVM).

Feature extraction:

• Non-negative matrix factorization (NMF).

Text and spatial mining:

• Combined text and non-text columns of input data.

• Spatial/GIS data.

Clustering:

• Enhanced k-means (EKM).

• Orthogonal Partitioning Clustering (O-Cluster).[2][3]

Association rule learning:

• Item sets and association rules (AM).

• Data mining (the advanced analysis step of the "Knowledge Discovery in

Databases" process, or KDD), an interdisciplinary subfield of computer

science,[2][3][4] is the computational process of discovering patterns in

large data sets involving methods at the intersection of artificial

intelligence, machine learning, statistics, and database systems.[2] The overall

goal of the data mining process is to extract information from a data set and

transform it into an understandable structure for further use.

• Analysis of data is a process of inspecting, cleaning, transforming, and

modeling data with the goal of highlighting useful information, suggesting

conclusions, and supporting decision making.

• In machine learning and statistics, classification is the problem of identifying

to which of a set of categories (sub-populations) a new observation belongs,

on the basis of a training set of data containing observations (or instances)

whose category membership is known.

• A prediction (Latin præ-, "before," and dicere, "to say") or forecast is a

statement about the way things will happen in the future, often but not always

based on experience or knowledge. While there is much overlap

between prediction and forecast, a prediction may be a statement that some

outcome is expected, while a forecast is more specific, and may cover a range

of possible outcomes.

• In statistics, regression analysis is a statistical technique for estimating the

relationships among variables. It includes many techniques for modeling and

analyzing several variables, when the focus is on the relationship between

a dependent variable and one or more independent variables.

• association rule learning is a popular and well researched method for

discovering interesting relations between variables in large databases. It is

intended to identify strong rules discovered in databases using different

measures of interestingness.

• In machine learning and statistics, feature selection, also known

as variable selection, attribute selection or variable subset selection, is

the process of selecting a subset of relevant features for use in model

construction. The central assumption when using a feature selection

technique is that the data contains many redundant or irrelevant features.

• Anomaly detection, also referred to as outlier detection refers to detecting

patterns in a given data set that do not conform to an established normal

behavior. The patterns thus detected are called anomalies and often translate

to critical and actionable information in several application domains.

Anomalies are also referred to as outliers, change, deviation, surprise,

aberrant, peculiarity, intrusion, etc.

• In pattern recognition and in image processing, feature extraction is a

special form of dimensionality reduction.

INPUT SOURCES AND DATA PREPARATION

• Most Oracle Data Mining functions accept as input one relational table or

view.

• Flat data can be combined with transactional data through the use of nested

columns, enabling mining of data involving one-to-many relationships (e.g.

a star schema).

• The full functionality of SQL can be used when preparing data for data

mining, including dates and spatial data.

• Oracle Data Mining distinguishes numerical, categorical, and unstructured

(text) attributes.

• The product also provides utilities for data preparation steps prior to model

building such as outlier treatment, discretization, normalization and binning

(sorting in general speak)

Graphical user interface: Oracle Data Miner

Oracle Data Mining can be accessed using Oracle Data Miner a GUI “client”

that provides access to the data mining functions and structured templates

called Mining Activities that automatically prescribe the order of operations,

perform required data transformations, and set model parameters.

The user interface also allows the automated generation of Java and/or SQL

code associated with the data mining activities.

The Java Code Generator is an extension to Oracle JDeveloper.

There is also an independent interface: the Spreadsheet Add-In for Predictive

Analytics which enables access to the Oracle Data Mining Predictive

Analytics PL/SQL package from Microsoft Excel.

Thank You !!!

Date post:	12-Aug-2015
Category:	Documents
Upload:	gowripriya12
View:	29 times
Download:	2 times

ppt

Documents