Date post: | 12-Aug-2015 |
Category: |
Documents |
Upload: | gowripriya12 |
View: | 29 times |
Download: | 2 times |
ORACLE DATA MINING
TOPICS
DEVELOPERDEVELOPER
INTRODUCTIONINTRODUCTION
HISTORY
FUNCTIONS
1 1
2
3
4
DEVELOPER
Oracle Data Mining
Developer(s) : Oracle Corporation
Stable release : 11gR2 / September, 2009
Type : data mining and analytics
License : proprietary
INTRODUCTION
Oracle Data Mining (ODM) is an option of Oracle Corporation's Relational
Database Management System (RDBMS) Enterprise Edition (EE).
It contains several data mining and data analysis algorithms
for classification, prediction, regression, associations, feature selection,
anomaly detection, feature extraction, and specialized analytics.
It provides means for the creation, management and operational deployment
of data mining models inside the database environment.
Oracle Data Mining (ODM) provides powerful data mining functionality as native
SQL functions within the Oracle Database.
Oracle Data Mining enables users to discover new insights hidden in data and to
leverage investments in Oracle Database technology.
With Oracle Data Mining, you can build and apply predictive models that help you
target your best customers, develop detailed customer profiles, and find and prevent
fraud.
Oracle Data Mining, a component of the Oracle Advanced Analytics Option, helps
companies better "compete on analytics."
The Oracle Data Miner "work flow" based GUI, an extension to SQL Developer,
allows data analysts to explore their data, build and evaluate models, apply them
to new data and save and share their analytical methodologies.
Data analysts and application developers can use the SQL APIs to build next-
generation applications that automatically mine star schema data to build and
deploy predictive models that deliver real-time results and predictions throughout
the enterprise.
Because the data, models and results remain in the Oracle Database, data
movement is eliminated, information latency is minimized and security is
maintained.
Additionally, Oracle Data Mining models can be included in SQL queries and
embedded in applications to offer improved business intelligence.
Data analysts can quickly access their Oracle data using Oracle Data Miner
11g Release 2 graphical user interface and explore their data to find patterns,
relationships, and hidden insights.
Oracle Data Mining provides a collection of in-database data mining
algorithms that solve a wide range of business problems.
Anyone who can access data stored in an Oracle Database can access Oracle
Data Mining results-predictions, recommendations, and discoveries
using Oracle Business Intelligence Solutions.
HISTORY
Oracle Data Mining was first introduced in 2002 and its releases
are named according to the corresponding Oracle database
release:
– Oracle Data Mining 9iR2 (9.2.0.1.0 - May 2002)
– Oracle Data Mining 10gR1 (10.1.0.2.0 - February 2004)
– Oracle Data Mining 10gR2 (10.2.0.1.0 - July 2005)
– Oracle Data Mining 11gR1 (11.1 - September 2007)
– Oracle Data Mining 11gR2 (11.2 - September 2009)
FUNCTIONS
As of release 11gR1 Oracle Data Mining contains the
following data mining functions:
Data transformation and model analysis:
• Data sampling, binning, discretization, and other data
transformations.
• Model exploration, evaluation and analysis.
Feature selection (Attribute Importance):
• Minimum description length (MDL).
Classification:
• Naive Bayes (NB).
• Generalized linear model (GLM) for Logistic regression.
• Support Vector Machine (SVM).
• Decision Trees (DT).
Regression:
• Support Vector Machine (SVM).
• Generalized linear model (GLM) for Multiple regression
Anomaly detection:
• One-class Support Vector Machine (SVM).
Feature extraction:
• Non-negative matrix factorization (NMF).
Text and spatial mining:
• Combined text and non-text columns of input data.
• Spatial/GIS data.
Clustering:
• Enhanced k-means (EKM).
• Orthogonal Partitioning Clustering (O-Cluster).[2][3]
Association rule learning:
• Item sets and association rules (AM).
• Data mining (the advanced analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science,[2][3][4] is the computational process of discovering patterns in
large data sets involving methods at the intersection of artificial
intelligence, machine learning, statistics, and database systems.[2] The overall
goal of the data mining process is to extract information from a data set and
transform it into an understandable structure for further use.
• Analysis of data is a process of inspecting, cleaning, transforming, and
modeling data with the goal of highlighting useful information, suggesting
conclusions, and supporting decision making.
• In machine learning and statistics, classification is the problem of identifying
to which of a set of categories (sub-populations) a new observation belongs,
on the basis of a training set of data containing observations (or instances)
whose category membership is known.
• A prediction (Latin præ-, "before," and dicere, "to say") or forecast is a
statement about the way things will happen in the future, often but not always
based on experience or knowledge. While there is much overlap
between prediction and forecast, a prediction may be a statement that some
outcome is expected, while a forecast is more specific, and may cover a range
of possible outcomes.
• In statistics, regression analysis is a statistical technique for estimating the
relationships among variables. It includes many techniques for modeling and
analyzing several variables, when the focus is on the relationship between
a dependent variable and one or more independent variables.
• association rule learning is a popular and well researched method for
discovering interesting relations between variables in large databases. It is
intended to identify strong rules discovered in databases using different
measures of interestingness.
• In machine learning and statistics, feature selection, also known
as variable selection, attribute selection or variable subset selection, is
the process of selecting a subset of relevant features for use in model
construction. The central assumption when using a feature selection
technique is that the data contains many redundant or irrelevant features.
• Anomaly detection, also referred to as outlier detection refers to detecting
patterns in a given data set that do not conform to an established normal
behavior. The patterns thus detected are called anomalies and often translate
to critical and actionable information in several application domains.
Anomalies are also referred to as outliers, change, deviation, surprise,
aberrant, peculiarity, intrusion, etc.
• In pattern recognition and in image processing, feature extraction is a
special form of dimensionality reduction.
INPUT SOURCES AND DATA PREPARATION
• Most Oracle Data Mining functions accept as input one relational table or
view.
• Flat data can be combined with transactional data through the use of nested
columns, enabling mining of data involving one-to-many relationships (e.g.
a star schema).
• The full functionality of SQL can be used when preparing data for data
mining, including dates and spatial data.
• Oracle Data Mining distinguishes numerical, categorical, and unstructured
(text) attributes.
• The product also provides utilities for data preparation steps prior to model
building such as outlier treatment, discretization, normalization and binning
(sorting in general speak)
Graphical user interface: Oracle Data Miner
Oracle Data Mining can be accessed using Oracle Data Miner a GUI “client”
that provides access to the data mining functions and structured templates
called Mining Activities that automatically prescribe the order of operations,
perform required data transformations, and set model parameters.
The user interface also allows the automated generation of Java and/or SQL
code associated with the data mining activities.
The Java Code Generator is an extension to Oracle JDeveloper.
There is also an independent interface: the Spreadsheet Add-In for Predictive
Analytics which enables access to the Oracle Data Mining Predictive
Analytics PL/SQL package from Microsoft Excel.
Thank You !!!