+ All Categories
Home > Technology > A Short PMML Tutorial by LatentView

A Short PMML Tutorial by LatentView

Date post: 20-Jan-2015
Category:
Upload: rameshlatentview
View: 4,249 times
Download: 2 times
Share this document with a friend
Description:
A Short PMML Tutorial by LatentView
29
Ramesh Hariharan PMML Tutorial www.LatentView.com This presentation is solely for the use of LatentView. No part of this presentation may be circulated, quoted, or reproduced for distribution without prior written approval from LatentView. 12-Feb-2009 www.LatentView.com www.latentview.com/blog
Transcript
Page 1: A Short PMML Tutorial by LatentView

Ramesh Hariharan

PMML Tutorial

www.LatentView.com

This presentation is solely for the use of LatentView. No part of this presentation may be circulated, quoted, or reproduced for distribution without prior written approval from LatentView.

12-Feb-2009

www.LatentView.com

www.latentview.com/blog

Page 2: A Short PMML Tutorial by LatentView

Agenda

• PMML Overview

• Constructing a PMML

• XSD Overview

• Reading the PMML Specification

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 2

• Next Steps…

Page 3: A Short PMML Tutorial by LatentView

Agenda

• PMML Overview

• Constructing a PMML

• XSD Overview

• Reading the PMML Specification

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 3

• Next Steps…

Page 4: A Short PMML Tutorial by LatentView

PMML Overview

PMML – Predictive Modeling Mark-up Language� Used for Model Scoring� XML Document� Owned by DMG. A consortium led by SPSS, SAS, IBM, Microsoft, Oracle and others� Currently in version 3.2

Advantages of PMML

� Portability of models� Metadata standardization� Model once, score anywhere (MOSA ☺)

Drawbacks of PMML

� Least Common Denominator� Potential loss of precision� Lack of support for complex transformations

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Some of the Model Types Supported� Association Rules, Clustering, General Regression, Naïve Bayes, Neural Networks, Support Vector

Machines

Capabilities of PMML� Model Composition – model sequencing & model selection� Built-in and User-defined functions� Usual data types – date, numbers, category� Model Verification – sample results for testing� Output field – create output tables based on the models� Extension Mechanisms

4

� Model once, score anywhere (MOSA ☺) � Lack of support for complex transformations� Lack of support from Tools

Page 5: A Short PMML Tutorial by LatentView

PMML in the Decision Management Architecture

Create Rules

Client Managers

Business Rules formulation

Scores and Decisions

Requests

Business Rules

Decision Models

Model Repository

Ope

ratio

nal S

yste

ms

Sales & Marketing

Customer Management

Risk Management

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

AnalyticModeling

LatentView Analysts Enterprise Decision Engine

Model Development

Enterprise Data

ProductData

ChannelData

CustomerData

Payment History Data

Interaction Data

Ope

ratio

nal S

yste

ms

Other Applications

Analytics Data Backbone

Page 6: A Short PMML Tutorial by LatentView

Agenda

• PMML Overview

• Constructing a PMML

• XSD Overview

• Reading the PMML Specification

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 6

• Next Steps…

Page 7: A Short PMML Tutorial by LatentView

Constructing a PMML<?xml version="1.0"?> <PMML version="3.2" xmlns="http://www.dmg.org/PMML-3_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > <Header copyright="Example.com"/> <DataDictionary> ... </DataDictionary> ... a model ...

</PMML>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 7

www.dmg.orghttp://dmg.org/v3-2/GeneralStructure.htmlhttp://dmg.org/v3-2/pmml-3-2.xsd

Page 8: A Short PMML Tutorial by LatentView

Constructing a PMML<?xml version="1.0"?> <PMML version="3.2" xmlns="http://www.dmg.org/PMML-3_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > <Header copyright="Example.com"/> <DataDictionary> ... </DataDictionary> ... a model ...

</PMML>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 8

www.dmg.orghttp://dmg.org/v3-2/GeneralStructure.htmlhttp://dmg.org/v3-2/pmml-3-2.xsd

Page 9: A Short PMML Tutorial by LatentView

Agenda

• PMML Overview

• Constructing a PMML

• XSD Overview

• Reading the PMML Specification

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 9

• Next Steps…

Page 10: A Short PMML Tutorial by LatentView

XSD Overview

XSD – XML Schema Definition

The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD.

An XML Schema:• defines elements that can appear in a document • defines attributes that can appear in a document • defines which elements are child elements • defines the order of child elements • defines the number of child elements

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

• defines the number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes

Page 11: A Short PMML Tutorial by LatentView

A First Example

Look at this simple XML document called "note.xml":

<?xml version="1.0"?> <note> <to>Tove</to>

<from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>

</note>

Look at the XML Schema for the same

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">

<xs:element name="note"> <xs:complexType>

<xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/>

</xs:sequence> </xs:complexType>

</xs:element></xs:schema>

Page 12: A Short PMML Tutorial by LatentView

Simple Elements

<xs:element name="xxx" type="yyy"/>

XML Schema has a lot of built-in data types. The most common types are:• xs:string• xs:decimal• xs:integer• xs:boolean• xs:date• xs:time

Example

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Example

<lastname>Refsnes</lastname> <age>36</age><dateborn>1970-03-27</dateborn>

<xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>

Page 13: A Short PMML Tutorial by LatentView

XSD Attributes

Simple elements cannot have attributes. If an element has attributes, it is considered to be of a complex type. But the attribute itself is always declared as a simple type.

<xs:attribute name="xxx" type="yyy"/>

where xxx is the name of the attribute and yyy specifies the data type of the attribute. XML Schema has a lot of built-in data types. The most common types are:• xs:string• xs:decimal• xs:integer• xs:boolean

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

• xs:boolean• xs:date• xs:time

Example

<lastname lang="EN">Smith</lastname>

<xs:attribute name="lang" type="xs:string"/>

Page 14: A Short PMML Tutorial by LatentView

Simple Elements: Restrictions

Restrictions are used to define acceptable values f or XML elements or attributes. Restrictions on XML elements are called facets.

Restrictions on Values<xs:element name="age">

<xs:simpleType> <xs:restriction base="xs:integer">

<xs:minInclusive value="0"/> <xs:maxInclusive value="120"/>

</xs:restriction> </xs:simpleType>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

</xs:simpleType></xs:element>

Restrictions on a set of Values<xs:element name="car" type="carType"/>

<xs:simpleType name="carType"> <xs:restriction base="xs:string">

<xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/>

</xs:restriction> </xs:simpleType>

Page 15: A Short PMML Tutorial by LatentView

Complex Elements

<employee> <firstname>John</firstname> <lastname>Smith</lastname>

</employee>

<xs:element name="employee" type="personinfo"/><xs:complexType name="personinfo">

<xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/>

</xs:sequence>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

</xs:sequence> </xs:complexType>

<xs:element name="employee“><xs:complexType>

<xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/>

</xs:sequence> </xs:complexType>

<xs:element>

Page 16: A Short PMML Tutorial by LatentView

More Complex Elements

You can also base a complex element on an existing complex element and add some elements, like this:

<xs:element name="employee" type="fullpersoninfo"/>

<xs:complexType name="personinfo"> <xs:sequence>

<xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/>

</xs:sequence> </xs:complexType>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

</xs:complexType>

<xs:complexType name="fullpersoninfo"> <xs:complexContent>

<xs:extension base="personinfo"> <xs:sequence>

<xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/>

</xs:sequence> </xs:extension>

</xs:complexContent> </xs:complexType>

Page 17: A Short PMML Tutorial by LatentView

XSD Indicators

You can also base a complex element on an existing complex element and add some elements, like this:

IndicatorsThere are seven indicators:

Order indicators:• All • Choice • Sequence

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Occurrence indicators:• maxOccurs• minOccurs

Group indicators:• Group name • attributeGroup name

Page 18: A Short PMML Tutorial by LatentView

Complex Type: Example

Let's have a look at this XML document called "ship order.xml":

<?xml version="1.0" encoding="ISO-8859-1"?><shiporder orderid="889923" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="shiporder.xsd">

<orderperson>John Smith</orderperson> <shipto>

<name>Ola Nordmann</name> <address>Langgt 23</address> <city>4000 Stavanger</city> <country>Norway</country>

</shipto> <item>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

<item> <title>Empire Burlesque</title> <note>Special Edition</note> <quantity>1</quantity> <price>10.90</price>

</item> <item>

<title>Hide your heart</title> <quantity>1</quantity> <price>9.90</price>

</item> </shiporder>

Page 19: A Short PMML Tutorial by LatentView

Complex Type: Example Solution

The XSD for the file:

<?xml version="1.0" encoding="ISO-8859-1" ?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:simpleType name="stringtype"><xs:restriction base="xs:string"/>

</xs:simpleType><xs:simpleType name="inttype">

<xs:restriction base="xs:positiveInteger"/></xs:simpleType><xs:simpleType name="dectype">

<xs:restriction base="xs:decimal"/></xs:simpleType><xs:simpleType name="orderidtype">

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

<xs:simpleType name="orderidtype"><xs:restriction base="xs:string"><xs:pattern value="[0-9]{6}"/></xs:restriction>

</xs:simpleType><xs:complexType name="shiptotype">

<xs:sequence><xs:element name="name" type="stringtype"/><xs:element name="address" type="stringtype"/><xs:element name="city" type="stringtype"/><xs:element name="country" type="stringtype"/>

</xs:sequence></xs:complexType>

continued next slide

Page 20: A Short PMML Tutorial by LatentView

Complex Type: Example Solution

The XSD for the file:

…continuous from the previous slide

<xs:complexType name="itemtype"><xs:sequence>

<xs:element name="title" type="stringtype"/><xs:element name="note" type="stringtype" minOccurs="0"/><xs:element name="quantity" type="inttype"/><xs:element name="price" type="dectype"/>

</xs:sequence></xs:complexType>

<xs:complexType name="shipordertype">

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

<xs:complexType name="shipordertype"><xs:sequence>

<xs:element name="orderperson" type="stringtype"/><xs:element name="shipto" type="shiptotype"/><xs:element name="item" maxOccurs="unbounded" type="itemtype"/>

</xs:sequence>

<xs:attribute name="orderid" type="orderidtype" use="required"/></xs:complexType>

<xs:element name="shiporder" type="shipordertype"/></xs:schema>

Page 21: A Short PMML Tutorial by LatentView

Agenda

• PMML Overview

• Constructing a PMML

• XSD Overview

• Reading the PMML Specification

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 21

• Next Steps…

Page 22: A Short PMML Tutorial by LatentView

PMML: Headers

<Header copyright="Copyright (c) 2009 LatentView" description="LatentView Logit Model v1.0">

<Extension name="timestamp" value="2009-01-19 19:38:13" extender="Rattle" /><Extension name="description" value="Administrator" extender="Rattle" /><Application name="Rattle/PMML" version="1.2.0" />

</Header>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Page 23: A Short PMML Tutorial by LatentView

PMML: Data Dictionary

<DataDictionary numberOfFields="23"><DataField name="ind_Sale" optype="continuous"

dataType="double" />…

<DataField name="STATE" optype="categorical" dataType="string" />

</DataDictionary>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Page 24: A Short PMML Tutorial by LatentView

PMML Transformations

PMML defines various kinds of simple data transformations:� Normalization : map values to numbers, the input can be continuous or discrete. � Discretization : map continuous values to discrete values. � Value mapping : map discrete values to discrete values. � Functions : derive a value by applying a function to one or more parameters � Aggregation : summarize or collect groups of values, e.g., compute average.

Value Mapping<DerivedField name="ETHNICGROUPCODE_02" optype="ordinal" dataType="integer">

<MapValues outputColumn="derived" defaultValue="0" mapMissingTo="0"><FieldColumnPair field="ETHNICGROUPCODE" column="original" />

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

<FieldColumnPair field="ETHNICGROUPCODE" column="original" /><InlineTable><row><original>02</original><derived>1</derived>

</row></InlineTable>

</MapValues></DerivedField>

Built-in Function<DerivedField name="I1EXACTAGE_dr" optype="continuous" dataType="double">

<Apply function="sum"><FieldRef field="I1EXACTAGE"/><FieldRef field="I1ESTIMATEDAGE"/>

</Apply></DerivedField>

Page 25: A Short PMML Tutorial by LatentView

PMML: Mining Schema

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Page 26: A Short PMML Tutorial by LatentView

PMML: Mining Schema

< <MiningSchema><MiningField name="ind_Sale" usageType="predicted" missingValueReplacement="-1"

missingValueTreatment="asValue" /><MiningField name="I1ESTIMATEDAGE" usageType="active" missingValueReplacement="-1"

missingValueTreatment="asValue"/><MiningField name="I2ESTIMATEDAGE" usageType="active" missingValueReplacement="-1"

missingValueTreatment="asValue"/>…

<MiningField name="I1EXACTAGE" usageType="active" missingValueReplacement="-1" missingValueTreatment="asValue"/>

</MiningSchema>

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Page 27: A Short PMML Tutorial by LatentView

Agenda

• PMML Overview

• Constructing a PMML

• XSD Overview

• Reading the PMML Specification

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential) 27

• Next Steps…

Page 28: A Short PMML Tutorial by LatentView

Next Steps

� Create a PMML file from your models – one for Logistic, Clustering and Decision Tree models

� Build PMML manually, and validate it using an XML editor such as XMLFox (a syntactically valid PMML may not be logically valid)

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

Page 29: A Short PMML Tutorial by LatentView

Thank You !

www.LatentView.com

LatentView Analytics Pvt. Ltd (Confidential)

JVL Plaza, Ground Floor,626 Anna Salai, Teynampet,Chennai – 600 018

Phone: +91-44-4509 4039/40

80, Broad Street, 5th FloorNew York, NY 10004

Phone: +1-212-837-7874


Recommended