Date post: | 20-Jun-2015 |
Category: |
Technology |
Upload: | kevin-mccarthy |
View: | 176 times |
Download: | 3 times |
Not for Public Use
3rd Party DataWho has it, how to get it and use it
Pittsburgh PAOctober 10th, 2012
2
Hi, I’m Kevin
Not for Public Use
compdataI live here I like
this I work on this
3
Focus @ Risk Metrics
Not for Public Use
Data Cloud
Mobile
4
Data
Not for Public Use
¨ Average 1 Terabyte net new monthly
¨ 209 million records processed YTD 2012
¨ Mange data on 4+ million businesses
¨ 16 people
5
“Cloud”
Not for Public Use.0064%
6
Mobile
Not for Public Use
7
Second Day Presenter
Not for Public Use
8
Dealing with Big Data through Analogies
Not for Public Use
9
Who do we have in today?
Not for Public Use
Show of hands…
¨ Marketing
¨ Distribution
¨ Business Intelligence
¨ Underwriting
¨ Claims
¨ Data Scientists?
10
Pioneers
Not for Public Use
11
3rd Party Data
Not for Public Use
Who has it, how to get it and how to use it.
¨ You have internal data
¨ What else is out there?
¨ How do you get access to it?
¨ Acquire, integrate and leverage?
12
3rd Party Data
Not for Public Use
Why?
¨ Models are made up of variables likely to influence future
behavior or results.
¨ The more relevant data you collect of the right type and at
the right level, the more accurate the prediction.
¨ Accessible
¨ Structured
¨ Scalable
“We don’t win because we have better algorithms, we win because we have better data”
- Larry Page, Google
13
Application Potential
Not for Public Use
14
Who is 3rd Party
Not for Public Use
15
Data Markets & Data as a Service (DaaS)
Not for Public Use
¨ SaaS + PaaS + 3rd Party = DaaS
¨ Recent emergence
¨ Growing trend
¨ Not yet “one stop shop”
Recent emergence of data marketplaces.
16
Access and Delivery Methodologies
Not for Public Use
¨ Download
¨ Views
¨ Batch
¨ API
¨ Scrape
17
Normalization
Not for Public Use
Preparation for integration
Address Triangulation
CASS, NCOA and Lat/long
Initial Data Set
Name Standardisatio
n
18
“Fuzzy Logic”
Not for Public Use
Mathematical processes that determine the similarities between data sets
19
“Fuzzy Logic”
Not for Public Use
Same, even though they look different:
¨ “McDonald’s Restaurant” and “Mc Donalds Family Rstrnt”
¨ “Starbucks Coffee Co” and “Starbucks”
¨ “Wendys Old Fashion Hmbrgrs” and “Wendy’s Restaurant”
Science + Art
Different, even though they look similar:
¨ “Jim S. Starbuck” (CPA) and “Jim St. Starbuck” (Starbucks on
Jim St.)
¨ “Wendy’s” vs. “Wendy’s Donuts” vs. “Nails by Wendy”
¨ “Wendys Old Fashion Hmbrgrs” and “Wendy’s Restaurant”
20
API
Not for Public Use
So what's an API anyway?
¨ Application Programming Interface.
¨ Commands and formats for standardized program
communicationProvidersAnalogies
21
APIs and Data-as-a-Service (DaaS)Scalable access to data
Not for Public Use
URL API Query
sTe52D4spwCvnAX47RpBHhz608i
XML or JSON Response
22
Property ExampleData and Match Overview
¨ 161,560 records in from client¨ 9,719 unable to be standardized¨ 2,052 not in coverage ¨ 94,181 address match (62.62%)¨ 5,074 mailing address match
(3.37%)
23
Workers Compensation ExampleData and Match Overview
*For Discussion: RM File, Optimizer Mismatch…
¨ 50,691 policy matches (31.37%)¨ 41,014 with incumbent carrier of
record¨ 38,513 with effective date¨ High effective date correlation
to Hanover effective date¨ 6,982 pickup up by RM n-gram
match algorithm*¨ 43,000 matches available via
APICorrelation to other line X-Date
24
Auto ExampleData and Match overview
¨ 31,560 current customers with 1 or more vehicle
¨ 293,100 Total Insurable Vehicles
Auto Outliers (Examples)
¨ Lease Plan USA, Alpharetta GA (34,218)¨ New Jersey Transit Corp, Newark, NJ
(4,050)¨ GSP Transportation, Greer, SC (1,429)¨ Frac Tech Services, Cisco, TX (1,162)¨ City of Houston, Houston, TX (1,125)
84%
16%
Example Record Set
Not for Public Use 25
Type Field Data
Base Recor
d Information
Unique ID NC428
Description L. L. Vann Electric, Inc.
Address 833 Purser Dr
City Raleigh
STate NC
ZIP 27603
Telephone 9197722567
County WAKE
Emp Total 100-499
Year Started 1987
SIC 1731 - Electrical Contractor
Work Comp
Effective Date 12/31
Effective Month 12
NAIC Carrier Number 31325
NAIC Carrier Name ACADIA INSURANCE COMPANY
NAIC Group Number 98
NAIC Group Name WR BERKLEY CORPORATION GROUP
WC Class Code 3179 - Electrician
Type Field Data
Property Ownersh
ip Informati
on
Current Owner Name L L VANN ELECTRIC INCStreet Address 833 PURSER DR
Mailing City RALEIGH Mailing State NC Mailing Zip 27603
Total Assessed Value $715,482 Assessed Improvement Value $539,064
Assessed Land Value $176,418 Assessment Year 2010Total Market Value $715,482
Market Value: Improvement $539,064 Market Value: Land $176,418 Market Value Year 2010
Original Date of Contract 12.02.1997Sales Price $390,000 Year Built 1987
Zoning SBOriginal Lot Size or Area 1.62 AC
Building Area 4,800 No of Buildings 2
No of Stories 1
Commercial Auto
Data
Car Fleet 2Truck Fleet 54Total Fleet 56
Lower Middle (Car) 1Upper Middle (Car) 1
Heavy Duty Station Wagon 2Window Van (Passenger) 2
Mini Sport Utility 2Midsize Pickup 45
Bus 3
26
Sourcing
Not for Public Use
Continually gather insight
¨ Identification
¨ Evaluation
¨ Implementation
¨ Management
¨ Post Implementation
¨ Source of data
¨ Supply chain
¨ Update frequency
¨ Distribution timeframe
¨ Test support
¨ Rent vs. Acquire
PROCESS IMPORTANT QUESTIONS
27
Closing
Not for Public Use
More analogies