Jim CzuprynskiOracle ACE Director, ODTUG Database Community Lead + TechCeleration Editor
[email protected]/JimTheWhyGuy
1
What’s Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB
Future and pastTechCasts:
Formerly called the BIWA Summit with the Spatial and Graph SummitSame great technical content…new name!
www.AnalyticsandDataSummit.org
Submit a topic to share at https://analyticsanddatasummit.org/techcasts/
My Credentials• 40 years of database-centric IT experience
• Oracle DBA since 2001
• Oracle 9i, 10g, 11g, 12c OCP and ADWC
• Oracle ACE Director since 2014
• ODTUG Database Committee Lead
• Editor of ODTUG TechCeleration
• Oracle-centric blog (Generally, It Depends)
• Regular speaker at Oracle OpenWorld, IOUG COLLABORATE, Hotsos Symposium, and Regional OUGs
➢E-mail me at [email protected]➢Follow me on Twitter (@JimTheWhyGuy)➢Connect with me on LinkedIn (Jim Czuprynski)
Our Agenda
•Unlearn What You Have Learned: What Machine Learning (ML) Promises
•So … Just How Much ML Do I Really Need?
• Leveraging Built-In Autonomous DB ML Capabilities in Under 10 Minutes
•See the Difference? Using Visual Representation to Discover Hidden Patterns
•Meet OML’s Big Brother, Oracle Analytics Cloud (OAC)
If The Onion Recognizes a Trend … It’s Probably A Thing. Just Saying.
From The Onion, January 18, 2018
Let My DBAs Go: Freeing Your Team’s Best Data Experts From Tedium
Let My DBAs Go: Freeing Your Team’s Best Data Experts From Tedium
DBAs are often themost knowledgeableresources about your organization’s data. Who better to lead the transition to a
data-driven orientation?
Specialization Is Dead. Long Live the Generalist.
Compared to other scientists, Nobel laureates are at least twenty-two times more likely to
partake as an amateur actor, dancer, magician, or other type of performer.
- David J. Epstein. Range (p. 33).
“We now have the [enemy] exactly where we want them. We can now
attack in any direction.”- Brigadier General Anthony C. “Nuts” McAuliffe
Data is the new oil, and its miners are data scientists … but DBAs are uniquely
positioned to support themAll images from images.google.com
APEX, AI, and ML: Where Analytic Magic Happens
Application Express (APEX) makes it trivial to instantly import data and business
applications directly into Oracle … even if it’s just resident within a simple spreadsheet
Oracle’s REST API enables quick development of complex data
entry and reporting applications within APEX in a low-code
environment
Once relevant data is captured, Oracle’s built-in data mining tools
make is simple to build data models, apply well-known algorithms, and obtain predictions for immediate
business insights
From Nothing to OML, In Under Ten Minutes• Creating an OML User
• Organizing Your OML Desktop
• Building Your First Zeppelin Notebook
OML: Where Do You Want To Go Today?
See the full “cheat sheet” here: https://www.oracle.com/a/tech/docs/oml4sql-algorithm-cheat-sheet.pdf
Configuring Your OML Environment
Request new ML User creation1
Configuring Your OML Environment
Request new ML User creation1 Specify username, password, and details2
Configuring Your OML Environment
Your user receives an e-mail confirmation 3
Configuring Your OML Environment
Your user receives an e-mail confirmation 3 If necessary, specify a new password4
Creating Your First Zeppelin Notebook
Welcome to the OML desktop …1
Creating Your First Zeppelin Notebook
Welcome to the OML desktop …1 … from where you can create Projects, Workspaces, and Notebooks2
Creating Your First Zeppelin Notebook
Create a new Project …3
Creating Your First Zeppelin Notebook
Create a new Project …3 … specify your new Project’s details …4
Creating Your First Zeppelin Notebook
Create a new Project …3 … specify your new Project’s details …4 … and then create your first Notebook5
Building an OML Analysis
First, let’s grant all necessary privileges to the TPC-E schema …
1
Building an OML Analysis
First, let’s grant all necessary privileges to the TPC-E schema …
1… and then build a query against several tables that capture Trade Details and ancillary information for a limited date range
2
Building an OML Analysis
… confirm retrieval execution …3
Building an OML Analysis
… confirm retrieval execution …3… and then view a sample of the data retrieved from TPC-E tables prior to analyses
4
Leveraging OML Data Visualization
Graphing the resulting retrieved data is as simple as picking a chart format (e.g. a simple pie chart)
1
Leveraging OML Data Visualization
Graphing the resulting retrieved data is as simple as picking a chart format (e.g. a simple pie chart)
1Changing the graph’s format is as simple as dragging and dropping desired attributes into appropriate analytic regions of the interface2
Leveraging OML Data Visualization
Want to display results in a totally different graph format? Choose any one …
3
Leveraging OML Data Visualization
Want to display results in a totally different graph format? Choose any one …
3
… and even choose different values to graph
4
Leveraging DBMS_DATA_MINING (1)
Access Machine Learning tools …
1
Leveraging DBMS_DATA_MINING (1)
Access Machine Learning tools …
1
… and choose from a number of available data mining examples and templates
2
Leveraging DBMS_DATA_MINING (1)
Here’s an example of how to implement a Time Series forecast and its corresponding exponential smoothing algorithm …3
Leveraging DBMS_DATA_MINING (1)
Here’s an example of how to implement a Time Series forecast and its corresponding exponential smoothing algorithm …3
… included within DBMS_DATA_MINING to implement data capture and then application of the algorithm
4
Leveraging DBMS_DATA_MINING (2)
Here, we execute SQL statements necessary to create dependent objects for the Time Series analysis …
5
Leveraging DBMS_DATA_MINING (2)
Here, we execute SQL statements necessary to create dependent objects for the Time Series analysis …
5 … using the following parameters specified for the algorithm6
Leveraging DBMS_DATA_MINING (2)
Now let’s actually execute the Time Series algorithm …7
Leveraging DBMS_DATA_MINING (2)
Now let’s actually execute the Time Series algorithm …7 … and view the results of the Time Series forecast8
Leveraging DBMS_DATA_MINING (3)
Almost there! Here are the actual results of the forecast …9
Leveraging DBMS_DATA_MINING (3)
Almost there! Here are the actual results of the forecast …9 … and the results of graphing them across time10
Leveraging DBMS_DATA_MINING (3)
Here’s the tabular detail behind the graph …11
Leveraging DBMS_DATA_MINING (3)
Here’s the tabular detail behind the graph …11 … and the result of applying key factor identification12
Leveraging DBMS_DATA_MINING (3)
Here’s the end result, in graphic format
13
Oh, Just One More Thing …• Leveraging Oracle Analytic Cloud (OAC) for deeper insights
• Examples: Analyzing Electoral Demographic Data
Personal or External Datasets
Enterprise Applications
Data Integration
Oracle Analytics Cloud
HOW? Analytical Data MartBusiness Managed Solution Architecture
43
Ad hoc, Batch or
Scheduled
Business Leaders
Analysts
Data Scientists
Developers
OAC Dataflow, Manual or ETL
Any FunctionalData
Data Management
Autonomous Data Warehouse
Business Analytics
• Profitability• Attrition• Sentiment• Click Stream• Procurement• Usage Tracking
ERP CRM HCM
Makes data available for analytics
Self-service analytics with ML
Use Case: Managing a 2020 US Congressional Campaign
44
• Identify potential voters, including “flippability” since 2018 election• Gerrymandered district stretches several dozen miles across five different counties and
56 municipalities
• Political landscape has changed dramatically in early months of 2020
• Analyze results of voter outreach• Record successful voter commitment to candidate
• Identify possible new volunteers and contributors
• Determine which messages resonate (retain!) … as well as which ones fall flat (discard!)
• Figure out how to best deploy multitudes of volunteers• Everyone is not suited to canvassing• Mundane but crucial tasks cannot be ignored!
• Handling requests for yard signs from motivated voters• Driving voter awareness through sign placement at high-traffic intersections
Realization: Perfect for the Cloud!
45
• Relatively simple data model and small data volume (<500M rows)
• Short development time frames demand quick orchestration for compute and storage
• Complex n-dimensional analytics may be required• Past voting record vs. strength of party affiliation vs. impact of current
government initiatives vs. …• Determine “flippability” of voter before canvassing starts
• Security is a must … so encryption is crucial!• Most voter data is publicly available … but contributor and volunteer
information as well as canvassing results are proprietary• Removes concerns about on-premises development and data storage• Eliminates risk of penetration by rival campaign
Preparing Grouping Levels for Voter Demographics
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_ethnicity
,v_race_designation
,v_likely_group
,v_likely_party
,COUNT(*)
FROM (
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_race_designation
. . .
Add some higher-level
groupings for . . .
Preparing Grouping Levels for Voter Demographics
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_ethnicity
,v_race_designation
,v_likely_group
,v_likely_party
,COUNT(*)
FROM (
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_race_designation
. . .
Add some higher-level
groupings for . . .
. . .
,DECODE(v_race_designation
,'White (Low)' , 'White'
,'White (Medium)' , 'White'
,'White (High)' , 'White'
,'Hispanic (Low)' , 'Hispanic'
,'Hispanic (Medium)', 'Hispanic'
,'Hispanic (high)' , 'Hispanic'
,'Asian (Low)' , 'Asian'
,'Asian (medium)' , 'Asian'
,'Asian (high)' , 'Asian'
,'Black (Low)' , 'Black'
,'Black (Medium)' , 'Black'
,'Black (High)' , 'Black'
,'Unspecified'
) AS v_ethnicity
. . .
… ethnicity …
Preparing Grouping Levels for Voter Demographics
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_ethnicity
,v_race_designation
,v_likely_group
,v_likely_party
,COUNT(*)
FROM (
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_race_designation
. . .
Add some higher-level
groupings for . . .
. . .
,DECODE(v_race_designation
,'White (Low)' , 'White'
,'White (Medium)' , 'White'
,'White (High)' , 'White'
,'Hispanic (Low)' , 'Hispanic'
,'Hispanic (Medium)', 'Hispanic'
,'Hispanic (high)' , 'Hispanic'
,'Asian (Low)' , 'Asian'
,'Asian (medium)' , 'Asian'
,'Asian (high)' , 'Asian'
,'Black (Low)' , 'Black'
,'Black (Medium)' , 'Black'
,'Black (High)' , 'Black'
,'Unspecified'
) AS v_ethnicity
. . .
… ethnicity …
. . .
,DECODE(v_race_designation
,'White (Low)' , 'Low'
,'White (Medium)' , 'Median'
,'White (High)' , 'High'
,'Hispanic (Low)' , 'Low'
,'Hispanic (Medium)', 'Median'
,'Hispanic (high)' , 'High'
,'Asian (Low)' , 'Low'
,'Asian (medium)' , 'Median'
,'Asian (high)' , 'High'
,'Black (Low)' , 'Low'
,'Black (Medium)' , 'Nedian'
,'Black (High)' , 'High'
,'Unspecified'
) AS v_income_level
. . .
… income level …
Preparing Grouping Levels for Voter Demographics
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_ethnicity
,v_race_designation
,v_likely_group
,v_likely_party
,COUNT(*)
FROM (
SELECT
v_county_name
,v_situs_city
,v_situs_zip5
,v_age
,v_race_designation
. . .
Add some higher-level
groupings for . . .
. . .
,DECODE(v_race_designation
,'White (Low)' , 'White'
,'White (Medium)' , 'White'
,'White (High)' , 'White'
,'Hispanic (Low)' , 'Hispanic'
,'Hispanic (Medium)', 'Hispanic'
,'Hispanic (high)' , 'Hispanic'
,'Asian (Low)' , 'Asian'
,'Asian (medium)' , 'Asian'
,'Asian (high)' , 'Asian'
,'Black (Low)' , 'Black'
,'Black (Medium)' , 'Black'
,'Black (High)' , 'Black'
,'Unspecified'
) AS v_ethnicity
. . .
… ethnicity …
. . .
,DECODE(v_race_designation
,'White (Low)' , 'Low'
,'White (Medium)' , 'Median'
,'White (High)' , 'High'
,'Hispanic (Low)' , 'Low'
,'Hispanic (Medium)', 'Median'
,'Hispanic (high)' , 'High'
,'Asian (Low)' , 'Low'
,'Asian (medium)' , 'Median'
,'Asian (high)' , 'High'
,'Black (Low)' , 'Low'
,'Black (Medium)' , 'Nedian'
,'Black (High)' , 'High'
,'Unspecified'
) AS v_income_level
. . .
… income level …
. . .
,v_likely_party
,DECODE(v_likely_party
,'LR', 'GOP’, 'SR', 'GOP’ ,'NR', 'GOP’
,'LD', 'DEM’, 'SD', 'DEM’, 'ND', 'DEM’
,'I' , 'IND'
,'Unspecified'
) AS v_likely_group
FROM vevo.voters
)
GROUP BY
v_county_name, v_situs_city, v_situs_zip5
,v_age, v_ethnicity, v_income_level, v_race_designation
,v_likely_group, ,v_likely_party
ORDER BY
v_county_name, v_situs_city, v_situs_zip5
,v_age, v_ethnicity, v_income_level, v_race_designation
,v_likely_group, ,v_likely_party
;
… and political party / orientation
Exploring Data With Oracle Analytics Cloud (1)
Choose a data source …1
Exploring Data With Oracle Analytics Cloud (1)
Choose a data source …1 … and select which schema to gather data from2
Exploring Data With Oracle Analytics Cloud (1)
Select which view to capture data from …3
Exploring Data With Oracle Analytics Cloud (1)
Select which view to capture data from …3 … and view the columns available4
Exploring Data With Oracle Analytics Cloud (2)
Here’s a preview of all selected columns from the chosen view …
5
Exploring Data With Oracle Analytics Cloud (2)
Here’s a preview of all selected columns from the chosen view …
5 … and, if desired, a more complete look at all of the selected data from the view6
Exploring Data With Oracle Analytics Cloud (2)
Now let’s move on to visualizing the data retrieved. Here are the data elements and their datatypes …
7
Exploring Data With Oracle Analytics Cloud (2)
Now let’s move on to visualizing the data retrieved. Here are the data elements and their datatypes …
7 … and here are the myriad different presentation types we can select from to visualize the data
8
Exploring Data With Oracle Analytics Cloud (3)
We’ll save the project for now …9
Exploring Data With Oracle Analytics Cloud (3)
We’ll save the project for now …9 … and finally we can visualize our data within discrete categorizations10
Want To Get Off To A Quick Start? Check Out Our Brief Video Series.
And don’t miss this brief primer on how to leverage Autonomous DB and
Oracle Analytics Cloud (OAC):http://bit.ly/2u9qF6K
Here’s a short primer of how to leverage Oracle Machine Learning (OML) and
Zeppelin Notebooks to begin leveraging Machine Learning in a matter of minutes:
http://bit.ly/39FBVGX
Useful Resources and Documentation
• OML Web Page: https://www.oracle.com/database/technologies/datawarehouse-bigdata/machine-learning.html
• OML “Cheat Sheet”: https://www.oracle.com/a/tech/docs/oml4sql-algorithm-cheat-sheet.pdf
• OML Blog Post: https://blogs.oracle.com/datamining/introducing-oracle-machine-learning-sql-notebooks-for-the-oracle-autonomous-data-warehouse-cloud
• Corresponding ODTUG Article Series:• Part 1: https://www.odtug.com/p/bl/et/blogid=20&blogaid=940
• Part 2: https://www.odtug.com/p/bl/et/blogid=20&blogaid=958
• Part 3: Coming soon!
Other Helpful Links
62
ORACLE AUTONOMOUS CLOUDhttps://cloud.oracle.com/tryit
ORACLE AUTONOMOUS HANDS ON LAB FOR DEVELOPERShttps://go.oracle.com/e/f2?LP=82486
ORACLE MACHINE LEARNING ON OTNhttps://www.oracle.com/technetwork/database/options/oml/overview/index.html
OML TUTORIALSBasic getting started: https://docs.oracle.com/en/cloud/paas/autonomous-data-warehouse-cloud/omlug/get-started-oracle-machine-learning.html#GUID-2AEC56A4-E751-48A3-AAA0-0659EDD639BA
ORACLE ANALYTICS CLOUDExamples: https://www.oracle.com/solutions/business-analytics/data-visualization/examples.html
Future and pastTechCasts:
Formerly called the BIWA Summit with the Spatial and Graph SummitSame great technical content…new name!
www.AnalyticsandDataSummit.org
Submit a topic to share at https://analyticsanddatasummit.org/techcasts/