Date post: | 15-Apr-2017 |
Category: |
Data & Analytics |
Upload: | jesse-wang |
View: | 376 times |
Download: | 0 times |
What are you building?
A Platform or an IDE
Two approaches in big data analytics
Intro: Big Data Analytics is In
The biggest difference between traditional enterprises and internet-age enterprises lies in the data, everything is being converted to data. The competition between companies is really the competition between their utilization of data.
Data as CapitalIn the age of everything is being digitalized, data is everywhere and winning in the virtual world directly translates into competitive advantage in real world
How to properly value and utilize data is the key to capitalizing on big data
Steps to Achievement
Data Collection• Identify datasets•Collect data• Joining data
Basic Analytics•Understand data•Clean, enrich, transform data
•Simple discovery
Experimentation
•Apply data science•Learn more about data•Build and test models
Model Application•Apply various models•Get feedback, refine models
How to win at Big Data Analytics
Types of Users
Decision
makers
Line-of-business workers
Business analysts
Data scientist
s / engineer
s
Domain experts
What need be built?
Data Information
Knowledge Wisdom
Applications to convert raw data into wisdom
Two approaches
Applications
Analytic Platform
Integrated Developm
ent Environme
nt
Do you have a big data analytics
platform?Many companies claim they have one
Do they really have a true platform, or they are building custom applications every time?
What is a Platform
In general, a piece of computer software designed to support applications, with fundamental functions provided, obeying its constraints, and making use of its facilities
Different abstraction levels: hardware, OS, Runtime, Web, Cloud, Analytics …
A big data analytics platform allows people to build apps out of components that are hosted or provided by the providers with specific protocols linking them together
Features of a Computing Platform
Enables quick development of custom apps by providing prebuilt functionalities (not tools or usability enhancements)
Components can be independently applied and can communicate with each other, often with proprietary semantics and protocols
Usually result in a lock-in due to data or protocol specifications, hard to move apps away
Examples of a Platform
Intel, Microsoft Windows (Wintel)
Adobe AIR, Apple iOS, …
Java Platform (J2EE etc.), .Net Framework
Facebook/Twitter …
WordPress
What is an IDEAn Integrated Development Environment (IDE) is a software application that provides comprehensive facilities to developers for software development. IDE normally contains a code/script editor, build automation, file/item browser, debugger (profiler/monitor).
IDE normally offers features like GUI, MDI, RAD, and support code generation, automation of execution (deployment), and revision control…
Modern features include intelligent code completion, visual browser, workflow manager and other productivity features
Examples of IDEMicrosoft Visual Studio, Delphi
Eclipse, IntelliJ IDEA, PyCharm
Xcode
WebStorm
Cloud9
…
Similarities Between a
Platform and IDEBoth are software providing facilities to its users
Both can enable faster application development
Differences between Platform
and IDEPlatformProviding facilities in the form of functional components
Faster development speed via pre-packaged functionalities
Allowing users to build applications only with its functions*, can use multiple IDEs
Resulting in lock-in of applications
IDEProviding facilities in the forms of usability improvements
Faster development speed via stream-lined operations
Allowing users develop only within its environment, can support multiple platforms
Resulting in lock-in of project files
* Most platforms allow calling external components, but still need fit into its own platform constraints
Why Not IDEIDE can help one type of user, most likely Data scientists or software engineers
These users are usually not the majority users in the company
The ROI is mainly usability: lowVs. platform that produces applications which can multiply productivity
Why PlatformHigh reusability Decreased time and cost to market
Supporting more customers Higher value for customers
Built-in flexibility Faster application development time
Component Marketplace (AppStore) Lower support cost, enable third-party contributions
Five-star DataAccessible
Parse-able (structurizable)
With shared metadata
Identifiable
Connected with relations
Four PillarsKnowledge-base: domain expertise, rules, data, metadata, etc.
Semantic data management system: manage all software artifacts including data sources, datasets, projects, users…
Function modules: parsers, algorithms, visualization modules, transformers, models, … to build apps
Infrastructure support: connect to proper infrastructure to run all the things
Custom Application Development Workflow
Example Application: HR Insights
Start with requirements and goals:Overview of whole company’s employees’ hours, times on which app/sites, sentiments, average time of responding email/requests, models to predict performance or attritionThe goals contain specific details on standards, conditions, environment, resources, and even methodologies
HR Insights Workflow Step #1
Find or Create GoalsFind similar goalsIf not, specifying details such as working hours, email/request response time, mapping out natural workgroups via communication patterns …
Collect data Select from list of known Sources and DatasetsOr create new sources or datasets if necessary
HR Insights Workflow Step #2
Performing Ingestion, Pre-processing (Parsing), and Instant Analytics (basic stats and other quick insights)
To help understand the data better for further steps
Perform transformation to get more targeted datasetsSMEs can run a set of existing tools (apps, models, transformations) to get more insights
Including enriching, filtering, linking to other data sets and do it over again
HR Insights Workflow Step #3
Create Analytic Pipeline Requests to solicit data science experts
Data scientists can start doing experimentation and build models
Models reviewed and published as applications
End users can benefit from new models/capabilities
24
Thank you