Date post: | 29-Nov-2014 |
Category: |
Technology |
Upload: | kognitio |
View: | 286 times |
Download: | 1 times |
The Proven Analytical Platform for Big Data
September 2013
Michael HiskeyVice President
Marketing & Business Development
Kognitio is an in-memory analytical platform
Built from the ground-up to satisfy large and complex analytics on big data sets
A massively parallel, in-memory analytical engine that interoperates with your existing
infrastructure
Kognitio
•Founded in 1987•Privately held•Dev Labs in the UK •Leadership in US•~100 employees
Core product:•MPP in‐memory analytical platform
•Built from the ground‐up to satisfy large and complex analytics on big data sets
Focused on providing the premier high-performance analytical platform to power business insight around the world
Price of RAM
Log (10)
1995 2000 2005 20101987
Kognitio clients span the globe
*some clients NDA
*
*
Analytical Platform Reference Architecture
AnalyticalPlatform
LayerNear-lineStorage
(optional)
Application &Client Layer
All BI Tools All OLAP Clients Excel
PersistenceLayer Hadoop
ClustersEnterprise Data
WarehousesLegacy
SystemsKognitioStorage
Reporting
Cloud Storage
Analytical Platform: Addressable Segments
Acceleration for Traditional BI
Data Science / Advanced Analytics
SQL on Hadoop.. And everything else
• Improve performance of existing BI stack 10‐100x without re‐engineering
• Cost‐saving alternative to expanding large‐scale EDWs
• Enable tighter data security and BI Tool governance
• Plug‐and‐Play with Hadoop
• Analytical “Sandbox” for rapid Big Data projects
• MPP in‐memory code execution of standard languages (R, SAS, Python, Perl) in line with SQL
• Ability to simply embed Big Data analytics into existing BI/Dashboard Tools without disruption
• Ability to rapidly move discovery into production
• Tight Hadoop Integration• In‐memory over disk• Seamless integration SQL, ODBC, JDBC, MDX, ODBO, XML/A etc.
• Fast MPP data transfer• High‐throughput, high‐concurrency, low‐latency interactive analytics
• Core RDBMS architecture simplifies integration and brings ACID, DW qualities
• Data Virtualization ‐Platform for LDW
• Central shared controlled data models
create view image shopdata as select prod, store, cust, cost from “transactions” where date > 1/1/12
selectstore, product_category,sum(cost) total_spend,customer_category customer_type,count (distinct cust) customers
from shopdata sd,product_info p,customers c
wheresd.prod = p.prod_code
and c.cust_id = sd.custgroup by store,product_category,customer_type
Kognitio Hadoop Integration• More than just a connector – tight integration*
– Hadoop does what it is good at – storing and filtering data– Kognitio does what it is good at – complex analytics
Hadoop Cluster
Give me prod, store, cust, cost from “hdfs files” where date > 1/1/12
Transaction Data
*Developed in co-operation with Sears (Metascale)
Kognitio Hadoop Connectors
HDFS Connector – fast load of complete files
• Connector defines access to HDFS file system• External table accesses row-based data
in HDFS• Dynamic access or “pin” data into memory• HDFS file(s) loaded into memory• Data filtering relies on data being partitioned into
different directories/files within Hadoop
Map Reduce Connector – filter from large files
• Connector uploads Kognitio agent to Hadoop nodes
• Query passes selections and relevant predicates to agent
• Data filtering and projection takes place locally on each Hadoop node
• Data filtered as it is read from file(s)• Only data of interest is transferred and loaded
into memory via parallel load streams
MPP in-memory code executionNoSQL external scripting function:• SQL provides standard data access framework
– Open, adaptable framework; pass data to/from any executable or interpreter
– Fully flexible MPP execution of R, Python, Java, text parsing libraries etc.
create interpreter perlinterpcommand '/usr/bin/perl' sends 'csv' receives 'csv' ;
select top 1000 words, count(*)from (external script using environment perlinterp
receives (txt varchar(32000))sends (words varchar(100))script S'endofperl(
while(<>){
chomp();s/[\,\.\!\_\\]//g;foreach $c (split(/ /)){ if($c =~ /^[a-zA-Z]+$/) { print "$c\n”} }
})endofperl'from (select comments from customer_enquiry))dt
group by 1 order by 2 desc;
Example: This reads long comments text from customer enquiry table, in line Perl converts long text into output stream of words (one word per row), query selects top 1000 words by frequency using standard SQL aggregation
Using R code for ad-hoc external script
create script environment rsint command '/usr/bin/Rscript --vanilla --slave';grant execute on script environment rsint to power-user;
select *from (external script using environment rsint
receives ( PRICE SMALLINT )sends ( PRICE INTEGER )script S'endofr(options(error = expression(q("no")))mydata<-read.csv(file=file("stdin"), header=FALSE)sink(, type="message")mydata$V1<-mydata$V1-100write.table(mydata, row.names = FALSE, col.names = FALSE, sep = "," )
)endofr'from (select price from ITEM_SALE)) dt ;
MPP Execution of R• Rows are read into data frame mydata• Data frame vectors (columns) automatically named V1,V2 etc.• Run math formula – in this case simple subtract 100• Data frame rows returned to Kognitio
Kognitio CloudPRIVATE CLOUD PUBLIC CLOUD
• Could be referred to as an “exclusive” hybrid cloud offering
• Heritage from “DaaS” managed services
Kognitio ‘hosted appliance’Kognitio & Partner operated
Exclusive – ‘bare metal’Monthly pricingMin. 1 year termMin. 256GB RAMNotice required
Multi-nodeOptimum configurationLimited Customisation
AWS• On-demand
‘hosted appliance’• Multi-node• Limited
Customisation
Marketplace• On-demand
‘hosted server’• Single node• Not customisable• Anonymous
• Ready-to-use in-memory analytical platform leveraging Amazon Web Services (AWS) Elastic Cloud Computing (EC2) infrastructure
• Hourly usage per CPU/server and TB of data (min 7.5 GBs RAM)
• Automatic provisioning - minutes with pre-installed servers
• Elastic scalability (up and down) to meet compute demand
Single NodeScale-out
Console / Services
Multi-node
CloudFormation
Cloud provides an ideal deployment scenario
Cloud model can provide a way to quickly model, experiment, develop and build
• Deploy to existing reporting tools• Pass ownership to IT• Cloud instances can be “temporary”• Repeatable framework
2011 2010 Sep.3 Aug. Jul. Sep. Aug.3,443,873 8.1 382,009 401,951 391,878 351,696 369,199617,194 10.4 67,055 71,725 69,801 61,676 66,08565,237 1.0 7,671 7,892 7,422 7,357 7,61170,324 0.0 7,737 8,240 7,888 7,685 8,082226,261 5.8 24,764 26,196 25,973 23,288 23,722455,276 5.6 50,418 52,164 53,062 47,710 48,597446,918 3.5 48,368 51,797 51,160 46,166 49,84888,590 8.7 10,510 10,681 10,258 9,591 9,514279,985 13.2 31,390 31,889 28,478 28,266 28,282368,372 5.5 41,188 42,244 43,097 37,992 40,228
Not Adjusted9 Month Total 2011 2010*
Business Analyst
Business User
IT Admin
Data Scientist
PRESS HERE
…and cool Big Data stuff happens!
12
Innovative client solutions
Orbitz leverages Kognitio Cloud to take large volumes of complex data, ingested in real time from web channels, demographic and psychographic data, customer segmentation and modeling scores and turn it into actionable intelligence, allowing them to think of new ways of offering the right products and services to its current and prospective client base.
PlaceIQ provides actionable hyper‐local Mobile BI location intelligence. They leverage Kognitio to extracts intelligence from large amounts of place, social and mobile location‐based data to create hyper‐local, targetable audience profiles, giving advertisers the power to connect with consumers at the right place, at the right time, with the right message.
Public Cloud
Private Cloud
Public Cloud
Software
Appliance
TiVo Research & Analytics 40 TBs of RAM that perform complex media analytics, cross‐correlating data from over 22 sources with set‐top box data to allow advertisers, networks and agencies to analyze the ROI of creative campaigns while they are still in flight, enabling self‐service reporting for business users
The VivaKi Nerve Center provides social media and other analytics for campaign monitoring and near real‐time advertising effectiveness. This enables agencies in the Publicis Global Network to provide deep‐dive analytics into TBs of data in seconds
AIMIA provides self‐service customer loyalty analysis on over 24 billion transactions that are live in‐memory full volumes of POS data. Retailers, Customer Packaged Goods companies and other service providers, provide merchandise managers with “train‐of‐thought” analysis to better target customers.
Context for media analytics: • In‐memory analytical database for Big Data
• Correlate everything to everything
• MPP + Linear Scalability
• Predictable and ultra‐fast performance
• > 22 data sources
• Commodity servers/equipment
• Market‐available IT skills
• No solution re‐engineering
Solution Benefits– Reports allow advertisers, networks and agencies to analyze the
relative strengths and weaknesses of different creative executions, and how such variables as program environment, time slots, and pod position impact their ROI
– Enables self‐service reporting for business users
Mars, Inc.: “By using TRA to improve media plans, creative and
flighting, Mars has achieved a portfolio increase in ROI versus a year ago of 25% in one category and 35% in a
second category.”
Challenges– Expanding volumes of data– Few opportunities for
summarization (demographics, purchaser targets, etc.)
– Data too large/complex for traditional database systems
– Need for simple administration
Analytics on tens of billions of events in tens of seconds with NO DBA
Loyalty marketing company that provides marketing and consulting services to retailers, service providers, and consumer packaged goods companies. Their Self-Service application offers “train-of-thought” analysis with near real-time data processing, enabling clients to better target customers.
Background
Case Study: AIMIAIn-memory analytics enable market basket analysis on with blazing speed
• Offer a near-time analytical environment where all EPOS transactions, not just sampled data, could be analyzed. (improve statistical confidence)
• Enable analysts to write a query and DB execute (no involvement from IT/DBAs)
Challenge
AIMIA lands a Kognitio Analytical Appliance they re-sell to each of their end-user clients, with years of full volume EPOS transactions + customer + product data (over 24 Billion transactions currently). All transactions are held in memory for complex basket analysis-type queries. S
olut
ion
Best-tuned Oracle RAC query ran in 25 min. same query Kognitio: 3 minutes!That was in the initial implementation, circa 2007. Today, average bundle of 12-18 queries runs in 90 seconds!R
esul
ts
Gartner: Kognitio is “visionary”
Strengths - Commentary • Consistent leadership with innovative pricing models• Pioneered data warehouse SaaS• Kognitio Cloud "on demand" cloud offering key for
growing clients • Unique ability to switch between Cloud and Platform • Meets Gartner Logical Data Warehouse concept • Innovative Hadoop integration• Great performance• Consistently satisfied clients with its great
performance• Makes it easier to use and run ad hoc queries• Recognized the shift from traditional warehousing• New features have extended capabilities to manage
external processes and data
What others say about Kognitio…
connect
www.kognitio.com
twitter.com/kognitiolinkedin.com/companies/kognitio
tinyurl.com/kognitio youtube.com/kognitio
NA: +1 855 KOGNITIOEMEA: +44 1344 300 770
The Kognitio Analytical Platform• Why an “analytical platform”?
– In the burgeoning “big data” ecosystem, the volume, velocity and variety of data require a new approach
• Disaggregation of persistent data storage and analytics• Variety of BI Tools (MicroStrategy, Tableau, MS Excel, etc.)• Introduce a new tier to accelerate, govern and increase flexibility
– Complement to Hadoop, EDWs, etc. • MPP in-memory structure enables fast ad-hoc reporting• Standard SQL, MDX, etc. to make Hadoop easy, consumable• Tight integration enables an “information anywhere” approach