+ All Categories
Transcript
Page 1: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Building a Data-Driven Log Application

with SILK

April 21, 2014Search | Discover | Analyze

Page 2: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Agenda

• Introduction to LucidWorks• The Continuum of Search• LucidWorks SILK

– Enabling Big Data Search– 360-degree view of customers and systems– Breakthrough ROI

• Solution Components• Demonstration• Summary and Q&A

Page 3: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Speakers

• Chief Product Officer at LucidWorks• 15 years product, marketing and BD

experience• Prior to LW 8 years @Splunk (Employee ~9)• Proud Search Snob

• Leads LucidWorks’ newly created Solutions team

• 16-year track record of data-driven solutions– Customer analytics/nano-targeting– Improving product development operations– Video processing and transmission

• Establishing search as the paradigm for solving the "last mile problem" of big data

Page 4: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Commercial entity behind Lucene/Solr - industry leading open search engine:

• 300+ enterprise customers

• Consulting, training, SLAs and “Pro-Active Support” for open source

LucidWorks platform provides advanced search capabilities directly on Solr:

Connectors , Entity Extraction, Security, pipelines, rules and more…

Solutions (e.g SiLK & LucidWorks App for Splunk) to help streamline use case adoption. Platform

Who is LucidWorks

Page 5: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Intranet Search Knowledge Base

E-Discovery E-Commerce

‘Big Data Search’

Application Innovation

Index Characteristics

‘Enterprise Search’

‘Intelligent Search’

Gigabyte scale Single instance Full-text

Terabyte Scale Cluster-ready Structured/

Unstructured Data Near real-time

Search on Hadoop Log Analysis Fraud Detection

Unlimited Scale Cloud-ready Handles any data

type Real-time NoSQL Alternative

Continuum of Search

Page 6: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Creates the data access layer leveraged by best-in-class data-drivenapplications:

is the choice of those building data-driven applications at massive scale

6

Solr is the Choice

Page 7: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

A Big Data Search search index

Unlimited Scale Cloud-ready Handles any data type Real-time NoSQL Alternative

7

Creates the data access layer

At-Hoc Discovery Personalization Context

That developers & users demand in

their Big Data applications

Big Data Search

is the partner of choice to deliver next generation search by the leading Big Data vendors

Page 8: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Big Data Ecosystem WITHOUT LucidWorks Search

Input Data Stream

Traditional RDBMS/EDWDoc Stores

Platform for Data Storage and Machine Learning

Difficult Getting Value from Data

1. Opaque2. Narrow views into data3. Out-of-date4. Not Actionable5. Accessible mostly to

expert users6. Expensive, ineffective

translation to broader set of users

Product Mgr’s

Business Users

Rest of Org

Data Scientist

BI AnalystIT

HDFS; NoSQL; Hadoop

Real-time Processing

Page 9: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Input Data Stream

Traditional RDBMS/EDWDoc Stores

Directly Access Data and Insights to Drive Actions:

Breakthrough ROI

Predictive

Relevant

Actionable

Timely

HDFS; NoSQL; Hadoop

Real-time Processing

Lucene/Solr

Solving the Last Mile Problem of Big Data

Page 10: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Solution Components

Gateway

JDBC Connector

Web/File System Crawl

Data Warehouse

Hadoop Connectors

Clickstream Networking

Data Sources

Connectors

Servers

Page 11: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Events from App/Server/Web Logs,etc

• Application Logs– 2013-12-18 01:37:20,637 INFO core.SolrCore - [collection1] webapp= path=/browse

params={fl=lucid_facet&facet.query={!tag%3Done_day}dateCreated:[NOW-1DAY/DAY+TO+NOW/DAY] &facet.query={!tag%3Done_year}dateCreated:[NOW-365DAYS/DAY+TO+NOW/DAY]&start=260&q=faceting&f.project.facet.limit=20&role=DEFAULT&req_type=main&hl.simple.post=</span>&facet.field={!ex%3Dsource}source&facet.field={!ex%3Dsource}list_type&facet.field={!ex%3Dsource}issue_status&facet.field={!ex%3Dsource}lucid_facet&facet.field={!ex%3Dproject}project&facet.field={!ex%3Dauthor_display}author_display} hits=6761 status=0 QTime=14

• Firewall Logs– Apr 07 2014 10:14:56 eventid='1278457197410173971' severity=severe

category="Penetrate/ArpPoisoning" hostId=r signature=3201-2 description="Unix Password File Access Attempt" attacker=110.236.0.15 target=27.96.128.0 target=141.146.8.66 gc_score="-5" gc_riskdelta="3" gc_riskrating="false" gc_deny_packet="true" gc_deny_attacker="false”

• Web Logs– 50.17.233.225 - - [09/Mar/2014:06:26:50 -0700] "GET / HTTP/1.1" 200 24442 "-" "Mozilla/5.0 (X11;

U; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 »

• Syslogs– Apr 17 07:00:42 Lucids-MacBook-Pro-25.local Microsoft Outlook[2461]: CGSCopyDisplayUUID:

Invalid display 0x18d88a81

• Other—Database Logs, Click Data, Conversions, Social Media (Tweets…), Financial Data, Product Catalogs, Knowledge Base, etc.

• Volume, Variety and Velocity

Page 12: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Application Development Process

• Understand your Users• Know your Data• Prepare and Ingest Data into Solr• Build Visualizations• Iterate

Page 13: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Search Analytics—Understand your Users

• Who will use this application– Business User (eCommerce or KM), IT and Search

Administrators

• What are they interested in?– What are people searching for?– Which queries are returning zero hits?– Which searches are providing slow response times?– What is my memory & cpu usage, jvm metrics, etc.?– Is there a trend in my slow searches?– Is the cache warm-up time very large?

• First three of interest to Business User, Search Admins/Developers interested in all six.

Page 14: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Search Analytics–Know your Data

• Where is the data available?– Core Logs– Core Request Logs– Connector Logs– Mbeans API– Log4j

• Data Connectors– LogStash (for this example)– Hadoop Job Jar

Page 15: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Centralized Logging Infrastructures

• Can be built using a combination of LogStash, Apache Flume, Lumberjack, Rabbit MQ, Apache Kafka, etc.

• Today’s example uses LogStash—extensive documentation at http://logstash.net/docs/1.4.0

Shipper

Shipper

Broker Indexer

Page 16: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

Solr/Solr Cloud

Search Analytics—Data Ingestion & Visualization

Gateway(Reverse Proxy)

Solr Output Writer for

LogStash (Http)

Search Logs

Visualization Configurable Dashboards

Hadoop ConnectorGrokIngestMapperLogStash

Page 17: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

5 DEMO

Search | Discover | Analyze

Confidential and Proprietary © Copyright 2013

Page 18: Building a data driven search application with LucidWorks SiLK

Confidential and Proprietary © Copyright 2013

• Contacts– Will Hayes, Chief Product Officer

[email protected] twitter:@iamwillhayes

– Ravi Krishnamurthy, Director of Solutions [email protected]

• Links– http://www.lucidworks.com/silk

Q & A


Top Related