Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Post on 05-Dec-2014

2,755 views 4 download

description

An introduction to Apache Solr, what it is and why we use it with TYPO3. Covers Solr, the old Indexed Search, and the new Solr extension.

transcript

Apache Solr for TYPO3TYPO3 Core Developer, Release Manager TYPO3 4.2

Ingo Renner

Samstag, 22. Mai 2010

ingo@

typo3.

org

@ingor

enner

mail

twitter

ingo@

typo3.

org

@ingor

enner

mail

twitter

Samstag, 22. Mai 2010

Current Status

Samstag, 22. Mai 2010

• First Prototype Summer 2008• Development Kickoff February 2009• Public Release v1.0 T3CON09• v1.1 soon• v2.0 later this year

Current Status

Samstag, 22. Mai 2010

• Initial development by dkd• Development Partnerships• Early Access, Trunk Access• Setup Support• Development Support• Development Priorities

Development Model

Samstag, 22. Mai 2010

Development Partnersd.k.d Internet Service GmbH

e-netconsulting KGCross Content Media

Marketing Factory Consulting GmbH

University of Hohenheim

Andreae-Noris Zahn AG

Deutsche Lufthansa AG

Eichborn AG

SEB Assetmanagement AG

MÜPRO GmbH

AOE media GmbHNetcreators BV

marit AG

internezzo AG

Eventex

Samstag, 22. Mai 2010

Indexed Search

Samstag, 22. Mai 2010

• Indexing Frontend / Crawler• Respects access rights• Respects languages• Index in Database• Totally OK for smaller websites

Indexed Search

Slooooooooooooowww

Samstag, 22. Mai 2010

Apache Solr

Samstag, 22. Mai 2010

• Enterprise Search Server• Based on Lucene Index• Apache Software Foundation Project• Many powerful features

• CNet, Netflix, ilocal.nl, Zappos.com

So what is Apache Solr?

Samstag, 22. Mai 2010

• Index = Collection of Documents• Document = Data stored in Fields• Field Type defines processing through

Analizers, Tokenizers, Filters• Dynamic Fields• Copy Fields

Solr Concepts

Flexibility

Samstag, 22. Mai 2010

• Speed: Many times faster than IS• Better search results• Faceted search• Spellchecker: Did you mean ... ?• Similarity search: More like this ...• Editorial Content / paid search results• Synonyms, Stopwords, Protected Words• Boosting of specific index fields• Replication, distributed search

Why Apache Solr?

Speed &

PowerSamstag, 22. Mai 2010

• REST like interface• Indexing of XML Documents through

HTTP POST• Querying through HTTP GET• Results as XML, JSON, PHP

How it works

Easy API

Samstag, 22. Mai 2010

• Needs Java

• We donʻt want to deal with Java• Solr shields us from Java once set-up

Disadvantages

Developers

stay with PHP

Samstag, 22. Mai 2010

• Multiple times faster than IS• NO database queries • Easy Installation / Configuration• Respects access restrictions• Respects languages• Cutomizability

Advantages

FastEasy to use

Powerful

Samstag, 22. Mai 2010

• Indexing of XML Documents• Reversed Index• Access through GET and POST

(REST like)• Results as XML, JSON, PHP

Inner Workings

Samstag, 22. Mai 2010

Inner WorkingsSolr IndexDocument

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Document

Document

Document

Document

Document

Samstag, 22. Mai 2010

Inner Workings

Lucene

Solr CoreAnalysis

Config Schema

Concurrency

CachingUpdateHandler

XML Update InterfaceXML

Response Writer

Custom Request Handler

DisMaxRequestHandler

Standard Request Handler

Admin Interface

HTTP Request Servlet Update Servlet

Replication

Samstag, 22. Mai 2010

Apache Solr for TYPO3

+

EXT:solr

Samstag, 22. Mai 2010

Features!FE Indexing

Search

Search Box

More Like This

Boosting

Common Searches

Facetted Search

Hierarchical Facets

Simple FormLast Searches

Sorting

Spellchecker / Did you mean ...

Auto Suggest

Index Queue

Hooks, Interfaces

Template Engine

View Helper

TYPO3 4.2

TYPO3 4.3

Scheduler

Reports

Access RightsInstall Script

Filter

Page Browser

Hit HighlightingMulti Language

Backend ModuleStatistics

File Indexing

Backend Search

Extbase / Fluid

Score Analyzer

Logging

Content Elevation

Samstag, 22. Mai 2010

Features!FE IndexingSearch Search Box

More Like This

Boosting

Common SearchesFacetted Search

Hierarchical Facets

Simple Form

Last SearchesSortierungSpellchecker / Did you mean ...

Auto Suggest

Index Queue

Hooks, InterfacesTemplate Engine

View Helper

TYPO3 4.3

SchedulerReports

Access Rights

Install Script

Filter

Page BrowserHit Highlighting

Multi Language

Backend Module

Statistics

File IndexingBackend Search

Extbase / Fluid

Score Analyzer

Logging

Content Elevation

1.0 2.0

Samstag, 22. Mai 2010

• „Acts like Indexed Search“• Indexing through Frontend / Crawler• Search• Search Word Highlighting• Sorting• Last and Common Searches

Current Status

Samstag, 22. Mai 2010

• Spellchecker: Did you mean ... ?• Similarity Search: More like this ...• Faceted Search, Hierarchical Facets• Suggest / Autocompletion• Index Queue• File Indexing

Current Status

Samstag, 22. Mai 2010

• Backend Module• Related Searches• Editorial / Paid Search Results• Editing of Stopwords, Synonyms• Statistics• Transition to Extbase / Fluid

Outlook

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Showcases

Samstag, 22. Mai 2010

Making the sun shine on your search

Samstag, 22. Mai 2010

• Requires any J2EE container:Tomcat, Jetty, Resin, ...

• Run setup scripts provided with EXT:solr• Copy provided configuration files to Solr• Install EXT:solr, TypoScript• config.index_enable = 1

Requirements, Setup

Samstag, 22. Mai 2010

• Indexing of additional Data through hooks, interfaces, TS configuration

• Individual index schema• En/Disable features through TS• Individual, flexible rendering of results

Customization

Samstag, 22. Mai 2010

Thank you for listening.

Samstag, 22. Mai 2010

ingo@

typo3.

org

@ingor

enner

mail

twitter

ingo@

typo3.

org

@ingor

enner

mail

twitter

Samstag, 22. Mai 2010