+ All Categories
Home > Documents > B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal...

B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal...

Date post: 23-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
26
1 Lotus Software – WebSphere Portal B0F2 – Text Search and Portal Integration © 2004 IBM Corporation IBM IT Training Services IBM WebSphere Portal and Lotus Workplace technical symposium Session Number: B0F2 Session Title: Text Search and Portal Integration Speaker's e-mail: [email protected] Aya Soffer, Manager, Search Technologies Dept.
Transcript
Page 1: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

1

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

IBM IT Training Services

IBM WebSphere Portal and Lotus Workplace technical symposiumSession Number: B0F2Session Title: Text Search and Portal IntegrationSpeaker's e-mail: [email protected]

Aya Soffer, Manager, Search Technologies Dept.

Page 2: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

2

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

AgendaWebsphere Portal Search Engine (PSE)

Overview and Architecture

Main functions

Usage Examples and Planning Guidelines

Common Components: Lotus Workplace Search DemoInformation ResourcesQ & A

Page 3: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

3

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

What is the Portal Search Engine? (PSE)

High level functional overview

Administrator: indexing / collecting content/documentso HTTP crawler o Indexer componento Text analysis functions (taxonomy, categorizer, language tools,

summarizer)o Simple workflow to control what and how gets indexed

End-user: searcho web-style searcho high precision relevance rankingo browse through the collection

Page 4: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

4

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

General informationOriginally developed by IBM Research in Israel

Proven technology base with emphasis on search quality

Backed by the joint Research and Software group program – Institute for Search and Text Analysis

Fulltext search technology100% pure Java implementation

Suitable for server as well as client environments

Emphasis on highly accurate results - constantly benchmarking and evaluating via official forums such as TREC and INEX

internal interfaces allow for convenient integration in IBM products and solutions

o Rich set of APIs suitable for simple and complex implementationso Easy to customize and extend - Adapt ranking formulas, extend built-in

methods, add new document typesIBM strategic component

Page 5: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

5

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Portal Search Engine – where used ....

Portal Search Engine portlet application:

Administer multiple indexes (collections), where each may include multiple sitesEnd-user search portlet for both handling search requests and browsing through the documents in the collection

Integrated with Portal Document Manager (PDM)

Integrated with Lotus Workplace 1.1

Integrated with WebSphere Portal Content Publisher (WPCP)

Page 6: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

6

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

New key features with Websphere Portal Version 5

Taxonomies and categorizationA taxonomy is a hierarchical representation of a set of categoriesIt includes rules per category that are applied to a document through a categorizerTwo types of taxonomies available

o A pre-defined taxonomy allowing for simple manipulation (like renaming of categories and definition of new categories)

o A rules based taxonomy which can be built and defined by the userCategorization – process of assigning a document to category(-ies)

Summarizationthe top ‘3’ key sentences are extracted“the first ‘250’ characters of text” used for CJK and BiDi type languages

Document filtersSupports >250 document formatsTechnology wrapped into the ‘document conversion services’ (DCS) which add support for additional document formats

Page 7: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

7

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Conceptual overview – Index build process

Metadata injectedinto original content

Approved set ofContent “In-basket”

1 2

ContentCrawlerFilter

Text analysisComponents:

•Categorizer•Summarization•Document filters

ApprovalWorkflow Indexer

Collection

Page 8: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

8

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Creating a new document collection is easy

Create a new collection

Specify a web site to collect information/content from

Click on ‘Start collecting’ icon/text to initiate the index build process

Processing status and status of the index are shown at the bottom of the portlet, for:

the selected site

the selected collection (index)

Select the ‘Manage search collections’ portlet

Page 9: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

9

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

A look at the Manage Collections portlet

Page 10: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

10

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Manage Collections Portlet – Options and Status

Select ‘Portal Settings’ Manage Search Index

Page 11: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

11

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

End user – Search portlet – detailed view

Page 12: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

12

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced options

Portlet for defining a new collection

Page 13: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

13

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced optionsPortlet for defining a new site

Page 14: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

14

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced optionsPortlet for defining a schedule for periodic indexing of a site

Page 15: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

15

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced optionsPortlet for defining filters for sites

Page 16: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

16

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced options

Portlet for defining destination categories for the site

Page 17: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

17

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced options‘Browse document’ portlet

Page 18: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

18

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced options‘Search’ portlet

Page 19: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

19

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Administration – advanced options‘Advanced search’ portlet

Page 20: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

20

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Usage example

Goal: provide a community of users with information about competitors in the market

How: catalog information such as news articles and related information from external websites

Additional steps to take:

When creating a collection, select “User-defined” from the taxonomy pull-down

From the main administration portlet choose “Category tree” in the Manage Collections frame

Page 21: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

21

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Category Tree portlet• Build the taxonomy tree• then go to ‘Manage Rules’ to define rules for each category

Page 22: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

22

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

What the rule set looks like .....

• a ‘rule’ is essentially a search query one would use to find such specific documents• you can use ‘+’ and ‘-’ and ‘ “ ‘ and ‘*’ within the rule

Page 23: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

23

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Last step – assign categories to each website

Page 24: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

24

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Result: search and browse

Page 25: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

25

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Planning numbers, performance

Index size informationapproximately 40% to 60% of the textual content size of indexed documents/pages

indexing throughputcrawling/indexing rate between 100 to 200 documents per minute

Search responsivenesstypically a search result page is completed and ready for transmission in less than 0.5 seconds

Page 26: B0F2 Text Search and Portal Integration - IBM Research | IBM · Lotus Software –WebSphere Portal B0F2 –Text Search and Portal Integration ©2004 IBM Corporation IBM IT Training

26

Lotus Software – WebSphere Portal

B0F2 – Text Search and Portal Integration © 2004 IBM Corporation

Additional Information and Resources

IBM Resources:Websphere Portal - http://www-3.ibm.com/software/genservers/portal/

Websphere Portal Catalog: http://www-3.ibm.com/software/genservers/portal/portlet/catalog

Websphere Portal Developer’s Zonehttp://www-106.ibm.com/developerworks/websphere/zones/portal/

WebSphere Portal Toolkit -http://www-3.ibm.com/software/info1/websphere/index.jsp?tab=products/portaltoolkit

Documentation - http://www-3.ibm.com/software/genservers/portal/library/

Education - http://www-3.ibm.com/software/genservers/portal/education/

WebSphere Commerce Portal - http://www-3.ibm.com/software/genservers/commerce/portal/

IBM Lotus Workplacehttp://www.lotus.com/engine/jumpages.nsf/wdocs/ondemand


Recommended