Date post: | 05-Dec-2014 |
Category: |
Technology |
Upload: | kroll-ontrack |
View: | 691 times |
Download: | 2 times |
2
Discussion Overview
Case Law and Industry Guidance: The Role of Searching in Ediscovery
Back to the Basics: Keyword Searching Tips
Deep Dive: Advanced Searching Technologies
3
Judicial Viewpoints on Keyword Searching
Court required parties to “confer on the
development of reasonable search terms”
instead of compelling production without a list
of proposed search terms provided by the
requesting party
“Common practice governing the
discovery of [ESI] requires the use of
search terms . . . If the producing party
generates the search terms on its own,
the inevitable result will be complaints
that the search terms were
inadequate” EEOC v. McCormick & Schmick’s Seafood Restaurants,
Inc., 2012 WL 380048 (D. Md. Feb. 3, 2012).
4
Keyword searching plays an important role in winnowing document sets for discovery
Analyzing Search Methods
5
Objective of search: high recall and precision
» Recall – fraction of relevant documents found during review
» Precision – fraction of identified documents that actually are relevant
In this example, fruit is relevant; broccoli is not.
Designing Effective Keyword Searches
1. Understand your search engine
» Learn how each operator works (OR, AND, PROXIMITY, etc.)
» Be aware of operator precedence (Boolean or left-to-right) and use parentheses to clarify
» Work with ediscovery provider to create an alternative strategy for lengthy searches that may “time out”
6
Designing Effective Keyword Searches
2. Develop a search strategy
» Run broad searches for date-range culling, etc. then use results as scope for sub-level searches
» Save searches and search results for future use and reference
» Find on-point documents and use “similar” documents and concepts to provide additional key terms
» Know your universe (foreign language requires foreign keywords!)
7
Designing Effective Keyword Searches
3. Build smart keyword lists
Use a text editor to reduce errors
» Programs that format text can cause difficulty
» Use a program like Notepad and place each term on a separate line
» Spell check
» Be aware of commonly misspelled keywords or privilege terms
Understand the impact of your key terms
» Be flexible: account for word/phrase permutations – use a “Data Dictionary”
» Over-inclusive? Under-inclusive?
» “Noise words” increase likelihood of false hits
8
Advanced Searching Technologies
What are some “new and evolving” search methods?
1. Concept Searching
2. Topic Grouping
3. Language Identification
4. Email Threading
5. Near De-Duplication
6. Sampling
**Technology-assisted Review
9
Will not cover in this
presentation – hot, evolving
topic!
Will cover in this presentation
Keyword Searching Concept Searching
Allows reviewers to find
documents with similar
conceptual terms even if they
do not contain the exact
search terms
Seldom used for filtering;
increasingly used for review
1. Keyword Searching vs. Concept Searching
Uses search terms to
retrieve documents that
contain those exact
terms
10
Standard practice; generally
accepted in the courts
Emerging as a technology alternative
2. Topic Grouping
Documents automatically grouped by theme without human input
Topic grouping will group similar documents and label them for quick identification
Users do not need to “seed” the processing engine by providing keywords
11
3. Language Identification
This technology can identify all languages in a document as well as the primary language and pass this information along via a metadata field
A legal team needs to know what languages are in a collection, and the volume of foreign language documents
Reports can help determine whether to use machine translations, foreign language reviewers, or a combination
12
4. Email Threading
Identifies and groups for review e-mail conversations based on content
Using actual content of the e-mails to identify e-mail threads is the most reliable method, as it will not fail to recognize a thread if the subject line changes or if e-mails are exchanged across different e-mail applications
13
5. Near De-Duplication
Reviewers can quickly identify and compare documents that are very similar to one another but are not exact duplicates
Technology assesses document set’s similarities, identifying the most uniquely representative documents as “the core”
» All related documents are then grouped around the core
14
6. Sampling: Defensibility & Quality Control
Sampling is the practice of looking at a certain % of documents in a data set or particular folder of data
» Strengthens the defensibility of the process
» Helps validate what you have (and equally important, do not have) in your production set
» May take place iteratively throughout the review process or prior to production
– During ongoing quality control
– At the end to assess completeness of review
15