+ All Categories
Home > Technology > FreEed - Open Source eDiscovery

FreEed - Open Source eDiscovery

Date post: 17-Dec-2014
Category:
Upload: markkerzner
View: 2,109 times
Download: 1 times
Share this document with a friend
Description:
Backgrou
20
FreeEed Open source eDiscovery with Hadoop
Transcript
Page 1: FreEed - Open Source eDiscovery

FreeEed

Open source eDiscovery with Hadoop

Page 2: FreEed - Open Source eDiscovery

Background of eDiscovery

• Preservation• Discovery request• Production

Page 3: FreEed - Open Source eDiscovery

EDRM

Page 4: FreEed - Open Source eDiscovery

What does FreeEed do now?

Processing: 

• Text extraction• Metadata extraction• Culling • Deliver load file• Deliver native documents

Page 5: FreEed - Open Source eDiscovery

What will FreeEed do soon?

 •Review•Analysis•Production•Presentation

Page 6: FreEed - Open Source eDiscovery

What else can FreeEed do?

 •Preservation•Collection

Page 7: FreEed - Open Source eDiscovery

Why can FreeEed do all of that?

 • Big Data technologiesoStorageoProcessing

• Open source toolsoText/metadata extractionoOCR

Page 8: FreEed - Open Source eDiscovery

Advantages of open source approach

 • Easy reach• Modern technologies• Sharing spirit• Community support• Integrate or use any way you

want

Page 9: FreEed - Open Source eDiscovery

Three ways to run FreeEed

 1.Standalone on Linux–Private Linux cluster–Amazon cloud, controlled from

your laptop (Windows, Mac, or Linux) - coming soon

Page 10: FreEed - Open Source eDiscovery

FreeEed Architecture

 • Staging (zip files)• Text/metadata extraction• Culling• TIFFing or PDF• Post-processing

Page 11: FreEed - Open Source eDiscovery

Staging

 

• One zip file per node (computer/server)

• Size controls load balancing• Big enough to make sense• Small enough to tolerate failure

Page 12: FreEed - Open Source eDiscovery

Text and Metadata

 • Tika• Umbrella for extractors• Hundreds of file formats• Just one line of code: String text = tika.parseToString(inputStream, metadata);

Page 13: FreEed - Open Source eDiscovery

Culling

 

• Selecting only responsive documents• Lucene - open source search • Flexible search queries • Search in memory• Two lines of code:

        Searcher searcher = new IndexSearcher(idx);        isResponsive = search(searcher, queryString);

Page 14: FreEed - Open Source eDiscovery

TIFF/PDF

 • OpenOffice• LibreOffice• Admittedly, TIFFing is hard• Open source answer: it is what it is• Perfectionist answer: commercial

filters

Page 15: FreEed - Open Source eDiscovery

Database use (HBase or Cassandra)

For example, find all authors

1.Document -> Author–Key = Author, Value = None–Author can be overwritten–The "Authors" row has all Authors

Page 16: FreEed - Open Source eDiscovery

So, practically

Command-line

java -jar dist/FreeEed.jar -param_file my.freeeed.properties

or GUI

Page 17: FreEed - Open Source eDiscovery

1-2-3

 • Install Ubuntu• Download FreeEed• Run the program• Ask for more features

Page 18: FreEed - Open Source eDiscovery

Install Ubuntu

 

Page 19: FreEed - Open Source eDiscovery

Download and unzip FreeEed

 

Page 20: FreEed - Open Source eDiscovery

Enjoy

 


Recommended