+ All Categories
Home > Documents > Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

Date post: 27-Mar-2015
Category:
Upload: gabriel-hicks
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
9
Almaden Research Center © 2006 IBM Corporation IOP ’06 Open Source Intelligence Lesson Learned
Transcript
Page 1: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

Almaden Research Center

© 2006 IBM Corporation

IOP ’06Open Source Intelligence Lesson Learned

Page 2: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

2

Almaden Research Center

© 2006 IBM Corporation

I

Issues in using open source for intelligence

Growth and complexity of heterogeneous content

Not all open source data is equal – Quantities vs. Qualitative

Requirements of Ecoinformatics Architectures

Page 3: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

3

Almaden Research Center

© 2006 IBM Corporation

ISource: IBM 2005 GTOYears

1024 = 1Trillion Terabytes of data which is equivalent to all the information consumed visually by all humans in a year

Digital content is growing at dramatic rate

Page 4: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

4

Almaden Research Center

© 2006 IBM Corporation

I Source: IBM 2005 GTO

The scale of open source data and its heterogeneous form increases complexity of extracting intelligence

Stora

ge o

nlin

e

Med

ical

dat

a st

ored

Perso

nal m

ultim

edia

Surve

illan

ce b

ytes

Photo

s m

ultim

edia

Scalable

Heterogeneity

Inte

llige

nce

Struct

ured

dat

a

Free

from

text

109

1012

1015

1021

1024

1027

Page 5: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

5

Almaden Research Center

© 2006 IBM Corporation

I

Industry Publication

Company Internal Content

Company Publication

Industry Journals

Conference Proceedings

NGO Publications

Website affiliated with an organization

User Groups / Forums

News Letters

Content Aggregators

News & Press Releases

Legal Filings

Government Publications

Blogs / Weblogs

Non affiliated Websites Qualitative

Quantitative

Open Source Intelligence from the periphery requires an understanding of its topology, including strengths and weaknesses

sou

rces

in

th

e p

erip

her

y These are authoritative sources, where data is trusted and is defended

These are credentialed opinions , the source is

known and can be weighted

Open opinion, it is impossible to verify the authority of the source

Page 6: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

6

Almaden Research Center

© 2006 IBM Corporation

I

Ecoinformatics Architectures need to be multi-layered

Cross-Page Annotators

ClassificationClassificationClusteringClustering CommunitiesCommunities RankingRanking

Applications

Network Associations

Network AssociationsSearch Search Topic

TrackingTopic

TrackingBuzz

AnalysisBuzz

Analysis

Per-Page Annotators

Auto Entity Spotters

Auto Entity Spotters

Auto Geography

Spotter

Auto Geography

Spotter

Porn & Dup Detection

Porn & Dup Detection

CustomerTaxonomy

Spotter

CustomerTaxonomy

Spotter10

0’s

10

00

’s

(pa

ge

s/se

con

d)

World Wide Web

BlogsNewspapers

Licensed Feeds Data BasesIntranet DataTaxonomies

Commercial Date Bases

IndexStore

Un-Structured DataDATA ACQUISITION

Structured Data

Parsing/Tokenizing

Annotation Searching

NaturalClustering

NaturalClustering

Affinity Analysis

Affinity Analysis

Snippet Analysis

Snippet Analysis

TrendingTrending

Performance Management

DrugResearch

Business Insights Workbench

Customer Applications

10

’s

Rel

evan

cy

Vo

lum

e

WebFountain

Business Insights Workbench

WS OminFind II

IndexStore

DATA ACQUISITION

Date SpottersDate Spotters Language SpottersLanguage Spotters Source SpottersSource Spotters

Page 7: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

7

Almaden Research Center

© 2006 IBM Corporation

I 0

10

20

30

40

50

60

70

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2005

# o

f We

b P

ag

es(0

00)

0

20

40

60

80

100

120

140

2001 2002 2003 2004 2005

# o

f We

b P

ag

es(0

00)

Year

0.0%0.5%1.0%1.5%2.0%2.5%3.0%3.5%4.0%4.5%

Congr

essm

an

Rob S

imm

ons

Dougla

s

Rushk

off Elio

t

Jard

ines

Majo

r Gen

eral

Patric

k Cam

mae

rt

Mr A

rno

Reuse

rRob

ert

Steele

Open Source Trend on Web

Some event happened in August

% o

f O

SI

we

b d

ocu

me

nts

One dominant voice

Finding intelligence can require different view of the same information

Page 8: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

8

Almaden Research Center

© 2006 IBM Corporation

I

Robert Steele 6,440,000"Robert Steele" 170,000"Robert Steele" and Open Source Intelligence 2,400"Robert David Steele" and "Open Source Intelligence" within 5 words 73

Context

Network of Conference Attendees to auto-spotted Companies and Universities

In this network view we don’t care about

association with “Open Source Intelligence” but

with companies and universities

Page 9: Almaden Research Center © 2006 IBM Corporation IOP 06 Open Source Intelligence Lesson Learned.

9

Almaden Research Center

© 2006 IBM Corporation

I

Computers don’t create intelligence, people do – computers enable smart people

Not all open source content is equal – know the sources

Not every thing you see is right – it’s all about the CONTEXT

Ecoinformation architecture supports- Large scale analytics of open source content- Integration of content other than open source- Power text analytic tools to support analysis of on topic stores

Conclusions on Open Source Intelligence


Recommended