Building SkyNet for Science: Discovering New Frontiers Using Embedded Knowledge

Post on 17-Nov-2014

5,664 views 1 download

Tags:

description

Discovery in the digital environment is primarily mediated by machines. Unfortunately, the machines don't speak our language. Therefore, we must find standard ways of representing and communicating our requests, and standards for embedding and exchanging knowledge about digital objects. With the rise of the machines, we need to consider what information encodings will allow them to most efficiently process and analyze the vast range of information that is available. We need to find ways to communicate human recommendations and preferences, and to enable people to successfully explore the new digital frontier.

transcript

BuildingSkyNet for Science

Discovering New FrontiersUsing Embedded Knowledge

Richard AkermanNISO Discovery Tools Forum

March 27, 2008

Stanley

How can we better serve the machines?

The machines don’t speak our language

We must become knowledge translators

To Serve Machine

• Produce information in formats that machines can understand, in parallel with formats that are human readable

• Every web resource its machine reader

• Have a limited number of formats, keep them simple, and enable easy interchange of information

• Save the time of the machine

Bibliographic Metadata as a First Class Citizen

• OpenURL (ANSI/NISO Z39.88 - 2004)

• COinS

Unique Identifiers

• authors

• institutions

• text content

• data

To Serve Human

• Delicious Library

• LibraryThing

• Machines can process and analyze information, but only humans can use and savour information (for now...)

The Social Life of Humans

• Formal categorization

• Reviews

• Ratings

• Connections / Relatedness

• Informal categorization (tags, folksonomies)

• Use (frequency, time...)

• Groups (colleagues, friends, work groups...)

The Social Life of Machines

• Feature extraction

• Similarity (count-based, vector-based)

• Impact factor / PageRank

• Context (location, others)

• Numbers numbers numbers

• Machines love unique identifiers

Use Case

• Find me the best relevant information

• Without me asking for it?

• Wherever and whenever?

Every Book Its Reader

• The WebOPAC is not a discovery interface

• Build a discovery layer over the catalogue metadata

Open Data

There is more to heaven and earth

• Licensed content and access

• Organization content

• The entire biblioverse and Internet

Is there “too much” information?

There is too much information poverty

Seeing the forest - licensed content

• Federated search

• Local indexing

I see... everything

• XML, RDF, RSS, GeoRSS...

• Microformats - Embedded knowledge

• Aggregators

• Recommender APIs

Glen Newton

Free the Humans!

Richard AkermanNRC-CISTI

http://www.connotea.org/user/scilib/tag/nisodiscovery2008

© 2008 Government of CanadaLicensed in the Creative Commons

http://creativecommons.org/licenses/by-nc-sa/2.5/ca/