Post on 20-Jan-2015
description
transcript
GoodRelations & RDFa for Deep
Comparison Shopping on a Web Scale
Can the Web of Data Reduce Price Competition
and Increase Customer Satisfaction?
Semantic Web Meetup Chicago
December 7, 2009,
Chicago, IL, USA
Prof. Dr. Martin Hepp http://www.unibw.de/ebusiness/
http://purl.org/goodrelations/
Twitter: mfhepp
Skype: mfhepp
2 Martin Hepp,
mhepp@computer.org
GoodRelations: A Unified View on
Commerce Data on the Web
3
Product Model
Master Data Shop
Offerings Auctions Spare Parts &
Consumables
Warranty
Delivery Payment
Retailers Manufacturers
Arbitrary Query
Extraction
and Reuse
Martin Hepp,
mhepp@computer.org
Part I: Diversity in Markets
The specificity of exchanged
goods has kept on growing...
Specificity
How much you loose when you can‘t
use a good for what it was designed.
5 Martin Hepp,
mhepp@computer.org
Growth in Specificity
Reason # 1: Division of Labor
6 Martin Hepp,
mhepp@computer.org
Growth in Specificity
Reason # 2: Technical Advancement
and Innovation
7 Martin Hepp,
mhepp@computer.org
Growth in Specificity
Reason # 3: Logistics
Temporal Constraints etc.
8 Martin Hepp,
mhepp@computer.org
Growth in Specificity
Reason # 4: Wealth
Abraham H. Maslow (1908-1970)
A Theory of Human Motivation (1943)
9 Martin Hepp,
mhepp@computer.org
Examples
10 Martin Hepp,
mhepp@computer.org
Examples
11 Martin Hepp,
mhepp@computer.org
Examples
12 Martin Hepp,
mhepp@computer.org
Specificity Increases the
Search Space
13 Martin Hepp,
mhepp@computer.org
Multi-Dimensional Trade-Off Problems
• Product Features
• Price
• Services
• Logistics
• Business Partners
• Etc.
14 Martin Hepp,
mhepp@computer.org
Part II: E-Commerce on the Web
Search for Suppliers, 2009
16 Martin Hepp,
mhepp@computer.org
Limitations of the Web, 2009
No Unified View: Jumping Back and Forth
Across Data Silos
18
Site
1
Site
2
Site
3
Page
1
Page
2
Page
3
Page
5
Page
7
Page
6
Page
8
Page
4
Searc
h E
ngin
e R
esults
Searc
h E
ngin
e R
esults
Searc
h E
ngin
e R
esults
Searc
h E
ng
ine R
esu
lts
Martin Hepp,
mhepp@computer.org
We know the best hits only when done.
19
Site
1
Site
2
Site
3
Page
1
Page
2
Page
3
Searc
h E
ngin
e R
esults
Page
5
Page
7
Page
6
Page
8
Page
4
Martin Hepp,
mhepp@computer.org
Specificity vs. Keyword-based Search
• Synonyms
• Homonyms
• Multiple languages
• No parametric
search
20 Martin Hepp,
mhepp@computer.org
Limited Ability to Reuse Data
21 Martin Hepp,
mhepp@computer.org
The Web: A Bottleneck for Sharing
Product Data
22 Martin Hepp,
mhepp@computer.org
Challenge: Web-wide Product Search
• Find all MP3 players
that have a USB
interface and a color
display, and sort them
by weight (lightest
first).
...on a Web Scale!
23 Martin Hepp,
mhepp@computer.org
Today: Loss of Variety and Detail
24 Martin Hepp,
mhepp@computer.org
Many Different
Products
Variety in
Preferences
Manufacturers &
Retailers Consumers
Web Search
What’s the
Consequence?
25 Martin Hepp,
mhepp@computer.org
Effect: Overly Price Competition
26 Martin Hepp,
mhepp@computer.org
Only 1 – 2 Product Models Considered
Comparison Shopping on the Small Subset
This will change soon.
Actually, very soon.
Deep Comparison Shopping
28 Martin Hepp,
mhepp@computer.org
Site
1
Site
2
Site
3
Page
1
Page
2
Page
3
Page
5
Page
7
Page
6
Page
8
Page
4
Search Engine Results
Part III: E-Commerce on the Web of
Linked Data
E-Commerce on the Web of Linked Data
30 Martin Hepp,
mhepp@computer.org
GoodRelations Principle: Small Data
Packets Inside Your Page
31 Martin Hepp,
mhepp@computer.org
RDF2RDFa: RDFa in Snippet Style
32 Martin Hepp,
mhepp@computer.org
RDF2RDFa: Turning RDF into Snippets for Copy-and-Paste
Independent RDFa Snippets RDFa by design allows differences between the literals used as property values and the literals being displayed using the “content” attribute [1], e.g.
<body><div property="vcard:tel" datatype="xsd:string" content="+49-89-6004-0">+49-89-6004 ext. 0</div> </body>
This is particularly useful if the formatting of the data for humans and machines differs, e.g. in the case of date and time information (“2009-04-24T00:00:00+01:00”). It is possible to exploit this to
create XHTML snippets that just contain the meta-data and insert it for instance at the bottom of
the page:
<body><!-- Content for humans --><div>+49-89-6004-0</div> <!-- RDFa rich meta-data --><div property="vcard:tel" datatype="xsd:string" content="+49-89-6004-0"></div> </body>
The potential advantages of this approach are that (1) we disentangle the markup and that (2)
respective snippets for simple copy-and-paste can be provided by form-based tools like FOAF-a-Matic [2]. As compared to publishing a separate RDF/XML file on the server, the advantages are
that (1) RDFa data is considered by Yahoo! SearchMonkey and other services, (2) one still has
to maintain a single file only (reducing the likelihood of outdated, forgotten meta-data files), (3)
the content creator does not require access beyond being able to edit the page. Also, note that
literal values will often have to be encoded in RDFa “content” attributes anyway, because the string for the presentation is not suitable as meta-data content (e.g. dates or country codes).
In a nutshell, the proposed approach can be a powerful way of publishing non-trivial RDF meta-
data suitable for broad audiences. Imagine e.g. if eBay sellers were able to put detailed
GoodRelations [3] meta-data directly into the free markup part of their product description in the
system.
Abstract In this demo and poster, we show a conceptual approach and an on-line tool that allows the use of RDFa for embedding non-trivial RDF models in the form of invisible div/
span elements into existing Web content. This simplifies the publication of sophisticated RDF data, i.e. such that goes beyond simple property-value pairs, by broad
audiences. Also, it empowers users with access limited to inserting XHTML snippets within Web-based authoring systems to add fully-fledged RDF and even OWL. Such is
a frequent limitation for users of CMS systems or Wikis.
Example: N3 as RDFa Snippet
RDF2RDFa Tool References
[1] RDFa in XHTML: Syntax and Processing. A collection of attributes and processing rules for extending XHTML to support
RDF. W3C Recommendation 14 October 2008, available at http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014.
[2] FOAF-a-Matic, available at
http://www.ldodds.com/foaf/foaf-a-matic.
[3] Hepp, M.: GoodRelations: An Ontology for Describing Products and Services Offers on the Web. 16th International
Conference on Knowledge Engineering and Knowledge Management (EKAW2008), Acitrezza, Italy, Springer LNCS Vol.
5268, 2008, pp. 332-347.
[4] RDFa Distiller, available at http://www.w3.org/2007/08/pyRdfa
Martin Hepp
E-Business & Web Science Research Group,
Universität der Bundeswehr
Werner-Heisenberg-Weg 39
D-85579 Neubiberg, Germany
+49-89-6004-4217
mhepp@computer.org
Roberto García
Computer Science and
Engineering Department
Universitat de Lleida
Jaume II, 69, E-25001 Lleida, Spain
+34-973-702-742
roberto@rhizomik.net
Andreas Radinger E-Business & Web Science Research Group,
Universität der Bundeswehr
Werner-Heisenberg-Weg 39
D-85579 Neubiberg, Germany
+49-89-6004-4218
andreas.radinger@unibw.de
Limitations of Popular RDFa Usage With the RDFa syntax for embedding RDF data in XHTML attributes being a W3C Recommendation, there is a standard way of adding RDF to Web content by inserting additional
mark-up [1]. However, the current usage of RDFa in the community is dominated by (1) using
simple property-value pairs rather than complex graph structures and (2) a close coupling
between page content for rendering and the literals attached to properties. For example, a typical
recipe would be to augment a phone number in a page by making it the literal attached to the vcard:tel property:
<body>
<div property="vcard:tel" datatype="xsd:string">+49-89-6004-0 </div> </body>
The key reason for the popularity of this approach is that there is no data redundancy, i.e. what is
shown in a browser is always identical to what an RDF-aware application will extract.
While this is appropriate for very lightweight annotations, it becomes very complicated if (1) more sophisticated RDF models are to be embedded or (2) the content or organization of the
information for humans on one hand and for machines on the other hand differ. Also, the
interweaving of existing Web content for humans with non-trivial RDF models requires a lot of
expertise, in particular if many nodes in the RDF model have no visual counterparts. In those
cases, the initial goal of avoiding data redundancy clashes with the goal of the separation of concerns, and the XHTML+RDFa markup gets hard to read and difficult to maintain because it
closely couples presentation and data. For examples, see http://www.ebusiness-unibw.org/wiki/
Rdfa4google. Most of all, it is not possible to provide users with XHTML snippets to be simply
inserted into Web resources, without the need to manually integrate them with existing XHTML
markup.
N3 foo:myCompany a gr:BusinessEntity ; gr:hasLegalName "Hepp Industries Ltd."^^xsd:string ; gr:hasDUNS "012345678"^^xsd:string ; gr:hasPOS foo:myShop ; rdfs:seeAlso <http://www.heppnetz.de>.foo:myShop a gr:LocationOfSalesOrServiceProvisioning ; rdfs:seeAlso <http://www.heppnetz.de/shop> ; gr:hasOpeningHourSpecification foo:Workdays.foo:Workdays a gr:OpeningHoursSpecification ; gr:opens "08:00:00"^^xsd:time ; gr:closes "18:00:00"^^xsd:time ; gr:hasOpeningHoursDayOfWeek gr:Monday, gr:Tuesday, gr:Wednesday, gr:Thursday, gr:Friday .
Discovery Effort
33 Martin Hepp,
mhepp@computer.org
Both Sides Can Help Build a Bridge
34 Martin Hepp,
mhepp@computer.org
What Do We Need?
• Vocabularies
– Product or service
types
– Businesses
– Offerings
• Data Sets
– Product model data
– Businesses, contact
details, opening hours
– Offering data
• Tools
• Applications
35 Martin Hepp,
mhepp@computer.org
Part IV: The GoodRelations
Vocabulary and Data Space
GoodRelations: A Unified View on
Commerce Data on the Web
37
Product Model
Master Data Shop
Offerings Auctions Spare Parts &
Consumables
Warranty
Delivery Payment
Retailers Manufacturers
Arbitrary Query
Extraction
and Reuse
Martin Hepp,
mhepp@computer.org
On the Shoulders of Giants
38
A Unified View of Commerce Data
on the Web Martin Hepp,
mhepp@computer.org
The GoodRelations Vocabulary • A universal and free Web
vocabulary for adding
product and offering data
to your Web pages.
• Compatible with all relevant
W3C standards and
recommendations
– RDF – OWL
http://purl.org/goodrelations/
39 Martin Hepp,
mhepp@computer.org
GoodRelations Design Principles
• Keep simple things
simple and make
complex things possible
• Cater for LOD and OWL
DL worlds
• Academically sound
• Industry-strength
engineering
• Practically relevant
40
Lightweight
Web of Data
LOD
RDF + a little bit
Heavyweight
Web of Data
OWL DL
Martin Hepp,
mhepp@computer.org
Others Do Care: Pick-up in Industry
• BestBuy
• Smart Information Systems
• ebSemantics
• Yahoo! SearchMonkey
• Virtuoso Sponger Cartridges for Amazon, eBay, and
• Major German mail order companies
• etc.
41 Martin Hepp,
mhepp@computer.org
Yahoo Enhanced by SearchMonkey
42 Martin Hepp,
mhepp@computer.org
BestBuy
43 Martin Hepp,
mhepp@computer.org
Incredible Success
44 Martin Hepp,
mhepp@computer.org
GoodRelations #2 of all Web Ontologies
45
…and this does not yet include the > 10 Mio. offers
from Amazon and eBay!
Martin Hepp,
mhepp@computer.org
GoodRelations #2 of all Web Ontologies
46 Martin Hepp,
mhepp@computer.org
Albert Einstein on Schema Design
"Make everything as simple as possible, but
not simpler.“
Albert Einstein
47 Martin Hepp,
mhepp@computer.org
Basic Structure of Offers
48
Agent 1 Object or
Happening Promise
Agent 2
Compensation Transfer of
Rights
Martin Hepp,
mhepp@computer.org
Data, Standards, Ontologies
49 Martin Hepp,
mhepp@computer.org
GoodRelations: License
• Permanent, royalty-free access for commercial and non-commercial use.
http://purl.org/goodrelations/
50 Martin Hepp,
mhepp@computer.org
Domain Structure and Use Cases
The Minimal Scenario
• Scope
– Business entity
– Points-of-sale
– Opening hours
– Payment options
• Suitable for
– Every business
– E-commerce and
brick-and-mortar
52 Martin Hepp,
mhepp@computer.org
The Simple Scenario
• Scope: Minimal scenario plus
– Range of products or services
– Business functions
– Eligible regions or customer
types
– Delivery options
• Suitable for
– Any business: E-Commerce and
brick-and-mortar
– Specific products or services 53 Martin Hepp,
mhepp@computer.org
GoodRelations Annotator
54
http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
Martin Hepp,
mhepp@computer.org
GoodRelations in MediaWiki
http://www.ebusiness-unibw.org/wiki/RDFaInMediaWiki
Martin Hepp,
mhepp@computer.org 55
The Comprehensive Scenario
• Scope: Simple scenario plus
– Individual products or services
– Product features
– Pricing, rebates, etc.
– Availability
• Suitable for
– Any business: E-commerce and brick-and-mortar
– Specific products or services
– Structured product database
56 Martin Hepp,
mhepp@computer.org
GoodRelations CookBook
http://www.ebusiness-unibw.org/wiki/GoodRelations#Recipes_and_Examples
57 Martin Hepp,
mhepp@computer.org
osCommerce Extension
http://code.google.com/p/goodrelations-for-oscommerce/
58 Martin Hepp,
mhepp@computer.org
Joomla/VirtueMart Extension
http://code.google.com/p/goodrelations-for-joomla/
Martin Hepp,
mhepp@computer.org 59
GoodRelations in Oxid eSales
• Popular shop
software
• Free recipe for adding
GoodRelations,
developed by Daniel
Bingel
http://www.ebusiness-unibw.org/wiki/GoodRelations4Oxid
Martin Hepp,
mhepp@computer.org 60
• Testshop
– http://www.la-mousson.de/
• Extension:
– Uwe Stoll, http://www.semantium.de/
GoodRelations in Magento
Google Product Feed Converter
http://www.ebusiness-unibw.org/tools/google-product-feed-converter/
62 Martin Hepp,
mhepp@computer.org
BMEcat2GoodRelations
• Converts complete catalogs from the popular
BMEcat XML Schema into GoodRelations
http://www.ebusiness-unibw.org/tools/bmecat2goodrelations/ 63 Martin Hepp,
mhepp@computer.org
Product Model Data Scenario
• Scope
– Individual product
models
– Quantitative and
qualitative features
• Suitable for
– Manufacturers of
commodities
64 Martin Hepp,
mhepp@computer.org
Linked Open Commerce Dataspace
http://loc.openlinksw.com/sparql
65 Martin Hepp,
mhepp@computer.org
Linked Open Commerce Dataspace
http://loc.openlinksw.com/sparql 66 Martin Hepp,
mhepp@computer.org
Conclusion
Today: Loss of Variety and Detail
68 Martin Hepp,
mhepp@computer.org
Many Different
Products
Variety in
Preferences
Manufacturers &
Retailers Consumers
Web Search
2010: Point-to-Point Commerce
69 Martin Hepp,
mhepp@computer.org
Many Different
Products
Variety in
Preferences
Manufacturers &
Retailers Consumers
Why Should I Bother?
• Web Shops: Better visibility in latest generation
search engines (e.g. Yahoo)
– Same holds for any business that has a Web page, from A as in Amusement Park to Z as in Zoo.
• Manufacturers: Allow your retailers to reuse
product feature data with minimal overhead at
both ends.
• Software Developers: Help your customers to use and generate Semantic Web data. It’s easy!
70 Martin Hepp,
mhepp@computer.org
What Should I Do?
• Web Shops: Create a GoodRelations data dump of
your range of offers (rather simple)
• Vendors of Web Shop Software: Create
GoodRelations import and export interfaces (we can
help you with that)
• Every Business: Ask your webmaster to create at
least a basic description of your range of products or
services
• Entrepreneurs: Invent new business models based
on GoodRelations data
71 Martin Hepp,
mhepp@computer.org
Part V: The Sky Is the Limit
Semantics in Affiliate Models,
Serendipity, Matchmaking
http://igoogr.appspot.com/
GoodRelations as a Global Schema
Thank you!
http://purl.org/goodrelations/
Prof. Dr. Martin Hepp
Chair of General Management and E-Business
Universitaet der Bundeswehr University Muenchen
Werner-Heisenberg-Weg 39
D-85579 Neubiberg, Germany
Phone: +49 89 6004-4217 Fax: +49 89 6004-4620
http://www.unibw.de/ebusiness/
http://purl.org/goodrelations/
mhepp@computer.org
76 Martin Hepp,
mhepp@computer.org
Bonus Track: Tools and Resources
Additional Information
• Web Page – Ontology – Language Reference – Primer – Recipes – Wiki
http://purl.org/goodrelations/
78 Martin Hepp,
mhepp@computer.org
GoodRelations User‘s Guide („Primer“)
79
http://www.heppnetz.de/projects/goodrelations/primer/
GoodRelations Cookbook:
Recipes & Examples
80 Martin Hepp,
mhepp@computer.org
http://www.ebusiness-unibw.org/wiki/GoodRelations#Recipes_and_Examples
GoodRelations Annotator
81
http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
Martin Hepp,
mhepp@computer.org
GoodRelations Validator
82
http://www.ebusiness-unibw.org/tools/goodrelations-validator/
Martin Hepp,
mhepp@computer.org
RDF2dataRSS Tool
83
http://www.ebusiness-unibw.org/tools/rdf2datarss/
Martin Hepp,
mhepp@computer.org
Growing Interest of Developers
Thank you!
http://purl.org/goodrelations/
Prof. Dr. Martin Hepp
Chair of General Management and E-Business
Universitaet der Bundeswehr University Muenchen
Werner-Heisenberg-Weg 39
D-85579 Neubiberg, Germany
Phone: +49 89 6004-4217 Fax: +49 89 6004-4620
http://www.unibw.de/ebusiness/
http://purl.org/goodrelations/
mhepp@computer.org
85 Martin Hepp,
mhepp@computer.org