Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Understanding the Hidden Web
Pierre Senellart
Journées GEMO — 2nd June 2005
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
The Hidden Web
Definition (Hidden Web)The set of webpages (which may or may not be dynamicallygenerated) not accessible from the hyperlinked structure ofthe World Wide Web.
Size estimate (2001) : 500 times larger than the surfaceWeb.
How to understand it and benefit from its content?
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
The Hidden Web
Definition (Hidden Web)The set of webpages (which may or may not be dynamicallygenerated) not accessible from the hyperlinked structure ofthe World Wide Web.
Size estimate (2001) : 500 times larger than the surfaceWeb.
How to understand it and benefit from its content?
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
The Hidden Web
Definition (Hidden Web)The set of webpages (which may or may not be dynamicallygenerated) not accessible from the hyperlinked structure ofthe World Wide Web.
Size estimate (2001) : 500 times larger than the surfaceWeb.
How to understand it and benefit from its content?
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide Web
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Web Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Web Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Results
Queries
queryingWeb Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Wrapping Web Service Descriptions
Analyzing the structure of:HTML forms
Result webpages
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Conceptual Model
IsA ontology of concepts (simple DAG)
Person
Man Woman
Thing
Proceedings Article Book
Publication
n-ary typed roles
AuthorOf(Person,Publication)HasName(Person,Name)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Conceptual Model
IsA ontology of concepts (simple DAG)
Person
Man Woman
Thing
Proceedings Article Book
Publication
n-ary typed roles
AuthorOf(Person,Publication)HasName(Person,Name)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Services and queries
ExampleService giving authors from publication titles
A*← WrittenBy(P,A),HasTitle(P,T),Input(T)
QueryService with no input
Example<A,T*>*← WrittenBy(P,A), Article(P), HasTitle(P,T),KeywordOf(“xml”,P)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Services and queries
ExampleService giving authors from publication titles
A*← WrittenBy(P,A),HasTitle(P,T),Input(T)
QueryService with no input
Example<A,T*>*← WrittenBy(P,A), Article(P), HasTitle(P,T),KeywordOf(“xml”,P)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Web Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Results
Queries
queryingWeb Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web