libwww - The W3C Protocol Library

Post on 19-Jan-2016

43 views 0 download

Tags:

description

libwww - The W3C Protocol Library. „Großes Schwerpunktseminar WI“ University of Applied Sciences Gießen-Friedberg Stefan Sabatzki. Contents. Introduction Structure libwww Programming with libwww Conclusion. Contents. Introduction What is libwww? Why libwww? Structure libwww - PowerPoint PPT Presentation

transcript

Libwww, the W3C protocol library 29.06.2004

libwww -The W3C Protocol Library

„Großes Schwerpunktseminar WI“University of Applied Sciences Gießen-Friedberg

Stefan Sabatzki

Libwww, the W3C protocol library 29.06.2004

Contents

1. Introduction

2. Structure libwww

3. Programming with libwww

4. Conclusion

Libwww, the W3C protocol library 29.06.2004

Contents

1. Introduction– What is libwww?– Why libwww?

2. Structure libwww

3. Programming with libwww

4. Conclusion

Libwww, the W3C protocol library 29.06.2004

What is libwww?

• Generic framework for building web applications• Written in C• Pluggable modularity• Means to provide most common Internet access methods• Transmit data in many different media formats• Dataflow to and from the server

Libwww, the W3C protocol library 29.06.2004

What is libwww? (2)

• First version implemented 1992 by Tim Berners-Lee• Development at CERN• 1994 libwww moved from CERN to W3C• 1998 released as opensource• As of September 2003 W3C stopped work on libwww• As of January 2004 libwww officially belongs to the „Open

Source Community“

Libwww, the W3C protocol library 29.06.2004

Why libwww?

• Experimenting and prototyping• Performance, modularity and extensibility• Free and open source code• Mailing lists and active community

Libwww, the W3C protocol library 29.06.2004

Contents

1. Introduction

2. Structure libwww– Design Model– Request/Response Paradigm– Data Flow– Threads, Eventloops and Filters– Modules as Statemachines

3. Programming with libwww

4. Conclusion

Libwww, the W3C protocol library 29.06.2004

Design Model

• Layering as design model

Libwww, the W3C protocol library 29.06.2004

Design Model (2)

• More demonstrative

Libwww, the W3C protocol library 29.06.2004

Request/Response Paradigm

• Application issues request • Libwww fulfills request• Presented to application on arrival• Simultaneous requests handled by Librarycore

Libwww, the W3C protocol library 29.06.2004

Data Flow

• Streams are used to transport data• Derived from generic stream

– Protocol streams– Converters– Presenters– I/O streams– Basic streams

Libwww, the W3C protocol library 29.06.2004

Data Flow (2)

• Structured streams– Derived from generic stream– Accepts structured document– Ordered tree-structured arrangement of data– Each instance is associated with SMGL parser– Each instance is associated with corresponding DTD

Libwww, the W3C protocol library 29.06.2004

Data Flow (3)

• Cascaded streams– Stream chains– Setup before data arrives

Libwww, the W3C protocol library 29.06.2004

Data Flow (4)

– Setup after data arrives

Libwww, the W3C protocol library 29.06.2004

Threads, Eventloops and Filters

• Not thread-save• Implements pseudo-thread model

– Uses non-blocking sockets– Based on callback functions

• Before/After-Filter– Global and local filters– Registered at runtime

Libwww, the W3C protocol library 29.06.2004

Threads, Eventloops and Filters (2)

Libwww, the W3C protocol library 29.06.2004

Modules as Statemachines

• Since libwww 3.0• Protocol modules implemented as statemachines• Part of thread-model• Keep track of current state in communication interface

Libwww, the W3C protocol library 29.06.2004

Modules as Statemachines (2)

Libwww, the W3C protocol library 29.06.2004

Contents

1. Introduction

2. Structure libwww

3. Programming with libwww– C++ Simulation– APIs and Library Interfaces– Simple Example– More Complex Example

4. Conclusion

Libwww, the W3C protocol library 29.06.2004

C++ Simulation

• Construction/destruction– *_new / *_delete (HTRequest_new / HTRequest_delete)

• Data hiding • Inheritance

– Explicit pointer casting

• PRIVATE, PUBLIC Makros

Libwww, the W3C protocol library 29.06.2004

APIs and Library Interfaces

• Set of APIs called packages• Win32: DLLs• Unix: separate static libraries • Package interface exported via single include file: WWW*.h • Some important packages

– Basic Utility Packages– Core Packages – Initialization Packages– Transport Packages– Protocol Packages– Parser Packages

Libwww, the W3C protocol library 29.06.2004

Simple Example

• Displays all links in document• Applicable to text, html/xml tags, etc.

// snippet...HText_registerLinkCallback(foundLink); .HTEventList_loop(request); ... foundLink (...) {

HTAnchor * dest = HTAnchor_followMainLink(...); char * address = HTAnchor_address(dest); HTPrint("Found link `%s\'\n", address); HT_FREE(address);

}

Libwww, the W3C protocol library 29.06.2004

More Complex Example

• Rudimentary commandline browser• See project www.dsw

Libwww, the W3C protocol library 29.06.2004

Contents

1. Introduction

2. Structure libwww

3. Programming with libwww

4. Conclusion– What‘s missing?– Facts about libwww– Personal Opinon

Libwww, the W3C protocol library 29.06.2004

What‘s missing?

• Not thread-safe• No cookie-jar, only parsing/generation• Consistent usage of RegEx• C++ representation

Libwww, the W3C protocol library 29.06.2004

Facts about libwww

• Who uses libwww? No one?• Sample applications on project homepage• No reviews, benchmarks, comparisons• Not ‚bug free‘• ‚Competitors‘ (mostly UNIX)

– WinInet– Libghttp– Libcurl– Libhttp – Neon

Libwww, the W3C protocol library 29.06.2004

Personal Opinion

• Typical opensource project• Tricky installation• ‚Feels‘ old < – > IS old• Desperate attempt to reach OOP• Non-trivial usage, but very flexible and potent

Libwww, the W3C protocol library 29.06.2004

Thank you for your attention

?