Granularity:Subdocuments in Ensemble
Lois Delcambre, Dave Maier,
Dave Archer, Jeremy Steinhauer
Kelson Luc, Vikram Ramesh
with assistance from Va. Tech
Outline
1. A vision
2. What we have implemented
3. A demo
4. Questions/issues
The Authoring ExperienceEnsemble
Find1 New work (mashup)
3
Q: How many source types do we want to support?
Q: How many different “mashup” tools do we want?
Ensemble
Save & ingest4
Source
2
Basic Authoring Needs
Search/browse repository for documents Identify resources (docs/subdocs/mashups) of
interest; place them in a “workspace” Create new works (mashups) from pre-existing
documents or subdocuments and original material
Store new works (mashups) in repository in appropriate format: .PPT, .DOC, .HTM, ...(Keep track of the subdocuments & how/where they were used.)
Features
With subdocuments in the repository, we can:
• Automatically generate citation lists
• Explain where information came from:
• Show bibliographic details of sources
• Show subdocuments in original context
Explain how information is used by others: “Have I (or anyone else) used this question
in an exam before?” Note: subdocuments can overlap; we need
recursive processing.
original materialoriginal material
Make this easy to use Use familiar tools to create mashups
Use copy-and-paste or other mashup-creation mechanisms
Don’t introduce unnecessary additional mouse clicks
For one Subdocument“Where did this come fromand where else did it go?”
EnsembleEnsemble
Show this subdoc's: Metadata Provenance Parent context Other known uses Overlap with others
Review details of the subdoc enclosing
this selection
Review details of parent/ancestor
documents
Show other documentsthat incorporate enclosing
subdoc
See enclosingsubdoc in context
of original/parent document
Show overlappingselections in repository
For Resources (including mashups)
Show this document's: Parent doc Derived subdocs Referenced subdocs Mashups that this doc contributed toSave & ingest into Ensemble: (save this document & referenced subdocs )
EnsembleEnsemble
Review details ofparent document (if
this doc is a subdocument)
Show documents that usecontent from this one
Show pieces ofothers docs included here
Show the pieces of thisdoc that were used
elsewhere
Added Value While Searching / Browsing Repository
Show doc/subdoc hierarchy in result
Subdocs Parents Instances of use
See document relationshipsin repository search results
Relationships
HasSubDoc
HasParent
References
UsedIn
HasSubDoc
HasParent
References
UsedIn
HasSubDoc
HasParent...
...
References
UsedIn...
...
HasSubDoc
HasParentReferences
UsedIn
Types doc/subdoc
mashup/subdoc
Recursive doc/subdoc
mashup/subdoc
Using bothrelationships
Tool Architecture
AuthoringApplication
BrowsingApplication
SourceDocument
A
TargetDocument
EnsembleRepository
Ingest
subdocuments
Document
Fetch relationship dataon subdocument Copy
subdocworkspace
subdocument & metadata
Copy Pastesubdocument
GetInfo
Fetch relationship dataon pre-existing subdocs
Current ImplementationWe have “assembly language” for our vision:
• subdocument selection using Copy
– In MSWord, OpenOffice Writer; creates 3 streams for Fedora (FOXML, text, subdoc)
• subdocument ingestion to Fedora including relationship creation
• Fedora search showing parent/child
• Note: we are using existing Fedora relationships IsPartOf, HasPart (we need subdocument-specific relationships for final implementation)
• No mashup creation (yet)
Selection using digital pens
Summer project for a high school intern (Kelson Luc)
Worked with the Anoto digital pens – with a digital camera near the tip of the pen and special paper with a dot pattern.
Print a pdf file on the dotted paper.
Circle the selected text with the pen.
Write a comment – in the margin – with the pen.
We are lacking software from the pen manufacturer to put the whole system together.
subDocument Selection
Insert movie here
subDocument Ingestion
Insert movie here
Subdocument: .doc file
Subdocument: text file
I wish I had never concealed it. For I, and I only, know what manner of fear lurked on that spectral and desolate mountain. In a small motor-car we covered the miles of primeval forest and hill until the wooded ascent checked it. The country bore an aspect more than usually sinister as we viewed it by night and without the accustomed crowds of investigators, so that we were often tempted to use the acetylene headlight despite the attention it might attract. It was not a wholesome landscape after dark, and I believe I would have noticed its morbidity even had I been ignorant of the terror that stalked there.
Subdocument: image
Searches With Relationships
Insert movie here
Discussion
• Will Ensemble users want to download/use our plug-ins, e.g., for MS Word?
• Which “mashup” creation tool should we provide?
• Where should subdocuments be stored?
• (Metadata records for) subdocuments and their parent documents and their mashups needs to be in the same repository.
• We could extend the browse/search interface for Ensemble to be subdocument-aware.
An Idea for the November launch
Develop an example – with documents from Ensemble that are used to create one or more mashup documents. (This will induce some number of subdocuments … with their relationships.)
Use it to demonstrate how the rewards could track use of materials across mashups.
Let people browse and search the example.