Date post: | 30-Jul-2015 |
Category: |
Education |
Upload: | giovanni-colavizza |
View: | 25 times |
Download: | 0 times |
A brief introduction to the Venice Time Machine
Giovanni Colavizza EPFL
Who am I
Giovanni Colavizza PhD student in Management of Technology chair of Digital Humanities, EPFL
previously: Computer Science, History, Archival and Library Sciences, 2 start-ups and some positions in IT and research.
Today
Venice Time Machine 1- Vision (where to go) 2- Pipeline (how) and Projects (what) 3- Methods and DH in context (or why, and how again)
VTM Vision
VTM Vision
Preservation (from analog to digital) Access (from browsing to searching) Valorisation by use
Preservation
Digitisation and replication as a preservation strategy.. Quite complicated: 1- metadata (digital provenance) 2- replication protocols: IT infrastructure (centralised vs distributed) 3- rights and partners’ needs (far away goal of open access for public heritage)
Access
An Information System down to contents:
Valorisation
1- research 2- teaching 3- digital reconstruction and outreach 4- technology transfer 5- methodology transfer
Pipeline illustrated by projects
1. Digitisation (Tomography) 2. Image processing (Pre-processing Suite) 3. Content extraction (Automatic transcription, READ Project) 4. Information modelling (Garzoni Project) 5. Building an information system (Document Viewer) 6. Content enrichment and network effects (Linked Books Project) 7. Valorisation and use (GIS, digital experiences, …)
Tomography
Fauzia Albertin EPFL
Tomography
Fauzia Albertin EPFL
Pipeline illustrated by projects
1. Digitisation (Tomography) 2. Image processing (Pre-processing Suite) 3. Content extraction (Automatic transcription, READ Project) 4. Information modelling (Garzoni Project) 5. Building an information system (Document Viewer) 6. Content enrichment and network effects (Linked Books Project) 7. Valorisation and use (GIS, digital experiences, …)
Image pre-processing suite
Andrea Mazzei ODOMA
Image pre-processing suite
Video pt. 1
Andrea Mazzei ODOMA
Pipeline illustrated by projects
1. Digitisation (Tomography) 2. Image processing (Pre-processing Suite) 3. Content extraction (Automatic transcription, READ Project) 4. Information modelling (Garzoni Project) 5. Building an information system (Document Viewer) 6. Content enrichment and network effects (Linked Books Project) 7. Valorisation and use (GIS, digital experiences, …)
Semi-automatic transcription
or the Big Data quest for script family resemblances
READ Horizon 2020 project: 8.2 million €, 7 partners, maximum peer reviewers’ score.
Opt. 1: Alignment
ello$ stara$ en$ carcere$ domentre$ chel$ fara$queste$ chose$ opagera.$ Et$ e5am$ deo$stagando$collui$encarcere$se$sauera$la$che$ sia$ dellauer$ de$ collui$ lodoxe$comandera$ chello$ sia$ entromesso$edara$ sse$ allo$ so$ credetor.$ Et$ e5am$deo$ selo$ creditor$ uora$ enues5r$lapprietade$ del$ debitor$ enquella$ fia$da$ alcreditor$ sera$ data$ en$ ues5xon.$Mosella$ femena$ che$ none$ maritata$sera$9depnata$segon$do$che$desoura$edito$ tuto$ se$ fara$ segondo$ che$ nui$auemo$ soura$ dito$ delomo$ remetuda$questa$ cho$ sa$ chello$ stara$ enlo$teratorio$de$san$ҫacharia$e$$
Fouad Slimane EPFL
Opt. 1: Alignment
Fouad Slimane EPFL
! chose! opagera.! Et! e.am! deo!stagando!collui!encarcere!se!sauera!la!che! sia! dellauer! de! collui! lodoxe!comandera! chello! sia! entromesso!edara! sse! allo! so! credetor.! Et! e.am!deo! selo! creditor! uora! enues.r!lapprietade! del! debitor! enquella! fia!da! alcreditor! sera! data! en! ues.xon.!Mosella! femena! che! none! maritata!sera!9depnata!segon!do!che!desoura!edito! tuto! se! fara! segondo! che! nui!auemo! soura! dito! delomo! remetuda!questa! cho! sa! chello! stara! enlo!teratorio!de!san!ҫacharia!e!!
Opt. 1: Alignment
Fouad Slimane EPFL
!!encarcere!se!sauera!la!
che! sia! dellauer! de! collui! lodoxe!comandera! chello! sia! entromesso!edara! sse! allo! so! credetor.! Et! e3am!deo! selo! creditor! uora! enues3r!lapprietade! del! debitor! enquella! fia!da! alcreditor! sera! data! en! ues3xon.!Mosella! femena! che! none! maritata!sera!9depnata!segon!do!che!desoura!edito! tuto! se! fara! segondo! che! nui!auemo! soura! dito! delomo! remetuda!questa! cho! sa! chello! stara! enlo!teratorio!de!san!ҫacharia!e!!
Opt. 1: Alignment
Fouad Slimane EPFL
!!
lodoxe!comandera! chello! sia! entromesso!edara! sse! allo! so! credetor.! Et! e2am!deo! selo! creditor! uora! enues2r!lapprietade! del! debitor! enquella! fia!da! alcreditor! sera! data! en! ues2xon.!Mosella! femena! che! none! maritata!sera!9depnata!segon!do!che!desoura!edito! tuto! se! fara! segondo! che! nui!auemo! soura! dito! delomo! remetuda!questa! cho! sa! chello! stara! enlo!teratorio!de!san!ҫacharia!e!!
Opt. 1: Alignment
Fouad Slimane EPFL
!!
sse! allo! so! credetor.! Et! e-am!deo! selo! creditor! uora! enues-r!lapprietade! del! debitor! enquella! fia!da! alcreditor! sera! data! en! ues-xon.!Mosella! femena! che! none! maritata!sera!9depnata!segon!do!che!desoura!edito! tuto! se! fara! segondo! che! nui!auemo! soura! dito! delomo! remetuda!questa! cho! sa! chello! stara! enlo!teratorio!de!san!ҫacharia!e!!
Opt. 2: Word spotting and Neural Networks
Andrea Mazzei ODOMA
Video pt. 2
Pipeline illustrated by projects
1. Digitisation (Tomography) 2. Image processing (Pre-processing Suite) 3. Content extraction (Automatic transcription, READ Project) 4. Information modelling (Garzoni Project) 5. Building an information system (Document Viewer) 6. Content enrichment and network effects (Linked Books Project) 7. Valorisation and use (GIS, digital experiences, …)
Information modelling
Garzoni Project Lille University and EPFL
ANR+FNS funded
Valentina Sapienza Lille Maud Ehrmann EPFL
Information modelling
Valentina Sapienza Lille Maud Ehrmann EPFL
Information modelling
Valentina Sapienza Lille Maud Ehrmann EPFL
Pipeline illustrated by projects
1. Digitisation (Tomography) 2. Image processing (Pre-processing Suite) 3. Content extraction (Automatic transcription, READ Project) 4. Information modelling (Garzoni Project) 5. Building an information system (Document Viewer) 6. Content enrichment and network effects (Linked Books Project) 7. Valorisation and use (GIS, digital experiences, …)
Information system
Fabio Bortoluzzi EPFL
Information system
Fabio Bortoluzzi EPFL
Not all documents are the same in connecting to each other.
Fiscal declarations (for taxation)
Personal acts (contracts, testaments, etc.)
State machinery (office holding)
Information system
Fabio Bortoluzzi EPFL
How Venetians indexed this information?
Information system
Fabio Bortoluzzi EPFL
Real estate surveysFiscal declarations
Testaments
Information system
Fabio Bortoluzzi EPFL
Entities
Indexes
Documents
Information system
Orlin Topalov EPFL
Pipeline illustrated by projects
1. Digitisation (Tomography) 2. Image processing (Pre-processing Suite) 3. Content extraction (Automatic transcription, READ Project) 4. Information modelling (Garzoni Project) 5. Building an information system (Document Viewer) 6. Content enrichment and network effects (Linked Books Project) 7. Valorisation and use (GIS, digital experiences, …)
Content enrichment
Linked Books Project EPFL, Ca’ Foscari, Marciana
FNS funded
Approx. half of the citations in humanities are to primary sources [Wiberley (2009)].
Their use has hardly ever been studied with citation analytic methods.
Network effects: directly link scholarship with primary sources.
Content enrichment
• Primary and secondary sources • Citation history (e.g. Google Scholar) • Citation semantics • Algorithmic History of the History of Venice
Content enrichment
Content enrichment
Content enrichment
Content enrichment
Network-based models. Remember primary and secondary sources, how many graphs can we build?
Bibliographic coupling and co-citation
Content enrichment: multiple perspectives
Pipeline illustrated by projects
1. Digitisation (Tomography) 2. Image processing (Pre-processing Suite) 3. Content extraction (Automatic transcription, READ Project) 4. Information modelling (Garzoni Project) 5. Building an information system (Document Viewer) 6. Content enrichment and network effects (Linked Books Project) 7. Valorisation and use (GIS, digital experiences, …)
Valorisation: some examples
Immersive reality
Valorisation: some examplesGIS and 3d virtual reconstructions
Valorisation: some examples
Teaching and interdisciplinary collaborations
Valorisation: some examples
Replication and transfer
VTM in the context of DH
1- The Big vs Small Data debate, or a proposal for reframing
2- The quest for evidence of value, or overcoming the DH drudgery conundrum
3- Humanities in the digital era, or why we need historians more than ever ;)
VTM in the context of DH
The Big vs Small Data debate, or a proposal for reframing
Big Data (for Humanities): 1- a matter of dimensions (in Tb or Pb)
2- networked, relational vs well-bounded (Kaplan 2015) 3- Telescope vs Microscope
“Data” are not big or small per se, but are so according to the observer. Do I want to aggregate or disaggregate? Do I have
“larger” or “smaller” questions?
VTM in the context of DH
The Big vs Small Data debate, or a proposal for reframing
Macro MicroMeso
VTM in the context of DH
The quest for evidence of value, or overcoming the DH drudgery conundrum
Tool-building not an end in itself. Developing tools to answer old questions should lead to new questions and perspectives. The great quest in DH
now is for new arguments.
VTM in the context of DH
Humanities in the digital era, or why we need historians more than ever ;)
“historians are fundamentally in the business of taking complex, incomplete sources that are full of biases and errors, and interpreting them critically to develop an argument that answers a research question. Digital sources do not change this.”
Ian Gregory
VTM in the context of DH
Humanities in the digital era, or why we need historians more than ever ;)
“Data of different kinds must be understood in their historical
relationship.”
Historians as critical arbiters of information trained to work with time (“comparative modelling of multiple
variables over time” in jargon).
A brief introduction to the Venice Time Machine
Thank you
Giovanni Colavizza EPFL
“Computers are incredibly fast, accurate and stupid; humans are incredibly slow, inaccurate and brilliant;
together they are powerful beyond imagination.” Albert Einstein (or was it someone else??)