+ All Categories
Home > Documents > Seminario Maurizio Agelli, 20-09-2012

Seminario Maurizio Agelli, 20-09-2012

Date post: 20-Aug-2015
Category:
Upload: crs4-research-center-in-sardinia
View: 766 times
Download: 6 times
Share this document with a friend
Popular Tags:
57
Archiving and Cataloging Digital Photographs Maurizio Agelli, CRS4 { [email protected] } September 20th 2012, 5.30pm Aula Magna Facoltà di Architettura - Via Corte d'Appello - Cagliari
Transcript
Page 1: Seminario Maurizio Agelli, 20-09-2012

Archiving and Cataloging Digital Photographs

Maurizio Agelli, CRS4

{ [email protected] }

September 20th 2012, 5.30pm

Aula Magna Facoltà di Architettura - Via Corte d'Appello - Cagliari

Page 2: Seminario Maurizio Agelli, 20-09-2012

Point de vue du Gras, Nicéphore Niépce, 1826 (from Wikimedia Commons)

Page 3: Seminario Maurizio Agelli, 20-09-2012

Boulevard du Temple, Louis Daguerre, 1838 (from Wikimedia Commons)

Page 4: Seminario Maurizio Agelli, 20-09-2012

The first photograph was taken less than 200 years ago ...

How many photos have ever been

taken ?

Page 5: Seminario Maurizio Agelli, 20-09-2012

[ source: Jonathan Good, 2011 - 1000memories.com ]

500 to 800 billiontaken in 2011 [source: Observatoire des Professions de l'Image ]

Number of photos ever shot (up to 2011): ~3.5 x 1012

Page 6: Seminario Maurizio Agelli, 20-09-2012

Presentation Outline

1) Archiving as part of the photographic workflow

2) Describing photographs: metadata

3) Organizing images in catalogs

4) Ensuring long-term storage: backup and migration

5) An overview of image archiving tools

6) A Digital Asset Management platform developed at CRS4

Page 7: Seminario Maurizio Agelli, 20-09-2012

- 1 -

Archiving as part of the photographic workflow

Page 8: Seminario Maurizio Agelli, 20-09-2012

Photo Archive

A collection of images kept in secure, long-term storage.

[ dpBestflow.org ]

Pho

to b

y S

eew

eb -

CC

BY

-SA

2.0

Pho

to b

y M

.Age

lli -

CC

BY

-SA

2.0

Page 9: Seminario Maurizio Agelli, 20-09-2012

Building a digital photo archiveinvolves many decisions ...

File formats

Metadata File naming

Folder structure

Catalog organization

Backup policies

Archiving platform

... which strongly depend on the photographic workflow

Migration policies

What to archive ?

Page 10: Seminario Maurizio Agelli, 20-09-2012

A general workflow

Capture Ingestion Working Publishing

Archive

No single workflow suits all photographers and all clients [UPDIG]

Workflow decisions are determined by volume production, turnaround, image quality requirements, regulations, costs, etc..

Page 11: Seminario Maurizio Agelli, 20-09-2012

A general workflow, more in detail

Capture Ingestion Working Publishing

Archive

camera computer

All camera-related stuff

- Image transfer- File renaming- Add bulk metadata- Batch editing- Format conversion

Focus on volume and speed

- Image editing- Metadata editing- Create derivative work

Focus on quality

- Export images- Print images- Publish to web

Store, search, organize, ...

Digital Asset Management Platform

Page 12: Seminario Maurizio Agelli, 20-09-2012

File formats / 1

Camera sensor

In-camera processing

Scanner

TIFFJPEG(DNG)

RAWJPEG(DNG)(TIFF)

Film

RAWMany RAW formats (>200).Proprietary, undocumented.Encodes values from camera sensor, before demosaicing (12-16 bit/pixel, 1 color/pixel) .Lossless. May be compressed.

TIFFOpen standard.8, 16, 32 bit RGBLossless, big file size !Possible PSD replacement (supports layers).

DNG (DIGITAL NEGATIVE)

Open standard, created by Adobe.Targeted to replace RAW, but stilllimited adoption by the industry.

Page 13: Seminario Maurizio Agelli, 20-09-2012

File formats / 2

JPEGOpen standardCompressed, lossy8 bit RGB: suitable for displaying, not good for editing

~35 MB

~5.3 MB

TIFF48 bit / pixel

uncompressed

NEF12 bit / pixelcompressed

JPEG 2000Better compression than Jpeg (wavelet transform vs. cosine transform)8, 16 bit RGBLossless / lossyMany extra features: regions of interest, progressive decoding, multi-resolution decoding.

Example: 6Mpixel image (Nikon D40)

~5 MB

DNG12 bit / pixelcompressed

JPEG90%

quality

~0.6 MB

Page 14: Seminario Maurizio Agelli, 20-09-2012

File formats and image editing

CAMERAPARAMETRIC EDITING

RASTER EDITING

EXPORTRAW RAW or DNG JPG

TIFF or DNGEXPORT

JPG

Parametric Image EditingImage data are not modified.Source file is preserved. Editing is saved as a list of rules which are applied at rendering time.(e.g. Lightroom, Aperture)

Raster Image EditingImage pixels are modified.A new file containing the edited image shall be saved in order to preserve the original.(e.g. Photoshop, Picture Window Pro)

TIFF or DNG

Page 15: Seminario Maurizio Agelli, 20-09-2012

CAPTURE

INGESTION

WORKING

File formats decision tree

PUBLISHINGJPG JPG JPG JPG JPG JPG JPG JPG JPG JPG JPG JPG JPG

JPG JPG TIFF RAW TIFF DNG TIFF JPG DNG TIFF JPG DNG TIFF

JPG JPG TIFF RAW DNG JPG DNG JPG DNG TIFF

JPG RAW DNG TIFF

CAMERA SCANNER

Note: unusual decision paths have been omitted

Page 16: Seminario Maurizio Agelli, 20-09-2012

Capture Ingestion Working Publishing

A r c h i v e

Which files to archive?

ORIGINALFILES

MASTERFILES

DERIVATIVEFILES

Page 17: Seminario Maurizio Agelli, 20-09-2012

- 2 -

Metadata

Page 18: Seminario Maurizio Agelli, 20-09-2012

The importance of metadata

"An image is worth 1000 words", but ...

... there are questions which only words can answer:

When was it shot?

... and where?

Who are those people?

Who took this photograph ?

Can I use it freely ?

Pho

to b

y M

auriz

io A

gelli

- C

C B

Y-S

A 2

.0

Page 19: Seminario Maurizio Agelli, 20-09-2012

Metadata

Information about content.

Pho

to b

y M

. Age

lli -

CC

BY

-SA

2.0

Page 20: Seminario Maurizio Agelli, 20-09-2012

A more precise definition

METADATA

"Structured encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities"

[source American Library Association]

Page 21: Seminario Maurizio Agelli, 20-09-2012

Image metadata is nothing new ...

Pho

to b

y an

yjaz

z65

[ CC

BY

-NC

2.0

] ht

tp://

ww

w.fl

ickr

.com

/pho

tos/

4902

4304

@N

00/

Page 22: Seminario Maurizio Agelli, 20-09-2012

Where digital image metadata can be written?

image data

metadata

+image data

metadata

○ inside the image file

○ in a sidecar file

○ in a database○ in an online registry○ in the file name

d40-20120920-DSC_0153-edited.jpgcamera date id derived

Page 23: Seminario Maurizio Agelli, 20-09-2012

Image metadata standards

EXIFIPTC

XMPMpeg-7

DICOM

PLUS

Creative Commons

Dublin Core

Page 24: Seminario Maurizio Agelli, 20-09-2012

IPTC IIMInformation Interchange ModelCreated in 1991 by International Press Communication CouncilAdobe defined the mechanism for embedding IPTC IIM metadata in image files (1994)Driven by NEWS INDUSTRYFocused on high-level properties (description, geo location, ...) Cannot be extended

EXIFExchangeable Image File FormatCreated in 1995 by Japan Electronic Industries Development AssociationDriven by CAMERA MANUFACTURERSFocused on low-level properties (camera settings, geo coordinates, date/time, ...) Cannot be extended

Image Data

EXIF

IPTC IIM

Page 25: Seminario Maurizio Agelli, 20-09-2012

XMPExtensible Metadata PlatformOpen standard, created by Adobe○ defines a data model and a

serialization model (RDF/XML)○ also covers video, audio, text○ structured as a set of schemas○ can be extended with new

metadata schemas○ multi-lingual qualifiers○ can be serialized and stored in

most file formats (not in RAW!)○ it is widely supported by the

industry

Image Data

EXIF

IPTC IIM

XMP

Legacy Metadata

Dublin Core

XMP Basic

Rights

Media Mng

Photoshop

Camera RAW

EXIF

IPTC Core

IPTC Extens.

...

Page 26: Seminario Maurizio Agelli, 20-09-2012

A timeline of image standards

1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 2001

First DSLR(Kodak DCS-100)

professional DSLRs

EXIF(first release)

JPEG(first release)

Kodak Photo CD

TIFF(first release)

IPTCIIM

IPTCHeaders(Adobe)

XMP(first release)

consumer DSLRs

Page 27: Seminario Maurizio Agelli, 20-09-2012

A quick look inside XMP>200 properties + all EXIF and IPTC properties

TITLE (dc:title)DESCRIPTION (dc:description)DESCRIPTION WRITER (photoshop:CaptionWriter)RATING (xmp:Rating)KEYWORDS (dc:subject)GEO COORDINATES (exif:GPSLatitude, exif:GPSLongitude)LOCATION (photoshop:Country, photoshop:State, photoshop:City,..)AUTHOR (dc:creator, exif:Artist)RIGHTS (xmp:Rights).....

Page 28: Seminario Maurizio Agelli, 20-09-2012

A quick look inside XMPDate/Time Metadata

The originalpainting( ~1507)

Iptc4xmpExt:AODateCreated

An ancient postcard(1925)

photoshop:DateCreated

The digital representationof the postcard(2008)

xmp:CreateDate

The archived image (metadata last edited in 2012)

xmp:MetadataDate

Page 29: Seminario Maurizio Agelli, 20-09-2012

Extending XMPCreative CommonsCC provides a legal and technical infrastructure to help people share knowledge and creativity.

Pho

to b

y C

reat

ive

Com

mon

s C

C B

Y 3

.0

CC defines a set of properties that allow authors to specify under which conditions their content can be distributed and used.

CC recommends XMP for embedding CC properties inside resources.

Page 30: Seminario Maurizio Agelli, 20-09-2012

Extending XMPPLUS

Picture Licensing Universal SystemNon-profit organization whose mission is to simplify and facilitate the communication and management of image rights.PLUS Registry○ unique ids for creators, right holders, images, ...○ access to rights information and other metadataPLUS License Data Format (LDF)○ metadata schema for embedding image license○ 88 properties○ dedicated XMP PLUS namespace

Page 31: Seminario Maurizio Agelli, 20-09-2012

Extending XMPPRISM

Publishing Requirements for Industry Standard MetadataDefined by IDEAlliance, a global community of content and media creators.PRISM Metadata for Images provides information about:○ objects pictured (manufacturer, model, description, ...)○ slideshows (sequences of images)○ shooting info (viewpoint, season, visual technique, ...)PRISM Advertising Metadata provides information about the usage of the image in an advertising campaignPRISM defines dedicated XMP namespaces: pmi and pam

Page 32: Seminario Maurizio Agelli, 20-09-2012

Extending XMPArea Tagging

Metadata Working Group

○ XMP-MP Schema for face tags○ adopted by Picasa

Microsoft has created a new XMP schema for tagging people

Page 33: Seminario Maurizio Agelli, 20-09-2012

Handling Social TaggingA research issue

[ source: Jonathan Good, 2011 - 1000memories.com ]

140 billion photos in Facebook (up to 2011)

Page 34: Seminario Maurizio Agelli, 20-09-2012

- 3 -

Organizing images in catalogs

Page 35: Seminario Maurizio Agelli, 20-09-2012

Pic

ture

by

Hen

ry T

rotte

r, 20

05 -

Sou

rce:

Wik

imed

ia C

omm

ons

catalognouna list of the contents of a library or a group of libraries, arranged according to any of various systems

[ Dictionary.com ]

catalogv.tr.1. to make an itemized list of2. to classify (a book or publication, for

example) according to a categorical system

[ Dictionary.com ]

Page 36: Seminario Maurizio Agelli, 20-09-2012

Photo Cataloging Software

Prime goals of Photo Cataloging Software:○ provide a secure, long-term storage○ find the images when you need them○ interoperate with other tools of the same ecosystem (in

the present, as well as future)

Photo Cataloguing Software falls into the broad domain of Digital Asset Management. Let's try grabbing some definitions ...

An ecosystem is made up of many parts that must not only coexist but also work with each other to survive. When all the elements work in concert, the system can thrive.(Peter Krogh, The DAM Book)

Page 37: Seminario Maurizio Agelli, 20-09-2012

Digital Asset Management

a way of keeping an overview of your digital files and make sure they don't get lost or altered unintentionally [J.Jacobsen, T.Schlenker, L.Edwards, Implementing a DAM System, Elsevier]

the protocol for downloading, renaming, backing up, rating, grouping, archiving, optimizing, maintaining, thinning, and exporting files [P.Krog, The DAM Book, O'Reilly]

a complete toolbox to the author, publisher, and the end users of the media to efficiently utilize the assets [D.Austerberry, Digital Asset Management 2nd edition, Focal Press]

a term open to many definitions ...

... and whose scope goes beyond the domain of photography

Digital Libraries

Creative Industries Publishing

Enterprise Content Management

Page 38: Seminario Maurizio Agelli, 20-09-2012

Core functionalitiesof a photo catalog / DAM software( will use these two terms interchangeably )

○ Import images○ Harvest metadata○ Manage metadata in a database ( + index for search)○ Synchronize metadata○ Export images○ Organize photos with hierarchical keywords○ Manage originals, masters and derivatives files as

different renditions of the same item

Extra functionalities such as file rename, raw converter, editor, publishing tools may be provided too.

Page 39: Seminario Maurizio Agelli, 20-09-2012

Harvesting and synchronizing metadata

Image Data

EXIF

IPTC IIM

XMP

EXIF

IPTC IIM

.....

DatabaseHarvest

metadata

Synchronize metadata

Image Storageimport export

User Interface

Page 40: Seminario Maurizio Agelli, 20-09-2012

Hierarchical keywords

Phot

o by

Isa

belle

Pal

atin

CC

BY-

SA 2

.0

○ typically mapped to dc:subject○ no semantic rules for describing the hierarchy,

special characters are used, e.g.:Organizations|Industry|ACME

Page 41: Seminario Maurizio Agelli, 20-09-2012

Renditions / Version sets

Image Storageimport

export

ORIGINAL

MASTER (edited)

DERIVATIVES...

Different files related to the same image under certain circumstances shall be managed as a single item.

Covered by XMP-MM (Media Management)

Cataloging applications provide different solutions (e.g. stacking, version sets) 1 item, N renditions

Page 42: Seminario Maurizio Agelli, 20-09-2012

- 4 -

Ensuring long-term storage:backup and migration

Page 43: Seminario Maurizio Agelli, 20-09-2012

There are many causes of data loss

disk / hardware failure

viruses

lightning

transfer errorstheft

loss

fire

human errors

floods

Pho

to b

y Lu

cina

M -

CC

BY

-NC

2.0

Page 44: Seminario Maurizio Agelli, 20-09-2012

Which files to backup

Original Files

Working Files

Derivative Files

Master Files

Catalog (DB)

Page 45: Seminario Maurizio Agelli, 20-09-2012

PRIMARY STORAGE

1 2 3

ON-LINEBACKUP(e.g. NAS)

OFF-LINEBACKUP

OFF-SITEBACKUP

storage media are swapped at every backup

rsync (*)

A possible backup strategy for single user workflow

4

CLOUDBACKUP

(*) deleting files on the receiving side shall be disabled for ORIGINALS, MASTERS and DERIVATIVES 5 additional copy on CLOUD

Service (Amazon S3, Elephant Drive, Symform. ...)

additional copy ona remote NAS

Copy to optical storage(ORIGINALS, MASTERS, DERIVATEIVES)

Page 46: Seminario Maurizio Agelli, 20-09-2012

Migration

○ file formats can become obsolete (just think what is happening to Kodak Photo CD ...)

○ storage evolves (higher capacity, higher speed, ...)○ solution:

○ monitoring the storage process○ conversion to newer and safer formats (e.g. DNG)○ periodical replacement of storage devices

Currently there are no permanent solutions for storing digital content. No media lasts forever, and file formats become obsolete. Migration must be considered as a necessary part of every storage strategy.

[ dpBestflow.org ]

Page 47: Seminario Maurizio Agelli, 20-09-2012

- 5 -

An overview of image archiving tools and services

Page 48: Seminario Maurizio Agelli, 20-09-2012

Image management applicationsApplication types

INGESTIONTOOL

CULLING APPLICATION

RASTER IMAGE

EDITOR

PARAMETRIC IMAGE

EDITOR

RAWPROCESSOR

SPECIAL PURPOSE

EDITOR

PUBLISHINGTOOLS

DEDICATEDPRINTING

SOFTWARE

Image Browser DAM

(Photo Catalog)

SCANNERSOFTWARE

Page 49: Seminario Maurizio Agelli, 20-09-2012

Image management applicationsExamples

INGESTIONTOOL

CULLING APPLICATION

RASTER IMAGE

EDITOR

PARAMETRIC IMAGE

EDITOR

RAWPROCESSOR

SPECIAL PURPOSE

EDITOR

PUBLISHINGTOOLS

DEDICATEDPRINTING

SOFTWARE

Image Browser DAM

(Photo Catalog)

SCANNERSOFTWARE

Fast Picture Viewer

Photomatix

Picture Window Pro Photoshop

Lightroom

Vuescan

ApertureIDImager

Bridge

Adobe Camera Raw

ImageIngester Pro

Silverfast

QimageQuad Tone RIP

Bibble Pro

Page 50: Seminario Maurizio Agelli, 20-09-2012

A few photo cataloging applications Product Notes Platforms Cost (EUR)

Adobe Lightroom 4 include Adobe Camera RAW, many export features WIN / MAC 130

Photo Supreme (formerly known as IDIMAGER)

very powerful catalog explorer, multiuser DB WIN / MAC 80

Phase One Media Pro (formerly known as Expression Media, formerly as iView)

WIN / MAC ~85

Apple Aperture 3 MAC 63

Corel AfterShot Pro (formerly known as Bibble Pro)

WIN / MAC ~50

Digikam Software Collection 3

RAW processing based on dcraw, rendition support from version 2

Linux free

Picasa 3.9 WIN / MAC free

PicaJet basic editing, multiuser DB WIN ~50

Common features:○ parametric editor, with possibility to use an external editor○ XMP support (with some issues when exporting/importing keyword hierarchies)○ some kind of rendition support○ trial period (typically 30 days)

Page 51: Seminario Maurizio Agelli, 20-09-2012

Multi-user photo management

○ commercial○ Daminion http://daminion.net/○ Canto Cumulus http://www.canto.com/○ Celum http://www.celum.com/

○ open-source○ ZenPhoto (GPL)○ Montala Resource Space (BSD)○ Gallery (GPL)○ Razuna (AGPL)○ NotreDAM (GPL3)

Page 52: Seminario Maurizio Agelli, 20-09-2012

- 6 -

NotreDAM:an open-source DAM

platform developed at CRS4

Page 53: Seminario Maurizio Agelli, 20-09-2012
Page 54: Seminario Maurizio Agelli, 20-09-2012

Bibliography

Page 55: Seminario Maurizio Agelli, 20-09-2012

References

1. Jonathan Good - How many photos have ever been taken? - September 15, 2011 - http://blog.1000memories.com/94-number-of-photos-ever-taken-digital-and-analog-in-shoebox

2. Observatoire des Professions de l'Image - Les chiffres officiels 2010 du marché de la photo et de l'image en France et dans le Monde - http://www.sipec.org/pdf/OPI2011.pdf

3. UPDIG Photographers Guidelines v4.0 - Universal Photographic Imaging Guidelines - http://www.updig.org/pdfs/updig_photographers_guidelines_v40.pdf

4. dpBestflow.org Best Practices - http://dpbestflow.org/links/32 5. Maurizio Agelli, Maria Laura Clemente, Mauro Del Rio, Daniela Ghironi,

Orlando Murru and Fabrizio Solinas, CRS4 - NotreDAM, a multi-user, web based Digital Asset Management platform - TPDL 2011 Conference on Theory and Practice of Digital Libraries, Berlin http://notredam.org/wp-content/uploads/2012/02/TPDL2011-notredam-demo.pdf

6. MS Windows Dev center - People tagging Overview - http://msdn.microsoft.com/en-us/library/windows/desktop/ee719905(v=vs.85).aspx#_people_tagging

Page 56: Seminario Maurizio Agelli, 20-09-2012

Metadata Standards

○ Exchangeable image file format for digital still cameras: Exif Version 2.3 http://www.cipa.jp/english/hyoujunka/kikaku/pdf/DC-008-2010_E.pdf

○ IPTC Information Interchange Model (IIM), IIM Schema for XMP, Specification Version 1.0, Document Revision 1, 2008 http://www.iptc.org/std/IIM/4.1/specification/IPTC-IIM-Schema4XMP-1.0-spec_1.pdf

○ XMP Specification http://www.adobe.com/devnet/xmp.html○ Part 1: Data Model, Serialization and Core Properties○ Part 2: Additional Properties○ Part 3: Storage in Files

○ PLUS Technical Specification http://ns.useplus.org/go.ashx

○ PRISM 2.0 Specifications http://www.prismstandard.org/specifications/

Page 57: Seminario Maurizio Agelli, 20-09-2012

Further reading

○ Peter Krogh - The DAM Book, Digital Asset Management for Photographers, 2nd edition - O'Reilly

○ Patti Russotti, Richard Anderson - Digital Photography Best Practices and Workflow - Focal Press

○ Metadata Working Group - Guidelines for Handling Image Metadata - http://www.metadataworkinggroup.org/specs/


Recommended