Date post: | 18-Jan-2018 |
Category: |
Documents |
Upload: | charles-morris |
View: | 233 times |
Download: | 0 times |
Managing Digital Assets
File Naming and Resizing
File Naming
• Is crucial to building digital library• Especially important during creation phase
and long term management• In the simplest sense, naming conventions (file
names) serve as labels for digital files• Develop a file naming convention before
starting a digitization project
Are quality file names needed?
• If you use repository software, it creates unique names for each file upload
• You could theoretically upload every file as “photo.jpg”
• Upload thousands and the system would perform perfectly
• Repository software uses a database to an method for making each file unique
• So why should we care about filenames?
File naming considerations
• Once you plan to digitize, how do you ensure that your objects will persist over time?
• Easily managed and updated?• Easily harvested by other repositories?• May be separated from repository and still
persist? (think of taking a book out of library)• Are preserved over long term?• Don‘t conflict with the systems and software
they reside on?
Why File Naming Conventions?
• Avoid this!
Windows or Apple File System
File Systems
• When sorting on filenames, the file naming convention will affect the order files are displayed.
• Name• Date / Date modified• File type• Size• And more!• File name is the most important
Scanned Newspaper/Book
Collection Date Page
Directory/path
Art and Science of File Naming
• It’s tempting to name your files, image1.jpg, image2.jpg, image3.jpg and so on
• Computer file systems require that files have unique names within folder, but not system-wide
• A file name is an easy way to provide useful description and an a unique identifier
What’s in the Name?
nam_apap_039_2001_11_11.wav
OCLC Code
Collection or Unit
ISO date Extension
Date must follow: YYYY_MM_DD
Don’t Touch!
3 characters exactly
Nearly every
library has a code
GENERAL to the PARTICULAR
Do’s and don’ts• Use a 3 character file extension (i.e. “.tif”, not
“.tiff”).• Do not use special characters, such as, . \ / : * ?
" < > |, except for dashes or underscores. • All letters should be lower case. – Some operating systems are case sensitive. Using
lower case consistently prevents problems, if the files are migrated to a case-sensitive operating system. Thing of URLs and how they handle spaces
Do’s and don’ts
• Do not use spaces in the file name. Browsers and some older operating systems do not handle spaces well.
• http://www.albany.edu/~mwolfe/ist653/week4/Exercise%20week%204.docx
• Use leading zeros. If the file name includes numbers use zeros as placeholders. For example, a collection with 999 items should be numbered: mac001.tif, mac002.tif ... mac011.tif, mac012.tif, etc. (NOT mac1.tif, mac2.tif ...). This practice facilitates clear sorting and file management.
Do’s and don’ts
• File naming conventions should be clearly documented • Follow standard date formula: year, month, day• No longer than 21 characters (ideal)• Do not use spaces. Some software will not recognize
file names with spaces, and file names with spaces must be enclosed in quotes when using the command line. Other options include:– Underscores, e.g. file_name.xxx– Dashes, e.g. file-name.xxx– No separation, e.g. filename.xxx– Camel case, where the first letter of each section of text is
capitalized, e.g. FileName.xxx
Do’s and don’ts
• Filenames and directory names should neither begin nor end with a punctuation character (period, hyphen, underscore)..myfile.tif-myfile.tif_myfile.tifmyfile-.tif
• Filenames and directory names should not contain multiple consecutive punctuation characters.my--file.tifmyfile..tif
Standardized Naming Conventions
• Diversity or non-standard methods in file naming is a bad thing
• Non-standardized metadata/filenames require significant time and effort to normalize into forms appropriate for a preservation repository.
• No standard labor-intensive process adds to the overall cost of management and preservation
• We must normalize our file names and metadata
Normalization
• A term used by database developers to mean a specific process (first normal form etc.)
• Digital Libraries mean it more loosely • Normalization –involves the imposition of
accepted professional standards, metadata or file, and rules to create sustainable digital objects
• We will normalize file name, metadata, and formats
• We can normalize by hand, or digital asset management systems can automate the process
Challenges to File Naming
• Uniqueness- You must ensure that your names to not conflict with each other. A file in one folder might be unique, but not in another folder.
• Meaning/convention persists over time-digital library can make sense of the convention used, good documentation etc.
• Ease of use, not too complicated or obscure• Scalability—will your convention be limited
number
File Management Tools• Work is impossible
to do by hand at scale
• Humans make mistakes
• We need digital tools to assist us in repetitive tasks
• ReNamer• Adobe Bridge
EXIF Header Information
• You can use information tagged inside the file to created metadata and file names
• It’s a standard that applies to images and sound
• It can tell you date, aperture (still camera), camera make, dimensions, and more.
Header Information/Embedded Metadata
EXIF IPTC
IPTC and EXIF
• IPTC (International Press Telecommunications Council) is an older standard from print industry, contains:– Title, date, creator, address
• EXIF (Exchangeable Image File) is a format embedded into images produces by all digital cameras– Has the settings used on the camera/scanner:– dimensions, resolution, ISO, shutter
Embedded Metadata
• Most modern formats have embedded metadata, audio, video, text and images
• Often you need special tools to access and update
• Some system ignore or strip out• iPods/iPhones best example of embedded
metadata