+ All Categories
Home > Documents > Shaoping Moss Monday, Oct. 3, 2005

Shaoping Moss Monday, Oct. 3, 2005

Date post: 20-Feb-2016
Category:
Upload: delora
View: 17 times
Download: 0 times
Share this document with a friend
Description:
Research and Instructional Support, LITS Mount Holyoke College French 331, Fall 2005. Creating an Electronic Edition of an Original 18 th Century Manuscript -- Mémoires de la comtesse de L…. Shaoping Moss Monday, Oct. 3, 2005. Today’s Topics. An electronic edition: - PowerPoint PPT Presentation
27
Creating an Electronic Edition of an Original 18 th Century Manuscript -- Mémoires de la comtesse de L… Shaoping Moss Monday, Oct. 3, 2005 Research and Instructional Support, LITS Mount Holyoke College French 331, Fall 2005
Transcript
Page 1: Shaoping Moss  Monday, Oct. 3, 2005

Creating an Electronic Edition of an Original 18th Century Manuscript --

Mémoires de la comtesse de L…

Shaoping Moss Monday, Oct. 3, 2005

Research and Instructional Support, LITSMount Holyoke CollegeFrench 331, Fall 2005

Page 2: Shaoping Moss  Monday, Oct. 3, 2005

Today’s Topics An electronic edition:

what and why an electronic edition Significance of Manuscripts

Technologies used behind the scene Markup languages: SGML, XML and HTML Stylesheets: XSLT

TEI -- Guidelines and DTDs

Group Project Encoding Project Objectives Text interpretation and markup of the manuscript

Page 3: Shaoping Moss  Monday, Oct. 3, 2005

What and Why an Electronic Edition? An electronic edition -- a transcription of a text,

which can be encoded as an object of study for literary,

linguistic, historical, or related purposes. be searched and manipulated by computer

programs in many different ways. facilitate and expand access. facilitate the long-term preservation of the

original form of the materials.

Page 4: Shaoping Moss  Monday, Oct. 3, 2005

Significance of Manuscripts

The term 'manuscript' simply means “written by hand.” These works written by authors, artists, scientists, and others, not only contain invaluable information for the study of the genesis, meaning and reception of their work, also for the reconstruction and a better understanding of the contemporary society and mentality in which they lived. In addition manuscripts throw light on the economics, psychology, politics, and social sciences, as well as the history and philosophy of science.

Page 5: Shaoping Moss  Monday, Oct. 3, 2005

What does Encoding a Text Mean?

The purpose of encoding a document is to embed intelligence in the text in such a way that the computer program can derive information from it.

The information embedded in the text is variously called encoding, markup, or tagging.

Page 6: Shaoping Moss  Monday, Oct. 3, 2005

What’s a Document? A document is:

A set of information presented to the reader in different forms and media: books, web pages, magazines, articles, advertisements.

A collection of small elements, which can be headings, paragraphs, quotations, etc.

Structure versus Format Structure concerns the content of a document. Format concerns the way a document looks.

Page 7: Shaoping Moss  Monday, Oct. 3, 2005

Sample Digital Collections The Newton Papers Project

http://www.newtonproject.ic.ac.uk/The Newton Project aims to create a printed edition of Newton's theological, alchemical and administrative writings and an electronic edition of all his writings, including his correspondence.

Sample Transcriptions: http://www.newtonproject.ic.ac.uk/texts/cul3996_d.html

The Adams Family Paper: an Electronic Archivehttp://www.masshist.org/digitaladams/aea/ Five College Archives & Manuscript Collections -- use XML (EAD) to

improve searching capabilities of archival finding aids http://asteria.fivecolleges.edu/index.html

Page 8: Shaoping Moss  Monday, Oct. 3, 2005

Markup Languages Address the structure of a document. Identify different components of the document. A set of symbols that can be placed in the text of a

document to define and label the parts of the document. Convey information to software that will allow it to:

determine the functions and boundaries of document parts. index the data for searching. render the data (e.g. for screen display or print). transform the data (e.g. for a voice synthesizer) for some

output device(s).

Page 9: Shaoping Moss  Monday, Oct. 3, 2005

Development of Markup Languages SGML -- Standard Generalized Markup Language (‘86)

Initiated by Charles Goldfarb at IBM in the 1960s Adopted as a standard of the International Organization for

Standardization (ISO 8879) in 1986 HTML -- Hypertext Markup Language (‘91)

developed by Tim Berners-Lee at a physics lab near Geneva, Switzerland in 1992

XML -- eXtensible Markup Language (‘98)XML is a new Web standard developed by World Wide Web Consortium since 1998.

Page 10: Shaoping Moss  Monday, Oct. 3, 2005

SGML and Its Subdivisions SGML is a toolkit for developing specialized markup

languages. SGML is composed of tag-set building rules. SGML has given birth to other sets of subdivisions:

HTML and XML CALS for U.S Department of Defense BOEING for commercial airlines C-H for publishing OED for Old English Dictionary TEI guidelines for the Text Encoding Initiative EAD for Encoded Archival Descriptions

Page 11: Shaoping Moss  Monday, Oct. 3, 2005

HTML: Good v. Bad Good:

Its simplicity has contributed to the rapid growth of the World Wide Web in the 1990s.

XHTML 1.0 is the latest HTML standard. Bad:

Easy HTML coding has made it harder for browsers to handle. Tags are predefined in HTML. Format and content are mixed and content is hard to reuse.

e.g. <H1>My First XML</H1> <H2>Introduction to XML</H2> <b><FONT SIZE=2><P>What is HTML?….</P></FONT SIZE=2>

Page 12: Shaoping Moss  Monday, Oct. 3, 2005

What is XML?http://www.w3.org/XML/

XML stands for eXtensible Markup Language. XML was designed to describe data. XML tags are not predefined in XML. You must define your own tags in using XML. XML separates format from content and semantic

structure,e.g. <title>What is XML?</title>

<chapter>Introduction to XML</chapter> Data encoded in XML can function much like a

traditional database. XML content can be output in many formats, such as

XHTML, text, Word documents, PDF, etc.

Page 13: Shaoping Moss  Monday, Oct. 3, 2005

A Sample XML Document

<?xml version="1.0" encoding="ISO8859-1" ?><booklist>

<book><booktitle>Project Cool Guide to XML for Web

Designers</booktitle> <author>Teresa A. Martin</author> <country>USA</country> <publisher>John Wiley and Sons</publisher> <price>25.00</price> <year>1999</year>

</book>…

</booklist>

Page 15: Shaoping Moss  Monday, Oct. 3, 2005

XSLT XSLT - eXtensible Stylesheet Language Transformations A markup language and programming syntax for processing

XML data Contains a set of template rules that defines what info. can

be taken out of the XML document and how it is structured Is most often used to:

Transform XML to HTML for delivery to standard Web clients or wireless devices

Transform XML from one structure to another Convert XML data into any wanted output - text, Word document,

PDF, etc.

Page 17: Shaoping Moss  Monday, Oct. 3, 2005

Markup Languages in Academics

TEI -- guidelines and DTDs http://www.tei-c.org/Guidelines2/index.htm

Resource Bioinformatic Sequence Markup Language (BSML)

Mathematical Markup Language (MathML)

Page 18: Shaoping Moss  Monday, Oct. 3, 2005

What is TEI?

Initially launched in 1987, the Text Encoding Initiative (TEI) is an international and interdisciplinary standard for encoding, keeping and analyzing textual content & structure of digital texts. This standard is designed for use with a broad range of text types. Now it is widely used in libraries, archives, and by publishers and researchers for online research and teaching and for the storage and exchange of large and small text collections.

http://www.tei-c.org/

Page 19: Shaoping Moss  Monday, Oct. 3, 2005

TEI Guidelines

The TEI encoding system is built upon Standard General Markup Language (SGML) and shifted to XML in 2002. The system is described in the TEI guidelines. It is modular and flexible, including basic modules, such as prose, poetry, drama, speech, lexicography and terminology. These modules can be combined in various ways according to the needs to adapt to a great number of text-encoding purposes.

http://www.tei-c.org/Guidelines2/index.html

Page 20: Shaoping Moss  Monday, Oct. 3, 2005

TEI Lite

http://www.tei-c.org/Lite/ (documentation)http://www.tei-c.org/Lite/DTD/ (download the files)

TEILite is a simplified ‘starter set’ of TEI elements, which has been defined in simple DTD. It includes most of the core tags, basic structural components, and an adequate set of header elements. It is a good starting point for simple encoding projects, and has proved very popular and serves about 85% of its users’ needs.

Page 21: Shaoping Moss  Monday, Oct. 3, 2005

DTD -- Document Type Definition

A DTD is a computer-readable text file that defines a markup language for a particular type of document, such as a poem, a novel, or an archival finding aids.

Its purpose is to define the document structure with a list of legal elements --a root element, parent and child elements, and where data can be placed.

It lays out the logical structure of the data. It establishes rules about which elements a document may have,

which are required, which can repeat, etc. A DTD can be declared inline in your XML document, or as

an external reference.

Page 22: Shaoping Moss  Monday, Oct. 3, 2005

TEI Document Format

All TEI documents follow the same essential format:

TEI header -- documents the bibliographic information about the electronic edition being created.

TEI body -- contains the content being created.

Page 23: Shaoping Moss  Monday, Oct. 3, 2005

Relationships in a TEI Document

<TEI.2><teiHeader></teiHeader><text><body></body)</text></TEI.2>

Parent element of <teiHeaher> and <text>

Sibling elements

<TEI.2> is an ancestor element of <body>

Page 24: Shaoping Moss  Monday, Oct. 3, 2005

The Encoding Example

A sample TEI markup for Mémoires de la comtesse de L http://www.mtholyoke.edu/courses/smoss/TEI_projects/french331/example.html

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE TEI.2 SYSTEM "DTD/teixlite.dtd"> <?xml-stylesheet href="example.xsl" type="text/xsl"?>

<TEI.2>…

</TEI.2>

Page 25: Shaoping Moss  Monday, Oct. 3, 2005

Encoding Project Objectives

Encoding Mémoires de la comtesse de L… is an act of analysis and interpretation, presenting intellectual challenges that bring us closer to the text and thus help us better understand the work, life, and the social environment surrounding the author.

Page 26: Shaoping Moss  Monday, Oct. 3, 2005

Group Project: Let’s have Fun!! 3 octobre:

Overview of XML/TEI technologyHands-on encoding exercises: themes, personal and place names

7 novembre: Demo encoding images by Shaoping in class

28 novembre: Demo encoding translation of selected words by Shaoping in class

5 et 14 decembre: class demo of each group project

(Note: Students will have to make appointment with Alexandra for encoding problems.)

Page 27: Shaoping Moss  Monday, Oct. 3, 2005

Contact Info

Shaoping MossInformation Technology ConsultantResearch and Instructional Support

Mount Holyoke CollegeEmail: [email protected]

Phone: (413) 538-3034

Alexandra BalanTech Mentor

[email protected]


Recommended