A TEI extension for the description of medieval manuscripts

Richard Gartner, Lou Burnard, and Peter Kidd

1 Introduction

From the viewpoint of the medieval scholar TEI P3, as it currently stands, offers only limited facilities for the detailed encoding of manuscripts. This is particularly true with respect to the encoding of manuscript description or "metadata": that is, the detailed prose descriptions which appear in traditional manuscript catalogues and handlists. Only a few manuscript-specific elements are defined in the existing header (for example <hand> and <handshift>, which allow the recording of information on scribes and handwriting styles); consequently, the manuscript scholar is often forced to fudge the issue by using tags for unintended purposes, or to modify the DTD for specific applications (as has been done, for instance, by the Canterbury Tales Project [Chaucer, 1996]).

In January 1996, the Bodleian Library at Oxford began looking at the possibility of extending the TEI to incorporate more detailed metadata for manuscript cataloguing, as a part of a nationally-funded four year project to provide access to descriptions of previously uncatalogued western medieval manuscripts in its collection. In this paper we review the set of TEI extensions we have so far defined for this purpose, most of which which extend the TEI header, but which also include a new global attribute, and several new phrase level elements. Our intention is that the set of metadata elements defines should be rich enough both for those wishing to use the TEI as the basis for a conventional catalogue record, and for those intending to produce electronic editions of the manuscripts themselves. It should also be emphasized that both the scope and the detail of our scheme have been very much dictated by local needs within the Bodleian; other libraries with other habits or different kinds of material may well have different needs.

2 Current practice in the automation of manuscript cataloguing

There is no overall standard for the cataloguing of manuscript material, analogous to the MARC record and AACR2 for the printed book. The nearest thing to a de facto standard to have emerged in the UK and USA is the set of recommendations drawn up by Neil Ker in his Medieval Manuscripts in British Libraries [Ker & Piper 1969-92]. In this work he lists the following sixteen items which should be included in any catalogue description:

These have become widely accepted as the basic elements necessary for a comprehensive catalogue record, although many institutions (such as the Bodleian) may go into much greater detail in many cases. These points were borne in mind as the TEI extensions described below were compiled, and in most cases a direct mapping from one of Ker's points to an element is possible.

3 The MssStmt

We began with the observation that manuscript description properly begins in the header; more specifically, since it is the manuscript itself which we wish to describe rather than its electronic surrogate, within the source description. We therefore propose the definition of a new container element <mssStmt>, as a child of the <sourceDesc> within the <teiHeader>. The place of this new element in the hierarchy is modelled upon that of <recordingS tmt>, since like that element it bundles together a range of elements specific to a particular class of material, in this case manuscripts. A manuscript (for our purposes) is a single physical object or codex, possibly composed of a number of originally distinct manuscripts: for that reason, we define a substructure rather like that of the teiCorpus in which descriptive information common to the whole of the object (such as its binding or provenance) is held at a different level from that relating to individual components (such as the handwriting or decorative features).

Whether it relates to the whole of a manuscript or to a part of one, a <description> element can potentially include a very wide range of detailed information. We list below the various elements which may be provided, to give some flavour of the level of detail permitted by the scheme:

<decoration> "a container element for up to seven sub-elements, covering the decoration of a m anuscript: these include an overview, a description of miniatures, of historiated and decorated initials, borders, minor decoration, and an element for attributions and commentary."

<area> "a repeatable element used to record the physical dimensions of ruled, written, pricked, or leaf areas. A type attribute states which kind of dimension is being described, and the element contains sub-elements for width, height, and a free text description."

<leaves> "records the number of leaves in the front fly leaves, main text block, and back fly leaves of an item, and a description of each type of material present in these sections, including a recording of damage, and (for paper) of watermarks"

<foliation> "a repeatable element containing foliation information for an item, including the period, medium (pencil, ink, etc.), and type of numerals."

<collation> "collation is recorded by a <quireformula> element, which models the structure of the collation itself, and an <evidence> element, which is used to record markings (such as catchwords or quire signatures), and other evidence relating to the collation."

<scriptDesc> "information about the scripts used in a manuscript is located here using one or more detailed <scriptNote> elements, the function of which overlaps to some extent with that of the existing <handList> element"

<rubrication> "describes the rubrication (i.e. red-lettering) or any analogous method used to highlight headings etc. within a manuscript"

<secFol>(secundo folio) "used to record the catchwords commonly used in the Middle ages and beyond to distinguish specific manuscript copies; these may also be used to determine whether a given manuscript correspond with a copy listed in a medieval inventoryS>"

In addition to these purely descriptive elements, the full manuscript description can include either a single <provenance> element, or a group of them enclosed within a <listProvenance>, which record the history (in particular the ownership) of the item. It can also include a <binding> element, again relating to the whole item.

4 Other extensions to the DTD

Our proposed extensions involve more than the simple addition of a new element to the Header however. We also propose an additional global attribute, an additional chunk level element, and some additional phrase level elements for specialized headings and bibliographic references.

A key problem which all metadata schemes have to overcome relates to the granularity of the description and its scope. The TEI scheme offers a variety of mechanisms to associate pieces of metadata with individual parts of a text (notably the use of "declaring" and "declarable"elements) but no obvious ways of specifying the range or scope of metadata when the corresponding textual elements are not included in the electronic transcription. In simplistic terms, if we wish to say that some folio of a manuscript has some property, we must encode both the folio itself, and the metadata describing its property. Since we do not wish to require all users of our scheme to have to transcribe all the manuscripts they wish to catalogue (!), we found it necessary to define a new mechanism by which the scope of range of some piece of metadata can be defined. this is accomplished by means of a range attribute, which can be attached to any element in the manuscript description, and whose value gives in a normalized form an "address" in the manuscript for the feature being described, most typically a folio number or range.

A key extension to the range of elements available with a <div> is the new <summary> element: this is used to contain an abstract of the contents of that <div> as compiled by the cataloguer or other commentator. This allows the detailed description of the contents of each section of a manuscript, and clearly delineates supplied text from the source material.

We have also found it necessary to add more elements to the m.bibl class to incorporate features exclusive to bibliographic description of manuscripts, such as place of origin, repository and collection. New phrase level elements have been added to allow the marking of incipits, explicits, colophons and headers within the header and the text itself, and to identify iconographic subjects at any point within a record: this element also includes an attribute to allow the inclusion of an Iconclass [ICONCLASS, 1997] code.

An important new global attribute is provided, range. This may be used at any point to state explicitly the span of pages or folios covered by a specific element: in the description of collation, for instance, it may be used to indicate the folio span of a sequence of quires. it may also be used to indicate the position, by folio reference, of a unit smaller than a folio, such as a miniature, or decorated initial.

5 Conclusions

We do not imagine that the level of detail proposed in our scheme will be appropriate for all cataloguers of manuscripts, nor that our list of headings embraces all of those needed for all forms of handwritten materials. They lack, for example, several features needed for cataloguing monumental inscriptions, and probably also several of significance for manuscripts deriving from non-Western traditions. Nevertheless, we believe that the set of extensions presented here form at least a useful starting point for others wishing to embark on the detailed encoding of comparable collections, and look forward to the discussion which we hope their presentation in this and other forums will generate. More detailed information about the scheme (including a complete example) is available from the URL http://www.bodley.ox.ac.uk/mss

6 References

Chaucer, Geoffrey. The Wife of Bath's Prologue on CD-ROM. Edited by Peter Robinson. Cambridge, Cambridge University Press 1996.

Ker, N. R., and Piper, A. J. Medieval Manuscripts in British Libraries. Oxford 1969-92.

ICONCLASS Research and Development Group. The ICONCLASS Home Page (WWW Site). Utrecht, ICONCLASS Research & Development Group, 1997.

Back to Technical Program