Text Encoding Initiative
Tenth Anniversary User Conference

The TEI as metadata package?

Dan Greenstein
Arts and Humanities Data Service
Kings College, London

This paper grows out of work recently carried by the Arts and Humanities Data Service and the UK Office for Library and Information Networking on metadata for resource discovery ­ that is the descriptive data which is supplied for information resources to facilitate their location or discovery by interested users (http://ahds.ac.uk/ and http://ukoln.ac.uk/ respectively). That work was built on the following assumptions: that scholars alike require access to information about relevant materials irrespective of where, how (e.g. as books, audio tapes, digital objects), or by whom (e.g. librarians, data archivists, museum curators) they are stored, and regardless of the manner in which they are described or catalogued. They want to query any number of information systems in parallel, and in this respect require a framework which will allow resource discovery across particular subject, curatorial, regional and other domains; a framework that will facilitate meaningful integrated access to the intellectual record generally.

The work resulted in recommendations about "Dublin-Core" style metadata which we feel will support this kind of "cross-domain" resource discovery (http://ahds.ac.uk/public/metadata/discovery.html). But it is on other aspects of the research which this paper will dwell; notably on the prospects for using the TEI to integrate resource description practices which emerge as de facto or de jure standards amongst particular groups of information specialists. Several such standards were identified in the course of the AHDS/UKOLN work which assembled expert workshops to identify the resource discovery requirements for the following information resources:

Each workshop followed a similar investigative programme which included:

Together they revealed three things of note. Firstly, how far specialists operating within each of the "domains" had gravitated toward consensus with regard to their own information description requirements. Secondly, the extent to which they had adopted an SGML formalism to express those requirements. Thirdly, that a common superset of metadata could be identified and agreed which satisfied resource discovery requirements across these domains.

Having identified that superset, the AHDS and UKOLN have turned naturally to investigating integrating mechanisms. These investigations currently extend to the use of Z39.50-enabled information systems which can exploit the commonality provided by the unifying metadata so far identified and enable scholars to search seamlessly across a range of differently structured information collections. A further investigation remains to be conducted, notably into providing a unifying syntax for representing both domain specific metadata packages and their common elements. How and whether the TEI can provide such a integrating mechanism is a useful starting point for this particular research.

