Meta Data Schema Development

Problem to be addressed: In order to function as part of the digital library, digital content needs to be wrapped in what the OAIS Reference model calls an “information package.” Current wrapper formats do not provide suitable ways of documenting interactive fiction and games at the bit-level: specifically, they fail to provide the “representation information” needed to map the raw bits into higher-level data constructs. In the case of highly complex, interactive objects such as hypertext fiction and games, inadequate representation information will severely hamper preservation of these works.

There have been a variety of recent efforts to develop XML-based ‘wrapper’ formats to bundle all of the content and metadata for a digital object in a single package. Notable examples include FOXML, METS, MPEG-21 DIDL, MXF and XFDU. Within the digital library community, a great deal of emphasis has been placed on the need to have these wrapper formats play the role of an “information package” as described within the Open Archival Information System Reference Model (see Figure 1, in Appendix B)1. However, none of the packaging formats developed to date can fully support the OAIS Reference Model notion of an information package for interactive fiction or games. A critical component of the OAIS information package is “representation information,” the set of information necessary to interpret binary data. Without representation information, a digital object is essentially an undecipherable string of ones and zeros. All of the various packaging standards mentioned provide facilities for referencing representation information, but this makes them highly dependent on external schema. Unfortunately, the schema developed for recording representation information during the past several years do not yet provide the level of documentation necessary to understand digital file formats fully. To date, because a great deal of the digital library community’s work has focused on documents with fairly simple digital representations, and because the community has a strong bias towards open, standard formats, this limitation has not emerged as a critical flaw. In the long term, however, the simplifying assumption that digital content will be simple rather than complex and will be produced in open formats is dangerous, especially for highly interactive virtual worlds such as interactive fiction and games, which are very likely to come in proprietary rather than open formats, with competitive disincentives to documentation.

Proposed activities and deliverables

In order to help assure the longevity of these complex works, the project will embark on an effort to build upon existing work in metadata wrappers and develop structures which will allow us to record the complete set of Representation Information and Preservation Description Information needed to support long-term preservation.

Since it is unclear whether migration, emulation or a combination of both may best serve to enable these objects to survive, we will seek to create or elaborate metadata standards that will support both approaches. We anticipate that successful completion of this task will require:

  • development of new schema to capture technical metadata and other representation information for the data formats included in our case studies;
  • new schema for description of Context Information for digital objects;
  • new schema for preserving complex interactive user-behavior;
  • new schema for structural metadata to encode interactive fiction;
  • a set of suggested elaborations of existing wrapper formats (along with recommendations on use practices) to allow for complete support of representation information.

The project will demonstrate the use of the new schema and the inclusion of Representation Information through a revised version of the METS format and/or development of the Electronic Literature Organization’s (ELO) proposed X-Lit format.

