Wednesday, November 4, 2009

MARAC Fall 2009: S6, EAD Perspectives at the Institutional, Research, and National Level

Moon over Manhattan, as seen from the Newport, Jersey City esplanade, 4:50 PM, 29 October 2009.

Post corrected 7 November 2009. I was sitting in the very back of the room in which S6 was held, and sometimes had trouble hearing the presenters. I completely misheard a couple of things that Michael Rush said during the start of his presentation, and this post contained some inaccurate information as a result. Thanks to Mike for setting me straight, and apologies all round.

I’m not doing a ton of description these days, and but I cut my professional teeth on Machine Readable Cataloging (MARC) records and have lots of colleagues who are still doing a lot of MARC and Encoded Archival Description (EAD) work, so I always make it a point to attend conference sessions relating to description whenever possible. I’m glad I caught this one.

Michele Combs (Syracuse University Special Collections Research Center) opened the session by outlining the internal and external benefits of EAD, technical options for creating and providing access to EAD finding aids, and how her repository has integrated EAD into its workflow. I particularly liked her discussion of SU’s More Product, Less Product (MPLP)-influenced approach to description: Combs and her colleagues create EAD finding aids for new collections during the accessioning process, and they’re tackling the backlog by converting paper finding aids to EAD and using existing MARC records to generate basic EAD finding aids. As a result, every collection gets at least a basic EAD finding aid.

Jeanne Kramer Smyth (Discovery Communications and, BTW, the force behind Spellbound Blog) discussed ArchiveZ, an information visualization project that uses EAD finding aids from a variety of institutions as a source of structured data. Focusing on subjects, time periods, and linear footage, Kramer-Smyth and her associates normalized the data and decomposed compound subjects into tags; the latter dramatically increases the chances of finding overlapping collections. They also cross-tabulated subjects and time periods to identify the volume of records covering a given subject at a given time.

This is very cool stuff that promises to open up all kinds of new avenues of access, but Kramer-Smyth and her colleagues have run into a few problems, almost all of which stem from the flexibility inherent in the EAD specification. Each repository that provided finding aids to the ArchivesZ project had its own encoding quirks and particularities, and standardization across certain tags was lacking; for example, some repositories measure quantities of records in linear feet, while others use cubic feet, etc. Some of the finding aids had incomplete subject assignments (e.g., subjects reflected in the collection title aren’t listed as subjects).

Kramer-Smyth emphasized that these problems are fixable: she and others who use EAD as a data source can figure out how to write better code and ask repositories to submit “configuration files” that resolve data inconsistencies (e.g., by explaining local practices regarding quantity/extent information). However, it’s pretty plain that EAD still has a long way to go before it truly transcends institutional boundaries.

Michael Rush (Beineke Rare Book & Manuscript Library, Yale University), who heads is drafting the charge for the soon-to-be reconstituted EAD Working Group, is charged with revising EAD, provided a useful overview of the Working Group’s goals some of the changes that may be incorporated into EAD 3.0:
  • Reduction of mixed content, i.e., mixing of text and tags.
  • Allowing namespace interoperability, i.e., giving implementers to embed MODS, PREMIS, and other XML schemas directly into an EAD finding aid.
  • Improvement of data handling, e.g., getting rid of forward slashes, which are ignored by many programs.
  • Eliminating anything that doesn’t describe the records, e.g., the head and attribute labels used to mark scope and content notes; formatting info should be in stylesheets, not EAD schema!
  • Possibly removing table and list coding and recursive tags.
  • Reining in the diversity of practice, which is a political challenge: people do things a certain way because a given way meets a given need, but this diversity makes it harder to exchange data across institutions or pull EAD data into a database. In an effort to accommodate everyone, the Working Group might come up with a strict EAD and a loose EAD that allows greater diversity of practice.
The Working Group is seeking will need volunteers who will to steer the revision process; if you’re interested, contact him at

Session chair Mark Matienzo (New York Public Library) then asked the panelists a really provocative question: should archivists should think of finding aids as documents or as data sources? All three panelists concurred that we need to start seeing finding aids as data sources from which documents, which still have many uses, can be produced as needed; conceptualizing finding aids as documents has led to many of the quirks and inconsistencies that become apparent anytime one looks at multiple institutions’ finding aids. As Michael Rush pointed out, we’ve moved beyond the point at which documents meet our needs. With MPLP and other developments, description is never done, and although we need the capacity to take a snapshot of a given description as it exists at a given point of time, we need to focus more on standardized creation of data over time.

All in all, a phenomenal session that brought to mind my own long-ago (and subsequently back-burnered) realization that the MARC format could be thought of as a highly flexible and repurposable information source, not just a cluster of templates organizing the presentation of various chunks of information. It also called to mind various past efforts to increase the consistency of MARC cataloging across institutions, most of which didn’t pan out. Here’s hoping that past experience, the profession’s increasing comfort and familiarity with databases, etc., and the emergence of new tools that make use of structured descriptive data make it possible to standardize descriptive practice in the EAD era.

1 comment:

Mike Rush said...

To clarify, I have been asked to draft a charge for the reconstituted EAD Working Group. No members and no chair have been appointed yet. My comments were all preliminary, and though drawn from many conversations, they were my own and not to be ascribed to the EAD Working Group or its successor. I'd appreciate it if this otherwise wonderful session recap could be revised so as not to give the false impression that I am "heading up" the new EAD group.

-Mike Rush (michael.rush[at]