Saturday, October 31, 2009

MARAC Fall 2009, plenary session

The Hudson & Manhattan Railroad Powerhouse, Jersey City, 30 October 2009. Built in 1906-1908, this beautiful Romanesque Revival structure -- which by night looks as if it could serve as a set for a Frankenstein film -- once powered what is now the PATH train system. It ceased operations in 1929 and has stood vacant ever since, and its owner, the Port Authority of New York and New Jersey, wanted recently sought to tear it down and build a parking deck on the site. Owing to the determined efforts of the Jersey City Landmarks Conservancy and other citizens, this remarkably sturdy building has been declared a National Historic Site and is on the verge of redevelopment.

MARAC got off to a roaring start yesterday morning: during the plenary session, Ellen Fleurbaay and Marc Holtman of the Amsterdam City Archives discussed the Archiefbank, the repository's on-demand scanning program, and the institutional changes required to make it work.

The archives, which holds a wide array of municipal government records and other materials documenting the history of the city, experienced substantial declines in in-person visitors during the early 21st century; at the same time, the number of visitors to its Web site increased steadily. Visitor statistics are the measure of Dutch cultural institutions' success, and the archives realized that it needed to reinvent itself in order to survive. To that end, it articulated two main goals:
  • In-person visitors will experience the look and feel of authentic archival documents and the pleasure of doing historical research.
  • Everyone should be able to access all archival collections at home at all times.
In support of the first goal, the archives changed its name and logo, developed a new facility in the city center, developed a permanent exhibit, offered evening events and weekend hours. It also transformed its research room into a wired "information center" in which people were encouraged to discuss their work with others; this idea intrigues me, but the security-minded archivist and the tranquility-loving researcher within me have a few doubts.

In support of the second, it radically expanded its digitization program. The archives holds more than 20 miles of records -- which would take an estimated 406 years to scan -- but quickly realized that it should first focus on its most heavily used documents.

It also developed a stunning new program that allows users to request scanning of specific her records. Online researchers scan the EAD-encoded finding aids in the Archiefbank, and with a simple click of a button request scanning of specific records. The Archiefbank then generates an order number that is used to track the order throughout the scanning process and to generate file names for the scans. Staff retrieve the records, quickly examine them for copyright and preservation issues, and do some minimal prep work (e.g., removing staples), then convey the materials to a scanning vendor. The resulting images are added to the archives' electronic repository, and are then transferred to its Web site. The requester then purchases the scans s/he wants; if a researcher wants materials that have already been scanned and added to the archives' Web site, he or she can do so instantly. The more scans one purchases, the lower the cost per scan.

The archives aims for a turnaround time of 2-3 weeks and a total of 10,000 scans per week. A distinctive mix of circumstance and policy makes this prodigious activity possible:
  • Dutch law. There is no fee for consulting original records or viewing digital images at the archives, but charges for reproduction are allowed, so the archives can assess fees for scanning materials for online researchers -- and the archives has carefully calibrated its fees so that it breaks even.
  • Focus on legibility, not preservation-quality scanning. Instead of the high-resolution TIFFs produced for preservation/conservation purposes, on-demand scans are created as low-resolution JPEGs. This policy dramatically reduces the archives' storage costs: the cost of storing 1 TB of 300 dpi TIFFs in a digital repository with remote backup is $7,000 per year, and but that of storing equivalent 200 dpi JPEG 4 images is $77.
  • Emphasis on high volume. The archives' in-house scanning facilities support preservation/conservation scanning, and on-demand scanning is outsourced. In order to reduce the amount of manual processing needed, the archives scans entire files, not individual documents; researchers pay only for those scans that they want.
  • An efficient back-office operation. The archives has developed a barcode-driven management system that enables staff to identify precisely where each group of records slated for scanning is located and which current and succeeding tasks are to be performed on each group.
  • A well-developed IT infrastructure. Although Fleurbaay and Holtman didn't emphasize this point, it's pretty evident that without robust and seamlessly integrated systems, high-volume on-demand scanning wouldn't be possible. Image ordering and purchasing functionality meshes neatly with the archives' EAD finding aids, and the archives' document viewer has a built-in filter that enables users to increase contrast -- a real help when inks have faded over time.
Everyone present was wowed by the Amsterdam City Archives' efforts, which by every measure are a rousing success: visits to the repository have increased five-fold, 15,000 registered online users have requested scans, and after two years of high-volume scanning more than 7 million images are available online.

I have the feeling that just about everyone who attended this presentation is going to devote a lot of time to thinking about their repositories can emulate the example set by the Amsterdam City Archives. Most of us probably won't be able to establish programs as sophisticated or as large as that of the Amsterdam City Archives -- because we lack the needed IT infrastructure, hold tons of copyrighted or restricted materials, or work in government archives that are legally barred from charging for online access -- but many of us will likely reassess some of our digitization practices and priorities. And that's a good thing.

No comments: