Showing posts with label MARAC Fall 2009. Show all posts
Showing posts with label MARAC Fall 2009. Show all posts

Thursday, November 5, 2009

MARAC Fall 2009: S11, Scattered Treasures: The Stewardship of Private Collections in the 21st Century

Rotunda, City Hall, Jersey City, New Jersey, during the MARAC evening reception, 30 October 2009.

Although it's been years since I've worked with personal papers, I always like hearing about them, and I'm particularly glad I attended this session: one of the papers touched upon an electronic records issue that crossed my mind earlier last week, and the other focused on copyright law, about which I don't know enough.

Unfortunately, one of the presenters, Donna Wells (Moorland-Spingarn Research Center, Howard University) was ill, but session chair Wilda Logan (U.S. National Archives and Records Administration) read her paper, which focused on what can happen when materials unexpectedly leave the custody of creators and their families.

In some instances, the severance of custody is accidental. After Washington, DC photographer Nestor Hernández died in 2006, his father placed a large number of his prints and other materials in a storage cubicle, but didn't tell other family members where the materials were housed. After Hernández's father died unexpectedly a short time afterward, the rent went into arrears and the contents of the cubicle were auctioned off. The devastated family then learned that the buyer was selling the prints at an open-air market for $3.00 apiece. Their only recourse was to assert their intellectual property rights whenever someone attempted to publish one of the images; the sale of the contents of the cubicle was perfectly legit.

In others, surviving relatives or other third parties see personal papers as sources of income. For example, civil rights icon Rosa Parks left her papers to a non-profit that taught civil rights history, but her surviving relatives successfully contested the will and the papers are being sold at auction. Wells has also gotten calls from foreclosure companies wanting to know whether photographs and other materials left behind by evicted residents are valuable; of course, these companies want to recoup lenders' losses, not donate materials to a repository.

These experiences have led Wells to give the following counsel to donors and their families:
  • Know the procedures governing rental of storage facilities and what will happen to your property if the rent falls into arrears, and make sure that someone else knows where your materials are being stored.
  • Give a trusted relative or friend your e-mail, online photo storage, etc., passwords.
  • If you use an online storage service, make sure you know what will happen to your resources if you die.
  • Let a trusted relative know what you want to have happen to your online and physical materials, even if you don't express your wishes in your will.
In an age when the digital equivalents of personal papers are being kept in the cloud, manuscript curators are going to spend a lot of time dealing with the complications of death in the digital era -- and all of us are likely going to feel ethically bound to provide guidance to families who don't want to donate materials but do want to access and preserve resources created by deceased loved ones.

The next presenter, Janet Fries (DrinkerBiddle) is an attorney who specializes in copyright and intellectual property law and who has represented numerous artists, authors, and musicians. Using the fate of Nestor Hernández's prints as a starting point, she furnished an overview of copyright and other laws. I'm going to emphasize just a few of the high points:
  • A person who buys a print does not automatically get the rights associated with the print. When people bought Hernández's prints on the street for $3.00, they didn’t get the right to do anything with these images. When and if the prints are duplicated, the family will be able to discover who has various prints and to assert their rights.
  • Copyright can be transferred via a will or a trust, and the laws of intestacy apply as well.
  • Creators and their heirs have the right to terminate the transfer of copyright: existing law allows the person or entity who undertook the transfer to terminate or renegotiate the right of transfer after 35 years -- even if the transfer document purports to transfer rights in perpetuity. Heirs can also exercise this right. This right is little known and seldom exercised, but it’s really important.
  • Copyrights don’t transfer by accident: handshake agreements aren’t sufficient.
  • Fair use is very helpful but very unpredictable, and there are no hard-and-fast rules; don’t rely on any myths that come your way. The nature of the use has bearing: educational use and commentary are favored, but there are other factors.
  • Repositories need to be aware that granting rights to others has pitfalls. Being in the chain of rights means being in the chain of liability. Repositories may also be vulnerable to charges of contributory and vicarious liability; refusing to make copies for for-profit uses might be a good idea, and making copies contingent upon the user’s securing of a licensing agreement is also a good idea.
  • Rights of publicity governing people depicted in images vary from state to state. Be careful about using images if you lack signed release forms. Other materials may also be covered by this right; in New York State, for example, image, voice, name, and biographical details may not be used for trade or advertising without the express written consent of the person.
Fries also discussed how creators can spare others the Hernández family's experience. She encourages the artists she represents to, among other things, create inventories documenting where their works are stored and the intangible rights (copyright, moral right, trademark, patent, contract) associated with each work and to develop estate plans; in order to help them do so, she's developed a variety of forms that they can use. If an artist doesn’t want members of his or her family to know where works are stored or fears that listing locations will make theft easier, she stresses that telling no one is an illogical extension of a logical premise; information can be shared with a trusted friend, kept in a sealed envelope in a locked drawer, or kept in an attorney's office (but not a safe deposit box!)

I'm really not doing justice to Fries's presentation, which segued nicely into an extended Q&A about various copyright, permissions, and other legal issues. Copyright law is incredibly complex, and Fries excelled at giving us a sense of some of its nuances while dispensing lots of practical advice.

Wednesday, November 4, 2009

MARAC Fall 2009: S6, EAD Perspectives at the Institutional, Research, and National Level

Moon over Manhattan, as seen from the Newport, Jersey City esplanade, 4:50 PM, 29 October 2009.

Post corrected 7 November 2009. I was sitting in the very back of the room in which S6 was held, and sometimes had trouble hearing the presenters. I completely misheard a couple of things that Michael Rush said during the start of his presentation, and this post contained some inaccurate information as a result. Thanks to Mike for setting me straight, and apologies all round.

I’m not doing a ton of description these days, and but I cut my professional teeth on Machine Readable Cataloging (MARC) records and have lots of colleagues who are still doing a lot of MARC and Encoded Archival Description (EAD) work, so I always make it a point to attend conference sessions relating to description whenever possible. I’m glad I caught this one.

Michele Combs (Syracuse University Special Collections Research Center) opened the session by outlining the internal and external benefits of EAD, technical options for creating and providing access to EAD finding aids, and how her repository has integrated EAD into its workflow. I particularly liked her discussion of SU’s More Product, Less Product (MPLP)-influenced approach to description: Combs and her colleagues create EAD finding aids for new collections during the accessioning process, and they’re tackling the backlog by converting paper finding aids to EAD and using existing MARC records to generate basic EAD finding aids. As a result, every collection gets at least a basic EAD finding aid.

Jeanne Kramer Smyth (Discovery Communications and, BTW, the force behind Spellbound Blog) discussed ArchiveZ, an information visualization project that uses EAD finding aids from a variety of institutions as a source of structured data. Focusing on subjects, time periods, and linear footage, Kramer-Smyth and her associates normalized the data and decomposed compound subjects into tags; the latter dramatically increases the chances of finding overlapping collections. They also cross-tabulated subjects and time periods to identify the volume of records covering a given subject at a given time.

This is very cool stuff that promises to open up all kinds of new avenues of access, but Kramer-Smyth and her colleagues have run into a few problems, almost all of which stem from the flexibility inherent in the EAD specification. Each repository that provided finding aids to the ArchivesZ project had its own encoding quirks and particularities, and standardization across certain tags was lacking; for example, some repositories measure quantities of records in linear feet, while others use cubic feet, etc. Some of the finding aids had incomplete subject assignments (e.g., subjects reflected in the collection title aren’t listed as subjects).

Kramer-Smyth emphasized that these problems are fixable: she and others who use EAD as a data source can figure out how to write better code and ask repositories to submit “configuration files” that resolve data inconsistencies (e.g., by explaining local practices regarding quantity/extent information). However, it’s pretty plain that EAD still has a long way to go before it truly transcends institutional boundaries.

Michael Rush (Beineke Rare Book & Manuscript Library, Yale University), who heads is drafting the charge for the soon-to-be reconstituted EAD Working Group, is charged with revising EAD, provided a useful overview of the Working Group’s goals some of the changes that may be incorporated into EAD 3.0:
  • Reduction of mixed content, i.e., mixing of text and tags.
  • Allowing namespace interoperability, i.e., giving implementers to embed MODS, PREMIS, and other XML schemas directly into an EAD finding aid.
  • Improvement of data handling, e.g., getting rid of forward slashes, which are ignored by many programs.
  • Eliminating anything that doesn’t describe the records, e.g., the head and attribute labels used to mark scope and content notes; formatting info should be in stylesheets, not EAD schema!
  • Possibly removing table and list coding and recursive tags.
  • Reining in the diversity of practice, which is a political challenge: people do things a certain way because a given way meets a given need, but this diversity makes it harder to exchange data across institutions or pull EAD data into a database. In an effort to accommodate everyone, the Working Group might come up with a strict EAD and a loose EAD that allows greater diversity of practice.
The Working Group is seeking will need volunteers who will to steer the revision process; if you’re interested, contact him at michael.rush-at-yale.edu

Session chair Mark Matienzo (New York Public Library) then asked the panelists a really provocative question: should archivists should think of finding aids as documents or as data sources? All three panelists concurred that we need to start seeing finding aids as data sources from which documents, which still have many uses, can be produced as needed; conceptualizing finding aids as documents has led to many of the quirks and inconsistencies that become apparent anytime one looks at multiple institutions’ finding aids. As Michael Rush pointed out, we’ve moved beyond the point at which documents meet our needs. With MPLP and other developments, description is never done, and although we need the capacity to take a snapshot of a given description as it exists at a given point of time, we need to focus more on standardized creation of data over time.

All in all, a phenomenal session that brought to mind my own long-ago (and subsequently back-burnered) realization that the MARC format could be thought of as a highly flexible and repurposable information source, not just a cluster of templates organizing the presentation of various chunks of information. It also called to mind various past efforts to increase the consistency of MARC cataloging across institutions, most of which didn’t pan out. Here’s hoping that past experience, the profession’s increasing comfort and familiarity with databases, etc., and the emergence of new tools that make use of structured descriptive data make it possible to standardize descriptive practice in the EAD era.

Monday, November 2, 2009

MARAC Fall 2009, S1: Solutions to Acquiring and Accessing Electronic Records

Pavonia Arcs, by Robert Pfitzenmeier (2004), Newport, Jersey City, 29 October 2009.

Along with Ricc Ferrante (Smithsonian Institution Archives) and Mark Wolfe (M.E. Grenander Department of Special Collections and Archives, University at Albany), I had the good fortune to participate in this session, which was graciously chaired by Sharmila Bhatia (U.S. National Archives and Records Administration).

Ricc Ferrante discussed the challenges of accessioning and preserving archival e-mail created by employees of the Smithsonian Institution's semi-autonomous museums and research institutes. His experience should resonate with many government and college and university archivists. Until late 2005, the Smithsonian's component facilities used a variety of e-mail applications, and retention guidelines were implemented in 2008. As a result, the archives is both actively soliciting transfers of cohesive groups (i.e., accounts) of documented and backed-up messages at predetermined intervals and passively accepting transfers of older groupings of records in a variety of formats.

Ricc then discussed the processing of these e-mails, which is performed on PC or Mac desktop computers. Incoming transfers are backed up, analyzed and documented, converted to a preservation format, and securely stored. The Smithsonian Institution Archives uses a tool to convert accounts or groupings of messages in formats other than MBOX to the MBOX format, and the Collaborative Electronic Records Project (CERP) parser then converts the MBOX files to an XML-based preservation format. Experimenting with the MBOX conversion tool and the CERP parser has been on my to-do list for some time, so I was really glad I got the chance to hear Ricc discuss these tools.

Mark Wolfe discussed how the M.E. Grenander Department of Special Collection and Archives is using Google Mini, a modestly priced "plug and play" search appliance that will index up to 300,000 documents, to improve access to its student newspapers. Prior to the installation of Google Mini, a paper card file was the only access mechanism for these publications, and Google MIni has made it possible for staff to find information about people who became prominent well after they left the university (e.g., gay rights activist Harvey Milk, '51), respond quickly to reference inquiries, and enhance access to the newspapers.

Mark also highlighted the shortcomings of Google Mini's indexing of digitized materials. When assigning titles, it looks for the most prominent text on a given page, which in a newspaper may be part of an ad, not a story. Dates are another problem. When sorting search results by date, it hones in on the date the digital file was created, not the date of the scanned original. The former problem can be corrected, albeit with considerable effort, by manually changing the author, title, etc. properties of the files, which are in text-based PDF format. However, the date properties, which help to safeguard the authenticity of born-digital files, cannot easily be changed and thus inhibit date-based access to scanned archival materials. There's been a lot of talk lately about how the management of born-digital and born-again digital materials will eventually converge, but Mark's talk is a good reminder that we're not quite there yet.

My presentation concerned our capture of New York State government sites and the redaction (i.e.. removal of legally restricted information from records prior to making them accessible) of electronic records converted to PDF format. In lieu of giving an exhaustive recap, I'll just offer a few words of advice to people contemplating electronic redaction. At present, there are several good tools for redacting PDF files, including the built-in tool bundled with Adobe Acrobat 8 and 9, Redax, and Redact-It. If you are using an older version of Adobe Acrobat and can't or don't want to upgrade or purchase an add-on tool, the National Security Agency has produced a document that outlines a laborious but effective redaction procedure. If you commit to electronic redaction, you need to keep abreast of the relevant legal and digital forensics literature: people are trying to figure out how to crack these tools and techniques and recover redacted information, and one of them may eventually succeed.

There are also several really bad PDF redaction techniques. Never, ever use Adobe Acrobat's Draw or Annotate tool to place black, white, etc. boxes over information you wish to redact. Another spectacularly bad idea: "redacting" a word processing document by changing the font color to white or using a shading or highlighting feature to obscure the text and then converting the document to PDF format.

Want to know why these options are so bad? Read this. And this. And this. And this. And this. And this, too (thanks to John J. @ W&L for drawing my attention to this recent blunder.)

Saturday, October 31, 2009

MARAC Fall 2009, plenary session

The Hudson & Manhattan Railroad Powerhouse, Jersey City, 30 October 2009. Built in 1906-1908, this beautiful Romanesque Revival structure -- which by night looks as if it could serve as a set for a Frankenstein film -- once powered what is now the PATH train system. It ceased operations in 1929 and has stood vacant ever since, and its owner, the Port Authority of New York and New Jersey, wanted recently sought to tear it down and build a parking deck on the site. Owing to the determined efforts of the Jersey City Landmarks Conservancy and other citizens, this remarkably sturdy building has been declared a National Historic Site and is on the verge of redevelopment.

MARAC got off to a roaring start yesterday morning: during the plenary session, Ellen Fleurbaay and Marc Holtman of the Amsterdam City Archives discussed the Archiefbank, the repository's on-demand scanning program, and the institutional changes required to make it work.

The archives, which holds a wide array of municipal government records and other materials documenting the history of the city, experienced substantial declines in in-person visitors during the early 21st century; at the same time, the number of visitors to its Web site increased steadily. Visitor statistics are the measure of Dutch cultural institutions' success, and the archives realized that it needed to reinvent itself in order to survive. To that end, it articulated two main goals:
  • In-person visitors will experience the look and feel of authentic archival documents and the pleasure of doing historical research.
  • Everyone should be able to access all archival collections at home at all times.
In support of the first goal, the archives changed its name and logo, developed a new facility in the city center, developed a permanent exhibit, offered evening events and weekend hours. It also transformed its research room into a wired "information center" in which people were encouraged to discuss their work with others; this idea intrigues me, but the security-minded archivist and the tranquility-loving researcher within me have a few doubts.

In support of the second, it radically expanded its digitization program. The archives holds more than 20 miles of records -- which would take an estimated 406 years to scan -- but quickly realized that it should first focus on its most heavily used documents.

It also developed a stunning new program that allows users to request scanning of specific her records. Online researchers scan the EAD-encoded finding aids in the Archiefbank, and with a simple click of a button request scanning of specific records. The Archiefbank then generates an order number that is used to track the order throughout the scanning process and to generate file names for the scans. Staff retrieve the records, quickly examine them for copyright and preservation issues, and do some minimal prep work (e.g., removing staples), then convey the materials to a scanning vendor. The resulting images are added to the archives' electronic repository, and are then transferred to its Web site. The requester then purchases the scans s/he wants; if a researcher wants materials that have already been scanned and added to the archives' Web site, he or she can do so instantly. The more scans one purchases, the lower the cost per scan.

The archives aims for a turnaround time of 2-3 weeks and a total of 10,000 scans per week. A distinctive mix of circumstance and policy makes this prodigious activity possible:
  • Dutch law. There is no fee for consulting original records or viewing digital images at the archives, but charges for reproduction are allowed, so the archives can assess fees for scanning materials for online researchers -- and the archives has carefully calibrated its fees so that it breaks even.
  • Focus on legibility, not preservation-quality scanning. Instead of the high-resolution TIFFs produced for preservation/conservation purposes, on-demand scans are created as low-resolution JPEGs. This policy dramatically reduces the archives' storage costs: the cost of storing 1 TB of 300 dpi TIFFs in a digital repository with remote backup is $7,000 per year, and but that of storing equivalent 200 dpi JPEG 4 images is $77.
  • Emphasis on high volume. The archives' in-house scanning facilities support preservation/conservation scanning, and on-demand scanning is outsourced. In order to reduce the amount of manual processing needed, the archives scans entire files, not individual documents; researchers pay only for those scans that they want.
  • An efficient back-office operation. The archives has developed a barcode-driven management system that enables staff to identify precisely where each group of records slated for scanning is located and which current and succeeding tasks are to be performed on each group.
  • A well-developed IT infrastructure. Although Fleurbaay and Holtman didn't emphasize this point, it's pretty evident that without robust and seamlessly integrated systems, high-volume on-demand scanning wouldn't be possible. Image ordering and purchasing functionality meshes neatly with the archives' EAD finding aids, and the archives' document viewer has a built-in filter that enables users to increase contrast -- a real help when inks have faded over time.
Everyone present was wowed by the Amsterdam City Archives' efforts, which by every measure are a rousing success: visits to the repository have increased five-fold, 15,000 registered online users have requested scans, and after two years of high-volume scanning more than 7 million images are available online.

I have the feeling that just about everyone who attended this presentation is going to devote a lot of time to thinking about their repositories can emulate the example set by the Amsterdam City Archives. Most of us probably won't be able to establish programs as sophisticated or as large as that of the Amsterdam City Archives -- because we lack the needed IT infrastructure, hold tons of copyrighted or restricted materials, or work in government archives that are legally barred from charging for online access -- but many of us will likely reassess some of our digitization practices and priorities. And that's a good thing.

Friday, October 30, 2009

MARAC Fall 2009, day one

Midtown Manhattan, as seen from the Newport, Jersey City esplanade, 30 October 2009, 9:35 PM.

Today was a really full day: Ellen Fleurbaay and Marc Holtman of the Amsterdam City Archives delivered a knockout plenary presentation about their repository's on-demand scanning program, I attended a great session focusing on new developments relating to Encoded Archival Description, and a poignant and informative session about new challenges to the survival of personal papers.

I took part in a morning session relating to acquiring and providing access to electronic records. My co-presenters, Ricc Ferrante from the Smithsonian Institution Archives and Mark Wolfe from the University at Albany's M.E. Grenander Department of Special Collections and Archives, did a great job. I was also pretty pleased about how my session turned out, even though I started coming down with a cold yesterday and darn near lost my voice midway through my presentation.

Owing to said cold, I'm turning in early. Look for after-the-fact recaps over the next few days . . . .

Thursday, October 29, 2009

Greetings from Jersey City

View from the 10th floor, Westin Jersey City Newport, 29 October 2009

The Mid-Atlantic Regional Archives Conference (MARAC) is holding its Fall 2009 Meeting here in Jersey City, so a couple of colleagues and I took the train from Albany yesterday. My colleagues took an excellent tour of the Statue of Liberty and Ellis Island, and I was supposed to tour the Federal Reserve Bank of New York and Trinity Church. Unfortunately, I messed up my back last week, and my doctor and physical therapist told me to resume activity gradually and to avoid overexerting myself.

However, they also told me not to baby myself too much, so I did explore the immediate area around the conference hotel, the Westin Jersey City Newport.

The Newport neighborhood, a large, modern "mixed use community," sits on the western bank of the Hudson River. As the sign above notes, the area has a lengthy and storied history. For much of the 19th and 20th centuries, the area was home to a mammoth Erie-Lackawanna Railroad yard, warehouses, and port facilities that facilitated the transfer of goods to and from the trains. From the 1950s onward, the completion of the Interstate Highway System and the resulting competition from trucks rendered the rail yard redundant, and the area fell into decline. By the 1970s, it was generally abandoned. It was redeveloped in the 1980s, and it's now home to carefully planned mix of apartment towers, office buildings, retail outlets, eateries, and green space.

It's a little too new for my taste: apart from the buildings of the Newport Yacht Club and Marina and ventilation towers for the Holland Tunnel, none of the buildings are more than 30 years old. However, I can see why people want to live here. It's a very walkable neighborhood, and it's surprisingly tranquil.

It also has spectacular views of western Manhattan, and Newport's developers have capitalized upon the setting by building a six-mile long esplanade along the river. I ate lunch while sitting on one of the many benches that line the esplanade, and was treated to a stellar view of a Holland Tunnel ventilation tower, the Empire State Building, and the Chrysler Building.

I could also see (most of) the Manhattan Municipal Building and, of course, the boats and ships that were traveling up and down the river. During my time on the esplanade, I saw large, ocean-going vessels, commuter and tourist ferries, and even a few kayakers enjoying a sunny fall day on the river.

Some parts of the esplanade are particularly picturesque . . . even if the lighthouse is a recent decorative addition.


Many people choose to live in Newport because it is a transportation hub. Ferry service at the Hoboken Terminal is readily accessible via the esplanade, and PATH, New Jersey Transit, and Hudson-Bergen Light Rail trains also serve the area. Given the plethora of stores and essential services within walking distance and the wealth of available public transportation options, one really doesn't need a car.


And, of course, Newport's inhabitants look out their windows and see sights like this . . . .