Friday, November 27, 2009

A new archival use for Twitter

I've been resisting Twitter mightily -- my tolerance for information clutter is pretty low -- but this splendid Twitter-driven initiative just might induce me to take the plunge . . . .

Archivists at Cambridge University's Scott Polar Research Institute are using Twitter to draw attention to the diary that Captain Robert Falcon Scott kept during his ill-fated 1910-1912 Antarctic expedition: every day, they tweet the first 140 characters of the diary entry that Scott penned exactly 99 years before.

Each tweet also includes a link to a blog, Scott's Last Expedition, that provides the full text of the diary entry and archival images. The blog itself is pretty neat, too: the only dates that appear on it are those of the diary entries, and as a result the blog looks as if it were written by Scott himself. The entries thus have an immediacy and intimacy that they would lack if accompanied by scholarly commentary.

The Scott Polar Research Institute plans to launch similar Web 2.0 initiatives focusing on other British polar explorers. More information about the Scott project and the Institute's future plans is available here.

Scholars and archivists have long known that archival records tell compelling stories, but most people tend to think of records as being of interest to history buffs, civil servants, and a few other quiet oddballs. The Scott Polar Research Institute's project promises to present Scott's diary in a way that will grab and hold the attention of 21st-century people, and other repositories certainly hold materials that could be brought to life in the same way. I really want to see lots and lots of archives follow in the Institute's footsteps. I promise I'll follow all your projects on Twitter.

Wednesday, November 25, 2009

Happy Thanksgiving

"World War II-Kids Reenacting Thanksgiving." Image courtesy of the New York State Archives. (New York (State). Dept. of Health. Bureau of Communications Production. Photography Unit. Photographic prints and negatives of department officials, facilities, and activities, ca. 1920-1983. Series 14655-88, Box 58.)

Wherever you are and whatever you're doing, I wish you a happy and safe Thanksgiving. Blogging is probably going to be light during the next few days, so here are a few holiday-related links to tide you over:
  • The U.S. National Archives and Records Administration offers a handy explanation and digitized records that detail how Thanksgiving came to be a national holiday. The process wasn't as straightforward as you might think: for the first two thirds of the 19th century, the day of observance shifted around quite a bit, and in 1939 and 1940 some states celebrated it on the second to last Thursday of November while others waited until the last Thursday of the month. The current day of observance -- the fourth Thursday of November -- was enshrined in federal law in 1941. (Somewhere, someday, someone's going to win a trivia contest because s/he knows this.)
Thanksgiving day greetings. Digital ID: 1588308. New York Public Library"Thanksgiving Day Greetings." Image courtesy of the New York Public Library. Vintage Holiday Postcards Collection. Mid-Manhattan Library -- Picture Collection. Record ID 1065827. Digital ID 1588308.

Tuesday, November 24, 2009

New York State Archives: data storage advisory and workshop

Earlier this year, a couple of my colleagues put together a brief records advisory highlighting the issues that arise when the day-to-day custody and control of electronic data is outsourced to a third party. Although this advisory is intended for state and local government in New York State, it should be of interest to anyone who is contemplating storing digital resources in a cloud computing environment, a data vault, fee-for-service digital repository, or other environment not under his or her direct control.

If you're interested in learning more about data storage and its relationship to electronic records management and can get to New York's Capital Region on the afternoon of December 15, the New York State Archives is offering a new, non-technical Electronic Data Storage workshop. The workshop is free and open to anyone interested in the subject; however, space is limited and registration is required.

Friday, November 20, 2009

New York State Archives and social networking

The New York State Archives has recently established a presence on a variety of Web 2.0/social networking sites. This effort is still in the testing stages, and there are no doubt a few bugs waiting to be discovered and fixed, but by all means check out the following resources. Several of my colleagues have been working on this initiative for months, and I'm really impressed with the end result. I think you'll like it, too.

New York State Archives on YouTube

One of my colleagues recently oversaw the digitization of some of our analog audiovisual holdings, a representative sampling of which are now available on the New York State Archives' YouTube channel. At present, you'll find iconic "I Love New York" tourism ads from the 1980s, World War II-era civil defense and public health films, and 1950s New York State Thruway Authority films about the Tappan Zee Bridge and how the Thruway and other new freeways would benefit Binghamton residents. You'll also find some 21st-century films that explain what archives do and how researchers can access their holdings and discuss classroom uses of historical records.

New York State Archives on Flickr

At present, there are three photosets highlighting some of the late 19th and early 20th century images in our collections: Winter in New York, Summer in New York, and Honoring the Workers of New York. You'll also find pointers to lots and lots of other images and videos available on our own Web site.

New York State Archives on Twitter
Follow us!

New York State Archives on Facebook
Information about who we are and what we do, upcoming events, new publications and other online resources, and lots of other cool stuff.

If you find any problems, have any questions, or simply want to give a well-deserved thumbs-up to the State Archives staff who created and will continue to develop these resources, please use each site's comments option or e-mail my colleague Michelle Arpey at marpey[at]

Thursday, November 12, 2009

Wanted: Grateful Dead archivist

It's really rare that archival job postings get much attention outside of the archival community, but the University of California-Santa Cruz's recent announcement that it was seeking an archivist to manage its Grateful Dead Archive has gotten more media coverage than the recent confirmation of David Ferriero as the 10th Archivist of the United States. The New York Times, the San Francisco Chronicle, and lots of other media outlets are all over the story.

The Daily Show With Jon StewartMon - Thurs 11p / 10c
Want Ads - Grateful Dead Archivist
Daily Show
Full Episodes
Political HumorHealth Care Crisis

Last night, Jon Stewart (who needs some help with the pronunciation of "archivist") got into the act. As evidenced by the reactions posted to the Daily Show site itself and the Archives and Archivists listserv, reaction is mixed: some archivists think it's hilarious, while others are insulted by the offhand manner in which Stewart dismisses our profession. FWIW, I'm in the former camp. Yeah, the "alphanumerically?" bit is kind of snotty, but this is a man who, upon receiving an honorary doctorate from his alma mater, said: "As a person, I am honored to get it; as an alumnus, I have to say I believe we can do better."

The job itself sounds like a great opportunity for a really high-energy archivist, who will work with approximately 600 linear feet of
archival records, news clippings, artifacts, photographs, posters, audio and video recordings, and publications by and about the band and correspondence and art contributed over the years by their fans.

Why do I say "high-energy"? Well, the person who takes this job will be responsible, among other things, for:
  • Developing overarching arrangement and description policies that conform to accepted national standards
  • Developing digitization plans and digital access mechanisms
  • Dealing with rights clearances and permission issues
  • Creating and maintaining ties to the band's fan community and potential donors
  • Providing reference services to academic researchers and members of the general public
  • Curating exhibits and overseeing the loan of materials for exhibit purposes
  • Planning conferences and other events
  • Developing a volunteer/intern program that will tap into the fan community's knowledge and expertise
  • Maintaining the Grateful Dead Archive's Web 2.0 presence
  • Helping to set policies governing the operations of the Department of Special Collections and Archives, of which the Grateful Dead Archive is part
  • Serving on appropriate University Library committees
As the Toronto Star points out, the successful candidate will likely have to perform another job function: "acting as the world's most chilled out bouncer." Ever since the surviving members of the Grateful Dead transferred the materials that comprise the archive to UC Santa Cruz last year, people have been traveling to Santa Cruz in hopes of getting access to the material. However, the Department of Special Collections and Archives -- quite rightly, I think -- is turning people away until they can establish some basic intellectual control over the collection.

What a great challenge -- and what a great opportunity. How many of us have large numbers of people clamoring for access to our holdings? Moreover, despite the stereotypes associated with Grateful Dead fandom, the band's following is drawn from all walks of life. I realize that the new Grateful Dead archivist and his/her colleagues in the Department of Special Collections and Archives are going to be struggling mightily to meet the immediate demands of researchers -- and, in all likelihood, to deal with some ongoing media attention -- but I hope they devote at least a little attention to educating the Grateful Dead Archive's users about the nature and value of archives in general.

The Department of Special Collections and Archives's other holdings, which include 16th-century Italian books, works of art by Lawrrence Ferlinghetti and others, mammoth photographic collections, materials relating to the history of feminism, and local history materials, would make a great teaching tool. Even if the users of the Grateful Dead Archive don't actively use any of the other holdings, I'm sure a lot of them would, with a little gentle nudging, grasp the value of preserving and providing access to these materials. Some of them could, with a little more nudging, become effective stakeholders and advocates. I, for one, would love to recruit a few advisory committee members, PAHR pushers, State Historical Records Advisory Board members, and other champions whose taste in casual wear runs toward tie-dyed t-shirts and Birkenstocks.

If you've got a master's degree in library science or archives management, are familiar with established professional standards and know how to put them into practice, supervisory experience, expert knowledge of modern American vernacular culture and music, first-rate organizational and communication skills, and lots and lots of energy, this job may be for you. You've got until December 4 to submit your application.

Thursday, November 5, 2009

MARAC Fall 2009: S11, Scattered Treasures: The Stewardship of Private Collections in the 21st Century

Rotunda, City Hall, Jersey City, New Jersey, during the MARAC evening reception, 30 October 2009.

Although it's been years since I've worked with personal papers, I always like hearing about them, and I'm particularly glad I attended this session: one of the papers touched upon an electronic records issue that crossed my mind earlier last week, and the other focused on copyright law, about which I don't know enough.

Unfortunately, one of the presenters, Donna Wells (Moorland-Spingarn Research Center, Howard University) was ill, but session chair Wilda Logan (U.S. National Archives and Records Administration) read her paper, which focused on what can happen when materials unexpectedly leave the custody of creators and their families.

In some instances, the severance of custody is accidental. After Washington, DC photographer Nestor Hernández died in 2006, his father placed a large number of his prints and other materials in a storage cubicle, but didn't tell other family members where the materials were housed. After Hernández's father died unexpectedly a short time afterward, the rent went into arrears and the contents of the cubicle were auctioned off. The devastated family then learned that the buyer was selling the prints at an open-air market for $3.00 apiece. Their only recourse was to assert their intellectual property rights whenever someone attempted to publish one of the images; the sale of the contents of the cubicle was perfectly legit.

In others, surviving relatives or other third parties see personal papers as sources of income. For example, civil rights icon Rosa Parks left her papers to a non-profit that taught civil rights history, but her surviving relatives successfully contested the will and the papers are being sold at auction. Wells has also gotten calls from foreclosure companies wanting to know whether photographs and other materials left behind by evicted residents are valuable; of course, these companies want to recoup lenders' losses, not donate materials to a repository.

These experiences have led Wells to give the following counsel to donors and their families:
  • Know the procedures governing rental of storage facilities and what will happen to your property if the rent falls into arrears, and make sure that someone else knows where your materials are being stored.
  • Give a trusted relative or friend your e-mail, online photo storage, etc., passwords.
  • If you use an online storage service, make sure you know what will happen to your resources if you die.
  • Let a trusted relative know what you want to have happen to your online and physical materials, even if you don't express your wishes in your will.
In an age when the digital equivalents of personal papers are being kept in the cloud, manuscript curators are going to spend a lot of time dealing with the complications of death in the digital era -- and all of us are likely going to feel ethically bound to provide guidance to families who don't want to donate materials but do want to access and preserve resources created by deceased loved ones.

The next presenter, Janet Fries (DrinkerBiddle) is an attorney who specializes in copyright and intellectual property law and who has represented numerous artists, authors, and musicians. Using the fate of Nestor Hernández's prints as a starting point, she furnished an overview of copyright and other laws. I'm going to emphasize just a few of the high points:
  • A person who buys a print does not automatically get the rights associated with the print. When people bought Hernández's prints on the street for $3.00, they didn’t get the right to do anything with these images. When and if the prints are duplicated, the family will be able to discover who has various prints and to assert their rights.
  • Copyright can be transferred via a will or a trust, and the laws of intestacy apply as well.
  • Creators and their heirs have the right to terminate the transfer of copyright: existing law allows the person or entity who undertook the transfer to terminate or renegotiate the right of transfer after 35 years -- even if the transfer document purports to transfer rights in perpetuity. Heirs can also exercise this right. This right is little known and seldom exercised, but it’s really important.
  • Copyrights don’t transfer by accident: handshake agreements aren’t sufficient.
  • Fair use is very helpful but very unpredictable, and there are no hard-and-fast rules; don’t rely on any myths that come your way. The nature of the use has bearing: educational use and commentary are favored, but there are other factors.
  • Repositories need to be aware that granting rights to others has pitfalls. Being in the chain of rights means being in the chain of liability. Repositories may also be vulnerable to charges of contributory and vicarious liability; refusing to make copies for for-profit uses might be a good idea, and making copies contingent upon the user’s securing of a licensing agreement is also a good idea.
  • Rights of publicity governing people depicted in images vary from state to state. Be careful about using images if you lack signed release forms. Other materials may also be covered by this right; in New York State, for example, image, voice, name, and biographical details may not be used for trade or advertising without the express written consent of the person.
Fries also discussed how creators can spare others the Hernández family's experience. She encourages the artists she represents to, among other things, create inventories documenting where their works are stored and the intangible rights (copyright, moral right, trademark, patent, contract) associated with each work and to develop estate plans; in order to help them do so, she's developed a variety of forms that they can use. If an artist doesn’t want members of his or her family to know where works are stored or fears that listing locations will make theft easier, she stresses that telling no one is an illogical extension of a logical premise; information can be shared with a trusted friend, kept in a sealed envelope in a locked drawer, or kept in an attorney's office (but not a safe deposit box!)

I'm really not doing justice to Fries's presentation, which segued nicely into an extended Q&A about various copyright, permissions, and other legal issues. Copyright law is incredibly complex, and Fries excelled at giving us a sense of some of its nuances while dispensing lots of practical advice.

Help save the State Library of Massachusetts!

At a press conference on Thursday, 29 October, the Massachusetts Governor's Office announced that Governor Deval Patrick is considering closing the State Library of Massachusetts as a cost-saving measure. This closure will have a monumental impact on the Commonwealth's cultural heritage.

Open to the public since 1826, the State Library has developed comprehensive collections in the areas of government documents, law, Massachusetts history, and public and current affairs. From the Bradford manuscript "Of Plimoth Plantation" to an ever-expanding digital repository, the holdings of the State Library shed light on the Commonwealth's past, present, and future.

There are several ways you can help:
In addition to doing the above, you Bay Staters out there might also want to consider contacting your legislators.

For the most recent information about the fight to save the State Library, check out its blog and follow Save Your State Library! on Facebook.

Wednesday, November 4, 2009

MARAC Fall 2009: S6, EAD Perspectives at the Institutional, Research, and National Level

Moon over Manhattan, as seen from the Newport, Jersey City esplanade, 4:50 PM, 29 October 2009.

Post corrected 7 November 2009. I was sitting in the very back of the room in which S6 was held, and sometimes had trouble hearing the presenters. I completely misheard a couple of things that Michael Rush said during the start of his presentation, and this post contained some inaccurate information as a result. Thanks to Mike for setting me straight, and apologies all round.

I’m not doing a ton of description these days, and but I cut my professional teeth on Machine Readable Cataloging (MARC) records and have lots of colleagues who are still doing a lot of MARC and Encoded Archival Description (EAD) work, so I always make it a point to attend conference sessions relating to description whenever possible. I’m glad I caught this one.

Michele Combs (Syracuse University Special Collections Research Center) opened the session by outlining the internal and external benefits of EAD, technical options for creating and providing access to EAD finding aids, and how her repository has integrated EAD into its workflow. I particularly liked her discussion of SU’s More Product, Less Product (MPLP)-influenced approach to description: Combs and her colleagues create EAD finding aids for new collections during the accessioning process, and they’re tackling the backlog by converting paper finding aids to EAD and using existing MARC records to generate basic EAD finding aids. As a result, every collection gets at least a basic EAD finding aid.

Jeanne Kramer Smyth (Discovery Communications and, BTW, the force behind Spellbound Blog) discussed ArchiveZ, an information visualization project that uses EAD finding aids from a variety of institutions as a source of structured data. Focusing on subjects, time periods, and linear footage, Kramer-Smyth and her associates normalized the data and decomposed compound subjects into tags; the latter dramatically increases the chances of finding overlapping collections. They also cross-tabulated subjects and time periods to identify the volume of records covering a given subject at a given time.

This is very cool stuff that promises to open up all kinds of new avenues of access, but Kramer-Smyth and her colleagues have run into a few problems, almost all of which stem from the flexibility inherent in the EAD specification. Each repository that provided finding aids to the ArchivesZ project had its own encoding quirks and particularities, and standardization across certain tags was lacking; for example, some repositories measure quantities of records in linear feet, while others use cubic feet, etc. Some of the finding aids had incomplete subject assignments (e.g., subjects reflected in the collection title aren’t listed as subjects).

Kramer-Smyth emphasized that these problems are fixable: she and others who use EAD as a data source can figure out how to write better code and ask repositories to submit “configuration files” that resolve data inconsistencies (e.g., by explaining local practices regarding quantity/extent information). However, it’s pretty plain that EAD still has a long way to go before it truly transcends institutional boundaries.

Michael Rush (Beineke Rare Book & Manuscript Library, Yale University), who heads is drafting the charge for the soon-to-be reconstituted EAD Working Group, is charged with revising EAD, provided a useful overview of the Working Group’s goals some of the changes that may be incorporated into EAD 3.0:
  • Reduction of mixed content, i.e., mixing of text and tags.
  • Allowing namespace interoperability, i.e., giving implementers to embed MODS, PREMIS, and other XML schemas directly into an EAD finding aid.
  • Improvement of data handling, e.g., getting rid of forward slashes, which are ignored by many programs.
  • Eliminating anything that doesn’t describe the records, e.g., the head and attribute labels used to mark scope and content notes; formatting info should be in stylesheets, not EAD schema!
  • Possibly removing table and list coding and recursive tags.
  • Reining in the diversity of practice, which is a political challenge: people do things a certain way because a given way meets a given need, but this diversity makes it harder to exchange data across institutions or pull EAD data into a database. In an effort to accommodate everyone, the Working Group might come up with a strict EAD and a loose EAD that allows greater diversity of practice.
The Working Group is seeking will need volunteers who will to steer the revision process; if you’re interested, contact him at

Session chair Mark Matienzo (New York Public Library) then asked the panelists a really provocative question: should archivists should think of finding aids as documents or as data sources? All three panelists concurred that we need to start seeing finding aids as data sources from which documents, which still have many uses, can be produced as needed; conceptualizing finding aids as documents has led to many of the quirks and inconsistencies that become apparent anytime one looks at multiple institutions’ finding aids. As Michael Rush pointed out, we’ve moved beyond the point at which documents meet our needs. With MPLP and other developments, description is never done, and although we need the capacity to take a snapshot of a given description as it exists at a given point of time, we need to focus more on standardized creation of data over time.

All in all, a phenomenal session that brought to mind my own long-ago (and subsequently back-burnered) realization that the MARC format could be thought of as a highly flexible and repurposable information source, not just a cluster of templates organizing the presentation of various chunks of information. It also called to mind various past efforts to increase the consistency of MARC cataloging across institutions, most of which didn’t pan out. Here’s hoping that past experience, the profession’s increasing comfort and familiarity with databases, etc., and the emergence of new tools that make use of structured descriptive data make it possible to standardize descriptive practice in the EAD era.

Monday, November 2, 2009

MARAC Fall 2009, S1: Solutions to Acquiring and Accessing Electronic Records

Pavonia Arcs, by Robert Pfitzenmeier (2004), Newport, Jersey City, 29 October 2009.

Along with Ricc Ferrante (Smithsonian Institution Archives) and Mark Wolfe (M.E. Grenander Department of Special Collections and Archives, University at Albany), I had the good fortune to participate in this session, which was graciously chaired by Sharmila Bhatia (U.S. National Archives and Records Administration).

Ricc Ferrante discussed the challenges of accessioning and preserving archival e-mail created by employees of the Smithsonian Institution's semi-autonomous museums and research institutes. His experience should resonate with many government and college and university archivists. Until late 2005, the Smithsonian's component facilities used a variety of e-mail applications, and retention guidelines were implemented in 2008. As a result, the archives is both actively soliciting transfers of cohesive groups (i.e., accounts) of documented and backed-up messages at predetermined intervals and passively accepting transfers of older groupings of records in a variety of formats.

Ricc then discussed the processing of these e-mails, which is performed on PC or Mac desktop computers. Incoming transfers are backed up, analyzed and documented, converted to a preservation format, and securely stored. The Smithsonian Institution Archives uses a tool to convert accounts or groupings of messages in formats other than MBOX to the MBOX format, and the Collaborative Electronic Records Project (CERP) parser then converts the MBOX files to an XML-based preservation format. Experimenting with the MBOX conversion tool and the CERP parser has been on my to-do list for some time, so I was really glad I got the chance to hear Ricc discuss these tools.

Mark Wolfe discussed how the M.E. Grenander Department of Special Collection and Archives is using Google Mini, a modestly priced "plug and play" search appliance that will index up to 300,000 documents, to improve access to its student newspapers. Prior to the installation of Google Mini, a paper card file was the only access mechanism for these publications, and Google MIni has made it possible for staff to find information about people who became prominent well after they left the university (e.g., gay rights activist Harvey Milk, '51), respond quickly to reference inquiries, and enhance access to the newspapers.

Mark also highlighted the shortcomings of Google Mini's indexing of digitized materials. When assigning titles, it looks for the most prominent text on a given page, which in a newspaper may be part of an ad, not a story. Dates are another problem. When sorting search results by date, it hones in on the date the digital file was created, not the date of the scanned original. The former problem can be corrected, albeit with considerable effort, by manually changing the author, title, etc. properties of the files, which are in text-based PDF format. However, the date properties, which help to safeguard the authenticity of born-digital files, cannot easily be changed and thus inhibit date-based access to scanned archival materials. There's been a lot of talk lately about how the management of born-digital and born-again digital materials will eventually converge, but Mark's talk is a good reminder that we're not quite there yet.

My presentation concerned our capture of New York State government sites and the redaction (i.e.. removal of legally restricted information from records prior to making them accessible) of electronic records converted to PDF format. In lieu of giving an exhaustive recap, I'll just offer a few words of advice to people contemplating electronic redaction. At present, there are several good tools for redacting PDF files, including the built-in tool bundled with Adobe Acrobat 8 and 9, Redax, and Redact-It. If you are using an older version of Adobe Acrobat and can't or don't want to upgrade or purchase an add-on tool, the National Security Agency has produced a document that outlines a laborious but effective redaction procedure. If you commit to electronic redaction, you need to keep abreast of the relevant legal and digital forensics literature: people are trying to figure out how to crack these tools and techniques and recover redacted information, and one of them may eventually succeed.

There are also several really bad PDF redaction techniques. Never, ever use Adobe Acrobat's Draw or Annotate tool to place black, white, etc. boxes over information you wish to redact. Another spectacularly bad idea: "redacting" a word processing document by changing the font color to white or using a shading or highlighting feature to obscure the text and then converting the document to PDF format.

Want to know why these options are so bad? Read this. And this. And this. And this. And this. And this, too (thanks to John J. @ W&L for drawing my attention to this recent blunder.)