Monday, October 31, 2011

Happy Halloween

New York State Governor Hugh Carey greeting a young guest at a Halloween party at the Executive Mansion, 30 October 1981. New York (State). Governor. Public information photographs, 1910-1992. Series 13703-83, Box 11, No. 020. Image courtesy of the New York State Archives.

Sorry for the light posting lately. After the 2011 Best Practices Exchange -- about which I'm not done posting -- ended, I headed to Ohio to spend a few days with my parents and took a bit of a break from the Internet. After I got back to Albany, I spent a few days digging out from under the mass of work that accumulated in my absence. Now that I've had a little time to recover, you'll see things perk up around here.

In the coming days, you'll see a couple of additional posts about the Best Practices Exchange, at least one post concerning the Barry Landau-Jason Savedoff case, and some other tidbits. However, today I want simply to wish you a happy Halloween and to share with you the above photograph, which was taken thirty years and one day ago and is found within my employer's holdings, and the recording below, which is of less certain provenance but was produced on 28 October 1940 by a San Antonio, Texas radio station. It brings together Orson Welles and H.G. Wells, who discuss Welles' infamous radio adaptation of Wells's War of the Worlds, which aired exactly seventy-three years and one day ago; you'll also hear a few words about Citizen Kane, which was released approximately six months after this conversation took place. Enjoy.

Saturday, October 22, 2011

BPE 2011: tidbits and lessons learned

The 2011 Best Practices Exchange (BPE) wrapped up about an hour ago, and I am both invigorated and completely exhausted. I’ll be posting more about the BPE either over the next few days or late next week (I may be without Internet access for a few days) , but I wanted to take this opportunity to share a few insights and lessons learned.

In yesterday’s morning plenary session, Doug Robinson of the National Association of State Chief Information Officers provided an overview of the challenges that state CIOs are facing. Those of us who work in state government need to be aware of the issues that these important partners face:
  • At present, 25 state and territorial CIOs are new appointments. Another 7 CIOs are serving in an acting capacity, and several new governors(Cuomo among them) have yet to appoint a CIO. About half of the CIOs report directly to the governor, but a growing number of states are moving the position of CIO out of the cabinet and placing it under the supervision of budget or procurement director. The CIOs of roughly half the states are currently situated within a budget or procurement office.
  • State CIOs come in from and return to the private sector, serve at the pleasure of the governor, and are in office for an average of 21 months. In contrast, private sector CIOs are in office for more than four years.
  • State CIOs are more interested in rationalizing and centralizing IT services than in introducing new technology. Their budgets are being cut, the budgets of the agencies that pay user fees for IT services are being cut, and they are very focused on saving money by consolidating and streamlining services. They’re also starting to explore sharing services across states. However, they’re pushing against agency resistance to consolidation, lack of a shared enterprise technology vision, the persistence of large legacy systems, outmoded and cumbersome IT procurement processes, and a host of other concerns. Many of them also have more accountability than authority.
  • State CIOs have identified electronic records management/digital preservation, authentication of data, social media, and the presence of state data on mobile devices as pressing concerns, but they’re doing much about them -- in large part because they’re not sure what they should do. Archivists and records managers should seize the opportunity to help CIOs address these issues.
  • Many state CIOs are also moving to private clouds, especially for e-mail, and this has records management implications: in many instances, individual users are getting very large inboxes, and many cloud contracts specify that service providers must destroy messages after a set period of time and certify that the destruction was carried out properly. It also has workforce implications: at present, states are accustomed to having one IT staffer administer roughly 20 servers. Google and other cloud service providers are accustomed to having one staffer oversee approximately 1,000 servers, and they’re always seeking greater efficiencies.
  • Other possible points of archivist/records manager-CIO intersection include enterprise architecture and policy, IT consolidation, shared services, and demands for government openness and transparency.
Some other interesting tidbits and lessons learned include:
  • In the past six years, state archives have accessioned over a million cubic feet of paper records. Agency consolidations, staff reductions, and budget cuts have propelled many records to push paper records out of storage space they pay for, and state archives are struggling to find space for masses of incoming records.
  • The Uniform Electronic Legal Material Act a piece of model legislation governing how states authenticate, preserve, and provide access to electronic copies of the state constitution, session laws, codified laws or statutes, state agency rules with the effect of law, and, optionally, court rules and decisions, state administrative agency decisions, and other legal material, will soon be the subject of a lot of state-level deliberation. Archivists and records managers must be actively involved in these discussions.
  • Redaction remains a bottleneck and a burden. We’ve made considerable strides in automating the processing of electronic records and making them accessible shortly after we accession them. Even though new search tools that facilitate identification and redaction of legally restricted or classified information are starting to appear, we’re still doing a lot of record-by-record review, particularly when less cut-and-dried forms of information (e.g., attorney work product) are present. In an era in which citizens and oversight bodies increasingly expect that records will made accessible quickly, this is a challenge.
  • Adobe Acrobat X is currently creating regular PDFs that don’t fully conform to the published PDF technical documentation; at least one digital collection’s validation tool consistently rejects PDFs created with Acrobat X. PDF/A files created with Acrobat X do conform to the published specification.

Friday, October 21, 2011

BPE 2011: emerging trends

The 2011 Best Practices Exchange (BPE) proceeds apace, and today I’m going to focus upon yesterday’s plenary session, which featured Leslie Johnston, the Director of Repository Development at the Library of Congress (LC). Johnston devoted a lot of time to discussing ViewShare, LC’s new visualization and metadata augmentation tool, but I’ll discuss ViewShare in a forthcoming post about some of the new tools discussed at this year’s BPE. Right now, I want simply to furnish an overview of her exhilirating and somewhat unsettling assessment of the changing environment in which librarians and archivists work:
  • Users do not use digital collections in the same way as they use paper collections, and we cannot guess how digital collections will be used. For example, LC assumed that researchers would want textual records, but a growing number of researchers want image files of textual records.
  • Until recently, stewardship organizations have talked about collections, series, etc., but not data. Data is not just generated by satellites, experiments, or surveys; publications and archival records also contain data.
  • We also need to start thinking in terms of “Big Data.” The definition of Big Data -- what can be easily manipulated with common tools and can be managed and stewarded by any one institutions -- is rather fluid, but we need to start thinking in these terms. We also need to be aware that Big Data may have commercial value, as evidenced by the increasing interest of firms such as in the data found in our holdings.
  • More and more, researchers want to use collections as a whole and to mine and organize the collections in novel ways. They use algorithms to do so and new tools that create visual images that transform data into knowledge. For example, the Digging into Data project examined ways in which many types of information, including images, film, sound, newspapers, maps, art, archaeology, architecture, and government records, could be made accessible to researchers. One researcher wanted to digitally mine information from millions of digitized newspaper pages and see whether doing so can enhance our understanding of the past. LC’s experience with archiving Web sites also underscores this point. LC initially assumed that researchers would browse through the archived sites. However, researchers want access to all of the archived site files and to use scripts to search for the information they want. They don’t want to read Web pages. Owing to the large size of our collections, the lack of good tools, and the permissions we secured when LC crawled some sites, this is a challenge.
  • The sheer volume of the electronic data cultural stewardship organizations need to keep is a challenge. LC has acquired the Twitter archive, which currently consists of 37 billion individual tweets and will expand to approximately 50 billion tweets by year’s end. The archive grows by 6 million tweets an hour. LC is struggling to figure out how best to manage, preserve, and provide comprehensive access to this mass of data, which researchers have already used to study the geographic spread of the dissemination of news, the spread of epidemics, and the transmission of new uses of language.
  • We have to switch to a self-serve model of reference services. Growing numbers of researchers do not want to come to us, ask questions of us, and then use our materials in our environment. They want to find the materials they need and then pull them out of our environment and into their own workspaces. We need to create systems and mechanisms that make it easy for them to do so. As a result, we need to figure out how to support real-time querying of billions of full-text items and the frequent downloading by researchers of collections that may be over 200 TB each. We also need to think about providing tools that support various forms of collection analysis (e.g., visualization).
  • We can’t be afraid of cloud computing. Given the volumes of data coming our way and mounting researcher demands for access to vast quantities of data, the cloud is the only feasible mechanism for storing and providing access to the materials that will come our way. We need to focus on developing authentication, preservation, and other tools that enable us to keep records in the cloud.
There’s lots and lots of food for thought here -- including a few morsels that will doubtless induce indigestion in more than a few people -- and it’s just a taste of what’s coming our way. If we don’t come to terms with at least some of these changes, we as a profession will really suffer in the coming years. Let's hope that we have the will and the courage to do so.

A bottle of locally brewed Kentucky Bourbon Barrel Ale at Alfalfa Restaurant, Lexington, Kentucky, 20 October 2011. I highly recommend both the ale and the restaurant, but please note that Kentucky Bourbon Barrel Ale is approximately 8 percent alcohol. Just like the BPE, it's a little more intoxicating than one might expect.

Thursday, October 20, 2011

BPE 2011: ERA and the move to the cloud

This week, I’m spending a little time with my parents in Ohio and at the 2011 Best Practices Exchange (BPE) in Lexington, Kentucky. The BPE, which brings together state government, academic, and other archivists and librarians and other people seeking to preserve state government enduring information of enduring value, is my favorite archival conference. The Society of American Archivists annual meeting is always first-rate, but it’s gotten a little overwhelming, and I love the Mid-Atlantic Regional Archives Conference (MARAC), but nothing else has the small size, tight focus on state government records, informality, and openness that characterize the BPE.

Before I start detailing today’s highlights, I should say a few things about the content of these posts. For the past few years, those of us who have attended the BPE have tried to adhere to the principle that “what happens at BPE, stays at BPE.” This doesn’t mean that we don’t share what we’ve learned at the BPE (hey, I’m blogging about it!), but it does mean that we’re sensitive to the fact that candor is both essential and risky. The BPE encourages people to speak honestly about how and why projects or programs went wrong and what they learned from the experience. Openness of this sort is encouraging; all too often, we think that we’re alone in making mistakes. It's also helpful: pointing out hidden shallows and lurking icebergs helps other people avoid them. However, sometimes lack of senior manager commitment, conflicts with IT personnel, and other internal problems contribute to failure, and colleagues and supervisors occasionally regard discussion of internal problems as a betrayal. As a result, BPE attendees should exercise some discretion, and those of us who blog about the BPE should be particularly careful; our posts are a single Web search away. As a result, in a few instances I may write about the insights and observations that attendees have shared but obscure identifying details.

Moving on to this year's BPE itself, I'm going to devote the rest of this post to the insights and predictions offered up by U.S. National Archives and Records Administration (NARA) Chief Information Officer Mike Wash, who spoke this morning about the Electronic Records Archives (ERA), NARA’s complex, ambitious, and at times troubled electronic records system, and some changes that are on the horizon.

At present, ERA sort of works: staff use it to take in, process and store electronic records. At present, ERA holds approximately 130 TB of data. The Office of Management and Budget wants NARA to take in 10 TB of data per quarter, and NARA is working with agencies to meet this benchmark. However, ERA lacks an integrated access mechanism, and it contains multiple modules. The Base module handles executive agency data, the EOP module handles presidential records (and includes some internal access mechanisms), the Classified module holds classified records, and several other modules were built to deal with specific problems.

Building ERA taught NARA several lessons:
  • Solution architecture is critical. ERA’s multiple modules are a sign of a failed system architecture. Anyone building such a system must carefully consider the business and technical architecture carefully during the planning stage and must manage the architecture carefully over time.
  • The governance process must be clear and should start with business stakeholders. What do they really need the system to do, and how do you ensure that everyone stays on the same page throughout the process? Information technology invariably challenges control and authority, but if you set up your governance process properly, you should be able to retain control over system development.
  • Over communicate. Funders and other powerful groups need frequent updates; failure to keep feeding information to them can be profoundly damaging.
  • You must manage the project. The federal government tends to hire contractors to develop IT systems, and contractor relationships tend to deteriorate about six months after the contract is awarded. Most federal agencies cede authority to contractors because they are loath to be seen as responsible in the event that a project fails, but staying in control of the project increases your chances that you'll get the system you want.
  • Watch costs closely. Cost-escalating provisions have a way of sneaking into contracts.
  • Be mindful of intellectual property issues. The federal government typically reserves the right to all intellectual property created as a result of contracts, but this doesn’t always happen, and the vendor that built the first iteration of ERA has asserted that it controls some of the technology that now makes the system work; NARA will be much more assertive in working with future ERA vendors.
Wash also made some intriguing observations about some of the challenges that NARA and other archives are confronting:
  • At present, our ability to acquire data is limited by bandwidth limitations. It takes more than three days to convey 20 TB of data over a 1 gbps data line and at least a month to convey it via the Internet. NARA recently took custody of 330 TB of 2010 Census data, and it did so by accepting a truckload of hardware; at present, there are no alternatives to this approach.
  • The rate of data creation continues to accelerate. The administration of George W. Bush created 80 TB of records over the course of 8 years, but the Obama administration likely created more than 80 TB of data during its first year.
Wash indicated that NARA is starting to think that federal records should be created and maintained in a cloud computing environment and that transfer of custody from the creating agency to NARA should be effected by changing some of the metadata associated with the records being transferred.

Wash noted that the move to cloud computing will bring to the fore new preservation and authentication concerns. It also struck me that the transition that Wash envisions assumes the existence of a single federal government cloud that has adequate storage, security, and access controls and that, at least at this time, many states aren’t yet thinking of constructing such environments. Individual state agencies may be thinking of moving to the cloud, but most states don't seem to be preparing to move to a single, statewide cloud environment. Moreover, owing to its sheer size, the federal government is better able to negotiate favorable contract terms than state or local governments; the terms of service agreements that the feds hammered out with various social media providers are an excellent example. I have the uneasy feeling that some governments will accept, out of lack of knowledge, desperate financial straits, or inability to negotiate optimal terms, public cloud service contracts that prove problematic or outright disastrous.

Its nonetheless apparent that government computing will move into the cloud, that this transition offers both new challenges and new opportunities for managing and preserving records, and that archivists and records managers are going to have come to grips with these changes. The next decade promises to be most interesting.

The Lexington Laundry Company building on West Main Street, Lexington, Kentucky, 20 October 2011. This little gem was built ca. 1929, is an outstanding example of Art Deco architecture in the city, and is part of Lexington's protected Downtown Commercial District. It now houses an art gallery.

Wednesday, October 12, 2011

Drew University employee sentenced

Last week, a federal judge sentenced 20 year-old William Scott, the former Drew University student employee who admitted to stealing 31 historical documents from the university's United Methodist Archives Center, to three years of probation and three hundred hours of community service.

When I first learned of Scott's sentence, I was a little steamed. In my opinion, just about anyone convicted of stealing cultural heritage materials deserve to spend at least a little time in a correctional facility; I might be willing to make exceptions for people who steal to feed their families or to pay for lifesaving medial treatment for a loved one, but that's about it.

I nonetheless recognize that imprisonment is expensive and that incarcerating a young, non-violent offender who does not have a prior criminal record might not be the best use of our limited resources. Moreover, the sentencing judge and prosecuting U.S. attorney clearly wanted to make sure that Mr. Scott's will have ample cause and opportunity to reflect upon his misdeeds. While on probation, Mr. Scott:
  • Must adhere to a 9:00PM curfew. (Most people would find such a curfew restrictive, but such restrictions are particularly painful for younger adults such as Mr. Scott, who once described himself as a night person who enjoys partying.)
  • Is barred from working any job that would give him access to cultural heritage materials.
  • Must write a monthly letter to the court describing the progress of his life.
  • Must write to each of the 72 people who submitted character letters to the court on his behalf and explain what his experience of theft, prosecution, and conviction has taught him.
  • May list his 300 hours of community service on his resume only if he specifies that said community service was court-ordered.
If I were 20 years old and forced to adhere to these conditions for 3 years, I suspect that the thought of spending 6 or 12 months in a minimum-security facility might seem like a reasonable alternative . . . .

Of the 31 documents that Mr. Scott stole, 30 have been recovered. The missing item is the two-sided second page of a letter that Charles Wesley wrote in 1755. United Methodist Archives Center staff scanned the Wesley letters in its holdings some time before Scott arrived on campus. If you ever come across an incomplete, double-sided document bearing Charles Wesley's autograph (stranger things have happened), you can compare it to the digital images of Wesley family letters that the repository contributed to the American Theological Library Association's Cooperative Digital Resources Initiative.

Thursday, October 6, 2011

Former NARA employee pleads guilty to theft

On Tuesday, Leslie Charles Waffen, a career U.S. National Archives and Records Administration (NARA) archivist who ultimately became head of its Motion Picture, Sounds and Video Recording Branch, pleaded guilty to stealing at least 955 NARA-held sound recordings worth approximately $30,000. Dozens of boxes of recordings were found were found when officials raided his home, but he sold others on eBay using the account name "hi-fi_gal."

The Baltimore Sun reports that the investigation into Waffen's criminal activities cost approximately $48,000. The sale that led investigators to swoop in -- a 1937 audio recording of baseball legend Babe Ruth -- netted him $34.74.

As an archivist who has lived through a major internal theft, I have immense sympathy for all of the NARA employees whose lives have been turned upside down as a result of Waffen's illicit activities, which came to light last October and will continue to affect NARA's operations for years to come. Internal theft leaves in its wake powerful feelings of outrage, betrayal, and humiliation, and it takes a long time for those emotions to become manageable. Some of my colleagues have said that it took about a year after my former co-worker's theft came to light for them to come to grips with our experience, and some of us (myself included) needed even more time. All of us will carry the experience with us throughout the remainder of our lives and our careers; if you look at the membership roster of the Society of American Archivists' Security Roundtable, you'll note the presence of a healthy contingent of New York State Archives employees.

Moreover, internal theft always prompts -- as it should -- changes in security procedures and protocols. It's not unusual to understand intellectually the need for these changes while at the same time resenting the ways in which they make it harder to do one's job. I'm a big proponent of improving security in archival repositories -- as evidenced by numerous past posts on this blog -- but every now and then I can't help but blame my thieving former co-worker for some minor security-related inconvenience.

I realize that the above statements may seem a bit gloomy, but I do want to say to any current or former NARA employee who reads this blog that things will get better. You and your employer will both come to terms with this experience, and you will eventually adjust to the "new normal," whatever it may be. It won't happen quickly or easily, but it will happen.

Leslie Charles Waffen will be sentenced on 5 March 2012. As noted in his plea agreement, he faces a maximum sentence of ten years in prison and a maximum fine of $250,000.