Showing posts with label open source. Show all posts
Showing posts with label open source. Show all posts

Thursday, June 24, 2010

NYAC/ARTNY: open source

Westbound on the Franklin Delano Roosevelt Mid-Hudson Bridge, late evening, 4 June 2010.

[I wrote this post a while ago, but I spent some time away from home and, for the most part, the Internet. Now that I'm back in cyberspace, the pace of posting is going to pick up a bit.]

While at the recent joint meeting of the New York Archives Conference and Archivists Roundtable of Metropolitan New York, I took part in Session 12, Using Open Source Software. I had a good time putting together and delivering my presentation, and both of my co-presenters were stellar. Although the three of us met only a few minutes before our session began, our presentations meshed well, in large part because all of us approached open source in the same pragmatic fashion; i.e., we largely avoided the open-source-versus-proprietary-software debate and encouraged people to use open source software when doing so met their business needs.

I discussed the Open Source Initiative’s definition of open source software, open source as a model of software development in which programmers work independently and collaboratively to write code and review each other’s work, and open source as a philosophical belief that sharing information and knowledge is good in and of itself and a spur to the development of more information and knowledge. I also detailed the practical advantages (e.g., no cost of acquisition) and disadvantages (e.g., support and technical documentation that may range from excellent to abysmal) and highlighted some open source applications of particular value to archivists working in smaller repositories:
I was really pleased that I got the chance to hear Seth Kauffmann, who spearheaded the development of CollectiveAccess, discuss its development, functionality, and support options. CollectiveAccess is a cataloging and Web presentation system that supports every descriptive standard (and local variant thereof) and that can be used to gather and manage information about people and places as well as digital surrogates of cultural heritage materials. After highlighting PhilaPlace and some other really cool projects powered by CollectiveAccess, he offered some sound words of advice to anyone involved in an archival software development project:
  • Planning is good. This point may seem painfully obvious, but it’s all too often overlooked. (I would hasten to add that “we’ll leave all the technical stuff up to the programmer/vendor because its his/her/its job” is not a plan)
  • Don’t assume that all of your problems can be solved with software or technology.
  • Involve real users in the development process. Seth stressed that one of CollectiveAccess’s strengths is its “community of self-interest development model” -- archivists who have grants to “do something real” drove its creation and guide its evolution.
  • Be realistic about the quality and extent of your existing metadata and digitized resources.
  • Involve archivists at the start of the development process and keep them involved throughout the project (no argument here)
Rick Cobello, the Schenectady County Information Technology Director, then detailed how the county is deploying open source software at the enterprise level. Fiscal pressures are forcing governments to choose between cutting operating costs and laying off staff, and open source software will enable the county to reduce costs and provide better services.

At present, roughly half of Schenectady County’s IT budget is devoted to fixing and decontaminating desktop computers. The county is now centralizing almost all of its storage and applications, and most county employees will have only a monitor, keyboard, mouse, and Pano Logic client that will enable them to access the county’s central servers. The county will no longer have to install antivirus protection and update software on desktops, and desktop support staff will be able to focus on other projects. Estimated cost savings: at least 30 percent.

The county attorney’s office is now using OpenOffice.org, and Rick plans to move other county offices to OpenOffice.org after the county’s licensing agreements with Microsoft expire. Although the county is currently using a mixture of open source and proprietary software and will continue to use specialized proprietary software (e.g., geographic information system applications) well into the future, Rick’s ultimate goal is to stop paying licensing fees for any of the software needed to support routine office operations.

Rick emphasized that the county has support contracts for almost all of the open source software that it uses: he believes in supporting organizations that create open source software, and paying for support is less expensive than paying licensing fees. I find this approach is both altruistic and smart: in addition to sustaining worthwhile projects, he's helping to ensure that the software he's using will be updated and enhanced.

Public-sector budgets always lag behind the economy. The coming years are going to be extremely tight, and I think that a lot of government IT directors are going to make many of the same decisions that Rick has made -- and look for other ways in which open source, among other things, can save money. It’s going to be really interesting to see just how things pan out.

Sunday, June 6, 2010

NYAC/ARTNY: Archivists' Toolkit

The Hudson River, as seen from the grounds of the Vanderbilt Mansion National Historic Site, Hyde Park, New York, 4 June 2010.

Last week, I attended the joint meeting of the New York Archives Conference (NYAC) and the Archivists Roundtable of Metropolitan New York, (ARTNY) which was held at Marist College in Poughkeepsie. Unfortunately, Mac-using attendees discovered upon arrival that, despite Marist’s promises to the contrary, they could not connect to Marist’s wireless network. Now that I’ve reconnected, I’ll put up a couple of posts about the highlights of this year’s conference.

In my view, the best session of the conference was Session 1, “Implementing, Modifying, and Teaching the Archivists' Toolkit.” The Archivists’ Toolkit (AT) is an increasingly popular open source tool that supports accessioning, location management, and description of archival materials, and the session itself attracted a capacity crowd.

Janet Bunde of New York University (NYU) discussed a recent effort to integrate the AT into NYU’s Advanced Archival Description course so that students, who typically lacked the funds needed to attend AT workshops sponsored by the Society of American Archivists, would become familiar with the tool and hone their descriptive skills. The students reviewed the AT user’s manual in advance, then devoted an entire class session to entering sample data into the AT. At the end of the class, students discussed where they entered specific data elements and the descriptive output that resulted. Although the discussion wasn’t as extensive as Bunde would have liked, it shed light on students’ descriptive choices and revealed that, despite the use of some odd terminology, the AT’s interface is relatively intuitive.

Bunde stressed that this exercise didn’t, in and of itself, teach archival description, but it made me think about how to do so. I created a handful of MARC records while working as a student assistant, but I really didn’t feel comfortable with description until I found myself responsible for reviewing MARC records created by archivists at other repositories. I soon acquired an intimate knowledge of MARC and the ability to differentiate between acceptable variations in local practice and out-of-bounds tag usage. I really like the idea of having students openly compare and defend their descriptive choices, and using the AT as a teaching tool has real promise, particularly if, as NYU plans to do this fall, it’s incorporated more fully into the course curriculum.

Deena Schwimmer of Yeshiva University discussed how her repository, which has only two professional staffers and few IT resources, used the AT to centralize, as quickly as possible, holdings and descriptive information about its manuscript collections. Working with a clerical assistant, Schwimmer first culled holdings information from donor files and the relatively small number of MARC records describing the collections and entered it into the AT. Then, working in tandem with an intern who created collection-level descriptions, she used the AT to create Encoded Archival Description (EAD) finding aids that contained only the most basic descriptive elements: Biographical/Historical Note, Scope and Content, Abstract, Conditions Governing Access, Conditions Governing Use, and Language of Materials, and Title and Date information. She also used the AT to manage the project: she added fields that identified whether an EAD finding aid had been produced and enabled her and her intern to exchange notes about specific collections.

Schwimmer’s project exemplifies what a single results-minded archivist can do with a well-chosen tool and a little student and clerical help. Before Schwimmer’s project began, approximately a third of Yeshiva’s 2500 linear feet of manuscript holdings had been described, and when the project wrapped up roughly 18 months later, every collection had at least a basic finding aid. I think we’re going to see lots of similar AT success stories during the next few years, and, needless to say, I think that this is a very good thing.

Marisa Hudspeth of the Rockefeller Archive Center (RAC) then discussed how her repository is building a new AT reference module that will both meet its needs and enable it to, via release of the module’s source code and documentation, give back to the archival community. The RAC had been using a proprietary tool that supported patron registration and tracking of duplication services, but moved to the AT because of its robust collections management and descriptive modules. When it became apparent that the AT development team's energies were focused elsewhere, the RAC decided to hire several former team members and build a reference module itself.

When it’s completed, the reference module will perform the following functions:
  • Patron registration: will track research visits, publications, completion of necessary research paperwork, and research awards; and facilitate generation of statistics and reports.
  • Duplication services: will manage all types of requests; create standardized invoices in PDF; store fee schedules and shipping rates and automatically calculate totals; track service requests; generate statistics and reports; and securely manage payment information.
  • Retrievals, bar-coding, and use tracking: will track use of materials by patrons; generate statistics and reports; automate the charge-out procedure using barcoding; add barcoding functionality to the AT’s Accession module; support printing of barcodes and box labels; and enable both archivists and researchers to submit pull requests electronically via clicking on boxes in the RAC’s EAD finding aids.
  • Reference requests and reading room scheduling: will electronically distribute reference requests to staff; allow staff to respond to requests within the AT; store request histories, staff assignments, and responses; generate statistics and reports; and enable archives that have limited research facilities to manage scheduling of research appointments and factor in holiday closings, weather incidents, and other events.
  • Personalized user accounts: will enable patrons to update their contact information, submit reference requests, schedule and cancel research appointments and sign up for waiting lists; receive notifications of closings and research room vacancies; sign up for newsletters and the like; view an orientation video and agree to the RAC’s terms of use; track the status of their duplication requests; review their own request histories; bookmark and comment on finding aids; submit funding paperwork; electronically sign forms, and, if they wish to do so, connect with other researchers.
At present, the RAC doesn’t know how this reference module will work with ArchivesSpace, which will, when completed, merge the AT and Archon, another open source archival data management system. However, the RAC will release the code and continue using it, even if the module can’t be incorporated into ArchivesSpace.

After this session ended, I was talking to a colleague about the RAC’s work, and we were both struck by the degree to which reference-related information systems remain paper-driven -- not only at our repository but also at many, many others. Our own repository is currently developing some of the functionality that will be included in the reference module (e.g., barcoding and use tracking), but we’re still terribly paper-centric. The RAC’s work ought to help propel the move away from paper, and it’s going to be really interesting to see how this exciting project pans out.

If you are an AT user and want to track reference requests, duplication services, etc., electronically, the RAC is looking for reference module beta testers. The module’s first component -- patron registration -- should be finished within a few weeks, and the entire module has a scheduled completion date of 31 December 2011, so things are moving right along. If you're interested in serving as a beta tester, contact Marisa Hudspeth at mhudspeth-at-rockarch.org.