Well, here it is, a mere ten days late: my final MARAC Spring 2009 post. I think I’m going back to the daily post style that I used at SAA 2008 -- unless, of course, anyone out there has a better idea . . . .
The last session I attended highlighted the many different ways in which archives are using wikis. I learned a few things about the varied uses to which wikis can be put . . . and a few things about why my own experiences with them have been less than satisfactory.
Kate Colligan outlined her use of a wiki to support the University of Pittsburgh's processing of the records (1887-1973) of the Allegheny County (Pa.) Coroner. Approximately 30 people, most of them undergraduate interns, ultimately participated in this project, which involved the flattening, rehousing, and indexing of approximately 220,000 trifolded documents.
In order to sustain the interns’ interest in the project and satisfy the writing component of their internships, Colligan created the Coroner Case File Documentation Wiki. This wiki allowed the interns to share in real time interesting things they found within the records, add descriptive tags, supply file arrangement information, and document their responses to files concerning murders, suicides, and accidents. Colligan also gave students research assignments that broke up the monotony of (and sometimes disrupted) processing, and this research is reflected in the wiki’s detailed timeline of life in Pittsburgh.
Colligan concluded that when working with wikis, immediacy is a more important goal than perfect writing and presentation. One should also have a clear sense of one’s target readership. In the final analysis, the core readership of this wiki seems to have been the project staffers themselves; however, the wiki has been discussed in genealogical chat rooms and has gotten a fair amount of international traffic.
Finally, Colligan noted that the creation of the wiki means that the preservation issues associated with this project have grown to encompass digital materials. She isn’t sure what the future holds for this wiki, but it has survived a recent migration from an older version of the wiki software (PBWiki) to a newer one (PBWorks).
Jean Root Green succinctly discussed the Binghamton University Libraries’ internal staff wiki. The wiki (created with MediaWiki) has been in place since 2005, and its unveiling was accompanied by a lot of staff training and the development of style guides, templates, and resources that made it easier for staff to use the wiki appropriately. She stressed that the careful planning that went into the development of the wiki and its supporting materials is crucial to the wiki’s success: even people who generally aren’t comfortable with technology feel comfortable making use of the wiki.
The wiki enables staff to discuss internal matters candidly, collaborate on policy and other documents, and it automatically records and tracks changes. It has pages for all projects, committees, task forces, etc., and includes documentation for and links to additional information about all of the libraries’ information technology systems. In addition, it enables staff to publicize collections internally and post reports about conference sessions and other professional development events that they have attended.
David Anderson detailed how George Washington University’s Special Collections Research Center used MediaWiki to create the George Washington University and Foggy Bottom Historical Encyclopedia. Unlike paper encyclopedias, which fade from consciousness soon after publication, this encyclopedia is online, constantly updated, and frequently consulted.
Work on the encycopedia began in 2006, when Anderson created templates and instructions for adding content, and to this day it adheres more closely to the traditional scholarly model of enyclopedia production than to the interactive Wikipedia model: two editors initially oversaw the development of the enyclopedia, and Anderson now serves as the gatekeeper for all additions and revisions. I suspect that Anderson and his colleagues were drawn to MediaWiki not because it can incorporate user-generated content but because it’s free and easy to use.
Scanned documents, articles written by faculty, staff, and students, timelines, and other materials are regularly added to the encyclopedia. At this time, there are 2,910 items in the database and 648 legitimate content pages; each photo is counted as a separate page, hence the discrepancy. There have been over 2 million page views to date. The most popular pages are the main page, the A-Z listing of campus buildings, and pages dedicated, among other things, to football (the university hasn’t fielded a team since 1966), distinguished alumni, Muhummad Ali (who one spoke on campus), various aspects of student life, and cheerleading.
Anderson noted that Google and other search engines have indexed these pages, and as a result he and his colleagues have gotten some non-historical reference inquiries; as a result, he has modified some pages to include pointers to, e.g., campus events calendars.
I’m glad I attended this session. Wikis really are suited to the sort of internal information-sharing that Jean Green discussed, and can readily serve as the backbone of scholarly Web projects of the sort that David Anderson developed. Kate Colligan’s processing wiki is also a great use of the technology; such wikis can capture information that might otherwise remain unrecorded.
However, wikis also have their limits, and this session led me to realize that my colleagues and I have sometimes used wikis not because they were the best tool for the job but because they were the least awful of the available IT options. In some instances, we actually need is something that combines the best features of, e.g., Microsoft Word (i.e., ability to create long, complex, highly formatted documents) with the ease of use and change tracking features of the best wiki software -- without the clutter and chaos of, e.g., Track Changes. If you have any suggestions, I would be most appreciative.
Tuesday, April 28, 2009
MARAC: Flickr: An Image is Worth a Thousand Views
Flickr is an online photo sharing site that enables users to “tag” (i.e., provide descriptive and other information about) one’s images. In this great session, archivists working in a variety of settings highlighted its practical value to archives.
Barbara Natonson discussed a pilot project undertaken by the Library of Congress (LC), which wanted to learn how social tagging could help cultural institutions and participate in an online community. LC chose Flickr because of its popularity and because its application programming interface (API) facilitated batch loading of photos. LC’s experience should be of interest to many larger repositories.
LC determined at the outset that every image it placed on Flickr would be available via its own site and that it would post only those images that lacked known copyright restrictions. It then did some custom programming that made batch loading practical and made its copyright statement (developed in consultation with the U.S. Copyright Office) appear whenever one of its photos was displayed. It also purchased a Flickr Pro account ($24/year) that allowed it to add large numbers of images and view access statistics.
LC’s first photos went online in early 2008, and LC adds new photos on a weekly basis. As of mid-March 2009, LC’s Flickr images have gotten roughly 15 million views. Most of the traffic comes from Flickr itself, but some of it arrives via seach engines, which index user comments.
To date, approximately 4,500 users have commented on at least one LC image. However, 40 percent of the tags are supplied by a small group of people, and most of the comments concerning images accompanied by good descriptive information simply repeat that information or document emotional/aesthetic responses. Images that lack such information produce the informative tags and comments that LC seeks.
A core group of approximately 20 “power commenters” corrects place names, supplies additional descriptive information, does historical detective work, and incorporates LC images into Wikipedia, etc., entries. These commenters have also highlighted how places have changed over time; photos documenting changes and links to GoogleEarth accompany some of these discussions.
LC actively monitors its Flickr photosets for uncivil discourse, and staff incorporate user-supplied information into LC’s descriptive resources and periodically update Flickr users on LC’s work; this work takes about 15-20 hours per week, and staff rotate responsibility for it. LC has also started incorporating links to Flickr versions of its images into its online catalog.
Natonson noted that that there are some risks to Flickr (and, by extension, other Web 2.0 technologies):
The other presenters highlighted how smaller repositories could make use of Flickr. Judy Silva discussed how the Slippery Rock University Archives, which uses CONTENTdm to manages its digital collections, has used Flickr to reach out to new audiences and experiment with Web 2.0 technology. Slippery Rock’s Flickr project, which made use of the university library’s existing Flickr account, centered on 41 digitized photographs taken by an alumnus during his time in service during the Second World War.
It took Silva one afternoon to load the images into Flickr and do some very basic (i.e., non-LC) tagging, and the rewards have been substantial: to date, Slippery Rock has gotten over 700 comments on these photographs, and one commenter forwarded the obituary of one of the people depicted in one of the images.
Owing to the success of this project, Silva is thinking of adding more recent images in an effort to get information from people who might Google themselves or their friends.
Malinda Triller was not able to come to Charleston, so her colleague Jim Gerenscer discussed how Dickinson College's Archives and Special Collections department, which also uses CONTENTdm, is using Flickr to publicize and obtain more information about its photographic holdings.
By design, the archives’ Flickr project was simple enough to be completed largely by undergraduates. The archivists identified images that lacked copyright restrictions, had appeal outside of the Dickinson community, and had basic contextual metadata, and students scanned the images and added them to Flickr.
Unlike LC and many other repositories, which create high-resolution master images in TIFF format and mount lower-resolution JPEG derivatives on Flickr and their own Web sites, Dickinson didn’t want to manage TIFF files. Students thus scanned the images in JPG format, at 100 dpi, and in grayscale or color as appropriate; in the future, the archives will rescan the images as needed. Project work is documented in a basic spreadsheet that contains the unique identifier, description (collection-derived or student-supplied), and title of each image.
To date, Dickinson’s Flickr photosets, which consist of images of an 1895 family trip to Europe, the 1893 Columbian Exposition, a school for Native American children, and construction of a major road in Alaska, have received 66,000 hits, which is a remarkable amount of exposure for a college archives; however, the archives recently learned that its Flickr account settings greatly limited the number of people who could comment upon the images, and it corrected this error a short time ago. The archives is really pleased with the project and is planning to add another set of images to Flickr.
I think that a lot of archivists are hesitant to embrace Flickr and other interactive Web technologies because they either don’t grasp their potential or fear that they’ll find themselves in the midst of a digital Wild West. This session highlights how repositories of varying sizes can use Web 2.0 technology without being consumed by it or losing physical or intellectual control of their holdings, and many of the attendees seemed really intrigued by these presentations. I suspect that The Commons will grow as a result of this session . . .
Barbara Natonson discussed a pilot project undertaken by the Library of Congress (LC), which wanted to learn how social tagging could help cultural institutions and participate in an online community. LC chose Flickr because of its popularity and because its application programming interface (API) facilitated batch loading of photos. LC’s experience should be of interest to many larger repositories.
LC determined at the outset that every image it placed on Flickr would be available via its own site and that it would post only those images that lacked known copyright restrictions. It then did some custom programming that made batch loading practical and made its copyright statement (developed in consultation with the U.S. Copyright Office) appear whenever one of its photos was displayed. It also purchased a Flickr Pro account ($24/year) that allowed it to add large numbers of images and view access statistics.
LC’s first photos went online in early 2008, and LC adds new photos on a weekly basis. As of mid-March 2009, LC’s Flickr images have gotten roughly 15 million views. Most of the traffic comes from Flickr itself, but some of it arrives via seach engines, which index user comments.
To date, approximately 4,500 users have commented on at least one LC image. However, 40 percent of the tags are supplied by a small group of people, and most of the comments concerning images accompanied by good descriptive information simply repeat that information or document emotional/aesthetic responses. Images that lack such information produce the informative tags and comments that LC seeks.
A core group of approximately 20 “power commenters” corrects place names, supplies additional descriptive information, does historical detective work, and incorporates LC images into Wikipedia, etc., entries. These commenters have also highlighted how places have changed over time; photos documenting changes and links to GoogleEarth accompany some of these discussions.
LC actively monitors its Flickr photosets for uncivil discourse, and staff incorporate user-supplied information into LC’s descriptive resources and periodically update Flickr users on LC’s work; this work takes about 15-20 hours per week, and staff rotate responsibility for it. LC has also started incorporating links to Flickr versions of its images into its online catalog.
Natonson noted that that there are some risks to Flickr (and, by extension, other Web 2.0 technologies):
- Disrespect for collections -- Flickr privileges the individual image
- Loss of meaning/contextual information -- LC links Flickr images to its descriptive information in an effort to remedy
- Reduced revenue from photo sales
- Undigitized collections are by definition excluded
- Collections are made more widely available
- LC gets additional information about its collections
- The visibility of specific photos is increased
- LC’s Flickr presence helps win support for cultural heritage institutions
- Users can mix past and present -- thus leading to a more informed world
The other presenters highlighted how smaller repositories could make use of Flickr. Judy Silva discussed how the Slippery Rock University Archives, which uses CONTENTdm to manages its digital collections, has used Flickr to reach out to new audiences and experiment with Web 2.0 technology. Slippery Rock’s Flickr project, which made use of the university library’s existing Flickr account, centered on 41 digitized photographs taken by an alumnus during his time in service during the Second World War.
It took Silva one afternoon to load the images into Flickr and do some very basic (i.e., non-LC) tagging, and the rewards have been substantial: to date, Slippery Rock has gotten over 700 comments on these photographs, and one commenter forwarded the obituary of one of the people depicted in one of the images.
Owing to the success of this project, Silva is thinking of adding more recent images in an effort to get information from people who might Google themselves or their friends.
Malinda Triller was not able to come to Charleston, so her colleague Jim Gerenscer discussed how Dickinson College's Archives and Special Collections department, which also uses CONTENTdm, is using Flickr to publicize and obtain more information about its photographic holdings.
By design, the archives’ Flickr project was simple enough to be completed largely by undergraduates. The archivists identified images that lacked copyright restrictions, had appeal outside of the Dickinson community, and had basic contextual metadata, and students scanned the images and added them to Flickr.
Unlike LC and many other repositories, which create high-resolution master images in TIFF format and mount lower-resolution JPEG derivatives on Flickr and their own Web sites, Dickinson didn’t want to manage TIFF files. Students thus scanned the images in JPG format, at 100 dpi, and in grayscale or color as appropriate; in the future, the archives will rescan the images as needed. Project work is documented in a basic spreadsheet that contains the unique identifier, description (collection-derived or student-supplied), and title of each image.
To date, Dickinson’s Flickr photosets, which consist of images of an 1895 family trip to Europe, the 1893 Columbian Exposition, a school for Native American children, and construction of a major road in Alaska, have received 66,000 hits, which is a remarkable amount of exposure for a college archives; however, the archives recently learned that its Flickr account settings greatly limited the number of people who could comment upon the images, and it corrected this error a short time ago. The archives is really pleased with the project and is planning to add another set of images to Flickr.
I think that a lot of archivists are hesitant to embrace Flickr and other interactive Web technologies because they either don’t grasp their potential or fear that they’ll find themselves in the midst of a digital Wild West. This session highlights how repositories of varying sizes can use Web 2.0 technology without being consumed by it or losing physical or intellectual control of their holdings, and many of the attendees seemed really intrigued by these presentations. I suspect that The Commons will grow as a result of this session . . .
Sunday, April 26, 2009
MARAC: Will the Fruit Be Worth the Harvest? Pros and Cons of End of Presidential Term Web Harvesting Projects
We as a profession are still trying to figure out how to deal with Web sites, which exist in the netherworld separating archives and libraries and pose a host of preservation challenges, and this session furnished interesting insight into the contrasting approaches of the U.S. National Archives and Records Administration (NARA) and the Library of Congress (LC).
Session chair Marie Allen (NARA) noted that NARA’s handling of Web records has consistently engendered controversy. Its 2000-01 decision to compel Federal agencies to copy their Web site files, at their expense, at the end of President Bush’s first term of office and transfer them to NARA within eight days of doing so angered agencies, and its 2008 decision not to take a Web snapshot (i.e., a one-time copy) of federal agency sites at the end of President George W. Bush’s second term aroused public concern.
Susan Sullivan (NARA) pointed out that in 2004 NARA had contracted with the Internet Archive to copy publicly accessible federal government Web site that it had identified and to provide access to the copies, then explained the rationale for NARA’s 2008 decision: it has determined that Web records are subject to the Federal Records Act and must be scheduled and managed appropriately. It issued Guidance on Managing Web Records in January 2005 and has since offered a lot of training and assistance to agencies; some of this information is available on NARA’s Toolkit for Managing Electronic Records, an internet portal to resources created by NARA and many other entities.
Sullivan emphasized that snapshots are expensive, have technical and practical shortcomings, and encourage the agency misperception that NARA is managing Web records. In fact, there is no authoritative list of federal government sites, which means that snapshots fail to capture at least some sites. Moreover, snapshots capture sites as they existed at a given point of time, cannot capture Intranet or “deep Web” content, and are plagued by broken links and other technical limitations. In sum, snapshots do not document agency actions or functions in a systematic and complete manner.
NARA is still copying Congressional and Presidential Web sites, which are not covered by the Federal Records Act. Although these snapshots have all of the problems outlined above, NARA regards them as permanent.
Abbie Grotke (LC) then outlined LC’s response to NARA’s 2008-09 decision: in partnership with the Internet Archive, the California Digital Library, the University of North Texas, and the Government Printing Office, it opted to take snapshots of publicly accessible federal government sites. All of the partners seek to collect and preserve at-risk born-digital government information, and all of them believed that the sites had significant research potential.
The partners developed a list of URLs of publicly accessible federal government sites in all three branches of government; they placed particular emphasis on identifying sites that were likely to disappear or change dramatically in early 2009. They then asked a group of volunteer government information specialists to identify sites that were out of scope (e.g., commercial sites) or particularly worthy of crawling (e.g., sites focusing on homeland security). This process ultimately yielded a list of approximately 500 sites.
The partners took a series of comprehensive snapshots and a number of supplemental snapshots focusing on high-priority sites. Much of this work centered on two key dates -- Election Day and Inauguration Day -- but some copying is still taking place.
Grotke outlined the project’s challenges, which will be familiar to any veteran of a multi-institutional collaborative project. The partners had no official funding for this project and thus have had to divert staff and resources from day-to-day operations. They have also had a difficult time managing researcher expectations: users want immediate access to copied sites, but the indexing process is time-consuming. The partners have also had to accept that, owing to the technical limitations of their software and the possibility that some sites escaped their notice, they could not fully capture every federal government site.
The snapshots have nonetheless captured a vast quantity of information that might otherwise be lost, and the project is also paving the way for future collaborations.
Thomas Jenkins (NARA) then explained how Web sites fit into NARA’s three-step appraisal process, which is guided by Directive 1441 (some of which is publicly accessible):
In response to a comment concerning whether Web snapshots capture how an agency presents itself to the public, Jenkins stated that NARA assesses whether the information presented on a given site is unique. Moreover, NARA is aware that other entities are crawling federal government sites. Although there is a risk that this crawling activity will cease, a risk analysis indicated that archival records and other sources of information amply document the agency’s activities.
Although this session illuminated how and why NARA and LC reached such sharply contrasting decisions and highlighted some resources that somehow escaped my attention, it underscored precisely why the profession hasn't reached any sort of consensus and is unlikely to do so in the near future. Many if not most state and local government archives lack the degree of regulatory authority afforded by the Federal Records Act, and as a result many of them will not want to rely upon the kindness of site creators. Archivists working in repositories with broad collecting missions may have great difficulty ensuring that creators properly maintain, copy, and transfer site files. Moreover, some archivists will doubtless differ with NARA's conclusion that documenting how site creators presented themselves to the public is not sufficient reason to take periodic Web site snapshots or otherwise preserve sites comprehensively. As a result, many of us will likely find LC's approach to federal government sites or NARA's handling of Congressional and Presidential Web sites more relevant to our own circumstances than NARA's treatment of executive-branch agency sites.
Session chair Marie Allen (NARA) noted that NARA’s handling of Web records has consistently engendered controversy. Its 2000-01 decision to compel Federal agencies to copy their Web site files, at their expense, at the end of President Bush’s first term of office and transfer them to NARA within eight days of doing so angered agencies, and its 2008 decision not to take a Web snapshot (i.e., a one-time copy) of federal agency sites at the end of President George W. Bush’s second term aroused public concern.
Susan Sullivan (NARA) pointed out that in 2004 NARA had contracted with the Internet Archive to copy publicly accessible federal government Web site that it had identified and to provide access to the copies, then explained the rationale for NARA’s 2008 decision: it has determined that Web records are subject to the Federal Records Act and must be scheduled and managed appropriately. It issued Guidance on Managing Web Records in January 2005 and has since offered a lot of training and assistance to agencies; some of this information is available on NARA’s Toolkit for Managing Electronic Records, an internet portal to resources created by NARA and many other entities.
Sullivan emphasized that snapshots are expensive, have technical and practical shortcomings, and encourage the agency misperception that NARA is managing Web records. In fact, there is no authoritative list of federal government sites, which means that snapshots fail to capture at least some sites. Moreover, snapshots capture sites as they existed at a given point of time, cannot capture Intranet or “deep Web” content, and are plagued by broken links and other technical limitations. In sum, snapshots do not document agency actions or functions in a systematic and complete manner.
NARA is still copying Congressional and Presidential Web sites, which are not covered by the Federal Records Act. Although these snapshots have all of the problems outlined above, NARA regards them as permanent.
Abbie Grotke (LC) then outlined LC’s response to NARA’s 2008-09 decision: in partnership with the Internet Archive, the California Digital Library, the University of North Texas, and the Government Printing Office, it opted to take snapshots of publicly accessible federal government sites. All of the partners seek to collect and preserve at-risk born-digital government information, and all of them believed that the sites had significant research potential.
The partners developed a list of URLs of publicly accessible federal government sites in all three branches of government; they placed particular emphasis on identifying sites that were likely to disappear or change dramatically in early 2009. They then asked a group of volunteer government information specialists to identify sites that were out of scope (e.g., commercial sites) or particularly worthy of crawling (e.g., sites focusing on homeland security). This process ultimately yielded a list of approximately 500 sites.
The partners took a series of comprehensive snapshots and a number of supplemental snapshots focusing on high-priority sites. Much of this work centered on two key dates -- Election Day and Inauguration Day -- but some copying is still taking place.
Grotke outlined the project’s challenges, which will be familiar to any veteran of a multi-institutional collaborative project. The partners had no official funding for this project and thus have had to divert staff and resources from day-to-day operations. They have also had a difficult time managing researcher expectations: users want immediate access to copied sites, but the indexing process is time-consuming. The partners have also had to accept that, owing to the technical limitations of their software and the possibility that some sites escaped their notice, they could not fully capture every federal government site.
The snapshots have nonetheless captured a vast quantity of information that might otherwise be lost, and the project is also paving the way for future collaborations.
Thomas Jenkins (NARA) then explained how Web sites fit into NARA’s three-step appraisal process, which is guided by Directive 1441 (some of which is publicly accessible):
- Data gathering. When appraising Web sites, an archivist visits each site and analyzes the information found on it, interviews agency Web administrators, assesses the recordkeeping culture of the creating agency, and determines how the site’s content relates to permanent records in NARA’s holdings.
- Drafting of appraisal memorandum. The archivist prepares a detailed report that assesses the extent to which the site documents significant actions of federal officials, the rights of citizens, or the “national experience.” The report also examines the site’s relationship to other records identified as permanent (i.e., is the Web site the best and most comprehensive source of information?)
- Stakeholder review. Each appraisal memorandum is circulated within NARA and then published in the Federal Register in order to solicit agency and public input.
In response to a comment concerning whether Web snapshots capture how an agency presents itself to the public, Jenkins stated that NARA assesses whether the information presented on a given site is unique. Moreover, NARA is aware that other entities are crawling federal government sites. Although there is a risk that this crawling activity will cease, a risk analysis indicated that archival records and other sources of information amply document the agency’s activities.
Although this session illuminated how and why NARA and LC reached such sharply contrasting decisions and highlighted some resources that somehow escaped my attention, it underscored precisely why the profession hasn't reached any sort of consensus and is unlikely to do so in the near future. Many if not most state and local government archives lack the degree of regulatory authority afforded by the Federal Records Act, and as a result many of them will not want to rely upon the kindness of site creators. Archivists working in repositories with broad collecting missions may have great difficulty ensuring that creators properly maintain, copy, and transfer site files. Moreover, some archivists will doubtless differ with NARA's conclusion that documenting how site creators presented themselves to the public is not sufficient reason to take periodic Web site snapshots or otherwise preserve sites comprehensively. As a result, many of us will likely find LC's approach to federal government sites or NARA's handling of Congressional and Presidential Web sites more relevant to our own circumstances than NARA's treatment of executive-branch agency sites.
Labels:
e-records,
federal records,
MARAC Spring 2009,
Web crawling
Thursday, April 23, 2009
2009 Best Practices Exchange: call for proposals
The University at Albany, SUNY has issued the call for proposals for the 2009 Best Practices Exchange. It's making its way onto listservs targeting state government electronic records archivists, digital librarians, electronic records managers, and IT professionals. Just in case you haven't seen it yet, here it is . . . .
CALL FOR SESSION PROPOSALS: 4TH ANNUAL BEST PRACTICES EXCHANGES (BPE)
We are seeking proposals for sessions to be presented at the 4th annual Best Practices Exchange (BPE), which will be held in Albany, New York, at the University at Albany, SUNY, on September 2-4, 2009. The BPE is a conference that focuses on the management of digital information in state government, and it brings together practitioners to discuss their real-world experiences, including best practices and lessons learned. The theme of this year's BPE is "Tackling Technology Together." Its focus will be on collaboration between and within branches of state government, and between librarians, archivists, records managers, information technology professionals, and others concerned with managing state digital assets.
This year's conference has four tracks. Each track is enumerated below, along with a list of themes embraced by each track. We ask that potential speakers be guided, but not limited, by the themes indicated. Each session will be 90 minutes long.
1) Finding Funding: securing support, developing a marketing strategy, unexpected funding sources, and advocacy
2) Creative Collaboration: finding common ground, a seat at the table, and unexpected partners; crossing professional boundaries; fostering leadership; building communities; and sustaining collaboration
3) Educating Each Other: learning new technical skills and new "soft" skills, learning each others' language, and ensuring professional development
4) Living Without Closure: morphing from project to program, defining "finished," planning for an unknown future, finding new uses for old ideas and tools, and managing change
Please send all session proposals to Brian Keough, Head of the M.E. Grenander Department of Special Collections and Archives, University at Albany, SUNY, bkeough[at]uamail.albany.edu. The deadline for submission is July 15, 2009.
CALL FOR SESSION PROPOSALS: 4TH ANNUAL BEST PRACTICES EXCHANGES (BPE)
We are seeking proposals for sessions to be presented at the 4th annual Best Practices Exchange (BPE), which will be held in Albany, New York, at the University at Albany, SUNY, on September 2-4, 2009. The BPE is a conference that focuses on the management of digital information in state government, and it brings together practitioners to discuss their real-world experiences, including best practices and lessons learned. The theme of this year's BPE is "Tackling Technology Together." Its focus will be on collaboration between and within branches of state government, and between librarians, archivists, records managers, information technology professionals, and others concerned with managing state digital assets.
This year's conference has four tracks. Each track is enumerated below, along with a list of themes embraced by each track. We ask that potential speakers be guided, but not limited, by the themes indicated. Each session will be 90 minutes long.
1) Finding Funding: securing support, developing a marketing strategy, unexpected funding sources, and advocacy
2) Creative Collaboration: finding common ground, a seat at the table, and unexpected partners; crossing professional boundaries; fostering leadership; building communities; and sustaining collaboration
3) Educating Each Other: learning new technical skills and new "soft" skills, learning each others' language, and ensuring professional development
4) Living Without Closure: morphing from project to program, defining "finished," planning for an unknown future, finding new uses for old ideas and tools, and managing change
Please send all session proposals to Brian Keough, Head of the M.E. Grenander Department of Special Collections and Archives, University at Albany, SUNY, bkeough[at]uamail.albany.edu. The deadline for submission is July 15, 2009.
Catching up
I was planning to spend this evening blogging about the remainder of the Spring 2009 MARAC sessions that I attended, but I'm still trying to catch up all of the work and personal stuff that accumulated while I was at MARAC and with family in Mercer County, West Virginia. As a result, I'll likely need to wait until the weekend to catch up on all of my planned MARAC posts; all of the sessions were really good, and all of them warrant more thought than I'm capable of mustering at this time.
Wednesday, April 22, 2009
MARAC: There and Back Again: Nazi Anthropological Data at the Smithsonian
I wrote this post during a long layover at the Detroit Metro Airport on 21 April 2009, and finished around 8:35 PM, but simply wasn't prepared to pay $8.00 for the privilege of accessing DTW's wireless connection.
I attended this session simply because the topic seemed interesting, and I’m glad I did: the records at the center of this session are inherently interesting (albeit in a disturbing sort of way), have a complicated, transnational provenance, and processing them, reformatting them, and determining where they should be housed posed real challenges. Although most of us will never encounter a situation quite as complex, many of us eventually encounter records of uncertain or disputed provenance, materials that lack discernable order, or multi-stage reformatting projects. The decisions that the Smithsonian made and the lessons that it learned thus ought to be of interest to many archivists.
The records in question were created by the Institut für Deutsche Ostarbeit (IDO; Institute for German Work in the East), which the Nazis created in 1940 to settle all questions relating to occupation of Eastern Europe. Edie Hedlin (Smithsonian Institution Archives), Beth Schuster (Thomas Balch Library), and Ruth Selig (Smithsonian) took turns discussing the records’ complicated custodial history and the Smithsonian’s involvement with them.
The IDO had many sections, including one that focused on “racial and national traditions” and researched Polish ethnic groups; however, apart from one study completed in the Tarnow ghetto, the IDO’s racial and national section did not study Jews. The section gathered or created data forms (e.g., personal and family histories), photographs of people and objects, and bibliographic and reference cards and published articles based on some of this research.
U.S. and British troops captured the IDO’s records in 1945, and the U.S. Army brought the records to the United States in 1947. The War Department’s intelligence division and the Surgeon General’s medical intelligence unit went through the records (in the process destroying whatever original order may have existed) and then offered them to the Smithsonian. The Smithsonian accepted the records, but then transferred some of them to the Library of Congress, the National Gallery of Art, and the Pentagon (which then sent some of the records to the National Archives). As a result, there are small pieces of the collection all over Washington, DC.
The IDO records held by the Smithsonian were not used for research until 1997, when a cultural anthropologist reorganized some of them, created the collection’s first detailed finding aid, and eventually published a book based on her research.
In 2003, the Polish Embassy requested that the IDO records be returned to Poland. It took the Smithsonian about five years to figure out how to respond to this request, and its response was the product of repeated consultation between various units of the Smithsonian, the State Department’s Holocaust studies unit, and the Library of Congress, which had received competing requests from the German and Polish governments for materials that had been created by German authorities but which concerned Poland; the State Department, which noted that the Smithsonian’s decision might set a precedent, wanted the governments to reach some sort of agreement concerning the materials in LC’s possession.
In order to determine how it would respond to the Polish government’s request, the Smithsonian set up a task force that examined:
In September 2005, the Smithsonian decided that it would make copies of the records and then transfer the originals to the Jagiellonian University Archives, which agreed to make them publicly accessible. It opted to digitize the records and then produce microfilm from the scans, and needed to raise a lot of money to do so. It initially requested funding from a private foundation, which deferred giving an answer for approximately a year. When the Polish Embassy inquired about the status of the project, the Smithsonian seized the opportunity to cc: approximately 20 other people and institutions in its response. As a result of this e-mail exchange, the U.S. Holocaust Memorial Museum offered funding for digitization and for conservation and allowed the Smithsonian to use its standing digitization contract; the Polish university to which the records were headed also offered some support.
The Smithsonian engaged Schuster, an archival intern fluent in German, to process the records and oversee their digitization. Schuster humidified, flattened, and cleaned the records, which were trifolded and covered in coal dust and other contaminants, and rehoused them in boxes suitable for A4-sized paper. She imposed order upon them, which was no small challenge. The anthropologist who prepared the initial finding aid had attempted to arrange the records geographically; however, she was chiefly interested in the IDO’s Tarnow ghetto and Krakow studies, and as a result most of the collection was unarranged. Schuster ultimately organized the records by type. In order to preserve the initial arrangement of the records (which was reflected in the anthropologist’s published citations), she created an Access database that tracked the original and new order of each document in the collection and generated container lists that contained crosswalks between the two arrangements.
Schuster also shared a couple of lessons she learned during the digitization phase of the project:
Ruth Selig concluded by making a very important point: the transfer was successful because the Smithsonian committed to working through a complicated process in a very deliberate, step-by-step manner. Many different institutions were brought together in interesting and unanticipated ways, and everyone was pleased with the outcome. Even the State Department was pleased; the initial request was technically issued by Jagiellonian University and directed to the Smithsonian, which is not a government agency, so the Smithsonian’s transfer decision really isn't precedent-setting.
All in all, a good session full of practical tips for dealing with a wide array of complex issues.
I attended this session simply because the topic seemed interesting, and I’m glad I did: the records at the center of this session are inherently interesting (albeit in a disturbing sort of way), have a complicated, transnational provenance, and processing them, reformatting them, and determining where they should be housed posed real challenges. Although most of us will never encounter a situation quite as complex, many of us eventually encounter records of uncertain or disputed provenance, materials that lack discernable order, or multi-stage reformatting projects. The decisions that the Smithsonian made and the lessons that it learned thus ought to be of interest to many archivists.
The records in question were created by the Institut für Deutsche Ostarbeit (IDO; Institute for German Work in the East), which the Nazis created in 1940 to settle all questions relating to occupation of Eastern Europe. Edie Hedlin (Smithsonian Institution Archives), Beth Schuster (Thomas Balch Library), and Ruth Selig (Smithsonian) took turns discussing the records’ complicated custodial history and the Smithsonian’s involvement with them.
The IDO had many sections, including one that focused on “racial and national traditions” and researched Polish ethnic groups; however, apart from one study completed in the Tarnow ghetto, the IDO’s racial and national section did not study Jews. The section gathered or created data forms (e.g., personal and family histories), photographs of people and objects, and bibliographic and reference cards and published articles based on some of this research.
U.S. and British troops captured the IDO’s records in 1945, and the U.S. Army brought the records to the United States in 1947. The War Department’s intelligence division and the Surgeon General’s medical intelligence unit went through the records (in the process destroying whatever original order may have existed) and then offered them to the Smithsonian. The Smithsonian accepted the records, but then transferred some of them to the Library of Congress, the National Gallery of Art, and the Pentagon (which then sent some of the records to the National Archives). As a result, there are small pieces of the collection all over Washington, DC.
The IDO records held by the Smithsonian were not used for research until 1997, when a cultural anthropologist reorganized some of them, created the collection’s first detailed finding aid, and eventually published a book based on her research.
In 2003, the Polish Embassy requested that the IDO records be returned to Poland. It took the Smithsonian about five years to figure out how to respond to this request, and its response was the product of repeated consultation between various units of the Smithsonian, the State Department’s Holocaust studies unit, and the Library of Congress, which had received competing requests from the German and Polish governments for materials that had been created by German authorities but which concerned Poland; the State Department, which noted that the Smithsonian’s decision might set a precedent, wanted the governments to reach some sort of agreement concerning the materials in LC’s possession.
In order to determine how it would respond to the Polish government’s request, the Smithsonian set up a task force that examined:
- Accepted archival principles and guidelines;
- Whether the U.S. Army had acted legally when it took the records and gave them to the Smithsonian;
- Whether the other Allied nations had any legal claim to the records;
- The Smithsonian’s authority to acquire, hold, and de-accession archival collections;
- The records’ unique characteristics and potential research uses;
- Whether various other parties—the U.S. Army, the Bundesarchiv and other German government agencies, the U.S. National Archives and Records Administration, the U.S. Holocaust Memorial Museum, the Polish government, and the U.S. State Department—had any interest in the records;
- The impact of any precedents that the Smithsonian’s actions would establish upon the Smithsonian itself, the Library of Congress, the Hoover Institution (which holds most of the records of the Polish government in exile), and U.S. government agencies.
In September 2005, the Smithsonian decided that it would make copies of the records and then transfer the originals to the Jagiellonian University Archives, which agreed to make them publicly accessible. It opted to digitize the records and then produce microfilm from the scans, and needed to raise a lot of money to do so. It initially requested funding from a private foundation, which deferred giving an answer for approximately a year. When the Polish Embassy inquired about the status of the project, the Smithsonian seized the opportunity to cc: approximately 20 other people and institutions in its response. As a result of this e-mail exchange, the U.S. Holocaust Memorial Museum offered funding for digitization and for conservation and allowed the Smithsonian to use its standing digitization contract; the Polish university to which the records were headed also offered some support.
The Smithsonian engaged Schuster, an archival intern fluent in German, to process the records and oversee their digitization. Schuster humidified, flattened, and cleaned the records, which were trifolded and covered in coal dust and other contaminants, and rehoused them in boxes suitable for A4-sized paper. She imposed order upon them, which was no small challenge. The anthropologist who prepared the initial finding aid had attempted to arrange the records geographically; however, she was chiefly interested in the IDO’s Tarnow ghetto and Krakow studies, and as a result most of the collection was unarranged. Schuster ultimately organized the records by type. In order to preserve the initial arrangement of the records (which was reflected in the anthropologist’s published citations), she created an Access database that tracked the original and new order of each document in the collection and generated container lists that contained crosswalks between the two arrangements.
Schuster also shared a couple of lessons she learned during the digitization phase of the project:
- Digitization should begin only after a collection is completely conserved and reprocessed. Project deadlines led the Smithsonian to start digitizing as soon as possible, and as a result, the image files had to be renamed after processing.
- Do not underestimate the amount of time and effort needed for good quality control. The Smithsonian needed accurate, complete surrogates and to ensure that every original had been scanned, and as a result Schuster needed to examine each image and count the number of pages in each folder. She had to send back to the vendor many originals that were scanned crookedly or were missed, and she has a jaundiced view of outsourcing as a result.
Ruth Selig concluded by making a very important point: the transfer was successful because the Smithsonian committed to working through a complicated process in a very deliberate, step-by-step manner. Many different institutions were brought together in interesting and unanticipated ways, and everyone was pleased with the outcome. Even the State Department was pleased; the initial request was technically issued by Jagiellonian University and directed to the Smithsonian, which is not a government agency, so the Smithsonian’s transfer decision really isn't precedent-setting.
All in all, a good session full of practical tips for dealing with a wide array of complex issues.
Saturday, April 18, 2009
Radio silence . . . for a while
I'm heading off to the Internet-free zone of my relatives' home for a few days, so I won't have the chance to post anything else about MARAC until sometime next week. Look for a flurry of posts when I get back. . . .
Subscribe to:
Posts (Atom)