Friday, January 15, 2010

How the White House Archives Its E-mail

Last month, the National Security Archive (NSA) and Citizens for Responsibility and Ethics in Washington (CREW) settled their 2007 lawsuit against the Executive Office of the President (EOP) and the U.S. National Archives and Records Administration (NARA). In keeping with the terms of the settlement, earlier today EOP conveyed to CREW and NSA a letter outlining its e-mail archiving and backup practices. A few hours ago, CREW posted a copy of the letter and supporting information on its Web site.

So how's EOP doing? From the looks of it, a pretty good job:
  • Since 20 January 2009, it has been using EMC's EmailXtender (now EMC SourceOne E-mail Management for Microsoft Exchange) to capture copies of all messages sent or received via its unclassified e-mail network. The EmailExtender system, which is centrally managed and housed in a secure offsite location, captures messages from EOP's central Microsoft Exchange Journal Servers (EOP's using Microsoft 2000 and will soon upgrade to Exchange 2010) immediately after they are sent or received by an EOP desktop computer or BlackBerry.
  • EOP network operations staff continuously monitor the status and storage capacity of the system via "health-check dashboard reports."
  • The system produces full backups on the second Tuesday of each month and incremental backups on every Monday, Wednesday, Friday, and Sunday.
  • Although some users of the e-mail network can search the EmailXtender system and view archived messages, they can search and view only those messages created by their own offices and do not have the ability to alter or delete messages.
  • Only a select handful of people have the ability to delete messages from the system, and only those messages that were subsequently found to contain classified information are deleted. EOP seems quite serious about preventing inappropriate deletions: messages are deleted only after the Office of Security and Emergency Preparedness and the National Security Council have been consulted and EOP's Office of the CIO, Office of the General Counsel, and the Director of Office Administration have granted permission. Moreover, record copies of deleted messages and records documenting adherence to the deletion protocols are maintained separately
  • The system produces weekly audit reports that identify individuals who conducted searches, the search terms they used, and whether they opened any messages in connection with their searches. The audit reports also document the deletion of messages, thus ensuring that unauthorized deletions will not go undetected.
From an archival/records management perspective, there's a lot to like about EOP's approach:
  • Mindful of some of the problems that confronted the previous administration, EOP has configured its e-mail network so that access to "all known Web based external e-mail systems" is blocked and neither the e-mail network nor EOP-issued BlackBerries can access "known instant messaging systems. Of course, secrecy-minded White House personnel could conduct official business via personal cell phones or PDA's -- and I would like to know how EOP is combating this practice -- but EOP seems to be doing whatever it can to ensure that its own hardware is locked down.
  • The EmailXtender system somehow determines whether a given message is subject to the Presidential Records Act or the Federal Records Act, which ought to make it a lot easier for NARA staff to manage these records after transfer. I wouldn't mind knowing more about this neat trick, which is probably based on analysis of the message's content, the account holder's role within EOP, or some combination of the two.
  • Although the EmailXtender system stores the messages and their attachments in their native formats, they can be extracted in .eml (Microsoft Outlook Express Electronic Mail) format for transfer to NARA. I would be happier if the messages could be exported in some sort of optimal preservation format, but the archival profession is just starting to figure out precisely what an optimal e-mail preservation format would look like. If, as EOP's letter implies, NARA can take in .eml files and convert them to a preservation format, .eml is okay.
This sounds all well and good, but a caveat is in order: we almost certainly don't have the whole picture. The very first sentence of EOP's letter indicates that the system and protocols described above relate only to "unclassified White House e-mails," which suggests that there is a separate system and set of protocols for messages containing classified information. Getting information about the workings of the classified system -- which I really hope isn't based on Exchange or any other off-the-shelf application -- is doubtless going to be a lot more difficult, and there is legitimate reason for not disclosing at least some of it. The care with which EOP is handling deletions from the unclassified system suggests that there is a substantial degree of commitment to doing the right thing, but we may not know with certainty just how well the Obama administration is managing classified e-mails until well after President Obama leaves office.


Ingmar said...

I'm wondering: Is this more than just 'storage and retrieval'.
Records can only be understood correctly within their context:
Why was this e-mail sent/received? What is it's relationship with other (non-e-mail) records?
How is EOP making sure that this kind of information is accurately added to all captured messages?

MK said...

Not only does the White House letter focus solely on unclassified email messages, it only deals with Presidential Records Act (PRA) controlled materials. It does explain how segregation of statutory and political material occurs. Keep in mind that information about a president’s “private-political associations” does not fall under the PRA and should not be captured (automatically or manually) within a governmental electronic record keeping system. That’s why some members of the Bush White House staff were issued RNC BlackBerries. Every White House has some staff (not many) whose duties are a mixture of governmental and political. Of course, the President is one such person.

I would be interested to hear what advice White House Counsel has offered to Obama and his EOP staff on how to separate purely political personal electronic records from governmental ones. Do some members of the White House staff have DNC BlackBerries? Do they use separate in-office computers that are not linked to governmental servers for their purely olitical correspondence? Do members of the records management staff of the White House Office of Records make determinations on what is governmental? That would require their having the authority to move off the official system that which is not. Is there an effort to keep everything, with the segregation decisions instead left to NARA to make after an administration leaves office? Or are those decisions made by the creators of records at the time they decide which equipment they will use to generate electronic records?

It’s hard to picture having to toggle between equipment depending on what you are writing or saying but the law requires segregation. It’s much the same principle as came up when questions arose over where Al Gore was and what telephones he was using for fund raising calls during the Clinton administration. And why White House staff in every modern day administration carefully split the costs of presidential travel between taxpayer funded, appropriated accounts and private political funds, depending on the purpose of a trip.

Nixon’s records were seized in place. He and his staff had created them on the premise that all were personal property. We federal archivists ended up having to determine what was personal property and what was governmental. If you’re interested in how the National Archives interpreted the scope of “private political association” non-governmental information, take a look at
All the documents were determined by NARA to be personal, not governmental. They were returned to the former president and donated back to the government around 2007.

It was much easier in the old days, when secretaries typed up memos and notes on typewriters. People could maintain two filing cabinets and decide based on looking at hard copies if a document should be sent to what used to be White House Central Files or was personal and not eligible for federal retention. The challenge now is not just preserving email, but also finding ways to separate personal-political from other electronic information.

MK said...

I left out the not in "It does explain how segregation of statutory and political material occurs." I intended to write, "It does not explain. . . ."

l'Archivista said...

Ingmar and MK, both of you make excellent points. Ingmar, I'm a big believer in the value of context, and I would love to know more about how EOP is managing its other paper and electronic records. However, I'm afraid that neither EOP nor any other organization responsible for managing tens or hundreds of millions of e-mails can provide detailed contextual information about each message; at present, we simply don't have content management systems that are up to the task. I think that the best we can hope for is some high-level information about the relationship between the e-mails and other EOP records series and good search-and-retrieval capability. I recently sifted through a much smaller e-mail archive, and was pleasantly surprised to discover that using knowledge gleaned from other records -- names and job titles of relevant individuals, formal and informal names of specific projects -- I could construct keyword searches that retrieved all of the relevant messages. (I also retrieved a lot of irrelevant messages, but the application I was using had limited search capabilities and I didn't have the time or money needed to procure better software.)

MK, I would also like to know more about the administration's policies concerning the creation of information about "private-political associations" and the types of hardware and software accessible to staff who create such information. I would imagine that some EOP staff have DNC-issued BlackBerries and computers in addition to their EOP-issued equipment, and I hope that they've gotten some pretty intensive training in how to keep Presidential Records Act and private-political association materials separate; the fact that EOP staff can't access e-mail or instant messaging services other than those used by EOP suggests that they're giving the matter some thought.

Of course, even the most conscientious human beings sometimes slip up. I want EOP to do everything it can, within reason, to keep such slip-ups to an absolute minimum, but I wouldn't be surprised if NARA ultimately gets a few e-mails documenting DNC business and a few messages subject to the Public Records Act end up on DNC servers -- or on the servers of companies that provide cell phone and Internet connectivity to individual EOP employees. I'm sure that similar mix-ups happened during the paper era: a few years ago, I discovered a small quantity of state Democratic Party and personal business correspondence in the voluminous correspondence and subject files maintained by the head of a state government agency, and I suspect that just about every government archivist has had similar experiences. However, MK, you are absolutely right that controlling the creation and management of records in a networked, multi-device environment is a lot more difficult.

In a lot of ways, EOP's letter raises almost as many questions as it answers. However, I do have to commend the current administration, which seems to have devoted substantial attention and resources to improving the management of EOP e-mail, which was a persistent problem for the previous administration.

Finally, as a state government archivist, I'm deeply pleased that EOP recognizes that e-mail must be managed at the enterprise level. With very few exceptions, state government entities in New York State and other states still hold individual staff members responsible for identifying messages of enduring value and arranging to preserve them -- either by saving them offline or printing them out. This approach is simply not sustainable, and I'm hoping that the example set by EOP and other federal entities will get the states to start moving away from it.

MK said...

Thank you for the excellent response, L'Archivista. I like your take on this. Striking the right balance certainly is a challenge. As you point out, requiring the creators of records to make the categorization decisions has a downside. While they know context best, they also often are extremely busy and record keeping often is very low on their list of priorities. Depending on their positions and their exposure to risk, they also may have vested interests in how their activities come across. One need only consider how Oliver North reportedly tried to delete messages from the old PROFS system used by NSC by the White House during the Reagan administration. As I've learned from following records managers' forums, there are many business environments where the lawyers say keep only what you must for the bare minimum of time as what you wrote can come back to haunt you. To their credit, it seems as if the WH lawyers are tilting towards preservation, as the law and putlic trust requires. Thanks for keeping us updated on this.