This year’s Cyber Security Conference was held on 16-17 June. I’ve held onto several posts because I was waiting for the conference organizers to post the presentations online. Now that some of the presentations are available, I’ll start working through the backlog.
The most compelling Cyber Security Conference session I attended had no direct archival connection but made me think about the types of records that law enforcement now creates. “Fighting Child Porn at the Provider Level” also left me deeply impressed with the bold and inventive manner in which the New York State Office of the Attorney General (OAG) used two tools commonly used by electronic records archivists -- hashing algorithms that verify that files haven’t been altered or corrupted and Web crawling software that captures Internet content meriting preservation -- to combat appalling crimes.
Senior Investigator Michael McCartney discussed OAG’s twelve-month investigation into the flow of child pornography on Usenet, the global bulletin-board service that predates the World Wide Web. Unlike most law enforcement agencies, which focus on snaring end-level consumers, OAG focused on the large Internet service providers (ISP’s) who provide Usenet feeds to both individual customers and smaller ISP’s.
Usenet groups have hierarchical, content-specific names, and groups devoted to distribution of child pornography and advertisements for child porn Web sites almost invariably have sickeningly obvious names. All of the large ISP’s have the technical ability to filter out groups dedicated to the sharing of illicit content, and most of them were filtering the Usenet feeds that their residential customers accessed. However, they refrained from examining the feeds they supplied to their commercial customers, some of whom removed foreign-language and other groups but not child porn groups.
In order to force the large ISP’s to take action, OAG created a dummy ISP and purchased Usenet feeds from the largest ISP’s that had any sort of business presence in New York State. It identified 88 Usenet groups that were unmistakably devoted to child porn and then developed an automated crawler that captured all of the content posted to these groups, calculated a unique alphanumeric hash value for each posted image, and inserted the header data that supplied information about the origin of each posting into a database. OAG investigators reviewed each image and earmarked those that unambiguously met the state’s legal definition of child pornography.
Any change made to an image (e.g., cropping, adding captions) changes its hash value, and as a result OAG’s investigators encountered a lot of variants of the same images. However, the crawler’s ability to compare hash values of previously captured images with those of newly captured files also enabled them to trace the posting of identical images across multiple Usenet groups and via multiple ISP’s -- and each posting constituted a separate felony offense. Whenever the crawler determined that more than 10 images had been posted from the same originating address, it automatically generated a subpoena demanding that the ISP supply information about the individual account from which the posts originated.
After 144 days of crawling, OAG had identified 59 active child pornography Usenet groups and thousands of pornographic images. After sending six separate citizen complaints to each ISP’s abuse department and getting little or no response, OAG then hit the ISP’s with the damning results of its investigation.
In lieu of prosecution, OAG, which maintains a Web site detailing the results of its investigation and ongoing efforts to fight child porn, entered into settlements that required the ISP’s to:
OAG is also working with the ISP’s to develop a larger hash value database that could be used to identify, in real time, known child pornography images transmitted across an ISP’s servers. This database is still in the conceptual stages, and variations in state child pornography laws will make it difficult to create a truly national hash database, but OAG hopes that this database will greatly reduce the supply of online child pornography.
- Remove the 88 child porn groups from all of their feeds
- Have a human being review each new Usenet group and flag those overtly used to share child pornography
- Adhere to strict, lengthy data retention requirements that would enable OAG and other law enforcement agencies to identify and prosecute individuals who transmitted child pornography via the Internet
- Enhance their abuse reporting systems and, in accordance with federal law, report each confirmed instance of child pornography to the National Center for Missing and Exploited Children (NCMEC)
- Revise their Terms of Service and Authorized Use policies
- Subscribe to the NCMEC’s child pornography Web site watch list
OAG is also working with the ISP’s to develop a larger hash value database that could be used to identify, in real time, known child pornography images transmitted across an ISP’s servers. This database is still in the conceptual stages, and variations in state child pornography laws will make it difficult to create a truly national hash database, but OAG hopes that this database will greatly reduce the supply of online child pornography.
About halfway through McCartney's presentation, I stopped thinking, “these guys are really cool” and started thinking, “these guys are awesome." I'm still kind of blown away by the technical sophistication and impressive results of OAG's investigation, and I'm still pondering its records-related implications. Technology is clearly changing how law enforcement agencies conduct investigations and document their work, and the legal and records management ramifications of these changes are going to be pretty interesting.
2 comments:
Spam removal: one of the great joys of blogging.
Post a Comment