When I was asked within hours of becoming employed as the Collaborative Programs Graduate Assistant at UNT to work at a Data Rescue event, I had virtually no idea what I would be doing or how I could contribute to something like rescuing data. While familiar with general computer use, hearing “Data Rescue” implied a level of technical ability that felt beyond my expertise. However, I would soon learn that the Data Rescue initiative provides opportunities for people at all technical levels to contribute their skills. Specifically, I found my opportunity in telling the Data Rescue story.
The Local Effort
Many people came together from various institutions, organizations, and agencies to hold #DataRescueDenton, the latest in a series of individual Data Rescue events cascading from the original Data Rescue held at the University of Pennsylvania over the weekend of January 14th 2017. I hope to place #DataRescueDenton in the larger context of the movement by explaining Data Rescue Philly and its relation to another initiative called the End of Term Archive.
The Original Effort
The Data Rescue movement operates on the premise that nothing is permanent. Information becomes lost to obscurity through multiple avenues including but not limited to finite storage space in archives, loss of funding for storage sites and organizations, data corruption, and obsolete formatting. Furthermore, loss of open access to data necessitates individuals resorting to filing Freedom of Information Act requests to view data. FOIA requests can take years to process, and the information received may return redacted or incomplete. The Penn Program for Environmental Humanities started to plan strategies for preventing environmental data loss and promoting access to environmental data for public use and application. Data Refuge and the Data Rescue effort emerged as the vehicle to reach both goals and to create opportunities for interested parties to contribute to the movement through volunteerism.
Farm to Table
The original workflow created by Data Rescue Philly and subsequent events (as well as #DataRescueDenton) included tracks with differentiated tasks. The tracks were cleverly named Seeding, Harvesting, Checking, Bagging, Describing, and Storytelling. In these tracks, volunteers could find ways to plug their talents into the various outlets of needs of the Data Rescue movement. The flow starts with seeders: volunteers who nominate website URLs containing useful data to be examined. Then, volunteers with knowledge of computer script-writing and programming could assist with building tools to help capture or “harvest” large and sophisticated file types from the seeded URLs to reupload to the Data Refuge. Volunteers confident in their ability to assess the contents of a dataset could serve as a Describer and provide metadata such as a title, where the data were collected, who collected the data, and potential uses for the data. Another camp of the movement houses the Storytellers: individuals who document what happens at Data Rescue events using social media field notes and news articles, write profiles of participants who wish to share their experiences, and think of creative applications for successfully captured data. Each volunteer among the various tracks contributes to the Data Rescue momentum.
Why We Rescue Data
The Data Rescue movement partners with other organizations and movements who share similar workflows but examine and preserve different kinds of data. The End of Term Archive collects and preserves governmental data that may be lost during U.S. presidential transitions. The University of North Texas partners with the End of Term archive, so #DataRescueDenton included an EoT archive group despite EoT existing separately from Data Rescue and Data Refuge. People across the world contribute to these complementary movements to help decentralize government and research data.
As mentioned earlier in this piece, Data Rescuers operate on the premise that nothing is permanent. Volunteers create trustworthy copies of data and house the data in multiple archives to prevent single point failure, archive abandonment, accidental loss, and threat of restricted access from departmental restructuring and loss of funding. Some Data Rescuers believe that data decentralization may be used as a form of protest, especially in a time when climate change data faces intense scrutiny and threat of erasure under uncertain governmental policies. For example, the public can no longer access USDA reports on animal welfare. According to Karen Brulliard in a February 2017 article for the Washington Post, the USDA animal welfare reports were “frequently used by animal welfare advocates to monitor government regulation of animal treatment at circuses, scientific labs and zoos.” No trustworthy copies exist for these reports, so the only way to view them is to file FOIA requests (which can take years to process as previously mentioned and often come back redacted or incomplete).
Decentralization also encourages a climate of accountability. Access to data empowers the general population to make informed decisions about their personal and professional lives and to hold their representatives accountable in the process of making and implementing policies and laws.
Deborah Caldwell collaborating with the harvester team
End of Term Archive group