Development of a Long-Term Interdisciplinary Data Archive with the Columbia University Library System

Robert S. Chen, Robert R. Downs, and W. Christopher Lenhardt

In collaboration with the Columbia University Libraries, the NASA Socioeconomic Data and Applications Center (SEDAC) has established a Long-Term Archive (LTA) to preserve and provide access to digital objects of scientific data and related information products that offer enduring value for communities representing various scientific and scholarly disciplines. The LTA Board, an interdisciplinary group of members representing the Columbia University Libraries, SEDAC, and units of the Earth Institute of Columbia University, oversees the organizational infrastructure and operations of the LTA. The Board evaluates each nominated data set to determine its potential for use and the appropriate levels of preservation services and dissemination services to be provided by the LTA. Digital objects, consisting of the data and additional information needed to create archival information packages for each acquisition, have been ingested into the digital repository system implemented for the LTA using the Fedora open source digital object repository architecture software. Some data will be preserved in their original formats and publicly disseminated as part of dissemination information packages created for the digital objects. Data selected for preservation in supported formats will be converted to new formats as needed, based on continual monitoring and assessment of evolving technologies and user needs. The SEDAC LTA represents a unique collaboration between an active data archive and a University library to begin addressing the need for long-term archiving of “born digital” data generated by the scientific research community.