Conference Program

 Monday October 5, 2009
 Robertson Auditorium, Mission Bay Conference Center.
8:00-9:00amBreakfast
9:00-9:15amWelcome by Patricia Cruse
[video]
[presentation]
9:15-10:15amDavid Kirsch [bio]. The Public Interest in Private Digital Records and Why We Should Care if Corporations have the Right to be Forgotten.
[abstract]
[video]
Abstract: Over the span of 120+ years of American legal history, corporations have enjoyed the benefits of personhood. This talk explores whether corporations should enjoy the "right to be forgotten," articulated recently by Blanchette (2006) and Werro (2009). We will consider the risks and benefits of the corporate right to be forgotten in the context of our ongoing efforts to preserve the digital records of the failed law firm Brobeck, Phleger & Harrison and discuss possible mechanisms that recognize the requirements of both deorganized firms and the interested public.
10:15-10:30amBreak
10:30-11:30am Brian Lavoie (introduction) [bio], Abby Smith (moderator) [bio], Martha Anderson [bio], Paul Courant [bio] and Patricia Cruse [bio]. Perspectives on the Economics of Sustainable Digital Preservation (Panel discussion).
[abstract]
[video]
[presentation]
Abstract: Keynote panel from members of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Martha Anderson, Paul Courant and Patricia Cruse with introduction by Brian Lavoie and moderator Abby Smith. The Task Force consists of a group of international experts from a variety of domains (libraries, archives, government agencies, the private sector) and areas of expertise (computer science, economics, information management), and is supported by the National Science Foundation, the Mellon Foundation, the UK Joint Information Steering Committee, the Council on Library and Information Resources, the Library of Congress, and the National Archives. The goal of the Task Force is to raise awareness and increase understanding of the economics of sustainable digital preservation and to develop practical recommendations and guidelines for achieving it.
11:30am-12:15pm Henry Lowood [bio]. Memento Mundi: Are Virtual Worlds History?
[abstract]
[video]
[full paper]
[presentation]
Abstract: In this paper, I consider whether virtual worlds are history in two senses of the word. The first explores the implications of the life-cycle of virtual worlds, especially of their extinction, for thinking about the history of computer-based technologies, as well as their use. The second sense of the virtual world as history brings us directly to issues of historical documentation, digital preservation and curation of virtual worlds. I consider what will remain of virtual worlds after they close down, either individually or perhaps even collectively. I argue that focusing virtual world preservation on software preservation alone is a barren exercise with respect to the documentation of events and activities that occur in these worlds; the value of software preservation lies elsewhere. In the How They Got Game Project at Stanford and the Preserving Virtual Worlds Project funded by the US Library of Congress, we have identified some possible approaches to documenting activities and events in virtual worlds. I discuss the progress we have made and why it is important for us to conduct more background research on practices for preserving complex user-behavior.
12:15-1:30pmLunch: Fisher Banquet Room
1st Floor, Mission Bay Conference Center.
 Robertson Auditorium
Front
Robertson Auditorium
Rear
1:40-2:00pm Reinhard Altenhöner [bio]. E-infrastructure and Digital Preservation: Challenges and Outlook. Wolfgang Wilkes [bio]. Towards Support for Long-Term Digital Preservation in Product Life Cycle Management.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: Undoubtedly long-term preservation has raised a great deal of attention worldwide, but on the other hand the attention is out of proportion to the quantity of real operational solutions. Up to now the existing digital preservation infrastructure is dedicated to at least a small number of trusted long-term archiving data repositories, which have the control and the responsibility on our digital heritage. Especially the discussion on e-infrastructures points out that we haven't even an infrastructure in place which integrates digital preservation with cultural heritage organisations, data centres and data producers. The slow development up to a global integrated dp-infrastructure has led to an extended political-strategic discussion within the last years in Europe and US: In studies, position papers and evaluation approaches we find elaborated acting steps for a roadmap and definitions for a more advanced landscape. Focussing especially the German situation and existing practical experiences the paper discusses reasons and strategic aspects which maybe can illuminate the prohibiting factors for the unassertive progress. Abstract: Important legal and economic motivations exist for the design and engineering domain to address and integrate digital long-term preservation into the product life cycle management (PLM). Investigations revealed that it is not sufficient to archive only the product design data which is created in early phases of a product lifecycle. Indeed, preservation is also needed for data that is produced during the entire product lifecycle including early and late phases. Data that is relevant for archiving consists of requirements analysis documents, design rationale, data that reflects experiences during product operation and also social collaboration context. In addition, also the engineering environment itself that contains specific versions of all tools and services is a candidate for preservation. This paper takes a closer look at real-life engineering archive use case scenarios as well as PLM characteristics and workflows that are relevant for long-term preservation. Resulting requirements for long-term preservation engineering archive lead to an OAIS (Open Archival Information System) based system architecture and a proposed preservation service interface that respects the needs of the engineering domain.
2:00-2:20pm Stephen Abrams [bio]. An Emergent Micro-Services Approach to Digital Curation Infrastructure. Rainer Schmidt [bio]. A Programming Model and Framework for Distributed Preservation Workflows.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: In order better to meet the needs of its diverse University of California constituencies, the California Digital Library (CDL) is re-envisioning its approach to digital curation infrastructure by devolving function into a set of granular, independent, but interoperable micro-services. Since each of these services is small and self-contained, they are more easily developed, deployed, maintained, and enhanced; at the same time, complex curation function can emerge from the strategic combination of atomistic services. The emergent approach emphasizes the persistence of content rather than the systems in which that management occurs, thus the paradigmatic archival culture is not unduly coupled to any particular technological context. This results in a curation environment that is comprehensive in scope, yet flexible with regard to local policies and practices and sustainable despite the inevitability of disruptive change in technology and user expectation. Abstract: The Planets project develops a service-oriented environment for the definition and evaluation of preservation strategies for human-centric data. It focuses on the question of logically preserving digital materials, as opposed to the physical preservation of content bit-streams. This includes the development of preservation tools for the automated characterization, migration, and comparison of different types of digital objects as well as the emulation of their original runtime environment in order to ensure long-time access and interpretability. The Planets integrated environment provides a number of end-user applications that allow data curators to execute and scientifically evaluate preservation experiments based on composable preservation services. In this paper, we focus on the programming environment and show how it can be utilized in order to create complex preservation workflows. We argue that preservation systems in particular have strong dependencies on legacy applications and third party services. Therefore, research on unified preservation interfaces, standardized service profiles, and programming models is crucial to the interoperability and reusability of current and future preservation tools and components.
2:20-2:40pm Pam Armstrong [bio] and Johanna Smith [bio]. Preserving the Digital Memory of the Government of Canada: Influence and Collaboration with Records Creators. Priscilla Caplan [bio] and Joseph Pawletko [bio]. TIPR's Progress (Towards Interoperable Preservation Repositories).
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: Library and Archives Canada has a wide mandate to preserve and provide access to Canadian published heritage, records of national significance, as well as to acquire the records created by the Government of Canada, deemed to be of historical importance. To address this mandate, Library and Archives Canada has undertaken the development of a digital preservation infrastructure covering policy, standards and enterprise applications which will serve requirements for ingest, metadata management, preservation and access. The purpose of this paper is to focus on the efforts underway to engage digital recordkeeping activities in the Government of Canada and to influence and align those processes with LAC digital preservation requirements. The LAC strategy to implement preservation considerations early in the life cycle of the digital record is to establish a mandatory legislative and policy framework for recordkeeping in government. This includes a Directive on Recordkeeping, Core Digital Records Metadata Standard for archival records, Digital File Format Guidance, as well as Web 2.0 and Email Recordkeeping Guidelines. The expected success of these initiatives, and collaborative approach should provide a model for other digital heritage creators in Canada. Abstract: TIPR is a project working to enable the transfer of complex digital objects between dissimilar preservation repositories. For reasons of redundancy, succession planning and software migration, such repositories must be able to exchange copies of archival information packages with each other, despite their different archival information packages. TIPR has designed the Repository eXchange Package (RXP), a structure of digital object components and metadata files, including constrained profiles of METS and PREMIS, the goal being an intermediary information package that all repositories can read and write, overcoming the mismatch between repository types.

In this presentation we describe three aspects of the TIPR project: the RXP format itself; the tests we are conducting with RXP in transfers between the Florida Center for Library Automation, New York University, and Cornell University; and finally, the issues we have wrestled with during the development of the format, issues that frequently make successful repository-to-repository transfer difficult.
2:40-3:00pm Robert Sharpe [bio]. Are You Ready? Assessing Whether Organizations are Prepared for Digital Preservation. Esther Conway [bio]. Curating Scientific Research Data for the Long Term: A Preservation Analysis Method in Context.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: In the last few years digital preservation has started to transition from a theoretical discipline to one where real solutions are beginning to be used. The Planets project has analyzed the readiness of libraries, archives and related organizations to begin to use the outputs of various digital preservation initiatives (and, in particular, the outputs of the Planets project). This talk will discuss the outcomes of this exercise which have revealed an increasing understanding the problem. It has also shown concerns about its scale (in terms of data volumes and types of data) and on the maturity of existing solutions (most people are only aware of piecemeal solutions). It also shows that there is a new challenge emerging: moving from running digital preservation projects to embedding the processes within their organisations. Abstract: The challenge of digital preservation of scientific data lies in the need to preserve not only the dataset itself but also the ability it has to deliver knowledge to a future user community. A true scientific research asset allows future users to reanalyze the data within new contexts. Thus, in order to carry out meaningful preservation we need to ensure that future users are equipped with the necessary information to re-use the data. This paper presents an overview of a preservation analysis methodology which was developed in response to that need on the CASPAR and DCC SCARP projects. We intend to place it in relation to other digital preservation practices discussing how they can interact to provide archives caring for scientific data sets with the full arsenal of tools and techniques necessary to rise to this challenge.
3:00-3:30pmBreak
 Robertson Auditorium
Front
Robertson Auditorium
Rear
3:40-4:00pm Paul Wheatley [bio]. LIFE3: Predicting Long Term Preservation Costs. Adam Farquhar [bio]. Implementing Metadata that Guides Digital Preservation Services.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: This paper will provide an overview of developments from the two phases of the LIFE (Lifecycle Information for E-Literature) project, LIFE1 and LIFE2, before describing the aims and latest progress from the third phase. Emphasis will be placed on the various approaches to estimate preservation costs including the use of templates to facilitate user interaction with the costing tool. The paper will also explore how the results of the Project will help to inform preservation planning and collection management decisions with a discussion of scenarios in which the LIFE costing tool could be applied. This will be supported by a description of how adopting institutions are already utilising LIFE tools and techniques to analyse and refine their existing preservation activity as well as to enhance their collection management decision making. Abstract: Effective digital preservation depends on a set of preservation services that work together to ensure that digital objects can be preserved for the long-term. These services need digital preservation metadata, in particular, descriptions of the properties that digital objects may have and descriptions of the requirements that guide digital preservation services. This paper analyzes how these services interact and use this metadata and develops a data dictionary to support them.
4:00-4:20pm Ulla Bøgvad Kejser [bio] and Alex Thirifays. Cost Model for Digital Preservation: Cost of Digital Migration. Rebecca Guenther [bio] and Robert Wolfe [bio]. Integrating Metadata Standards to Support Long-Term Preservation of Digital Assets: Developing Best Practices for Expressing Preservation Metadata in a Container Format.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: The Danish Ministry of Culture is currently funding a project to set up a model for costing preservation of digital materials held by national cultural heritage institutions. The overall objective of the project is to provide a basis for comparing and estimating future financial requirements for digital preservation and to increase cost effectiveness of digital preservation activities. In this study we describe an activity based costing methodology for digital preservation based on the OAIS Reference Model. In order to estimate the cost of digital migrations we have identified cost critical activities by analysing the OAIS Model, and supplemented this analysis with findings from other models, literature and own experience. To verify the model it has been tested on two sets of data from a normalisation project and a migration project at the Danish National Archives. The study found that the OAIS model provides a sound overall framework for cost breakdown, but that some functions, especially when it comes to performing and evaluating the actual migration, need additional detailing in order to cost activities accurately. Abstract: This paper explores the purpose and development of best practice guidelines for the use of preservation metadata as detailed in the PREMIS Data Dictionary for Preservation Metadata within documents conforming to the Metadata Encoding and Transmission Standard, a container format that integrates various forms of metadata with links to digital assets. Integration guidelines are needed to find common practice within the flexibility of METS to serve many different functions within digital systems and to support many different metadata structures. How much needs to be mandated given the different contexts and implementation models in using PREMIS has been an issue for discussion in view of the constant tension between tighter control over the package to support object exchange versus each implementation's unique preservation metadata requirements. The PREMIS in METS Guidelines serve primarily as a submission and dissemination information package standard. This paper details the issues encountered in using the standards together, and how the METS document changes as events pertaining to the lifecycle of digital assets is recorded for future preservation purposes. The guidelines have enabled the implementation of an exchange format based on the PREMIS in METS guidelines, especially tools being developed to support such a standard.
4:20-4:40pm Sabine Schrimpf [bio]. Lessons Learned: Moving a Digital Preservation Network from Project Organization to Sustainability. David Giaretta [bio]. Significant Properties, Authenticity, Provenance, Representation Information and OAIS.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: nestor, the German network of expertise in digital preservation started as a time-limited project in 2003. Besides the establishment of a network of expertise with an information platform, working groups, and training opportunities, a central goal of the project phase was to prepare a sustainable organization model for the network's services. In July 2009, nestor transformed into a sustainable organization with 6 of the 7 project partners and 2 additional organizations entering into a consortium agreement.

The preparation of the sustainable organization was a valuable experience for the project partners because vision and mission of the network were critically discussed and refined for the future organization. Some more aspects were identified that also need further refinement in order to make nestor fit for the future. These aspects shall be discussed in the paper.
Description: The concept of Significant Properties is one which has been much discussed within the preservation community as a way of characterising the essential features of a digital object which must be maintained over time. However, the term Significant Properties has been given a variety of definitions and used in various ways over the past several years. There is also a lack of consensus on how Significant Properties are categorised and tested. Furthermore it is unclear how the concept applies to scientific data. In this paper we discuss the new proposed definitions in the revision of OAIS, and draw together and provide a coherent view of Significant Properties, Authenticity, Provenance and Representation Information.
 Robertson Auditorium
Front
Robertson Auditorium
Rear
4:45-6:00pm lightning bolt Lightning talks. lightning bolt Rick Prelinger [bio] presents "Lost Landscapes of San Francisco" with rare film and video.
[presentations] [abstract]
  1. Stephen Abrams: JHOVE2
    [video]
    [presentation]
  2. Randy Stern: DRS-2: HUL Digital Repository Services
    [video]
    [presentation]
  3. William Kilbride: Digital Preservation Coalition 2009-2011
    [video]
    [presentation]
  4. Neil Grindley: Preservation Exemplar Projects (no slides)
    [video]
  5. Yao Fai: Digital Preservation at Tsinghua University
    [video]
    [presentation]
  6. Priscilla Caplan: Question: compression OK in stored AIP? (no slides)
    [video]
  7. Meg Phillips: Implications of accepting any format in National Archives (no slides)
    [video]
  8. Maurizio Lunghi: Persistent Identifiers: first example in Italy (no slides)
    [video]
  9. Mark Evans, Tessella plc. Format transformations: authenticity requirements (no slides)
    [video]
Description: Rick Prelinger presents "Lost Landscapes of San Francisco," an eclectic montage of lost and rarely-seen film clips showing life, landscapes and labor in a vanished San Francisco as captured by amateurs, newsreel cameramen and industrial filmmakers.

What does it mean to collect archival material in an age of geotagging and augmented reality, and how can ephemeral images be contextualized in new ways? And how does the future manifest itself in evidence from the past? In a brief introduction, Rick will discuss issues of access, reuse and description.

This is an interactive screening, and we encourage audience members to interact with the film—to ask questions about what's on the screen, to especially to identify mystery scenes!
6:15pmBus Pick-up
7:00-10:00pmReception: California Academy of Sciences, African Hall.



 Tuesday October 6, 2009
 Robertson Auditorium, Mission Bay Conference Center
8:00-9:00amBreakfast
9:00-9:15amWelcome
9:15-10:15am Micah Altman [bio] Open Data: Legal and Institutional Models for Access and Preservation.
[abstract]
[video]
Abstract: Scientific data were among the first sorts of information to be generated in digital form, and some data archives have been preserving such data for nearly a century. And access to scientific data is fundamental for scientific progress, and for transparency in public policy based on scientific evidence. Nonetheless treatment of data is wildly inconsistent across institutions and fields, and wide gaps remain between systems and practices for data creation, dissemination, preservation, and scholarly publication. This presentation will focus on existing and emerging legal and institutional models for preservation and dissemination of scientific data.
10:15-10:45amBreak and Poster Session

Posters:
  • Andrea Goethals. Unified Digital Format Registry (UDFR)
  • Kevin de Vorsey, Peter McKinney and Cynthia Wu. Risk and Reward: A Method for Measuring Format Obsolescence in a Preservation Repository
  • Robert Olendorf. Preserving Second Life
  • Natalie Walters. Moving Digital Preservation into the Mainstream: The Hybrid Archive of Action on Smoking and Health (ASH) at the Wellcome Library
  • Hannah Frost. Assessment of Digital Objects in JHOVE2
  • Jeremy York, Joan Starr and Heather Christenson. HathiTrust: Preservation as a Platform For Collaboration and Expanded User Services
  • Paul Wu Horng-Jyh. Capturing, Describing, and Arranging eLearning Records in Web 2.0 and Beyond: a Case Study of Professional Seminar
10:45-11:30am Martha Anderson [bio]. The National Digital Stewardship Alliance Charter: Enabling Collaboration to Achieve National Digital Preservation.
[abstract]
[video]
[full paper]
[presentation]
Abstract: The National Digital Information and Infrastructure Preservation Program (NDIIPP) has engaged over 150 partners to preserve at-risk digital content. The content preserved to date ranges from statistical and geospatial data to digital video and web sites. A broad activity portfolio supported projects and research in collection development, technical architecture, policy, and network development. From the beginning of NDIIPP in 2000, a "network of networks" was envisioned where local stewards, service organizations, and education and research facilities would come together to commit to preserving digital materials so historians and policy makers of the future can consult the full record of our age. The promise of a distributed collaborative effort that can address tough challenges for the benefit of all is what informed NDIIPP's network model. As NDIIPP moves into its next phase this model persists but evolves. The National Digital Stewardship Alliance will formalize the NDIIPP network and create an organization that enables collaboration across institutional, industry, and state boundaries while encouraging diversity and heterogeneous solutions.
11:30am-12:15pm Tyler Walters [bio], Liz Bishoff [bio], Emily Gore [bio], Mark Jordan [bio] and Thomas Wilson [bio]. Distributed Digital Preservation: Technical, Sustainability, and Organizational Developments.
[abstract]
[video]
[full paper]
[presentation]
Abstract: Representatives from a variety of distributed digital preservation initiatives will serve on a panel and discuss the technical achievements, organizational models, and advances toward sustainability made by several cooperative distributed digital preservation networks. The distributed digital preservation approach increasingly is being implemented to preserve the vast array of at-risk digital content produced by communities and institutions.
12:15-1:30pmLunch
 Robertson Auditorium
Front
Robertson Auditorium
Rear
1:40-2:00pm Ardys Kozbial [bio]. Chronopolis: Preserving our Digital Heritage. Maureen Pennock [bio]. ArchivePress: A Really Simple Solution to Archiving Blog Content.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: Chronopolis is a digital preservation data grid framework developed by the San Diego Supercomputer Center (SDSC) at UC San Diego, the UC San Diego Libraries (UCSDL), and their partners at the National Center for Atmospheric Research (NCAR) in Colorado and the University of Maryland's Institute for Advanced Computer Studies (UMIACS). Chronopolis provides cross-domain collection sharing for long-term preservation. Using existing high-speed educational and research networks and mass-scale storage infrastructure investments, the partnership is leveraging the data storage capabilities at SDSC, NCAR, and UMIACS to provide a preservation data grid that emphasizes heterogeneous and highly redundant data storage systems. Abstract: ArchivePress is a new technical solution for collecting and archiving content from blogs. Current solutions are commonly based on typical web archiving activities, whereby a crawler is configured to harvest a copy of the blog and return the copy to a web archive. This approach is perfectly acceptable if the requirement is that the site is presented as an integral whole. However, ArchivePress is based upon the premise that blogs are a distinct class of web-based resource, in which the post, not the page, is atomic, and certain properties, such as layouts and colours, are demonstrably superfluous for many (if not most) users. As a result, an approach that builds on the functionality provided by web feeds to capture only selected aspects of the blog offers more potential. This is particularly the case when institutions wish to develop collections of aggregated blog content from a range of different sources. The presentation will describe our research to develop such an approach, including work to define the significant properties of blogs, details of the technical development, and pilot collections against which the tool has been tested.
2:00-2:20pm Christopher A. Lee [bio]. Mainstreaming Preservation through Slicing and Dicing of Digital Repositories: Investigating Alternative Service and Resource Options for ContextMiner Using Data Grid Technology. Mark Guttenbrunner [bio]. Digital Archeology: Recovering Digital Objects from Audio Waveforms.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: A digital repository can be seen as a combination of services, resources, and policies. One of the fundamental design questions for digital repositories is how to break down the services and resources: who will have responsibility, where they will reside, and how they will interact. There is no single, optimal answer to this question. The most appropriate arrangement depends on many factors that vary across repository contexts and are very likely to change over time. This paper reports on our investigation and testing of various repository "slicing and dicing" scenarios, their potential benefits, and implications for implementation, administration, and service offerings. Vital considerations for each option (1) efficiencies of resource use, (2) management of dependencies across entities, and (3) the repository business model most appropriate to the participating organizations. Abstract: Specimens of early computer systems stop to work every day. It becomes necessary to prepare ourselves for the future situation of having storage media and no working systems to read data from these carriers.

With storage media residing in archives for already obsolete systems it is necessary to extract the data from these media before it can be migrated for long term preservation.

One storage medium that was popular for home computers in the 1980s was the audio tape. The first home computer systems allowed the use of standard cassette players to record and replay data. Audio tapes are more durable than old home computers when properly stored. Devices playing this medium (i.e. tape recorders) can be found in working condition or can be repaired as they are made out of standard components. By reengineering the format of the waveform the data can then be extracted from a digitized audio stream.

This work presents a case study of extracting data created on an early home computer system, the Philips G7400. The original data formats were re-engineered and an application was written to support the migration of data stored on tapes without using the original system thus eliminating the necessity of keeping an obsolete system alive for the preservation of data on storage media for this system in the future. Two different methods to interpret the data and eliminating possible errors in the tape were implemented and evaluated on original tapes recorded 20 years ago. Results show that with some error correction methods parts of the tapes are still readable, even without the original system. It also becomes clear, that it is easier to build solutions now when the original systems are still available.
2:20-2:40pm Esther Conway [bio]. Towards a Methodology for Software Preservation. Angela Di Iorio [bio]. A Translation Layer to Convey Preservation Metadata.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: Only a small part of the research which has been carried out to date on the preservation of digital objects has looked specifically at the preservation of software. This is because the preservation of software has been seen as a less urgent problem than the preservation of other digital objects, and also the complexity of software artefacts makes the problem of preserving them a daunting one. Nevertheless, there are good reasons to want to preserve software. In this paper we consider some of the motivations behind software preservation, based on an analysis of software preservation practice. We then go on to consider what it means to preserve software, discussing preservation approaches, and developing a performance model which determines how the adequacy of the a software preservation method. Finally we discuss some implications for preservation analysis for the case of software artefacts. Abstract: Long term preservation is a responsibility to share with other organizations, even adopting different preservation methods and tools. The overcoming of the interoperability issues, by means of the achievement of a flawless exchange of digital assets to preserve, enables the feasibility of applying distributed digital preservation policies.

The Archives Ready To AIP Transmission, a PREMIS Based Project (ARTAT-PBP), aims to experiment with the adoption of a common preservation metadata standard as interchange language in a network of cooperating organizations that need to exchange digital resources with the mutual objective of preserving them in the long term.
2:40-3:00pm Jens Ludwig [bio]. Into the Archive: Potential and Limits of Standardizing the Ingest. Matthew Kirschenbaum [bio] and Erika Farr [bio]. Digital Materiality: Preserving Access to Computers as Complete Environments.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: The ingest and its preparation are crucial steps and of strategical importance for digital preservation. If we want to move digital preservation into the mainstream we have to make them as easy as possible while maintaining their quality. The NESTOR guide "Into The Archive" tries to streamline the planing and execution of ingest projects. The main task for such a guide is to provide help for a broad audience without detailed background knowledge and with heterogeneous use cases. This presentation will introduce the guide and discuss the challenges. Abstract: This paper addresses a particular domain within the sphere of activity that is coming to be known as personal digital papers or personal digital archives. We are concerned with contemporary writers of belles-lettres (fiction, poetry, and drama), and the implications of the shift toward word processing and other forms of electronic text production for the future of the cultural record, and in particular literary scholarship. The urgency of this topic is evidenced by the deaths of several high-profile authors, including David Foster Wallace and John Updike, both of whom have left behind electronic records containing unpublished and incomplete work alongside of more traditional manuscript materials.

We report outcomes from a planning grant funded by the NEH’s Office of Digital Humanities which brought together scholars, archivists, information professionals, and other specialists for purposes of evaluating evolving practices in several institutions and planning for additional activity. We argue that literary and other creatively-minded end-users raise unique challenges for the preservation enterprise, since the complete digital context for individual records is often of paramount importance—what Richard Ovenden, in a useful phrase (in conversation) has termed "the digital materiality of digital culture." We will therefore discuss preservation and access scenarios that account the computer as a complete artifact and environment, drawing on examples from the born-digital literary collections at Emory University, the Harry Ransom Center at the University of Texas, and the University of Maryland.
3:00-3:30pmBreak and Poster Session

Posters:
  • Andrea Goethals. Unified Digital Format Registry (UDFR)
  • Kevin de Vorsey, Peter McKinney and Cynthia Wu. Risk and Reward: A Method for Measuring Format Obsolescence in a Preservation Repository
  • Robert Olendorf. Preserving Second Life
  • Natalie Walters. Moving Digital Preservation into the Mainstream: The Hybrid Archive of Action on Smoking and Health (ASH) at the Wellcome Library
  • Hannah Frost. Assessment of Digital Objects in JHOVE2
  • Jeremy York, Joan Starr and Heather Christenson. HathiTrust: Preservation as a Platform For Collaboration and Expanded User Services
  • Paul Wu Horng-Jyh. Capturing, Describing, and Arranging eLearning Records in Web 2.0 and Beyond: a Case Study of Professional Seminar
 Robertson Auditorium
Front
Robertson Auditorium
Rear
3:40-4:00pm David Giaretta [bio]. Tools for Preservation and Use of Complex and Diverse Digital Resources. David Tarrant [bio]. Where the Semantic Web and Web 2.0 Meet Format Risk Management: P2 Registry.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: This paper will describe the tools and infrastructure components which have been implemented by the CASPAR project to support repositories in their task of long term preservation of digital resources. We address also the capture and preservation of digital rights management and evidence of authenticity associated with digital objects. Moreover examples of ways to evaluate a variety of preservation strategies will be discussed as will examples of integrating the use of these infrastructure components and tools into existing repository systems. Examples will be given of a rich selection of digital objects which encode information from a variety of disciplines including science, cultural heritage and also contemporary performing arts. Abstract: The Web is increasingly becoming a platform for linked data. This means making connections and adding value to data on the Web. As more data becomes available and more people are able to use the data, it becomes more powerful. An example is file format registries and the evaluation of format risks. Here the requirement for information is now greater than the effort that any single institution can put into gathering and collating this information. Recognising that more is better, the creators of PRONOM, JHOVE, GDFR and others are joining to lead a new initiative, the Unified Digital Format Registry. Ahead of this effort a new RDF-based framework for structuring and facilitating file format data from multiple sources including PRONOM has demonstrated it is able to produce more links, and thus provide more answers to digital preservation questions—about format risks, applications, viewers and transformations—than the native data alone. This paper will describe this registry, P2, and its services, show how it can be used, and provide examples where it delivers more answers than the contributing resources.
4:00-4:20pm Klaus Rechert [bio]. Novel Workflows for Abstract Handling of Complex Interaction Processes in Digital Preservation. Geoffrey Brown [bio]. Born Broken: Fonts and Information Loss in Legacy Digital Documents.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: The creation of most digital objects occurs solely in interactive graphical user interfaces which were available at the particular time period. Archiving and preservation organizations are posed with large amounts of such objects of various types. At some point they will need to, if possible, automatically process these to make them available to their users or convert them to a valid format. A substantial problem in creating an automated process is the availability of suitable tools. We are suggesting a new method, which uses an operating system and application independent interactive workflow for the migration of digital objects using an emulated environment. Success terms for the conception and functionality of emulation environments are therefore devised, which should be applied to future long-term archiving methods. Abstract: For millions of legacy documents, correct rendering depends on resources such as fonts that are not generally embedded within the document structure, and may not be adequately controlled for during archival ingest procedures. Large document collections depend on thousands of unique fonts not available on a common desktop workstation, which typically has between 100 and 200 fonts. Silent substitution of fonts performed by applications such as Microsoft Office can yield poorly rendered documents and may result in significant information loss.

We use a large collection of 225,000 Word documents to assess the difficulty of matching font requirements with a database of fonts. We describe the identifying information contained in common font formats, font requirements stored in Word documents, the API provided by Windows to support font requests by applications, the documented substitution algorithms used by Windows when requested fonts are not available, and the ways in which support software might be used to control font substitution in a preservation environment.
4:20-4:40pm Emmanuelle Bermès [bio] and Louise Fauduet [bio]. The Human Face of Digital Preservation: Organizational and Staff Challenges and Initiatives at the Bibliothèque nationale de France. René van Horik [bio]. MIXED: Repository of Durable File Format Conversions.
[abstract]
[video]
[full paper]
[presentation]
[abstract]
[video]
[full paper]
[presentation]
Abstract: The process of setting up a digital preservation repository in compliancy with the OAIS model is not only a technical challenge: libraries also need to develop and maintain appropriate skills and organization. Digital activities, including digital preservation, are nowadays moving into the mainstream activity of the Library and are integrated in its workflows. The Bibliothèque nationale de France (BnF) has been working on the definition of digital preservation activities since 2003. This paper aims at presenting the organizational and human resources challenges that have been faced by the library in this context, and those that are still awaiting us. The library has been facing these challenges through a variety of actions at different levels : organizational changes, training sessions, dedicated working group and task forces, analysis of skills and processes, etc. The results of these actions provide insights on how a national library is going digital, and what is needed to further accomplish this essential change. Abstract: The MIXED project delivers open source software that can convert various data formats into XML. Among these formats are binary formats such as mdb (MS Access), dbf (dBase), dpf (DataPerfect), and xls (MS Excel). The MIXED project is a contribution to the community effort of gathering quality tools for digital preservation. It is starting to deliver its results and will finish at the end of 2009. This paper reports on the current state of the project, the results obtained so far, and discusses how this tool fits into strategies for digital preservation that repositories employ. Moreover, it points out a way for repositories to agree on preferred formats and direct their efforts in format conversion towards a shared infrastructure.
4:45-6:00pmAprès iPRES at the Hi Dive!



Valid XHTML 1.0 Transitional

iPRES 2009 California Digital Library