Establishing Minimum Requirements for Archival Repositories
by Deanna B. Marcum
The Challenge of Preservation in Creating Digital Library Services
by Daniel Greenstein
Managing our Collections by Managing our Risks
by Abby Smith
The Organizational Impact of Digital Library Initiatives
by Rebecca Graham
College Libraries Committee Sets Agenda
by Deanna B. Marcum
LAST FALL, THE Coalition for Networked Information (CNI) and CLIR convened a small group of library directors and publishers and asked what would be required to ensure the persistence of an electronic journal for one hundred years. CNI has since enlarged the group and continued the discussion. Other organizations, such as the Society for Scholarly Publishing and the National Science Foundation, have also devoted meetings to the subject of long-term maintenance of digital information.
These and other discussions have brought to light many questions focusing on responsibility for archiving, technical strategies, the connection between archiving and access, and economic models. These questions cannot be answered in the abstract. To make progress, projects must be started, the processes documented, and the costs calculated.
Following strategy sessions with digital library experts, CLIR agreed to convene a series of meetings to identify practical next steps to establish some archival repositories, to seek publishing partners to populate the archives, and to develop the necessary licensing apparatus to ensure that the archiving strategy accommodates the interests of users. For publishers and librarians to feel confident about archival repositories, both parties must agree upon the criteria for archivability.
Thirteen librarians met in Washington on April 14 to develop a framework for establishing archival repositories and to agree upon the minimum requirements. Daniel Greenstein, director of the Digital Library Federation, extracted relevant requirements from the existing Open Archives Information System (OAIS) model and framed them in the context of electronic journals as a place to begin. Also providing a framework for discussion was a summary of archival repositories for electronic journals that Clifford Lynch presented at the CNI meeting in March.
Although the exact language of the eight criteria for archivability is still being developed, the librarians agreed in principle with the following:
- A digital archival repository that preserves digital scholarly publications will be a trusted third party that conforms to minimum criteria agreed upon by scholarly publishers (as producers of those publications) and libraries (as consumers of those publications).
- The digital archival repository will define its mission with regard to the needs of scholarly publishers and research libraries. It will also be explicit about what scholarly publications it is archiving.
- The repository will negotiate and accept appropriate deposits from scholarly publishers.
- The repository will obtain sufficient control of deposited information to ensure its long-term preservation.
- The repository will follow documented policies and procedures to ensure the information is preserved against all reasonable contingencies and enable the information to be disseminated as authenticated copies of the original or as traceable to the original.
- The repository will ensure that data can be disseminated to libraries in a renderable form.
- The repository will have as its mission the long-term maintenance of digital information in conformance with the criteria set out above. It will work within legal, business, and organizational contexts appropriate to its mission and maintain arrangements to secure the persistence of any data in its care in the event that the repository ceases to operate.
- Repositories will work as part of a network of repositories.
Fuller discussions of these criteria are available on the Digital Library Federation Web site.
The librarians agreed to observe these criteria in developing local and consortial archival repositories for electronic journals. They further agreed to document the processes they follow in archiving, and to make available the information about costs. CLIR will work with the libraries to gather information on processes and costs, and will disseminate the findings widely to the library, archival, scholarly, and publishing communities.
The next step is to convene a group of publishers in early May to discuss these criteria. In June, the licensing group will meet. Then, CLIR intends to bring the three groups together to negotiate criteria of archivability and terms of deposit of electronic journals.
THE PERSISTENCE OF digital information remains an essential challenge in building the online service environment of the digital library. A few libraries and library organizations are poised to develop limited digital archival repositories. Their progress may rely upon the emergence of two elements that are currently absent.
First, there is no widespread agreement about the minimum functional requirements of a digital archival repository. Such agreement is essential. Without defining what maintenance entails (and thus the requirements of the repository), libraries cannot tell suppliers of digital content what is needed to preserve the information. The suppliers need to agree on the requirements of a repository to satisfy any demand that libraries may make with regard to that content’s persistence. (See “Establishing Minimum Requirements for Archival Repositories,” above). Finally, for emerging repositories to be trusted, whether as suppliers or consumers of digital content, they require a blueprint for the services they need to offer and a benchmark against which their services can be measured and validated.
A second element that is absent from the digital preservation arena is a more realistic understanding of the value of digital information. The costs of maintaining digital information over time are unknown but undoubtedly high. The costs of information loss are likewise unknown, but the potential costs must be considered. For example, a drug company maintains data generated in the development of a new product for as long as those data have value to the company. Such data might be kept as evidence in the case of legal action; the costs of not preserving the data could be ruinous. In this context, preservation may be expensive but less so than the alternative.
It would be difficult for libraries to make similar assessments, given their overwhelming focus on commercially produced scholarly materials (e.g., journals and reference services). Moreover, because of the number of subscriptions they hold, it would be unlikely that any single library or library consortium could take responsibility for preserving such content over the longer term. Nor does long-term preservation motivate the commercial supplier. And the commercial supplier’s understanding of “longer term” will understandably be at variance with that of the library.
Might we begin, then, with digital information for which no other body is likely to take an archival interest—with the digital surrogates, for example, that are created by some libraries? This is not to suggest that all digital surrogates must be preserved. The UK’s National Gallery periodically re-digitizes its collection of some 2,500 art objects to take advantage of new imaging technologies. The same strategy is not necessarily advisable for all, especially those conducting projects to digitize tens or even hundreds of thousands of individual objects. The question to be addressed is not only about the costs of preservation but also about the higher costs that are likely to be involved in periodic re-digitization.
And what about the digital content emanating from surrounding academic departments, which makes up an increasing proportion of the university’s intellectual assets? Computer-based research, learning, and teaching materials have significant value. Yet, that value is fully realized only if the materials are assembled into professionally managed collections and maintained over time.
Admittedly, decisions to maintain the university’s intellectual assets will not be made by the university library in isolation. The information content that is available from the university’s digital library makes up only one part (a very important part to be sure) of the university’s portfolio of information assets. To determine its value and the bearable expense involved in its preservation, the entire portfolio needs to be reviewed. In the university context, progress in digital preservation is likely to require institutional ownership of a far broader preservation problem.
IT IS NOT unusual for a library to field questions from its oversight bodies such as “How much are your collections worth?”, “How do you decide how much to invest in security, cataloging, preservation, or collection development?”, and “What if you digitize your collections?” These questions beg more fundamental questions about the role of collections in contemporary libraries and the funding required to make them productive. For as long as libraries have existed, ownership of collections has been central to the mission of providing information to patrons. But it is no longer true that libraries must own—or even have physical custody of—an item to serve it to a patron. First with the growth of interlibrary lending, then with the spread of networked resources, libraries have slowly begun to uncouple collections and services. This development coincides with the increasing demand by funding organizations to manage library services and collections in a businesslike way.
CLIR has published a report, Managing Cultural Assets from a Business Perspective, that describes how the Library of Congress developed and implemented a plan for greater accountability over its collections. The report is a case study that, while focusing on one institution, sets out a model that can easily be adapted to every type of library, no matter how small, how specialized, or how atypical. This report is published with the cooperation of the Library of Congress and written by Laura Price of KPMG LLP and Abby Smith, now of CLIR but formerly with the Library of Congress. The Public Services–Assurance practice of KPMG LLP, an international audit and business advisory firm, developed the business risk model for the Library of Congress. Both Ms. Price and Ms. Smith were deeply involved in adapting the business risk model to the daily operations of a major cultural institution.
Given how little data we have about the relationship between access to collections and the productivity of scholars, the achievements of students, or the enlightenment of the public who use library resources, many managers are rightly concerned about treating the library business as a business. Nonetheless, it is not difficult to translate the work that goes on in libraries into the language of business. Nor is it incorrect to assess the effectiveness of libraries by investigating how responsibly staff members meet their obligations as custodians of collections. Responsible stewardship is, after all, at the very heart of professional librarianship.
When appropriately adapted to the library environment, the business risk assessment model effectively addresses the major challenges facing library managers, funders, and staff in the course of running a library. It defines library collections as core institutional assets and defines good stewardship as a dynamic process of identifying risks to the collections and instituting policies and procedures that mitigate the risks. This model is valuable to managers because it is designed not only to identify risks to assets, but also to determine which risks are unacceptable and what measures must be taken to reduce them. It guides managers’ decisions about investments in their collections, and is grounded in the individual mission of each library.
The fundamental step in determining what constitute the chief risks to a collection is to quantify what threatens their fitness for use. What good would a book be if no one could use it, and what could happen to a book that would render it unusable? It could become embrittled and crumble—a preservation risk. It could become misplaced, inadvertently through misshelving (an inventory control risk) or deliberately through theft (a security risk). It could be incorrectly cataloged and hence be unretrievable—a bibliographic risk. These hazards are well-known to librarians, and staff members spend much of their effort in reducing the chances that any of those things will happen. Libraries can effectively serve patrons if they have bibliographic controls that tell them what they have, inventory controls that tell them where the items are, preservation controls that mitigate loss of information caused by physical deterioration, and security controls that prevent unauthorized removal of items from the library. The risk assessment model is flexible and dynamic: it allows for the great variety of physical formats and intellectual value of library materials and permits one to define risk at any given moment in the life cycle of an item. For example, an eighteenth-century manuscript leaf is more vulnerable to theft than a book, while a book may be more vulnerable to embrittlement than a manuscript. Digital materials carry entirely different risks. Policies and procedures to control these materials must derive from those perceived threats. All staff who handle the materials—from those who unload incoming materials at the loading dock to the registrar who documents collection items leaving for exhibitions—are responsible for following those procedures to reduce risk to acceptable levels.
By looking at the life cycle of library materials, the risk assessment model widens the definition of who is involved in the stewardship of library assets to include information technology staff. The bibliographical and inventory controls in libraries are increasingly dependent upon systems librarians and the technology staff that support the systems. Because risk assessment is a dynamic process, it can help managers to identify the risk to collections in advance of crises and to plan for the strategic investments in collections management that will ensure the productivity of institutional assets.
CLIR is interested in working with libraries and other collecting institutions to move this risk assessment model from case study into practice and seeks ways to help libraries incorporate it into their work. The published report is available from CLIR for $15, prepaid; an online version is available.
SEVENTY-FIVE PARTICIPANTS from the Digital Library Federation (DLF) gathered at Emory University on March 31 for the second DLF Forum on Digital Library Practices, which focused on the libraries’ organization for digital library initiatives. The two-and-a-half-day event allowed for an in-depth exchange of ideas about how libraries can manage effectively their digital library efforts.
What emerged from the presentations and discussions is that organizational practices vary significantly, reflecting the variations in institutional mission and strategy. Yet, some common themes ran through the sessions.
Digital library initiatives, while in the early days tended to be driven by grant funding, are now driven by libraries’ interest in providing services to their users or by the changes in scholarly communication, or both. The participants noted increased demand for digital materials in distance education and for classroom teaching. Users expect the library to be able to bring multimedia materials together for them. Simultaneously, there are growing expectations of self-service access to library resources.
As digital library programs evolve, more and more library staff want to be involved. The early initiatives were limited to a few members of the staff, who used the digital library projects to gain expertise. Increasingly, however, there is a strong interest among the entire staff to integrate the digital library into the larger library program. As this integration occurs, traditional library positions are evolving to include new tasks. Technical services staffs are no longer concerned with acquisitions and cataloging exclusively. They are also taking on the tasks of capturing and producing metatdata, assuming rights management responsibilities, and assigning and managing persistent names for digital objects. Some libraries have added Web harvesting responsibilities to the mix of technical services duties.
But these changes are not limited to technical services. Public services staff have also been integrated into the digital library initiatives because of their service orientation and their subject expertise. Kitty Bridges, Head of the Shapiro Science Library at the University of Michigan, commented that “information technology is now a part of all of our jobs.”
Throughout the Forum sessions, participants emphasized the need for additional technical training and better preparation for project management. In the technical areas, they called for training in encoded archival description and metadata standards. They need a better understanding of digital library architecture. In project management, they cited the need for basic training as well as help in the use of project management software.
Recruitment and retention of digital library staff are common problems. Several participants recommended that libraries seek partners from departments of computer science and the schools of information and library studies, both for the expertise these partners would bring, but also for the potential supply of new staff they could provide.
Not surprisingly, the review of organizational practices uncovered both challenges and opportunities. Many of the participants picked up new ideas from their counterparts in other institutions. The discussions, while focused on how to organize digital library initiatives, soon branched into a larger and more important question—how to organize the library of the twenty-first century.
Rebecca Graham to Oversee Digital Library Program at Johns Hopkins
Digital Library Federation Research Associate Rebecca Graham has left the staff of CLIR to become head of library computing services and the digital library program at Johns Hopkins University in Baltimore. She will begin her new position May 15.
Ms. Graham joined CLIR in September 1998 after graduating with an MLS from the University of Illinois at Urbana-Champaign. During her tenure with DLF, Ms. Graham initiated the Forum on Digital Library Practices series, served as interim director of DLF, and was extensively involved in several projects, including the digital certificate prototype and the Academic Image Cooperative.
AT ITS MEETING on March 20, 2000, CLIR’s College Libraries Committee concluded that the agenda for the next phase of work could be best accomplished if the group were expanded to include different types of academic libraries. In the near future, the committee will add representatives from mid-sized universities and independent research libraries to its number, and the name of the group will be changed to reflect the new composition.
Formed originally to advise the Commission on Preservation and Access on preservation problems confronting liberal arts colleges, the group has broadened its focus in recent years to address a wider range of issues. For example, early in 1999, CLIR and the CLC convened a conference on the innovative use of technology on the campuses of small and mid-sized academic institutions. The conference was based on a series of case studies conducted the previous year. (CLIR published the case studies and a summary of the conference proceedings in August 1999.)
At its March meeting, the committee identified the following topics that most need attention:
- Special Collections. The committee urged CLIR to initiate projects that help small and mid-sized institutions manage more effectively their special collections. Which collections should be digitized? What are the economic models for digitization projects? What are the organizational models for special collections?
- Collections. The committee is especially interested in understanding the meaning of “core collections” in a digital environment. How will collecting patterns of smaller institutions change when electronic books become more prevalent? What role does off-site storage play in collection management policies? Does cooperative collection development become more or less important in the digital environment?
- Technology. Many of the issues relating to library collections and services have a technological component. The committee hopes to address the following questions: Authorization and authentication are institution-wide concerns, but what is the library’s role in addressing them? What should libraries be doing about course management software? What kinds of rights management systems best serve the needs of college and mid-sized university libraries?
- Leadership. The committee encouraged CLIR to find ways to communicate library issues to administrators and relevant decision makers. It also urged CLIR to study the requirements for leadership in the future and to conduct quantitative analyses of staffing and recruitment patterns and problems.
- Distance Education. What is the library’s role in providing resources for distance education offerings?
- Outsourcing. With the many changes that are taking place in the digital environment, smaller institutions have unprecedented opportunities to outsource their processes and services. Which opportunities are likely to be most important for increasing library effectiveness?
After a thorough discussion of these needs, the committee chose four areas for in-depth study this year:
- Best practices for libraries working with Web-based or Web-assisted courses. Specifically, the library’s role in such courses will be examined.
- Overview of outsourcing. What activities are smaller and mid-sized libraries now outsourcing to vendors? What activities are being planned?
- Staffing. What types of skills are needed for smaller and mid-sized institutions? What are the recruitment sources for the people that academic libraries need?
- Communicating with administrators. How do we inform decision makers about the changes taking place in academic libraries? The committee proposed an annual survey of important academic library issues. From the survey, CLIR will publish a report aimed at academic administrators.
Enhancing Digital Libraries through the Use of Knowledge Organization Systems
A NEW REPORT from the Digital Library Federation (DLF) examines the use of knowledge organization systems in a digital environment. Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files, by Gail Hodge, is the DLF’s fourth published report.
Knowledge organization systems are used to organize materials for the purpose of retrieval and to manage a collection. They serve as bridges between a user’s information need and the material in a collection. Examples of such systems include term lists, such as dictionaries; classification schemes, such as Library of Congress Subject Headings; and relationship lists, such as thesauri.
The report provides examples of how knowledge organization systems can be used to enhance digital libraries in a variety of disciplines. Systems of Knowledge Organization for Digital Libraries is available electronically at www.clir.org/pubs/abstract/pub91abst.html. Print copies may be ordered for $15, prepaid, from CLIR.
|Council on Library and Information Resources|
|1755 Massachusetts Avenue NW, Suite 500
Washington, DC 20036
Fax: (202) 939-4765 · E-mail: firstname.lastname@example.org
The Council on Library and Information Resources (CLIR) grew out of the 1997 merger of the Commission on Preservation and Access and the Council on Library Resources. CLIR identifies the critical issues that affect the welfare and prospects of libraries and archives and the constituencies they serve, convenes individuals and organizations in the best position to engage these issues and respond to them, and encourages institutions to work collaboratively to achieve and manage change.
Digital Library Federation
Deanna B. Marcum
Director of Programs
Editor and Director of Communications