Setting the Stage
This is neither a commentary on the preceding papers nor a summary of the discussions held in May 2003. Instead, I will try to give a sense of the various views expressed by the presenters and respondents, identify the issues raised in the subsequent discussions, and highlight the implications of the day’s deliberations.
The central question that the papers address is whether or not the infrastructure in place for preservation is appropriate to the new information environment. In the case of libraries and archives, the preservation infrastructure that supports print-on-paper collections is well developed and relatively well resourced. Despite the difficulties seen by Anne Kenney and Deirdre Stam and others in funding and staffing this infrastructure, it still serves as a benchmark for other preservation services.
Kenney, as well as Dan Greenstein, Bill Ivey, and Brian Lavoie, believe that this model is inappropriate for digital and audiovisual materials. Based on decentralized, locally oriented, ownership-based preservation strategies–strategies that make up what Dan Greenstein calls the “buy it and put it here” model-the infrastructure in place for books and serials runs up against a host of technical, legal, and policy difficulties when used for moving image, recorded sound, and electronic resources. Moreover, these strategies will not, Greenstein argues, even serve the needs of imprints that are not rare. As Lavoie points out in his economic analysis of archiving of (nonunique) digital materials, the very incentive for any single institution to preserve commonly held materials disappears in this scenario. Thus it is not only digital and audiovisual collections that are undercut by a strategy of decentralized, ownership-based preservation. In the long run, the print collections that form the core of library and archival collections today are equally affected.
If this decentralized, ownership-based model for preservation is not scalable for the present and future, then what model or models will be? And what is an appropriate infrastructure to support preservation in the twenty-first century? What are the economic models, policy and legal sanctions, human and financial resources, tools, and technical expertise that will make possible the myriad actions necessary to ensure long-term access? After examining the changing nature of users, collections, and collectors, I will then discuss specific features of the infrastructure that need to be changed.
Patterns of Collecting and Use
Answers to the core questions of preservation-what are we to collect and preserve, for whom, for how long, and who should assume the burden of stewardship-have changed dramatically in the past few decades. While there is agreement on the proximate causes of these changes-the growth of digital information technologies, the explosion in the production and dissemination of information, the constraints imposed by expanding copyright monopolies, the static or shrinking resources devoted to preservation, and the destabilization of cultural and intellectual canons-the ultimate cause is hard to identify. That cause is no doubt rooted in the fundamentally different roles that information resources and the intangible cultural heritage that Ivey describes play in our lives. The demand for access to these resources has escalated, as have the means to deliver them. This escalating demand for access has been accompanied by clear user preference for direct access-access that is unmediated both physically and intellectually. Study after study shows that most users prefer desktop delivery of information. (As the Digital Library Federation [DLF]/Outsell study confirms, the user perceives such delivery, even when mediated by the library, to be Web-based and thus unmediated [Friedlander 2002].) Peer-to-peer file swapping happens not only among students trading favorite music tracks but also among high-energy physicists using the preprint service arXiv.org. Indeed, there is a growing movement among scholars to “disintermediate” their communications from publishers and libraries, some in the hope of making their work accessible in a more timely way, others in the hope of lowering publishing costs.
The emerging paradigm of peer-to-peer, Web-enabled communication ends up inadvertently eliding the traditional guarantors of long-term access, not to mention authenticity and reliability. Scholars and the public alike recognize the socially beneficial function that libraries and archives perform in protecting these attributes of information. Nonetheless, given the enormous costs of ensuring long-term access-costs that are largely hidden from the view of the primary beneficiaries-it becomes crucial that preservation serve the purpose of access directly and that this be generally recognized. As Ivey and Lavoie remind us, without the promise of near-term access, preservation will not find the widespread public support-financial, regulatory, and otherwise-that it needs. By defining preservation in the context of access, present and future, we are forced to recognize that preservation demands active management of information resources. Moreover, this management should begin well before the resources end up in the institutions traditionally charged with their long-term care. Preservation cannot be managed by libraries, archives, and other so-called memory institutions with the kind of autonomy they have had in the past. Preservation organizations must enlist the help of the users, the creators, and the numerous nonprofit and for-profit stakeholders that Brian Lavoie describes. What do we know about who these users are and what would motivate them to contribute to or support long-term access?
The user for whom collections in libraries and archives have been assembled over the past two centuries is not the same user whom those institutions anticipate serving in the near future. Wendy Lougee noted in her response to Dan Greenstein that the use of research collections has expanded beyond the walls of the library and even the grounds of the campus. The off-site users who come to publicly available academic and research collections through the Web and Web-accessible catalogs are not the usual self-selected experts, such as faculty and graduate students, who are well versed in searching the hierarchical orders created by libraries and archives. Among these nonexperts are many undergraduates, who tend to use Google and other search engines as their default mode of searching. But an increasing number of today’s users are in the expansive category of “lifelong learners,” which includes schoolchildren, hobbyists, and independent researchers of all stripes from around the globe. These users often seek direct access to both primary and secondary sources, though the former are more likely to be in the public domain and hence accessible to those not affiliated with a university or college. There is a growing number of commercial users as well-users who only a decade ago would come to the reading rooms of audiovisual and rare book collections but who are now able to gain access to what they want through the Web.
In her response to Greenstein, Lougee argues that the trinity of content, access, and users is the only meaningful context in which to understand changing patterns of use. Greater access can drive increased use; the example of JSTOR and the changing nature of use of the JSTOR journal articles provided early instruction on this point. Another lesson derived from the JSTOR experience is that content must be structured in a way that facilitates access and use. People often prefer the easily accessible over the more reliable, when the latter is more difficult to get to. Lougee cautions that what users really want is an impossible mixture: the ease of finding that is common to the Web, the sophisticated functionality of complex and intelligent systems, and the depth and reliability of collections found in libraries and archives that provide access to value-added electronic resources.
This law of convenience applies to both sophisticated and casual users, as shown by the DLF/Outsell data. This preference probably has more to do with individual users’ sense of the value of their time than it does with the value of the information they are seeking at a given time for a specific purpose. Not all searchers settle for what they find through the path of least resistance. What is important to note is that they wish to be the ones who decide about the investment of their time in search and retrieval; they do not want librarians and other information brokers to make these decisions. Participants agreed that there is an ongoing need to follow more closely the emerging patterns of use. Recent studies, such as the DLF/Outsell report, OCLC’s study of college students (OCLC 2002), and the overview of user studies compiled by Tenopir (2003), are frequently cited to lend statistical weight to anecdotal reports.
Many of the most significant disagreements at the meeting centered on the issue of selection and assessment of value, which has been a cornerstone of preservation strategies to date. The issue of selection for preservation, or what archivists call “appraisal for long-term retention,” has preoccupied preservationists for decades. Assessing the relative value of an object or collection routinely precedes the decision to take an action to preserve. Although individuals have long debated the criteria for deciding value, there has been little argument about the importance of assessing value as such. But this, too, has changed.
There is a sharp disagreement on the question of selection. Many believe that it is worthwhile for collecting institutions with a preservation mission-libraries, museums, archives, historical societies, and so forth-to dedicate time and resources to selecting items first for acquisition and subsequently for preservation. Those who disagree feel that such activity is not feasible or desirable in the current information landscape.
The fault line is clear. Many participants at the meeting, for example, asserted that one of the fundamental purposes of memory institutions is to select information and cultural resources that meet a certain benchmark of value. That benchmark could vary from one institution to another, depending on its core clientele. Moreover, within a single institution, those benchmarks could evolve over time.
Nonetheless, setting limits to collecting is important for several reasons. It is only by careful editing that we can build collections that provide depth and breadth on chosen subjects and, at the same time, exclude those resources with dubious provenance, uncertain authenticity, and lack of relevance. Selective collecting is also an economic mandate, many argue. Since institutions cannot afford to acquire and serve everything of potential interest to some patron at some time in the present or future, selection is a matter of engaging in good husbandry and maximizing service to the communities they serve. Some have expressed concern that users expect that anything found in the collection of a library or archives carries a certain warrant of value and authenticity. If something is found in a library, this means that the library ranks its intrinsic value above that of other information resources that have been excluded. To give one lighthearted but not exceptional example, many lay visitors to special collections libraries that have a research collection of, say, pulp fiction or first-edition paperbacks need to have an expert-the librarian-explain why these materials are worthy of collecting in such depth at a serious institution. The explanation often entails a short discourse on the principles of selection that makes it clear that such items are valuable only if collected in depth: Context is highly prized among collectors. That said, the object’s intrinsic merit has nonetheless been validated by inclusion within the library’s walls.
Other meeting participants argued just as strongly that this kind of selection is, in the words of one historian, a “waste of time.” It is a waste of time because the flood of information makes collecting en masse and migrating digital data less time-consuming and requires fewer human resources than does selecting among electronic resources or, for that matter, the voluminous paper trails that many historical figures and organizations leave in their wake. The promise by some technologists that collecting and migrating digital data will soon be automated, and that metadata will be automatically extracted in due course, encourages those who see intellectual flaws in selection per se. Looking at what is and is not available for present researchers in institutions leads them to conclude that human judgments are unreliable for shaping the historical record, at least in the context of the present deluge of new information. We cannot predict what people will find valuable in the near or distant future, they contend.
At heart, it is the scale of distribution on the Web, the richness of its content, the diversity of its genres, formats, and authors, and its unpredictability that lead to skepticism about the feasibility and desirability of selecting Web-based materials for long-term access. One technologist proposed the model of “harvest and purge” for Web-based materials; that is, to crawl and store Web data, then migrate or otherwise manage it for as long as it can be or needs to be kept. One can always throw it away later, but once it disappears from the Web, it is next to impossible to get it back. One can even keep these digital objects alive for several migrations before a user or collector comes along to add value to the data through description, curation, and so forth, and the data find a new home where they will be actively managed over time.
Some argued that selection of digital data for long-term access (that is, preservation) no longer occurs at point of acquisition, as it has with books and other analog formats. When libraries find themselves in the business of providing access to information they license from a third party, they can forgo the notion that “acquiring” to provide access obliges them to preserve the resource for long periods of time. In the world of archives, this is a familiar pattern. Archives can manage a vast amount of records for fixed periods of time without committing to their long-term access. The decision to maintain records over time is made during the appraisal process, which often takes place years after records are acquired. There is no reason why managing digital data for access cannot happen in a similar way, with disposal and retention decisions being made much later in the life cycle of the information.
The issue of completeness of information is one that librarians and scholars pressed, both those who value and those who argue against careful selection for collections. A user must have confidence that the information provided by a library or archives is complete to the extent possible, and that when it is not, that fact must be marked. That relates to the reliability of the information.
The collection’s breadth is crucially important for the nonquantitative disciplines in general and the humanities in particular. Many researchers believe that access to a largely unedited collection of resources (for example, a text with all drafts or a set of complete, unexpurgated business records) will enable them to better understand the nature of the environment under study and how it shaped the final outcome-the context of an event or condition or creation, in other words. A corollary of that belief is that there is risk to ultimate knowledge in not saving something, even if its present value is unknown or appears to be slim. Given the role of contingency in historical phenomena, that risk is greater in the humanities and studies of culture than in the physical sciences (though this is changing, particularly in the observational sciences). Scholars in all fields have long understood the value of context in hermeneutics and have thus hailed the signal purpose of archives and special collections to preserve the context in which information arose or was fixed, used, or collected. This fact alone would argue for the massive collecting of Web-based materials.
A final factor that makes it difficult to weed out nonrelevant digital data is that they are seldom fixed or bounded into final forms that remain stable for long. A digital publication can have many versions as it changes over time; unlike print-based publications, it is often designed to be updated by the creator or publisher. Because few digital objects are fixed, basing decisions about what to select and preserve on the old model of fixing information to an archival medium can be perplexing.
Broadcasting archivists have to grapple with information that has temporal instability. Taking a cue from them, those who wish to preserve and maintain access to a constantly changing Web publication should decide on a sampling strategy that best reflects what is essential to that site. Broadcasting archivists offer an equally important lesson for digital archiving because broadcasts, moving-image materials, and audio are all constituted from myriad elements that are edited time and again to produce different versions. Most “final products” in audio or visual collections comprise many “production elements” that are recycled and recombined for different times, audiences, and purposes, and are often reborn into new consumer products or are integrated into new productions for broadcast. Indeed, it is their constant selection for reuse that appears to favor their long-term persistence. Use drives access, which in turn drives preservation.
One element in this changing information landscape that has stayed remarkably the same is the individual collector. The role of the collector in identifying value and context, and bringing in uncollected materials for curation and long-term access, remains crucial in the transmission of knowledge from one generation to another. One of the best-known collectors among the “digerati” is Brewster Kahle, and his forays into Web collecting bear many lessons, not the least of which is that people routinely refer to what the Internet Archive does as “archiving,” when it is in fact collecting and providing access, pure and simple. (Indeed, the fate of Kahle’s collection after his demise is unclear.) Kahle collects in ways similar to those of broadcast archivists. The Web crawlers at the Internet Archive sample certain parts of the World Wide Web every month; they sample other sites at differing rates. What they crawl is determined by Kahle’s assessment of which features of the Web he finds most essential to document. He has identified the essence of the Web as its ephemerality, its democratic nature and dynamism, and its ubiquity. While his crawlers do exclude some parts of the Web for his collections, such as many commercial sites and all those devoted to pornography, the crawlers are in turn excluded from countless others because they are gated. The exclusive ownership of so much information of the Web, which parallels information in other formats, presents formidable challenges to the harvest-and-purge model of building persistent digital collections.
A final point about the difficulty of assessing value casts doubt on one’s faith in the “past is prologue” theory of preservation. Lougee, in her response to Daniel Greenstein, pointed to a growing body of evidence that the new uses made of digital texts are altering the ways in which their users perceive value in the print collections. She mentioned The Making of America site at the University of Michigan as one example of books whose value and use were transformed through digital distribution. Out-of-print public domain American imprints that were virtually unused in print form see very heavy use in digital form. Is there really any way to extrapolate what we know about the use of print into the digital realm?
In sum, all we seem to have learned so far is that digital text users value highly the ability to reuse and repurpose resources. That would lead one to see digital texts as one sees moving-image and audio objects, that is, in terms of their “elements.” We can imagine the digital user as one who is now able to build private collections and libraries of his or her own from existing digital objects taken from all over. In a way simply not possible in the analog world, users in the digital realm can now become collectors as well. The full implications of this phenomenon are just beginning to be thought through by digital scholars, teachers, and serious researchers.
Use in the digital context upsets certain concepts of value for the collector from the analog world, and this has significant implications for libraries as collecting institutions. In some sense, scarcity as a value is replaced by ubiquity as a value: The more things are used, the more they will be used and the more likely they will be preserved. This has serious consequences for the fate of libraries’ long-held print collections, which continue to grow rapidly but which, according to this scenario, will be superseded by digital delivery of text. While digital content is currently only 20 percent of most Association of Research Libraries (ARL) acquisitions, both Greenstein and Lougee reported that it is more heavily used than that number might suggest. While the DLF/Outsell study shows us that people value hard copy, it also shows that they prefer it for a limited set of research and pedagogical tasks. For delivery of content, academic users, like others, prize convenience. This has been borne out in practice by the University of California’s Collection Management Initiative.
As more quotidian titles are available in digital form, library collecting will change. Given Greenstein’s observation that libraries are unwilling to preserve nonunique imprints systematically, we can predict that, in the future, there will be little willingness to collect and retain nonunique items that are commonly available through various subscriptions if there is some guarantee-from the publisher or a central archiving service (or both)-that they will persist. This will be true not only of such genres as newspapers and journals but also of the audio and visual resources that will find greater and greater use in the coming decades. The question then becomes who is responsible for ensuring their long-term access if that is no longer a routine activity of libraries. Neither libraries nor publishers see that mission as core to publishing and distribution companies, especially those in the commercial sector. Yet few if any libraries would say that they are now able to expand their preservation activities to include stewardship both of the full range of analog formats and of the increasing load of digital information.
The question therefore turns to the nature of institutional commitment to preserve obsolete or nonrare analog formats. All archives and libraries that identify themselves as research libraries see preservation as core to their mission. But as the perceived value of print collections, not to mention that of LPs, audiocassettes, and videotapes, lessens over time because of dramatically reduced demand, will research libraries be willing to devote resources to their preservation? If not, which institutions will see that as their responsibility, and what is their capacity to undertake this work?
Institutions and their Readiness
Early in the discussion, participants agreed that preservation failures of the past-be they the loss of census data from the 1960s, the occasional failure by libraries or publishers to follow reformatting standards, the decisions not to retain original materials after reformatting, or any other that can be cited-are more often organizational than technological in nature. Some who believe in the promise of the Internet to democratize collecting and preserving also tout its ability to obviate such organizational problems by going around organizations altogether. Others caution that those who do this risk finding out for themselves what private collectors have known for centuries: Collections need a stable organizational environment to survive for more than one generation. Organizations alone endure long enough to provide the “perpetual care” that collections require to remain fit for use. Institutional longevity allows them to build and sustain the infrastructure necessary to carry on the work of preservation. That infrastructure comprises not only physical support-technology and buildings most notably-but also human resources and relationships, from skilled experts to relations with both scholars and the local fire department to an institutional commitment in the executive offices. Organizations can commit to the long term, and doing so is not a one-time investment.
Need for Systemic Changes
But, as Kenney correctly notes, the very institutions that have preservation as their core missions are themselves in a period of extraordinary instability. She advises libraries and archives to streamline preservation handling and treatments, mainstream specialized activities, and embrace “good-enough” practice in order to keep pace with the demand for preservation. That involves a significant adjustment in the professional practice of many preservationists, trained as they are to seek the best-possible solution to a problem. But such a change in approach is achievable, given some appropriate resources and strong leadership at all levels of the organization.
There is a need to move away from fixed solutions tied to carefully developed standards and to move toward good-enough solutions that can be adapted as both the environment and the technology change. Kenney and others call not only for development of automated techniques for preservation treatments but also for the assurance of completeness and authenticity that users expect from libraries and archives. Such automated solutions may ease the burden on preservationists in the long run, but only if those technologies embed the core values of preservation. Kenney, for one, does not believe that the kind of crawling done by the Internet Archive meets those high standards.
While arguing that technology may aid preservation in discrete and identifiable ways, Kenney asserts that preservation is not essentially a technical problem. The fundamental challenge she sees facing preservation organizations is the same one Greenstein identified: how to cope with the scale of information production. Kenney and Greenstein agree that any strategy to effect the economies of scale necessary for satisfactory solutions would demand a change in professional and organizational cultures that extend beyond the preservation community. Those solutions will demand a collaboration among libraries and archives and many stakeholders that is so fundamental and so radical that it becomes, in effect, an interdependence.
The notion that libraries must collaborate to preserve access to print collections is widely shared. But the call for interdependence is likely to be quite controversial, once its implications for organizations and their governance practices are fully grasped. For decades, libraries have cooperated with varying degrees of success on the collection and preservation of specialized literature. Such arrangements have enabled them to have access to niche imprints that they would not otherwise have, while sharing the responsibility of preservation. Required now, however, is a wholly different kind of shared collection management: the collection and preservation of common imprints, widely held among institutions, even if not regularly used by faculty and students. Furthermore, as Kenney notes, the decoupling of ownership and governance necessary for that to happen demands an interinstitutional trust that goes beyond contractual agreements, though these will be necessary as well.
What incentives will motivate libraries to cooperate and become interdependent where they have competed before?
The primary motivations will be economic-what a capitalist might hail as “enlightened self-interest.” As digital delivery supersedes analog as the preferred access mode for most information, the level of collection redundancy that was necessary for local access actually becomes a potential liability. If collecting institutions take seriously the expressed preferences of their users, they will conclude that they must collect, manage, and preserve print differently. The availability of digital information, even if not currently widespread for whole classes of information resources, is already fundamentally changing the ways people use collections, which collections they use, and the values that they place on various collections. If institutions do not seize on the economies of scale now available for the management of print through digital technologies-technologies that will, when fully implemented, lead to better service and lower costs-they will be swamped by the rising tide of information resources demanded by their users.
Both Kenney and Greenstein see in functional streamlining and shared collection development the possibility of redeploying resources for emerging needs. Those resources will be needed for the initial costs of work redesign and staff training. Maintaining a shared collection management environment will incur costs. These changes will also be costly in time. The need to build and nurture relationships of trust will be an ongoing cost that will require reliable systems of information sharing, both technological and personal. Consultation with colleagues takes time, a resource that is growing scarce. But automated information sharing and technologies for remote conferencing may ease the way, once relations of trust have been engendered.
Paula Kaufman, in her response to Kenney’s paper, cites the infamous case of Xerox “fumbling the future” when it let the mouse and the GUI interface “leak out the front door” to be developed by competitors. If libraries and archives are not quick to respond to the need to reach beyond traditional approaches, she warns, then others will. If that happens, the current generation will simply grow up without using the library, because other entities, such as search engines, meet their information needs more conveniently. Looking to the example of the Internet Archive as a nonlibrary preservation entity (or competitor), some librarians lamented that it does not rigorously follow good archiving practices. Others rebutted that the Internet Archive is a clear case of good-enough practice, especially in the near absence of library- or archives-led efforts to collect the Web on such a scale.
The role for memory institutions has become more complex in today’s heterogeneous information environment. It is, therefore, more critical than ever that these institutions focus on the core missions that are unique to them. Among the most crucial and socially valued is to warrant the authenticity and completeness of their information resources-to remain a highly reliable source of highly reliable information. This becomes their competitive advantage in the information landscape of the twenty-first century.
Building Stakeholder Support
Kaufman discussed the need for libraries and archives to develop trust with several communities of stakeholders. Stakeholders include not only users, such as faculty and students, but also governing boards, administrators, government officials, and the general public. Faculty members, for example, increasingly fear that their campus library will, in a shared-collections scenario, subsume the needs of local users to those of a larger and, they assume, more homogeneous and impersonal collection. Under scenarios of shared collection management described by Greenstein and others, in fact the opposite would be true: These shared metacollections could afford to be more diverse and specialized. But that assumes the careful shaping of collections among partners and possibly greater commitment on the part of some faculty in advising the collection development staff.
Faculty who are paying attention to the rising cost of journals and monographs also express the fear that reducing redundant purchasing, a likely result of shared collection development, will exacerbate the economic problems of academic publishers. This fear points to the need for libraries to put the series of related problems-the crisis in scholarly publishing, the crisis in preservation funding and management-into the larger context so that faculty and other stakeholders can see that treating the symptoms (for example, a decreased demand for monographs) rather than the underlying causes of the scholarly publishing crisis will be harmful in both the near and long terms.
Again, we see that patterns of use are very important to consider when developing effective and cost-effective preservation strategies. Monographs, in contrast to scientific journal literature, are used intensively but not frequently. The former tend to have a longer productive shelf life. This argues for spreading the cost of long-term access across a network of institutions, just as Greenstein and others propose. A more cost-effective means of ensuring long-term access to back files or retrospective literature ultimately has a beneficial effect on the whole chain of scholarly communication; it is good for the entire system, even if it may be of no immediate benefit to the particular problems of specialized monographic publication.
Libraries and archives have a vital role in forging the alliances that will ensure a healthy and accessible research resource base. Libraries, for example, are uniquely positioned as an all-campus resource to present the broadest possible view of the information landscape that we now inhabit. An educated and committed consumer is a vital part of organizational readiness.
Leveraging Past Investments for Future Gains
Another key element of institutional readiness is the ability and willingness to leverage past investments by cooperating with other collecting institutions to achieve economies of scale. One of the more controversial topics at the meeting was the need to develop and sustain centralized service centers for a variety of preservation activities, beyond shared collection storage. Several managers advocated strongly for the development of centralized provision of such services as preservation reformatting, deacidification, conservation treatments, and other actions requiring highly skilled labor and expensive equipment. Arguing all libraries and archives would require serious, sophisticated preservation provisioning but that only a dozen or so of these institutions would be able to afford in-house facilities to meet that demand, participants called for moving quickly to develop these “industrial” facilities by several willing libraries. These facilities would then be able to serve the larger library community, most likely by spinning off nonprofit entities. Developing the idea further, one could see different service centers specializing in different formats or different media, for different sorts of artifacts. All emerging models of digital preservation are seen as being embedded in a larger network of preservation partners, from the Library of Congress’s National Digital Information Infrastructure and Preservation Program (NDIIPP) to the network of libraries collaborating with the Massachusetts Institute of Technology in its deployment of the DSpace digital archiving program. Why should the same not be true of analog format preservation?
In response, several people expressed serious concerns about the trade-off between quality and quantity-that industrial-scale “preservation factories” would not provide the level of treatment many artifacts warrant. There will still be a need for highly specialized or custom treatments. Others expressed a different concern-that institutions that were not leaders in this endeavor would be marginalized. Those concerns were rebutted strongly by others who contested that it is precisely the small- and medium-size institutions that would benefit from the affordable availability of such services. At the same time, those with specialized expertise in one format or treatment or another would not be disadvantaged because, as one computer scientist said, “in a network, size does not matter.”
Old habits of competition among institutions of higher education and their libraries die hard. However, there are many examples of colleges and universities choosing to cooperate in certain areas (for example, preserving information resources) while continuing to compete in others (for example, vying for faculty and students).
The idea of shared preservation facilities, like the proposal for shared collections, was contested on largely political rather than economic or technical grounds. These proposals appear to be at risk of foundering over the lack of sufficient trust among libraries that historically attach a good deal of prestige to claiming a preservation mission. The successful examples of collaboration cited-those of the Five Colleges in western Massachusetts and of the University of California-are based on relationships of trust built up over years through cooperation in other areas of endeavor.
It is not only among other libraries and archives that preservation institutions must cooperate to ensure long-term access in the present century. They must cooperate with the commercial sector as well. Such partnerships will depend on trust that must be built up and sustained over time, and that trust will be crucially dependent on a policy environment that supports cooperation.
The Policy Environment
Knowing that there will be many new preservation partners that are far beyond the walls of libraries and archives-from computer and materials scientists to legislators to for-profit publishers and distributors-it becomes vital to ensure that the laws, regulations, and enabling agreements needed to support these partnerships are in place. The area of policy that has received the greatest attention is copyright and the host of rights that encumber information resources. But the monocular focus on rights management can blind us to equally important concerns, such as the continuing failure of business models too dependent on copyright for revenue and the erosion of information as a public good that fuels innovation and creativity. The proliferation of information produced within the academy, particularly specialized literature, is widely remarked, usually with some dismay. But of far greater significance in the information landscape is the increasing amount of material that falls outside the purview of the academy-neither created nor consumed by it, except as an artifact of culture to be studied (such as pop music, animated cartoons, and television programs). For audiovisual and digital materials, commercial and noncommercial actors must work in concert to ensure the preservation of cultural heritage. Strong partnerships between the commercial and nonprofit sectors are the linchpin of the NDIIPP strategic plan. This includes direct relations between content producers, such as music and book publishers, as well as academic society publishers, the National Science Foundation, and the supercomputing centers.
Ivey makes a strong case for special efforts to include the arts community in the network of preservation partners. He calls for libraries, as centers of public trust, to play a leading role in bringing the creators and distributors of these arts into the evolving network. His argument starts from the observation that our culture does not value intangible heritage as a public good that demands public protection and that it will therefore always be at risk from larger social threats. A case must be made for the value and long-term stewardship of the creative work that is often seen primarily as commercial product if it is not to suffer the fate of the RCA Records vault. He argues that libraries are uniquely positioned to make this case.
This clarion call for libraries to take up the cause of intangible heritage-materials, as Ivey points out, that in America are both “cultural heritage and corporate asset”-comes at a curious juncture. It comes at precisely the time that libraries, not used to seeing their print collections as “corporate assets,” face the fact that their new digital collections are viewed as such by the companies that license them. Print-based research libraries find themselves facing the same legal and market environment as, for example, music libraries and film archives do. Suddenly, the institutions that we have relied upon to take the long view, as he maintains, are struggling to find in the new digital rights regime their sanctioned ability to do so.
This new environment is hostile to the long view by which preserving institutions abide. The legal environment will no longer allow libraries and archives the luxury of making fine, but heretofore useful, distinctions between access and preservation. Access will be driving preservation, and to succeed in their preservation mission, libraries must therefore “stake out a public right of access.” Fair use is an exemption from the copyright law whose power, if not asserted regularly, will erode as markets grow up to meet access demands. Ivey is, in short, calling for an active public campaign by libraries-one that should be on a scale comparable to that for brittle books and waged in Washington.
The public campaign should be informed by economic reality and be based on the assumption that commercial partners in preservation have more to gain through cooperation than through competition. The primary role of the libraries would be to inform the public about what is at stake if this heritage is lost and to increase the level of outrage. The volume could be great if libraries and archives were to make common cause with museums, scientific societies, indigenous peoples, and other communities also struggling with the threats to heritage that the property and rights regime poses.
In her response to Ivey’s discussion of a renewed strategy of activism for libraries, Annette Melville pointed to the successful example of the film archivists, creators, producers and distributors, and academics who came together as a consequence of the National Film Preservation Act. Among the act’s important outcomes was to raise the visibility and prestige of film as an endangered and irreplaceable part of our culture. The film community is making great strides in cooperative preservation for a number of reasons. Foremost among them has been its ability to create a sense of community and a common commitment to preservation. That occurred under the national leadership at the Library of Congress, starting at the top with the director of the library. That leadership was well matched by the impassioned moral suasion of celebrities and influential studio and industry individuals and the well-timed appearance of technologies that added economic incentives for preservation through the ability to repurpose old films for new markets.
Melville highlighted the successful strategy taken for “orphan films”-films lacking champions in the corporate world because they have no well-endowed institution committed to their long-term well-being. The National Film Preservation Foundation receives federal funds to match those of the cultural heritage institutions that will preserve them. Melville reported that a critical part of the orphan film rescue efforts under way is building and sustaining public support of preservation. It is imperative to make what is preserved readily accessible in consumer formats to keep making the case for their preservation, restoration, and access. The best advocate for preservation of film is the film legacy itself, and access to the legacy must always be put first. In this case, as in all others, the demand for access will push the demand for preservation.
All hopes and aspirations for long-term access ultimately rest on our ability to provide resources in a timely way to those who are doing the preservation work. How are we going to pay for preservation, especially in light of the fact that it is invisible to, or little valued by, those who are its chief beneficiaries?
Looking at the demands of digital preservation, in which active management must take the place of intermittent interventions, Lavoie argues that decentralized, locally oriented, ownership-based preservation strategies will not hold. The preservation landscape he depicts is one in which initial investments are steep and in which ongoing costs, while as yet unknown, are predictably intensive. Add to that the ineluctable drive for the disintermediation of delivery, and it is hard to see how creators, let alone archivers, can recoup their costs. As some participants noted, it is not in the interest of creators, publishers, and distributors to raise the barriers to access, because assets that do not circulate freely in the marketplace cannot earn revenue there. Publishers and distributors are not trying deliberately to create scarcity; they would like to be able to create demand, not limit access. Nonetheless, locking down information assets for fear of piracy has been one reaction to the uncertainty about maintaining revenue, and it has cast a pall over the conversation in which all stakeholders need to engage. How do we re-create a world in which information flows through well-regulated systems and those who add value to the information or the system are rewarded commensurately?
Lavoie defines who the critical stakeholders are, how their roles are changing, and what incentives and disincentives they have for good preservation behavior. Given the fundamentally different roles that information resources and intangible cultural heritage now play, together with their ability to be repurposed and released anew for some markets, there may be ways to imagine a rights regime that itself provides incentives for good stewardship. Among the ideas proposed were some that actually reinforced libraries and archives in their historically valuable roles as guarantors of authenticity and reliability. A simple example is the notion that preserving institutions can serve as trustworthy repositories of complex media objects, themselves comprising a number of “production elements” that are repurposed for access but that need to be preserved at the highest-possible resolution or sampling rate. Just as in the print world it has been financially unfeasible for publishers to carry inventory for long periods of time, so it will be in the digital world (though the meaning of “long” in the context of digital asset management systems is not yet clear). Maybe preserving institutions, if they choose to act as neutral third parties, can provide a service-carrying that inventory in its authentic state-and receive compensation from the digital asset owners for that service. That compensation could come in any number of forms; for example, the asset owner could provide a dowry to accompany the information as it moves to its new home in a preservation repository. Commercial firms could be rewarded by tax credits or the other incentives that donors of collections have traditionally been offered.
As Lavoie makes clear, current economic models do not support good preservation behavior. But new models can be put into place. The business models we need to develop must have robust policies that not only regulate the behavior of stakeholders but also encourage and reward the right behavior. As has often been noted in current debates about copyright, our country’s founders created through copyright what were, at that juncture, appropriate incentives for creators to create. The advancement of science and the useful arts was deemed good for the republic, and so the government offered a limited monopoly of rights to authors to reward their investment of time and resources. While the wisdom of limited monopolies may be obscured to many in today’s heated market for entertainment and intellectual resources, the expectation remains that the copyright owner is responsible for ensuring that its assets survive for the benefit of future generations. Most people recognize that it is unwise to rely on commercial firms to preserve materials for a future that consists of at least several human generations and business cycles. What is important, however, is to make preservation planning a good business practice for these firms.
Lavoie cautions that the cost of preservation will rise as the information landscape becomes increasingly digital. Preservation, he argues, will go “from intervention to process” and in doing so will demand a greater share of resources. Furthermore, those resources need to be leveraged among many institutions: A number of preservation partners must agree to become interdependent, as Greenstein would have it, in order to optimize preservation across the network. The incentive for such partnerships would be that as the level of redundancy goes down across the system, cost savings would accrue to several institutions and user communities.
Lavoie also predicts that core preservation activities will be centralized and large-scale, a prediction that maps to current plans for preservation at the National Archives, the Library of Congress, and other national libraries. The optimal levels of redundancy for these centralized services are not known and not clearly sanctioned by current digital copyright law. As the example of LOCKSS (Lots of Copies Keep Stuff Safe) archiving shows, some people oppose the strategy of reducing redundancy. They argue that high levels of redundancy are both necessary and not that expensive. These two opposing views of digital preservation will most likely continue to coexist for some time, and this is totally appropriate. It is perilous to assume that there is only one model for preservation.
Winston Tabb noted in his response that national institutions, specifically the Library of Congress, are uniquely positioned to take leadership roles in developing models of shared responsibility for preservation. Unfortunately, such institutions are often slow to act without vociferous encouragement from the field. He mentioned the Library’s authority to set aside one of two copyright deposit imprints as a “heritage copy” for permanent retention, but added that this authority is not exercised, for a number of reasons. Adding his voice to that of Ivey in calling for public advocacy, Tabb urged that a number of issues, from that of heritage copy to authority for the Library of Congress to harvest Web sites as part of its copyright mandate, be put on the agenda for memory institutions to take to Washington.
Tabb also took issue with the notion that the primary model of preservation in the twenty-first century will be centralized. On the contrary, Tabb asserted, the scale of production of preservation-worthy information and the consequent inability of any central collecting agency to develop a “collection of record” means that there will have to be a network of preservation institutions working closely together in a way heretofore unprecedented. He suggested a model of “centralized coordination and tracking with distributed preservation,” that is, a collaborative solution with shared responsibilities.
Although this model has perhaps a better chance of meeting the challenge of preservation and access in this century than did models we know from the last, it may differ little theoretically from the distributed preservation model attempted for brittle books, with individual libraries taking on preservation responsibilities for certain materials and working in theory with other independent libraries through commonly shared tracking systems. But in fact the systems could not be more different. The system Tabb elaborated requires that the copyright regime demand deposit from creators and that the law “deputize” certain institutions to share the collecting and preserving responsibility with the Library of Congress, which is now the only authorized agent of copyright deposit in the United States. The collaborative solution Tabb described would entail the kind of policy change, buttressed by law, for which Ivey also advocates. It is not yet clear what kind of actions need to be taken and who would effect these changes. But Tabb concurred that libraries, archives, and other collecting institutions are uniquely positioned to be leaders in bringing about needed change.
The only things we can be sure of are that resources for preservation will continue to be scarce in relation to demand and there will continue to be a need to leverage common infrastructure, exploit economies of scale, and avoid unnecessary redundancies, however defined. To ensure access in the future to the information that is created, used, or otherwise valued, we should be expansive in our thinking about who can and should preserve. We will need to be comfortable with many good-enough practices alongside the best practices. We can think of selection, for example, on a sliding scale of evaluation and curation, so that libraries may continue to have highly selected and curated collections, archives will have collections characterized by greater inclusion and volume with lesser degrees of description and curation, and individuals will continue to play a vital, often prophetic role in creating collections of value.
At its best, preservation can be defined as a part of the infrastructure of the knowledge economy that is so fundamental it is virtually invisible. And like most critical infrastructures—the electrical grid, the water and sewage system, or the Internet—preservation is too often remarked only in failure. Now, a combination of new information technologies and faltering business models in scholarly communication and the entertainment industry is stressing preservation to the breaking point. At this juncture, when national governments are willing to make major investments in overhauling the preservation infrastructure and billion-dollar industries are recycling old “product” for new markets, there is a unique opportunity for preservation institutions to make a compelling case to their stakeholders, from information creators and educational administrators to the general public, for investing now in access for the future.
As Ivey reminds us, the environmental movement has been successful in large part because it staked a claim for the environment as a public good where none existed. Without such a claim for our common intellectual and cultural heritage, continuing to be good stewards will get harder and harder. In this new century, when information and cultural heritage have taken on radically new roles in private and public life, libraries and archives may be able to fulfill their preservation missions if, and only if, they are willing to stake a claim for public access.
Friedlander, Amy. 2002. Dimensions and Use of the Scholarly Information Environment. Introduction to a Data Set Assembled by the Digital Library Federation and Outsell, Inc. Washington, D.C.: Digital Library Federation and Council on Library and Information Resources. Available at http://www.clir.org/pubs/abstract/pub110abst.html.
OCLC. 2002. OCLC White Paper on the Information Habits of College Students. Available at http://www5.oclc.org/downloads/communityinformationhabits.pdf.
Tenopir, Carol. 2003. Use and Users of Electronic Library Resources: An Overview and Analysis of Recent Research Studies. Washington, D.C.: Council on Library and Information Resources. Available at: http://www.clir.org/pubs/reports/pub120/contents.html#exec.