F e d e r a l D e p o s i t o r y L i b r a r y P r o g r a m ADMINISTRATIVE NOTES Newsletter of the Federal Depository Library Program --------------------------------------------------------------------- March 15, 2001 GP 3.16/3-2:22/05 (Vol. 22, no. 05) --------------------------------------------------------------------- The Federal Depository Library Program Electronic Collection: Preserving a Tradition of Access to United States Government Information George D. Barnum Steven P. Kerchoff Electronic Collection Manager FEDLINK Network Program Specialist Library Programs Service FLICC/FEDLINK, Library of Congress United States Government Printing Washington, DC Office Washington, DC [Presented by George D. Barnum at the Preservation 2000 Conference in York, UK, December, 2000. The conference was an international gathering of librarians and archivists focused on the preservation and long-term accessibility of digital materials, sponsored by the Cedars Project of England, and the Research Libraries Group and OCLC, Inc. of the U.S. The conference proceedings are available at .] Beginning with a Congressional mandate in Public Law 103-40 (1993)1 for the U.S. Government Printing Office (GPO) to create and maintain online access to "appropriate publications distributed by the Superintendent of Documents," GPO has endeavored to translate its historic mandate for free access to Government information to the digital age. Throughout much of U.S. history, GPO has maintained a system of dissemination and access based on the deposit of printed publications by the government in designated libraries. This system has been widely emulated by state governments as well as by other nations and international organizations. In the face of rapidly expanding adoption of digital technology and a climate of Government reform, the challenge for GPO has been to determine which attributes and principles of the historic depository system are successful and valid, and with that as a basis, to seek applications within the context of the digital revolution. In this context GPO has been at work over the past four years in a transition to a more (or primarily) electronic dissemination program. This transition has had several phases, beginning with extensive study2 and a strategic plan in 1996, followed by various experimental and pilot projects, modifications of workforce and working routines, and, at the beginning of the 2001 fiscal year, a more general and wide-ranging application of the assumptions and new goals of the transition across the entire Federal Depository Library Program (FDLP) driven by a substantial reduction by Congress in funding for tangible format distribution. The Goal and the FDLP Stated simply, the goal of the FDLP is to assure current and permanent public access to the universe of information published by the U.S. Government. This universe includes information products, regardless of form or format, which are of public interest or educational value, not strictly administrative or operational in nature, and not classified for reasons of national security nor otherwise constrained by personal privacy issues. The primary user community consists of end users gaining access through the facilities and resources of designated libraries. The program that has grown around this goal was first enacted in the 1860s, and took the statutory form it retains today with the enactment of the Printing Act of 1895,3 which linked the distribution of publications to libraries with the newly centralized system for the procurement of printing by the Congress and executive branch agencies. Briefly, any printing order sent by a Government agency to GPO, for products that meet certain minimum criteria, has a quantity added to the total order earmarked for distribution to libraries. The libraries, designated by act of Congress, receive the publications for free and in return must agree to be open to the public, and to meet certain minimum standards for service. Although a multitude of variations has developed over the years, the system has remained remarkably robust at capturing and guaranteeing access to Government publications. In practice, the FDLP has evolved to perform four broad functions: * Deposit. The functions that relate to selection, acquisition, distribution, and physical control of publications (classification, etc.) by GPO, including the retention of ownership of deposited publications by the Government, and inspection to assure compliance; * Assurance of current and permanent public access, including the requirements made of depository libraries for free access to the general public, retention schedules, and service to users of Government information; * Provision of locator tools, including the statutorily mandated catalogs and indexes GPO produces as well as bibliographic description and other types of finding aids; * Promotion and facilitation of use, including training opportunities, conferences, and marketing. It is in the first two categories, deposit and assurance of access, that the transition to a more electronically-based program has had the most fundamental effect. In the print world the system of deposit provides a stable and secure environment in which information is, as a by-product of the legal requirement that Government printing be either performed or contracted for by GPO, funneled into a geographically distributed and fairly closely regulated system of outlets. In return for receiving the information free-of-charge, these libraries agree to be bound by various requirements for access. In the Internet environment, Federal agencies no longer have an imperative to involve GPO in the dissemination of their information, and the need for redundant housing of copies of publications to achieve geographical equity is obviated by the ability to use a single source from multiple remote locations. At the same time, needs and expectations on the part of librarians and library users for access to this information have grown. The attempt to reinvent distributed, permanent access has centered on the creation of the FDLP Electronic Collection, a digital library conceived on fairly traditional library collection development principles, and consisting of an interdependent set of locator tools, user interfaces, links to content on agency servers, a digital archive, and various kinds of metadata. The collection is being built using a standard collection development document4 which emphasizes a blending of new and adapted roles for the depository program. On one level, the FDLP must continue to provide access, through its network of designated libraries, to the information that its enabling statute describes as being in scope5. The everyday realities of providing both actual electronic access and bibliographic/intellectual access tools have been in a state of almost constant change since the first introduction of electronic products in the early 1990s. Previously the processing of materials from the printing press through GPO's verification and distribution mechanisms and into libraries was a highly detailed process not far removed either in concept or practice from other mass-production processes employed in a large printing and publishing concern. The shift to a digital FDLP has altered this model, changing the skills and workflow required to provide access. Over time, the size and composition of the workforce performing these tasks is changing, with an increase in the need for so-called knowledge workers superseding the need for production-line materials handlers and lower-level clerical employees. The Challenge and the Electronic Collection In the third edition of her book Tapping the Government Grapevine: The User-Friendly Guide to U.S. Government Information Sources, Judith Scheik Robinson has said, "Although the focus has shifted from on-site physical collections to electronic linkage, the underlying ideology of permanent and equitable access remains a FDLP hallmark. Depository libraries are government information sanctuaries...by championing audience-appropriate and use-appropriate formats, permanent preservation, and public access, the FDLP preserves the nation's oeuvre..." Robinson interprets a summary of GPO's transition goals first put forth by GPO official Gil Baldwin in 1996:6. Transition to Electronic FDLP From To Focus on products Focus on services Dissemination Access Shipping physical products Electronic connections Physical, tangible information Online Internet access formats Short-term GPO responsibilities GPO responsible for long-term access The FDLP has among its fundamental assumptions that information included is official Government information, and that it will, by virtue of its being included in the program, be freely available to the public, permanently. Two of the most significant challenges in charting a digital future for the program have been to create an operational structure around the basic policy framework of assuring the official integrity of the information and to keep that information available and accessible permanently. Any discussion of the integrity of digital publications ultimately leads to some discussion of authentication, which, at a purely technical level, has to do with ensuring that the digital bitstream received by the user is identical to that which left the server. This is accomplished most often by a sampling of the bitstream. In the commercial realm, the term is used in terms of assuring that a document for which a user has paid for access is in fact "authentic," or what was paid for. At a more abstract level, authentication relates to the genuineness of the object, the authenticity of the content. This can mean an assurance that the object is what it purports to be and actually emanates from whence it claims to be from. It may also be an assurance that the object is verifiable or certified as "official," that is, having not only genuineness of origin, but possessing some official sanction thereof. Both of these examples point to a construction of trust, and a source of verification separate from but related to the object itself. Determining or establishing authenticity is a comparative process, and is the result of various tests or judgements. In their report "Preserving Digital Information," the Task Force on Archiving Digital Information of the Research Libraries Group defines five criteria or attributes on which the integrity of digital objects rests: * Content - What comprises or represents the content of the object? * Fixity - Is there an authorized or canonical version of the object, and how was that authorization determined or derived? Is the object a whole and singular work (can it be?) or are there multiple acceptable versions or states which make parts of the whole? * Reference - One must be able to locate the object definitively and reliably over time (as with citation) * Provenance - What is the chain of custody, development, responsibility, or ownership that may confer integrity on the object? * Context - How does the object interact with other objects in the digital environment (e.g., software with data or text, links within a document and links elsewhere)7? Three of these attributes are of particular concern in the context of the FDLP Electronic Collection: content and what comprises it; fixity, and provenance. In the print world a publication from a government agency, printed by GPO, passed through a series of official channels that assured the integrity of the content, including a variety of internal controls and approvals within the originating agency, and proofreading at GPO ("verification by proof.") The requisition for printing services and the actual publication of a document in a sense legitimized or authorized the content, fixed it in time, and established a chain of responsibility. So for example, one could be confident that a copy of the Statistical Abstract of the United States received in a shipment box in a depository library was in every respect consistent with copy that was approved and ordered to be printed by U.S. Census Bureau officials. In the environment of the World Wide Web, government processes and structures for such verification are being altered and reduced, and various aspects of this scenario are altered. The publication is not securely fixed in time by the printing process. Web publishing in Government agencies often has departed from the bureaucratic structures for review and approval that grew up around print. In this dynamic and less certain environment, users still expect some mechanism to establish the same level of trust. It is generally recognized that this role may be filled by some variant of the notion of the "trusted third party;" ultimately, authenticity and integrity of digital objects are matters of trust. Our very definitions and descriptions of authenticity and "officialness" revolve around independent verification or comparison that provide a basis for trust. It is the role, then, of GPO's electronic collection to act in the capacity of "trusted third party," providing the assurance, based on some verifiable criteria, that the information is indeed official. With printed documents, this authenticity was straightforward to establish, and formed the basis of the authority of the FDLP. In the digital world the FDLP must establish this authority and then build the remaining attributes of free and permanent access on that foundation. The initial approach to this question, which may eventually prove to be only an interim solution, has been to structure access through the FDLP such that users gain access to publications, using GPO's bibliographic tools, from the originating agency or Web site, not from a central repository or mirror site. Cataloging practices and other description have employed Persistent Uniform Resource Locators (PURLs), which assist in managing volatile resource locations and thus simplifying access for end users. By consistently directing users to the originating site, within a limited universe of domains (primarily .gov, .mil, and .fed.us), the official character of the publication is assured. The obvious difficulty for this strategy is the assumption that while some publications will remain in their originating locations permanently, some, if not most, do not. For this reason, a significant commitment of FDLP resources is being devoted to building, through various avenues, a working archive of the publications in the program. Various experiences have led to the conclusion that a single, central archive of electronic publications would be not only extremely difficult to create and maintain, but would be unwieldy to manage and keep viable. Thus a suite of solutions is being tested, including: * an in-house archive, operated on GPO servers by GPO personnel; * agreements with agencies for keeping information permanently available on native servers in agencies; * agreements with partners within the FDLP such as university libraries and consortia to manage portions of the archive remotely; * agreements with vendors or service providers for fee-for-service arrangements to store and provide access to publications. The term "archive" or "archiving" in this context describes a different sense than the work of the National Archives and Records Administration, which is charged in 44 USC ch. 29 8 with guiding and assisting Federal agencies in preserving the essential evidence of the operation of the Federal Government. The FDLP is not trying to preserve a record that demonstrates or documents an agency's operation or mission. Instead, the attempt is made to preserve access to electronic publications. The strategy to achieve seamless ongoing access distributes responsibility among FDLP stakeholders. Where possible, GPO obtains a documented commitment from publishing agencies that electronic publications will be available on the originating site permanently, and that GPO is given the files to manage in the event that the agency cannot honor that commitment. Where a documented agreement is not possible, GPO downloads a copy of the publication to its own archive or seeks a partner to manage the archived publications. These publications are retained, updated as needed, and provided to the user upon verification that the information is no longer attainable from the originating site. The PURL is redirected to the archived copy and the user is alerted that the publication is an archived version. In terms of assuring permanent access, technology has radically altered the deposit model: where previously permanence was assured by multiple depositories being bound to retain publications in perpetuity, and GPO's responsibility as custodian was largely complete at the point that publications were shipped to depository libraries, GPO must now assume responsibility for keeping the single electronic source not only currently available, but technologically viable. Based on the model of print-based publishing, GPO has no control over content of publications, nor of the format in which information is presented. Although there have been calls for standardization of electronic Government information formats, no standards exist, no consensus has evolved, and no entity exists with the authority to promulgate or enforce a fully Government-wide standard. Presentation of electronic publications that rely on an open standard, such as HTML (for text) or TIFF (for images) will presumably remain straightforward as the Web and its successor technologies develop. Publications, however, that rely on a proprietary format or commercial software for their use pose serious challenges, since backward compatibility in newer technology will depend on market forces and demand. GPO cannot consider content separate from access and access mechanisms; thus the greatest challenge over the coming years will be to keep publications captured in 2000 viable despite the advance of technology. Transfer of all publications in the archive to a single, migration-friendly, open standard format has not, in the interest of preserving the official nature of the publications, been pursued thus far. Such transfer may, however, present itself as the best alternative for keeping archived publications alive. Likewise advances in electronic archiving may conceivably separate format from storage and re-presentation and thus ease the dilemma. The Future Three factors have worked in concert to move GPO's transition forward since its inception in the mid-90's: the overall trend in Government to adopt electronic media for communicating with the public; the rapid adoption of electronic media in libraries generally; and the clear direction of the Congress to implement greater electronic access and to seek reductions in the cost of disseminating information. The third factor has been the most direct and imperative. The transition process has been guided by the underlying assumption that as the emphasis on tangible product distribution diminishes, GPO's resources will be redirected toward managing electronic files, coordinating the cooperative efforts that will facilitate preservation of electronic publications, and maintaining a standard of permanence, authenticity, and reliability comparable to the print-based program. While the specifics of implementing these assumptions are developing by degrees, the driving forces have remained clear: that free public information is a right of the people, that Government has an obligation to provide broad, ongoing public access to that information, and that the FDLP continues to be uniquely placed to assure that access. Footnotes 1"Access to Federal Electronic Information" Title 44 U.S. Code chapter 41 (1994 ed.) (http://www.access.gpo.gov/congress/cong013.html) [ back to text ] 2U.S. Government Printing Office. Report to the Congress: Study to Identify Measures Necessary for a Succcessful Transition to a More Electronic Federal Depository Library Program (Washington: Government Printing Office, 1996) (http://www.access.gpo.gov/su_docs/fdlp/pubs/study/studyhtm.html) [ back to text ] 3 "Printing Act of 1895" 28 United States Statutes at Large p.612 et seq. [ back to text ] 4U.S. Government Printing Office, Library Programs Service. Managing the FDLP Electronic Collection: A Policy and Planning Document. (Washington: Government Printing Office, 1998) [ back to text ] 5 "Depository Library Program" Title 44 U.S. Code sec. 1901 (1994 ed.)(http://www.access.gpo.gov/su_docs/fdlp/pubs/ecplan.html) [ back to text ] 6 Judith Scheik Robinson. Tapping the Government Grapevine: The User-Friendly Guide to U.S. Government Information Sources. Third edition. (Phoenix: Oryx Press, 1998) [ back to text ] 7 "Preserving Digital Information: Report of the Task Force of Archiving of Digital Information" commissioned by The Commission on Preservation and Access and The Research Libraries Group, 1996 (http://www.rlg.org/ArchTF/) [ back to text ] 8 "Records Management by the Archivist of the United States and by the Administrator of General Services" Title 44 United States Code chapter 29 (1994 ed.) (http://www.access.gpo.gov/congress/cong013.html) [ back to text ]