A. Data Capture

Data and Information Sources

To achieve its objectives, the USGCRP requires a wide variety of data and information from many disciplines for long periods of time. The data and information exist in both digital and nondigital forms and include raw data from observation systems and surveys; value- added data from data assembly activities; derived data and information from models and other investigations; long-term as well as short-term data and information sets; historical, current, and future data and information; inĘsitu and remote observations; information from other studies; and references to data and information that are produced outside the USGCRP, such as major international and national programs.

Extensive collections of such data and information critical for global change research are now supported in the Federal agencies and many other sources such as libraries, State and local governments, university and other research efforts, and the international community. Within the USGCRP, the focused program includes data and information from those programs identified by the agencies as having the needs of the USGCRP as their primary objective. A wealth of other data and information, gathered for purposes other than global change research, also is critical for the USGCRP, including those identified in the USGCRP contributing programs. Some of these data and information may not reside in organized data and information systems, and are not easily accessible or well documented. Table A1 shows agency sources for data and information critical for the USGCRP arranged by its central priorities. F indicates programs that have the USGCRP as their primary objective, and X indicates those that do not.

The Special Issue

The USGCRP recognized at its inception that a substantial part of the data and information needed for global change research would not be created by the focused programs created specifically for USGCRP purposes. The Federal Government supports thousands of individual research projects in Earth, environmental, and human sciences that are not part of the USGCRP focused programs. These projects have produced such national treasures as the archives held in the NDCs, the daily data collection and analyses provided by the National Weather Service, university-based holdings of tree ring samples and ice cores, forest inventories from the USDA Forest Service, cartographic data and stream flow records from the USGS, biological and ecological observations of the DOI, synoptic meteorological and bathythermograph data from the DOD), fossil fuel statistics from the DOE, demographic data from the Bureau of the Census, and soil maps from the USDA Soil Conservation Service. Table A2 provides examples of data sets that are needed to help answer specific global change research questions.

From the perspective of such other programs and projects, global change research constitutes a secondary use whose needs are generally not funded. Only some of the crucial programs and projects have already been identified as USGCRP contributing programs; their budgets in FY 1993 were about $1.4 billion - about equal in size to the total for the focused USGCRP.

From Table A2 it can be seen that not only are the sources of such data and information from many agencies, but also that they cover all the central priorities of the USGCRP. Such data and information are critical not only in their own right for global change research, modeling, and assessments, but also to the focused data and information program to fill gaps in coverage, tie together diverse data sets, and improve the quality and usefulness of the data and information by providing ground truth at selected points for calibration. The latter is particularly important for the remotely sensed data required by the USGCRP. Also not apparent is that some of these critically important data and information are unavailable to the USGCRP because they are intermingled with material subject to security, proprietary, or regulatory constraints.

This special issue identifies the need for agencies participating in the USGCRP to assemble, document, archive, and disseminate the data and information critical to the USGCRP produced by programs that do not have global change research as a primary objective. The goal is to provide GCDIS access in useful form to data from such programs and projects through existing and planned data systems and centers or by creating new data centers for them. Within the framework established, agencies will propose individual or interagency activities to meet specific USGCRP requirements. These requirements then will be coordinated from an interagency perspective through the CENR. By using this coordinated approach, the program will reap the benefits of using data and information from outside the USGCRP and allow for deeper insights that arise from cross-disciplinary analysis.

Although the existing USGCRP priorities and milestones help focus the data capture activities, many thousands of related data and information products of potential relevance still exist, with many needing an investment of resources to become well documented and accessible. As a result, it is essential to establish more specific priorities for data capture activities. This will be done in cooperation with the other elements of the USGCRP, the user community, and advisory groups such as the NAS.

Agency planning for the special issue recognizes that global change data management efforts should be well coordinated with other established interagency data management activities, such as the FGDC, the Office of the Federal Coordinator for Meteorology, and other appropriate coordinating organizations and committees.

Data Capture Activities

To make relevant data and information resulting from data capture activities available to the USGCRP, a number of different activities are required. These are grouped here into the four activity areas described in The U.S. Global Change Data and Information Management Program Plan: that is, assembling, documenting, archiving, and disseminating. Specific activities that overlap activity areas are included in the area that is expected to require the most resources.


1. Inventory of data. Agencies need to set specific objectives and conduct an inventory of previously generated data and information that focuses on those research areas most relevant to the highest priority science objectives and research milestones of the USGCRP. Library holdings will be included in the inventory. (It is estimated that more than half the publications in Federal scientific and technical libraries and information centers relate to global change.) Attention will be given to obtaining input from the NRC and other representatives of the user community in order to guide development of the inventory.

2. Inclusion in the GCDIS. Data and information included in the GCDIS will be based on the USGCRP priorities and the advice of the user community. Where agencies cannot provide direct access to their holdings, funds will be required for data transfer to data centers participating in the GCDIS. In some cases, extracts will be created, eliminating material that is sensitive due to security, proprietary, regulatory, or other considerations. In other cases, key data sets will be put into digital form in order to be generally useful.

3. Integrate preexisting programs. Many of the focused USGCRP programs were started before the widespread appreciation of the need to plan for interagency data and information management in the programs. For example, resources are needed to integrate with the GCDIS the data and information management components of programs such as the TOGA, the WOCE, and the GEWEX.

4. Data rescue. Many data and information products of vital importance for global change research need immediate rescue and long-term maintenance to avoid deterioration and loss.


5. Priorities. It will be almost impossible to document adequately all data and information potentially useful for global change research so that they can be properly used decades after their creation. With participation of the NRC and the user community, priorities will be established for the data and information requiring such documentation - taking advantage of available expertise before such opportunities disappear forever.

6. Current data center holdings. The adequate documentation of holdings already in data centers is a major undertaking. The most critical holdings need sufficient documentation so that they not only will be accessible, but can also be used with confidence decades from now. This level of documentation is typically much more stringent than now exists and requires significant effort to remedy.

7. Data from other sources. Global change studies often require correlating data from disparate sources, such as remotely sensed data combined with in situ measurements. Even within a single set of data there are often discontinuities attributable to instrumentation changes or other events. It is essential that complete information describing these data is available to researchers in order that the data can be applied appropriately.

8. Training. Training of agency research staffs, information specialists, and managers will be required to institute the policies and procedures for managing data and information to support global change research, developing the skills needed for effectively using the wide range of GCDIS capabilities, and improving communication of technical specialists with the public. Attention will be given to creating global change research network information, information products, and electronic data base information appropriate to educators at the K-12 levels.


9. Long-term storage facilities. Agencies participating in the GCDIS have agreed to manage all data and information in a manner that complies with archiving standards. Agency data centers will need significant additional resources to handle not only the large numbers of additional data and information products that need to be archived, but also - for such programs as the Next Generation Weather Radar (NEXRAD) - very large data volumes.

10. Small data centers. Libraries and small data and information centers will collect more and more data sets and information created through individual research efforts. These data and information will sometimes be available at Federal libraries, but more often they will be found at academic institutions. Such local reference and research systems need to be compatible with the GCDIS, and many need be upgraded in order to store, retrieve, and send large amounts of data and information to users.

11. Interagency coordination. The GCDIS will have a wide range of interagency coordination requirements, including inventory search and order, pricing, order tracking and billing, user services, submission guidelines, system interfaces, retention and purging, media, formats, and performance assessments. As the amount and scope of the data and information in the GCDIS increases, support for such interagency and external coordination will increase.

12. Researcher feedback. Agencies will actively involve users and others in the affected research communities in their global change data and information management activities. This will include periodic formal reviews by expert panels such as the NAS CGED, as well as surveys of user satisfaction and the soliciting of suggestions for improvement of the GCDIS.


13. User services. Data and information centers and libraries will need resources to provide the data and information to users and to respond to user questions both on the proper use of specific data and on the holdings of other data centers. This need will increase with the number of products in the GCDIS and with the number of its users. Among the Federal and academic libraries, as well as the Federal Depository Libraries, access to the Internet and CD-ROM equipment is essential.

14. Consolidated services. In some cases, data center services for data capture data and information exist but are so fragmented and incompatible with the GCDIS that they are almost unusable. For example, there are 18 separate Long-Term Ecological Research Centers that can only be accessed individually. The GCDIS will not only tie all the participating data centers together, but also create dissemination mechanisms to allow researchers to deal with groups of centers as a single entity.

15. International. Access to data and information holdings outside the United States that are critical to the USGCRP will require significant resources. For instance, data and information sets from international sources may need to be obtained by connecting them to the GCDIS through international mechanisms such as the UNEP Global Resources Information Database (GRID). Additionally, selected technical information received from international sources needs translation into English.

16. Standards. Compliance with national and international standards in use by libraries and data centers is crucial even though it may be costly. Access to major library collections, information centers, and bibliographic information systems is dependent upon implementation of standards developed for bibliographic control, communications, product formats, and product delivery.

Fiscal Requirements

Although the annual USGCRP program plans have recognized a dependence on data and information from programs not focused on global change, resource requirements have not previously been identified to assemble, document, archive, and disseminate these crucial data and information. Since these data and information reside with groups whose primary mission is not global change research, in many cases an additional specific allocation of resources is needed for these data and information to be made available to support global change research.

From the perspective of the agencies, the funding required by the special issue on data capture is over and above that which the agencies had previously planned as their part of the USGCRP. The individual IWGDMGC agencies, however, have been encouraged to submit budget requests to meet this critical need. Agency special issue plans will be a part of each agency's GCDIS implementation plan.

