Used with permission from Cartography and Geographic Information
Systems, Volume 23, Number 3. Copyright 1996 American Congress on
Surveying and Mapping.

RESEARCH PRIORITIES FOR GEOGRAPHIC INFORMATION SCIENCE

University Consortium for Geographic Information Science

 

http://www.ucgis.org

PREFACE

In the United States and many other countries, the scientific community faces an unprecedented period of continued downward pressure on the funds available from the sources that have traditionally provided its financial support. However hard the scientif ic community is able to lobby, and however convincing its case, it is clear that some very difficult choices must be made in allocating what is likely to be an ever diminishing resource in the coming years. Governments are under pressure from disgruntled taxpayers and face spiralling costs, and research will always be one of the easiest budgets to attack, since it is rarely protected by the kinds of legislation that preserve pensions and other entitlements.

 

In such circumstances, the mechanisms that establish funding priorities become crucial. Traditionally, priorities for funding research have been established by a complex process that attempts to balance intellectual curiosity with the need to solve immediate and practical problems, and to ensure the future health of industry through invention. One of the most important components of this mechanism is the role played by scientists themselves. While society as a whole must determine the importance of many problems, scientists are very well equipped to estimate the likelihood that a given problem can be solved through research, and the resources necessary to do so. Scientists have an important role to play, therefore, in helping to set priorities for research, and the scope of the scientific agenda. Thus although the final decisions on allocation of public funds will always be made by governments acting on behalf of citizens, it is essential that scientists be involved in the dialog that precedes the allocation of the limited resources that will be available for future research.

 

In 1995, following several years of informal discussions, a group of U.S. research universities, national laboratories, and learned societies formed the University Consortium for Geographic Information Science (UCGIS). The term "geographic informati on science" has emerged recently as an acceptable umbrella term for the fundamental problems surrounding the effective capture, interpretation, storage, analysis, and communication of geographic information--topics that have become increasingly important w ith the popularity of geographic information systems (GIS), the Global Positioning System (GPS), satellite remote sensing, and related geographic information technologies. Geographic information scientists study how people use geographic information in direction-finding; develop techniques to measure the accuracy of geographic information; find better ways of representing geographic information in digital computers; develop the standards that allow computers to exchange geographic information despite differences in system formats; and research many other important issues.

 

Members of UCGIS combine strengths in a range of disciplines. In order to qualify for membership, an institution must demonstrate that it has made a significant commitment to research in geographic information science; that the commitment extends across a number of disciplines; and that mechanisms for coordination and cooperation exist. Further information on UCGIS can be found by accessing its web site, http://www.ucgis.org.

 

An important goal of UCGIS is the development of a set of research priorities for geographic information science. Accordingly, in June 1996 delegates from the 29 research institutions that were then members of UCGIS met in Columbus, Ohio to carry out a consensual process of development of a prioritized research agenda. Prior to the meeting, each institution was given the opportunity to identify five topics, based on discussion between geographic information scientists on each campus. In Columbus these initial topics were clarified, merged, extended, and refined. Delegates then voted to identify the final list. After the meeting, working groups further refined the topics, formalized them in accordance with a standard format, and submitted them to an editorial committee.

 

This paper presents a summary of the research priorities that emerged from this process. UCGIS regards the organization's research agenda as a dynamic, continually evolving document. In its initial form, it represents only the views of the research community, and thus is no more than the first stage in a dialog that will involve as many as possible of the other stakeholders in the national process of prioritization. Moreover, the research community itself is likely to wish to modify the agenda, as science evolves and more is known about the fundamental problems associated with geographic information. Nevertheless, we believe it is important that the UCGIS research priorities be published in the form of this paper, in order to make them accessible to the widest possible audience, and to move the process of dialog forward as rapidly as possible. UCGIS plans a number of other activities in the coming months to stimulate this dialog, and hopes that as much input as possible will be forthcoming from the other stakeholders. How, for example, do these priorities match with those of government agencies with heavy commitments to geographic information, or with those of the GIS software industry? How do they match those of scientists from disciplines that use geographic information technology, rather than study its basic issues?

 

Certain individuals played important roles in the development of this paper. Key roles in organizing the Columbus meeting on which it is based were played by several members of the UCGIS Research Committee: David Mark, State University of New York at Buffalo (chair); John Bossler, Ohio State University; Jerome Dobson, Oak Ridge National Laboratory; Max Egenhofer, University of Maine; George Hepner, University of Utah; Donna Peuquet, Pennsylvania State University; and Dawn Wright, Oregon State University. UCGIS also acknowledges the contributions of the delegates to the Columbus meeting; the working groups who contributed to the elaboration of each topic and the draft white papers on which this paper is based; the Directors of UCGIS; and the members of the editorial committee for the research agenda: Earl Epstein, Ohio State University; Michael Goodchild, University of California, Santa Barbara (co-chair); Carolyn Hunsaker, Oak Ridge National Laboratory (co-chair); John Radke, University of California, Berkeley; Bill Reiners, University of Wyoming; and Alan Saalfeld, Ohio State University. Comments on this paper are invited. They should be directed to the President of UCGIS (currently William Craig, University of Minnesota), by email to president@ucgis.org.

INTRODUCTION

Geographic information can be defined as consisting of facts about specific places on the Earth's surface (spatial information is defined more generally as information related to any multidimensional frame, and thus includes medical imaging, for example, although geographic and spatial are frequently used almost interchangeably). Traditionally, such information has been expressed in the form of maps, and maps are often embedded in larger information sources such as atlases, books, or encyclopedias. The advent of aerial photography in the early years of this century, and later satellite remote sensing from space, added greatly to the availability, precision, and richness of geographic information. Geographic information can also be expressed in the form of written text, or in the tables produced by statistical agencies. Telephone directories are yet another form of geographic information, providing links between individuals, telephone numbers, and street addresses.

 

The handling of geographic information has always raised issues of scientific nature. The science of mathematical geography flourished in classical and medieval times because of the need to understand the basic shape of the Earth, and its dimensions, so that it could be mapped accurately and its surface transformed to fit the flat paper sheets of maps. Geodetic science continues to address such questions, as ever more accurate geometric models of the Earth's shape are devised in response to improved measurements. New technologies, such as the street map systems now being installed in many vehicles to aid navigation, raise new interest in old questions about the ability of people to comprehend and work with information expressed in map form.

 

The rapid development of geographic information technologies over the past two decades has led to fundamental changes in the ways many human activities are organized. The forester who once managed forest resources by walking the ground now relies on cost-effective aerial photography and satellite images to support the same functions at greatly reduced cost. The utility company uses geographic information systems and geographic databases instead of hand-drawn paper records to keep track of the locations of cables and pipes, and to manage their maintenance. The delivery company uses GIS to optimize its routes, and to allow the customer to monitor the progress of a shipment. Geographic information technologies allow vital linkages to be made between apparently unrelated activities, based on common geographic location, and have led to a much higher level of integration and sharing between what were previously rigidly separated parts of an organization.

 

Many of these changes have been driven by broader developments in information technology in general, and have little to do with research in geographic information science. Faster and cheaper computing, the shift from mainframe to desktop, the development of the Internet, and many other breakthroughs have all made it easier to process and store geographic information in digital form. On the other hand, geographic information continues to lag behind other information types that are inherently more suited to digital representation, such as numbers and text. Geographic information is uniquely different from other information types in several key respects, suggesting that a science of geographic information is particularly important if some of the barriers to effective use of this vitally important form of information are to be overcome.

 

First, geographic information is rich and voluminous. While the contents of a book of 100,000 words can be captured on a megabyte diskette, it can easily take two orders of magnitude more storage capacity to capture a reasonably precise representation of a single paper map. A single Earth image from a satellite can fill the entire storage capacity of today's personal computer.

 

Second, the surface of the Earth is infinitely complex, and consequently geographic information must always be an approximation. A vast range of choices therefore exist, depending on what is captured and what is lost in the process of creating a map, or a representation of the Earth's surface in digital form. These choices will later affect the usefulness of the information, and may even lead to litigation when mistakes are made.

 

Third, geographic information is increasingly essential to many activities of modern society. Growth of international trade, and the globalization of economies, requires an unprecedented level of knowledge of the diverse conditions existing in different parts of the planet. The Earth's resources are being exploited at ever faster rates, and accurate information is needed for their effective management and conservation. Geographic information is essential to our understanding of the physical Earth system, and the interrelationships between its components. Moreover, the level of interest in detailed geographic information inevitably varies geographically, leading to complex problems in matching availability to need.

 

Fourth, geographic information science is inherently multidisciplinary. No existing or traditional discipline can claim a unique role in solving the problems of handling geographic information--and indeed research in these issues has traditionally been divided among a number of disciplines that have often competed among themselves for the available resources. In this environment, UCGIS hopes to provide an interdisciplinary meeting ground, where scientists from different disciplines who share a common interest in solving these problems can work together, each bringing a different set of approaches and paradigms, and together combining them to optimum effect.

 

Finally, the growth of geographic information technologies has already had profound and in many cases unanticipated impacts on society. The ability to use GIS to link together digital street maps and telephone directories, for example, means that it is now possible to identify the telephone number of a house by pointing to its image on a computer screen. Marketing campaigns can now be targeted to the imputed socioeconomic status of each household. These possibilities are the simple result of improvements in technology, but their implications for individual privacy are much more profound.

 

The Columbus meeting of the UCGIS identified ten priority research topics within geographic information science. While the delegates believe that they are of higher priority than other topics, no attempt was made to rank the ten. Instead, we believe that with sufficient resources significant progress can be made on all ten topics in the next few years; and that in each case there will be substantial benefits to society at large, and specifically to the various groups who depend in one way or another on geographic information technologies. While the following sections identify the specific benefits of research in each case, in general we believe that investment in research in these priority topics for geographic information science will:
The ten priority topics are presented in an order that has no significance. We anticipate that subsequent dialog between the research community, funding agencies, stakeholders with interests in the results of research, and other groups will both refine the topics and add specific prioritization as they attempt to adapt them to particular needs and objectives. In the interests of brevity, this summary of the research priorities does not include references. Instead, interested readers are referred to the appropriate UCGIS source documents, available via the UCGIS web site http://www.ucgis.org.

SPATIAL DATA ACQUISITION AND INTEGRATION

Technological advances are making it possible to capture geographic information with ever increasing accuracy. Commercial remote sensing images from space will soon offer a resolution of one meter or better. Satellite telemetry using the Global Positioning System (GPS) can now achieve accuracies well within one centimeter. But each new data set, and each new data item that is collected, can only be utilized fully if it can be placed correctly within the context of other available data. Integration with other data is increasingly important in new geographic information products. For example, the production of a digital orthophoto quadrangle (DOQ), a new form of digital imagery that has been processed to correct for distortions due to topography and camera angle, requires four distinct types of information, all of which must be successfully integrated to produce an accurate result: the image acquired from the airborne sensor; a digital model of the elevation of the land surface; a minimum of four geodetic control points whose locations are known accurately; and information about the sensor device itself.

 

Adding to the complexity of the task of integrating diverse forms of geographic information is the existence of two very different types of accuracy. A map or image can capture the relative positions of features with great accuracy; but their absolute positions depend on how successfully the map or image is registered to an Earth frame, most notably the system of latitude and longitude. For example, we can know very precisely the distance from one mountain peak to another, but have very poor information on their latitude and longitude positions. This difference becomes crucial when two data sets have to be combined--unless both have high levels of absolute positional accuracy, there will be significant errors of misregistration. Such errors often occur when databases are updated with apparently more accurate information.

 

Similar problems of data integration occur at the boundaries between data sets, particularly if they have been registered independently to the Earth frame, or if they have been produced using different standards and protocols. This problem of edgematching is found frequently in geographic data, and can have serious consequences in many applications. A road, for example, can disappear, shift position, or change classification at a county boundary if the two counties' mappings are of different dates, have been registered to the Earth frame using different control points, or use different systems of classification, respectively.

 

Recent trends affecting the agencies that have traditionally supplied the nation's basic mapping have exacerbated the need for better approaches to integration. The National Spatial Data Infrastructure is conceived as a system of collaboration between agencies at all levels--federal, state, and local--and the private sector, to work to common standards and protocols in building the nation's base of geographic information. Instead of one agency, able to set its own procedures and ensure high internal levels of quality control, the base mapping of the future will be provided through a series of consortium agreements between independent producers. Problems have also been exacerbated by communication technologies like the Internet, which offer the opportunity to integrate data from widely different sources.

 

To support such efforts, we need to develop much better tools for data integration than currently exist, based on high quality research. The term conflation has been suggested as a way of referring to techniques that are capable of automatic registration of geographic data sets, based on recognition of common features, and adjustments to both geometric positions and feature types. Conflation techniques are needed for many different types of geographic data, ranging from digitized maps to digital images; and with varying degrees of human intervention. To be reliable they must be based on sound principles, including an understanding of the causes of misregistration and their likely effects.

 

Some of these techniques are likely to be common to other areas where spatial data shares similar characteristics, such as medical imaging; but in other cases the unique characteristics of geographic data argue for specialization. Effective research on integration will require the collaboration of many sciences with common interests and motivations, including image processing, pattern recognition, robotics, computer science, geodetic science, and photogrammetry.

 

In the coming years, we can expect continued research into better tools for spatial data acquisition, as new satellite sensors are launched and new generations of global positioning systems become available. Major advances are also likely in ground- based data acquisition systems. Because of the enormous volumes of data generated by automatic sensors, it will be increasingly important to employ sophisticated algorithms for directing ground-based sampling, for recognizing patterns and analyzing data directly in the field. The term field GIS has been used to describe systems that can be taken directly to the observation site, and use GIS-like tools to help scientists collect a more efficient and economical representation. Field GIS is becoming widely used in forestry, and in improving the efficiency and minimizing the impacts of intensive agriculture.

DISTRIBUTED COMPUTING

Digital technology is moving rapidly to distributed computing. It is now possible for parts of a database to be stored and maintained at different locations; for users to take advantage of economical or specialized processing at remote sites; for decision makers in collaborate across computer networks to making decisions; or for large archives to offer access to their data to anyone connected to the Internet. These and a host of other opportunities are offered by recent developments in hardware, software, and large bandwidth communications technologies.

In the future, it is likely that large scale, integrated packages such as GIS will be transformed into collections of smaller, interoperable modules. The free flow of data between them will be enabled by open specifications such as the industry standard open object specifications, and by the GIS industry's OGIS, or open geodata interoperability specification. Early versions of these "plug and play" GIS software architectures are already appearing. Modules may coexist in one system, or may be distributed across a network and assembled only when needed and with minimal user intervention. Already, we are seeing the rapid implementation of such ideas in the form of "add-ons" to World Wide Web browsers, and in languages like Java.

 

These technical advances in hardware, software, and communications create the need for two distinct types of research, both directed at making best use of broad technical advances within the comparatively narrow field of geographic information technologies. We need broadly based research into the economics, institutional impacts, and applications of distributed computing; and more narrowly defined research into the technical implications. The latter agenda is presented below under the topic Interoperability.

 

The problems and applications that GIS addresses seem particularly suited to take advantage of distributed computing. Geographic decisions supported by GIS must often be made by many stakeholder groups who are distributed both geographically and socially. Stakeholders are often located in different tiers of the administrative hierarchy. Data custodians may also be distributed, as may be the power to process geographic data in sophisticated software and hardware. On the other hand, a host of issues arise with the implementation of distributed architectures, some technical and some institutional. For example, we currently lack the kind of comprehensive, rigorous approaches to data description that will be needed if users are to be able to search for suitable data sources across distributed networks.

 

GIS has already adapted to several changes in computing architectures. Early mainframe systems were quickly extended to remote sites using phone lines and terminals. The minicomputers of the late 1970s were replaced by workstations and personal computers that were increasingly networked for exchange of data. Client/server architectures were adopted in the late 1980s, in a first step towards distributed software. Today, such architectures are being generalized to full distribution, while the user may be presented with an integrated view of the system that may bear little relationship to its actual structure. Indeed, we may reach a time when the entire global network is best conceived as a single, integrated computing system, as we once conceived of the mainframe.

 

Each of these changes has stimulated new growth in GIS applications, in the managerial and institutional arrangements that support it, and in the basic economics of GIS and geographic data in general. These changes are likely to continue in the transition to fully distributed computing architectures. Moreover, such architectures are likely to provide the opportunity for the GIS community to interact with entire new communities, particularly the library community, and for geographic information to become even more important to a range of human activities.

 

We need to anticipate the new applications and services that will become possible with distributed computing, and the costs and benefits associated with each of them. Monolithic solutions, which fail to take advantage of distributed computing architectures, are likely to become increasingly more expensive in comparison to solutions that exploit the opportunities offered by technology to share responsibilities and roles among various stakeholders. Studies are needed of the effects of the implementation of distributed computing architectures, and the opportunities they offer to GIS and geographic information in general. In addition to specialists in the technical aspects of the architectures, such as computer scientists, communications experts, and computer engineers, effective research will require the skills of geographers, economists, information scientists, digital librarians, and experts in public policy. UCGIS can play a key role in providing the institutional framework to link experts from these disciplines in a coordinated approach, and to develop partnerships with software vendors and other institutions.

EXTENSIONS TO GEOGRAPHIC REPRESENTATIONS

The manner in which geographic information is represented both conceptually and physically as stored data observations is a central issue for any field that studies phenomena on, over, or under the surface of the Earth. A data representation scheme is required, and is in fact inextricably linked with the processes of analysis and modeling of geographic phenomena. For example, in systems that find routes between places the geographic information is typically represented in the form of links between places denoted as points. In dealing with environmental problems, pollutants in air, water, or soil tend to be represented simply as grids. For other purposes, these same places may be represented as polygonal objects that are locationally defined by explicit boundaries.

 

The selection of information to be represented, and the representational scheme employed, is thus often driven by the application, and particularly by anticipating later stages of analysis, modeling, or interpretation. In turn, the results of any analysis can be greatly influenced by how the phenomena under study are represented. This is why, on an everyday level, a strip map or route map is more easily used for traveling from one place to another than an overall areal map, whereas a route map is virtually useless for showing the overall distribution of various geographic features within a given area.

 

While it is true that current geographic data representation techniques are capable of representing complex associations among multiple variables, they are nevertheless geared toward representation of static situations on a plane surface at a specific scale--in this respect, they echo and are largely limited to the nature of the paper maps from which many data sets are drawn. Many of these 2-dimensional representations can be extended conceptually to accommodate applications in which the third spatial dimension is important, but operational capabilities for representing and analyzing 3-dimensional data have been integrated only recently into general purpose, commercially available geographic information systems. Current spatial data storage and access techniques are also not designed to handle the increased complexity and representational robustness needed to integrate diverse data across a wide range of applications and disciplines.

 

Earth related data are being collected in digital form at a phenomenal rate, and the data volumes that are being generated are far beyond anything we have experienced so far. The Earth has nearly 1.5 x 1015 square meters of surface area, a single complete coverage of satellite data at 10 meter pixel resolution would total approximately 1.5 x 1013 pixels, and the number of bytes needed to store it would be of the same order of magnitude. Also, satellite imagery data is normally represented as a gridded array, or matrix, of cells. It is geometrically impossible, however, to represent the spheroidal Earth with a single mesh of uniform, rectangular cells, and research is needed to find better, less distorted representations.

 

Although many efforts have been made to integrate GIS with dynamic modeling, most have been limited to the development of an interface between two separate types of software systems. Modeling software tends to operate within very narrowly defined domains using mathematical simulation, while GIS is used primarily for preprocessing of observational data and post-processing for comparative display.

 

The ability to represent and examine the dynamics of observed geographic phenomena is currently not available within a GIS context, except in the most rudimentary fashion. We urgently need this capability as an essential tool for examining an increasing variety of problems at local, regional, and global scales. Problems requiring the analysis of change through time and of patterns of change range from urban growth and agricultural impacts to global warming. The need for research in this area is of particularly high priority because these representational schemes must be present before databases can be built, or analytical techniques based upon them can be developed.

 

Given the rapidly increasing use of geographic information systems for policy analysis and decision making, another urgent issue is how to represent data of varying exactness and degrees of reliability, and to convey this additional information to the user. Much work remains to be done on how to handle the fuzziness and imprecision that is inherent in geographic observational data within a digital database. This becomes particularly important when multiple layers of data from varying sources are combined.

COGNITION OF GEOGRAPHIC INFORMATION

In the past decade it has become clear that an understanding of certain aspects of human cognition is essential if future geographic information technologies are to realize their full potential as tools in the service of human decision making. If geographic information systems are to be made easier to use, by people who must make geographic decisions but are not willing to undertake the extensive and lengthy training required by today's systems, then GIS interfaces must be made more intuitive, and users must be able to interact with them in ways that reflect their natural thought processes. We need to know more about how humans learn geographic information, and how this understanding varies as a function of the medium through which the information is learned (direct experience, maps, descriptions, virtual reality). How do concepts of geographic space vary as a function of training and experience? How can complex geographic information be presented to the user in ways that promote comprehension and effective decision making? How and why do individuals differ in their cognition of geographic information, perhaps because of age, culture, sex, or specific background? How does exposure to new geographic information technologies alter human ways of perceiving and thinking about the world?

 

Inadequate attention to such cognitive issues is a major current impediment to the effectiveness of geographic information technologies. Cognitive research will lead to improved systems that take advantage of an understanding of human geographic perception and expertise. It may lead to improvements in representations, if the latter can be made to exploit the primitive elements of human spatial understanding. Cognitive research promises to make geographic information technologies more accessible to inexperienced and disadvantaged users, and also to increase their power and effectiveness in the hands of experienced users. Finally, it holds great promise for improving geographic education at all levels, by addressing general concerns about the poor levels of geographic knowledge in society, and low levels of awareness of such critical issues as global environmental change.

 

For example, research has shown that the effectiveness of In-Vehicle Navigation Systems (IVNS) depends on the format in which information is presented to the user. For most users, certain forms of verbal instructions have been shown to lead to faster processing and fewer errors than map displays, and are also safer because they require less of the driver's attention. Further research will help to determine the types of features that are most usefully included in verbal instructions; the optimum timing of instructions; and other aspects of the interaction between driver and IVNS.

 

The development of the Internet has opened the possibility of systems that emulate the functions of map libraries by allowing a user to search for digital geographic data over the network as if he or she were browsing among the shelves of a traditional library. But the future of such technologies depends on our ability to provide a user interface that successfully reproduces all of the map library's functions, including the assistance provided by library personnel to users with a wide range of levels of experience. Many of the concepts used to classify and catalog maps, such as scale, or the latitudes and longitudes that define the map's extent, are likely to be unfamiliar to at least some users of the digital map library.

 

Research into the cognitive aspects of geographic information technologies is part of a research tradition begun primarily in the 1960s by urban planners, behavioral geographers, cartographers, and environmental psychologists. Planners study how humans perceive and learn about places and environments. Behavioral geographers develop theories and models of the human decision making processes that lead to behavior in geographic space, such as shopping, migration, and the journey to work. Cartographers study how maps are perceived and understood by users with varying levels of expertise. Environmental psychologists have refocused traditional questions about psychological processes and structures, to examine how they operate in the contexts of built and natural environments. All of these disciplines will need to work together to address the cognitive aspects of geographic information technologies.

INTEROPERABILITY OF GEOGRAPHIC INFORMATION

The term interoperability refers to a bottom-up integration of existing systems and applications that were not designed to be integrated when they were built. Because there are so many options available for representing geographic information, and so many different choices have been made by system designers, it can be difficult if not impossible to transfer data from one system to another; to access one system's data from another; to control one system with the commands defined for another; or to take experience accumulated with one system and apply it to another without retraining. The costs of this situation, in wasted time, lack of communication and coordination, and duplication of effort are enormous.

 

Interoperability implies the sharing or exchange of information between different systems. In some instances data may be transmitted from one system to another; in others, instructions may be sent from one system and executed on another, without actual exchange of data. Such technical options are generally easier to resolve than the more fundamental ones related to incompatibilities of languages, representations, and syntax. For systems to be interoperable there must be a consistent set of interpretations for information--one system must be capable of understanding the meaning of another system's data. Such agreement on the meaning of exchanged or shared information is termed semantic interoperability.

 

Efforts over the past ten to fifteen years have produced a number of exchange standards for geographic information, and many have been adopted. Such exchange standards establish a standard format, with associated semantics. Each system is then able to develop translators to and from the exchange standard, and to map its own terms and language into those of the exchange standard. To date, most of this effort has been focused on the data, rather than on the operations which systems perform. Thus we are currently a long way from achieving the full goals of interoperability. The exchange of data must be initiated explicitly by the user, and command languages and user interfaces are still largely unique to each system.

 

A key component of any interoperable environment is a shared system for describing data. Such descriptions must travel ahead of the data, informing the recipient system of the data's formats and semantics, so that the recipient system can process it effectively. Metadata has emerged as the accepted term to describe this form of digital documentation, and much attention has been devoted recently to the development of appropriate standards and protocols. Much further work is needed in storing and representing metadata, specifying metadata requirements for geographic domains, and building tools that are able to find commonalities between data from different systems and agencies.

 

A long term goal of research in interoperability is to develop methods that are capable of extracting and updating essential metadata automatically. The willingness of agencies to invest in the creation of useful metadata has proved to be a key issue in achieving interoperability, since metadata definition is labor-intensive and tends to require a high level of expertise. Yet much metadata could be obtained automatically from the characteristics of the host system, or by examination of the contents of the data set.

 

Much of the capability of GIS as a tool for the analysis of geographic problems is derived from formal models of geographic features. In the past these models were largely cartographic in origin. But geographic information technologies are now being used to address problems that are not inherently cartographic, such as the modeling of dynamic physical processes. Research is needed to formalize methods for representing all kinds of geographic phenomena, and to develop standardized languages for describing operations. The results of such research will make it easier to integrate GIS data into dynamic models, and to provide the environmental modeling community with tools that use standard languages and thus offer a much higher degree of interoperability.

SCALE

The term scale refers generally to the level of detail with which information can be observed, represented, analyzed, and communicated. Since we can never observe the geographic world in complete detail, scale is necessarily an important property of all geographic information. Changing the scale of data without first understanding the effects of such action can result in the representation of processes or patterns that are different from those intended. The spatial scaling problem presents one of the major impediments, both conceptually and methodologically, to advancing all of the sciences that use geographic information; and the scaling of other dimensions, such as time, raises similar problems.

 

Recent work on scaling behavior of various phenomena and processes (including research on global change) has shown that many processes do not scale linearly or uniformly. Thus, in order to characterize a pattern or process at a scale other than the scale of observation, some knowledge is needed of how that pattern or process changes with scale. Attempts to describe scaling behavior by fractals or self-affine models have proven largely ineffective because the properties of many geographic phenomena do not repeat over a range of scales as precisely as the model requires. Multifractals have shown some promise, but alternatives are needed if we are to understand the impacts of scale changes on information content. Scale-based benchmarking of process and analytical models will help scientists to validate hypotheses, which in turn will improve geographic theory building.

 

Despite longstanding recognition of the implications of scale for geographic inference and decision making, many questions remain unanswered. The transition from paper maps to digital representations of geographic information forces us to deal formally with the conceptual, technical, and analytical issues of scale in new ways. The cartographer's familiar representative fraction, perhaps the most widely used measure of geographic scale, defined as the ratio of distance on the map to distance on the ground, becomes comparatively meaningless in the world of digital information, where a data set may never exist in paper map form at any stage of its existence. It is easy to demonstrate by isolated example that scale poses constraints and limitations on geographic information, spatial analysis, and models of the real world. The challenge is to articulate the conditions under which scale-imposed constraints are systematic, and to develop geographic models that compensate for scale-based variation.

 

The widespread adoption of GIS contributes to the scale problem, but it may also offer solutions. GIS facilitates integration across scales; advanced database designs can handle data at multiple scales in one consistent format; hierarchical structures such as the quadtree allow a single data set to supply representations at many scales; and the set of computer-based tools for automated manipulation of scale is growing rapidly. Fundamental scale questions will benefit from coordinated, multidisciplinary research. With the development of alternative models of scale behavior, novel methods for describing the scale of data that are appropriate for the digital world, and intelligent automation of scale change, information systems of the future can both sensitize users to the implications of scale dependence, and provide effective tools for management of scale.

 

It has become clear that global and regional processes have implications for local places, and that individual and local decisions have collective effects at regional and global scales. Thus scientific information about global and regional patterns and processes must be understood on a local level, and vice versa. As the policy making and scientific communities come to grips with these relationships, systematic understanding about spatial and temporal variations in scale gains in importance. Geographic information plays an ever larger role as we move to an increasingly automated information economy. Our understanding of scale, and the management of data at various scales, must keep pace. Research is needed:

SPATIAL ANALYSIS IN A GIS ENVIRONMENT

The collection, storage, and analysis of data has often been limited by our ability to amass, collate, recognize, and detail observations; the spatial component of data is no exception. Humans tend to simplify and generalize in the process of drawing conclusions about relationships, trends, and patterns in space. Although these goals may seem focused, the process of reaching them is often simplistic, relying too much on possibly misleading intuition, and limited by the poor quality of available tools.

 

Modern data collection methods, such as remote sensing, are capable of supplying data in amounts, detail, and combinations that literally boggle the mind. The increased availability of large, spatially referenced data sets, and improved capabilities for visualization, rapid retrieval, and manipulation within a GIS all point to the inadequacies of the human being's capacities for data analysis, filtering, assimilation, and understanding. If we are to make effective use of this vast supply of data, we need new methods of spatial data analysis that are better designed for this new data-rich environment, whether the objective is to explore for new patterns, or to test and confirm the validity of previous ones.

 

To remain at the cutting edge of GIS technology, analytic and computational methods must be devised that allow for solutions to problems conditioned by GIS data models and the nature of spatial and space-time research. New forms of statistical analy sis are needed to assess relationships between variables in a variety of new spatial contexts. New theories must be devised that provide understanding of relationships at the new levels of resolution and dimension that are available with new sensing technologies.

 

Standing in the way of confirmatory spatial data analysis, including modeling, are questions having to do with spatial scale, spatial association, spatial heterogeneity, boundaries, and incomplete data. Without reasonable responses to these problems, the usefulness of GIS as an analytical tool in a sophisticated research environment will surely come into question. By the use of GIS, previously prohibitive, computationally intensive, and highly visual ways of spatial analysis will become accessible at reasonable costs.

 

UCGIS calls on spatial analysts from both the physical and human sciences to assist in the development of spatial statistics, geostatistics, spatial econometrics, structural and space-time modeling, mathematics, and computational algorithms that can take advantage of the flexibility, capacity, and speed of GIS. Those well-schooled in theory, empiricism, data collection, data manipulation, programming, and computer technology will be in the best positions to make advances in the field, but practitioners such as epidemiologists, ecologists, climatologists, regional scientists, landscape architects, and environmental scientists can provide much useful guidance and input.

 

New methods, techniques, and approaches are needed for the analysis of very large and complex spatial data sets. Further development is needed in the area of exploratory spatial data analysis, particularly to extend existing methods to data that includes a temporal component. We are just beginning to see the integration into GIS tools of the existing and powerful methods of geostatistics. Procedures must be found that can identify key observations, clusters, and anomalies in spatial data. There is a need to incorporate tools for complex spatial and temporal simulation; and to improve access to such advanced analytic and modeling methods as neural nets, wavelets, and cellular automata. We need to explore the implications for spatial analysis of new computing architectures, such as massively parallel and distributed systems, and the implications for analytic and modeling software of open object-based programming methods. Spatial econometrics is a new and burgeoning field, and it is important to link its sophisticated procedures with the functionality and flexibility of GIS, and to find appropriate techniques for heterogeneous geographic data. Better data models are needed in GIS to handle the suite of models used to analyze and forecast spatial interaction, and widely applied in transportation, demography, and retailing; and to advance the sophistication of techniques of operations research that are applied to vehicle routing, site selection, and location analysis.

THE FUTURE OF THE SPATIAL INFORMATION INFRASTRUCTURE

In the early 1990s the U.S. National Research Council's Mapping Science Committee articulated how spatial information handling might best be approached from an organizational perspective. This led to a plan for the creation of a National Spatial Data Infrastructure (NSDI), recognized as critical to serving national priorities. In addition, many designated executive science and technology priorities, such as science education, technology transfer, high performance computing and networking, digital libraries, global environmental change, and international competitiveness, all have significant geographic information components, as do traditional land management activities. These priorities are mirrored at state and local levels of government. However, there is a growing need for increased coordination between programs, and to make the outcomes of these activities appropriate, and available to address social needs.

 

Despite the large investments in geographic data development by government and the private sector, there is often a lack of knowledge and experience with the complex policy-related issues that arise from the community-wide creation, compilation, exchange, and archiving of large geographic data sets. Technical, legal, and public policy uncertainties interact, making it difficult to utilize information resources fully to pursue social goals. The ownership of digital geographic data, protection of privacy, access rights to the geographic data compiled and held by governments, and information liability are all concepts that require greater clarity in the new, automated context. Observations of the ramifications of following different policy choices are needed to help guide future choices.

 

The government sector plays an important role in developing the fundamental spatial information infrastructure due to its activities in the systematic collection, maintenance, and dissemination of geographic data. These resources have significant uses beyond their governmental purposes. For example, subsequent use of geographic data by organizations can stimulate the growth and diversity of the information services market. At the same time, public access to government information remains essential to ensuring government accountability and democratic decision making. Reconciliation of the tensions inherent in these and other policies becomes more important as we move toward global economies and international networked environments. Rigorous and impartial analysis is urgently needed to inform decision makers on the economic, legal, and political ramifications of choosing one policy over another.

 

We propose four broad areas where research will help to strengthen the future of the nation's spatial information infrastructure:
In summary, the goal of this research should be to help policy makers, scientists, business leaders, and citizen groups to understand the relationships between government policy and geographic information resources, services, and infrastructure--and by so doing, to facilitate the accelerated growth and utilization of geographic information resources in meeting society's future needs.

UNCERTAINTY IN GEOGRAPHIC DATA AND GIS-BASED ANALYSES

Geographic data are unique, in that information about a geographic feature contains three different kinds of attributes: the typological attributes (describing the type of a geographic feature), the locational attributes, and the spatial dependence (the spatial relationship with other features). For example, a datum about a forest can include the type and species combination of the forest (as typological attributes), the location and spatial extent of the forest (the locational attributes), and its relationships with its surrounding landscape features (spatial dependence). All of these attributes are subject to uncertainty, since stored information is at best only an approximation to reality; and they may also change over time, making geographic data very complex and difficult to manage. The basic schemes used to create digital representations of geographic features do not deal with complex objects which may consist of interacting parts, or display variation at many different levels of detail over space and time. Many forms of discrepancy therefore exist between geographic data and the reality these data are intended to represent. These discrepancies propagate through, and may be further amplified by, spatial data management and analyses in a GIS environment. We refer to them here by the general term uncertainty. Uncertainty information associated with a geographic data set can be conceived as a map depicting varying degrees of uncertainty associated with each of the features or phenomena represented in the data set, and potentially separable into three components: uncertainty in the typological attributes, uncertainty in the locational attributes, and uncertainty in spatial dependence.

 

Unfortunately, geographic data are often used, analyzed, and presented under the assumption that they are free of uncertainty. The beguiling attractiveness, the high aesthetic quality of cartographic products from GIS, and the analytical capability of GIS further contribute to an undue credibility, at times, of these products. However, undeserved and inappropriate acceptance of the accuracy of these data is often not warranted for the reasons discussed above. Error-laden data, used without consideration of its intrinsic uncertainty, has a high probability of leading to inappropriate decisions.

 

Uncertainty exists in every phase of the geographic data life cycle, from data collection to data representation, data analyses, and final results, transcending the boundaries of disciplines and organizations. As it passes along the stages from observation to eventual archiving, geographic data may pass between many different custodians, each of whom may provide their own distinct interpretations to the data. Thus uncertainty is not a constant property of the data's content so much as a function of the relationship between the data and the user: uncertainty is a measure of the difference between the data, and the meaning attached to the data by its current user. For example, if knowledge of the classification scheme used to create a data set fails to pass from one custodian to another, and a user mistakenly attributes the wrong classification scheme, then uncertainty has been increased, because the data contents may now be further from the new user's understanding of the truth, as defined by the new, mistaken classification scheme.

 

At this time, our understanding of uncertainty in geographic data and its consequences for decisions made using geographic information technologies is very incomplete. Progress will require the combined efforts of experts in particular domains of geographic data; experts in GIS, spatial analysis, and modeling; spatial statisticians and geostatisticians; and developers and vendors of GIS software. Intensive research is needed in the following areas:

GIS AND SOCIETY

Access to information technology is often presented as offering enormous benefits to society, in the form of increased choices, a more informed citizenry, economic growth, and empowerment of the individual. At the same time, and while we may believe in m any of the assumptions on which these assertions of improvement are based, we have very little understanding of the long term political, economic, legal, and institutional impacts of technologies like GIS. Moreover, the geographic information technologies seem to have certain unique characteristics that will affect their eventual impact, including the potential for invasion of privacy, and relevance to community-based decision making and political processes.

 

In listing this as one of their ten research priorities, the delegates to the UCGIS meeting in Columbus identified several specific issues that should form the basis of a research agenda on GIS and society:
The products of geographic information technologies are changing and will continue to change the economic, legal, political, and cultural status of adopting agencies, decision makers using the products, and the people and organizations affected by the decisions. While early impacts are becoming evident, little is known about the long term effects that the products of these technologies will have on the communities and organizations that implement them. We should observe, and ultimately be able to predict, how geographic information technology and products alter decision making processes within organizations, interactions between agencies, the citizen's relationships with government agencies, and people's beliefs and actions in regard to the use and management of land and resources.

 

At a deeper level, we need to ask to what extent the particular logics, visualization techniques, value systems, forms of reasoning, and ways of understanding the world that have been incorporated into existing GIS techniques limit or exclude the possibilities of alternative forms of representation that may be as yet unexplored. We need to ask how the proliferation and dissemination of GIS has influenced the ability of different social groups to use information for their own empowerment--who it has favored, and who it has excluded. Finally, we need to ask whether ethical or legal restrictions need to be placed on access to geographic information technologies because of their potential for misuse, surveillance, and invasion of privacy.

CONCLUDING COMMENTS

As noted earlier, the ten topics identified by UCGIS as its research agenda reflect the views of the research community at this point in time. They are driven partly by the research community's perception of what is possible, and where commitment of resources would lead to substantial results, and the solution of well-defined problems, in reasonable time. They also reflect the research community's consensus on problems that currently impede the use of geographic information technologies in addressing current needs. On the other hand, the research community's views on society's needs are clearly limited, and must be refined and enlarged by those better equipped to address such issues, including government agencies, elected officials, and the general public. Thus this agenda is presented here more in the spirit of a shopping list--here is where we think research could be done to good effect, as a first step in a dialog.

 

UCGIS intends to refine the agenda, as perceptions change, results accumulate, views are expressed, and problems are solved. We expect to do this roughly every two years, at meetings similar to the one held in Columbus in June, 1996.

 

Certain readers may be disappointed by apparent absences from the list of topics. We have tried to construct a scientific research agenda, and to organize it in terms of a set of fundamental issues rather than applications, and in consequence none of the topics refers to a specific domain. As dialog proceeds, we expect to identify areas within the ten topics that are particularly relevant to domains--for example, several of the topics are of great relevance to transportation, and several to global environmental change. A matrix showing the importance of each of the ten topics to each domain of GIS application would be useful and should be developed.

 

Similarly, none of the topics is itself a geographic information technology. We do not have a research priority on GPS, or remote sensing, or GIS, because these technologies form the underlying framework for the entire agenda. Remote sensing, for example, is of particular relevance to the first topic, Spatial Data Acquisition and Integration; to Spatial Analysis in a GIS Environment; and to Uncertainty in Spatial Data and GIS-Based Analysis. Rather than focus a topic on a specific technology, we feel that a focus on several fundamental issues raised by the technology and currently impeding its use will be more productive.

 

As noted earlier, UCGIS welcomes comments and discussion of this agenda, involvment in the dialog that will follow its publication, and participation in the process of its continued evolution and refinement.