Used with permission from Cartography and Geographic
Information
Systems, Volume 23, Number 3. Copyright 1996 American Congress on
Surveying and Mapping.
RESEARCH PRIORITIES FOR GEOGRAPHIC INFORMATION SCIENCE
University Consortium for Geographic Information Science
http://www.ucgis.org
PREFACE
In the United States and many other countries, the scientific community faces
an unprecedented period of continued downward pressure on the funds available
from the sources that have traditionally provided its financial support. However
hard the scientif ic community is able to lobby, and however convincing its
case, it is clear that some very difficult choices must be made in allocating
what is likely to be an ever diminishing resource in the coming years.
Governments are under pressure from disgruntled taxpayers and face spiralling
costs, and research will always be one of the easiest budgets to attack, since
it is rarely protected by the kinds of legislation that preserve pensions and
other entitlements.
In such circumstances, the mechanisms that establish funding priorities
become crucial. Traditionally, priorities for funding research have been
established by a complex process that attempts to balance intellectual curiosity
with the need to solve immediate and practical problems, and to ensure the
future health of industry through invention. One of the most important
components of this mechanism is the role played by scientists themselves. While
society as a whole must determine the importance of many problems, scientists
are very well equipped to estimate the likelihood that a given problem can be
solved through research, and the resources necessary to do so. Scientists have
an important role to play, therefore, in helping to set priorities for research,
and the scope of the scientific agenda. Thus although the final decisions on
allocation of public funds will always be made by governments acting on behalf
of citizens, it is essential that scientists be involved in the dialog that
precedes the allocation of the limited resources that will be available for
future research.
In 1995, following several years of informal discussions, a group of U.S.
research universities, national laboratories, and learned societies formed the
University Consortium for Geographic Information Science (UCGIS). The term
"geographic informati on science" has emerged recently as an acceptable umbrella
term for the fundamental problems surrounding the effective capture,
interpretation, storage, analysis, and communication of geographic
information--topics that have become increasingly important w ith the popularity
of geographic information systems (GIS), the Global Positioning System (GPS),
satellite remote sensing, and related geographic information technologies.
Geographic information scientists study how people use geographic information in
direction-finding; develop techniques to measure the accuracy of geographic
information; find better ways of representing geographic information in digital
computers; develop the standards that allow computers to exchange geographic
information despite differences in system formats; and research many other
important issues.
Members of UCGIS combine strengths in a range of disciplines. In order to
qualify for membership, an institution must demonstrate that it has made a
significant commitment to research in geographic information science; that the
commitment extends across a number of disciplines; and that mechanisms for
coordination and cooperation exist. Further information on UCGIS can be found by
accessing its web site, http://www.ucgis.org.
An important goal of UCGIS is the development of a set of research
priorities for geographic information science. Accordingly, in June 1996
delegates from the 29 research institutions that were then members of UCGIS met
in Columbus, Ohio to carry out a consensual process of development of a
prioritized research agenda. Prior to the meeting, each institution was given
the opportunity to identify five topics, based on discussion between geographic
information scientists on each campus. In Columbus these initial topics were
clarified, merged, extended, and refined. Delegates then voted to identify the
final list. After the meeting, working groups further refined the topics,
formalized them in accordance with a standard format, and submitted them to an
editorial committee.
This paper presents a summary of the research priorities that emerged from
this process. UCGIS regards the organization's research agenda as a dynamic,
continually evolving document. In its initial form, it represents only the views
of the research community, and thus is no more than the first stage in a dialog
that will involve as many as possible of the other stakeholders in the national
process of prioritization. Moreover, the research community itself is likely to
wish to modify the agenda, as science evolves and more is known about the
fundamental problems associated with geographic information. Nevertheless, we
believe it is important that the UCGIS research priorities be published in the
form of this paper, in order to make them accessible to the widest possible
audience, and to move the process of dialog forward as rapidly as possible.
UCGIS plans a number of other activities in the coming months to stimulate this
dialog, and hopes that as much input as possible will be forthcoming from the
other stakeholders. How, for example, do these priorities match with those of
government agencies with heavy commitments to geographic information, or with
those of the GIS software industry? How do they match those of scientists from
disciplines that use geographic information technology, rather than study its
basic issues?
Certain individuals played important roles in the development of this paper.
Key roles in organizing the Columbus meeting on which it is based were played by
several members of the UCGIS Research Committee: David Mark, State University of
New York at Buffalo (chair); John Bossler, Ohio State University; Jerome Dobson,
Oak Ridge National Laboratory; Max Egenhofer, University of Maine; George Hepner,
University of Utah; Donna Peuquet, Pennsylvania State University; and Dawn
Wright, Oregon State University. UCGIS also acknowledges the contributions of
the delegates to the Columbus meeting; the working groups who contributed to the
elaboration of each topic and the draft white papers on which this paper is
based; the Directors of UCGIS; and the members of the editorial committee for
the research agenda: Earl Epstein, Ohio State University; Michael Goodchild,
University of California, Santa Barbara (co-chair); Carolyn Hunsaker, Oak Ridge
National Laboratory (co-chair); John Radke, University of California, Berkeley;
Bill Reiners, University of Wyoming; and Alan Saalfeld, Ohio State University.
Comments on this paper are invited. They should be directed to the President of
UCGIS (currently William Craig, University of Minnesota), by email to
president@ucgis.org.
INTRODUCTION
Geographic information can be defined as consisting of facts about specific
places on the Earth's surface (spatial information is defined more generally as
information related to any multidimensional frame, and thus includes medical
imaging, for example, although geographic and spatial are frequently used almost
interchangeably). Traditionally, such information has been expressed in the form
of maps, and maps are often embedded in larger information sources such as
atlases, books, or encyclopedias. The advent of aerial photography in the early
years of this century, and later satellite remote sensing from space, added
greatly to the availability, precision, and richness of geographic information.
Geographic information can also be expressed in the form of written text, or in
the tables produced by statistical agencies. Telephone directories are yet
another form of geographic information, providing links between individuals,
telephone numbers, and street addresses.
The handling of geographic information has always raised issues of
scientific nature. The science of mathematical geography flourished in classical
and medieval times because of the need to understand the basic shape of the
Earth, and its dimensions, so that it could be mapped accurately and its surface
transformed to fit the flat paper sheets of maps. Geodetic science continues to
address such questions, as ever more accurate geometric models of the Earth's
shape are devised in response to improved measurements. New technologies, such
as the street map systems now being installed in many vehicles to aid
navigation, raise new interest in old questions about the ability of people to
comprehend and work with information expressed in map form.
The rapid development of geographic information technologies over the past
two decades has led to fundamental changes in the ways many human activities are
organized. The forester who once managed forest resources by walking the ground
now relies on cost-effective aerial photography and satellite images to support
the same functions at greatly reduced cost. The utility company uses geographic
information systems and geographic databases instead of hand-drawn paper records
to keep track of the locations of cables and pipes, and to manage their
maintenance. The delivery company uses GIS to optimize its routes, and to allow
the customer to monitor the progress of a shipment. Geographic information
technologies allow vital linkages to be made between apparently unrelated
activities, based on common geographic location, and have led to a much higher
level of integration and sharing between what were previously rigidly separated
parts of an organization.
Many of these changes have been driven by broader developments in
information technology in general, and have little to do with research in
geographic information science. Faster and cheaper computing, the shift from
mainframe to desktop, the development of the Internet, and many other
breakthroughs have all made it easier to process and store geographic
information in digital form. On the other hand, geographic information continues
to lag behind other information types that are inherently more suited to digital
representation, such as numbers and text. Geographic information is uniquely
different from other information types in several key respects, suggesting that
a science of geographic information is particularly important if some of the
barriers to effective use of this vitally important form of information are to
be overcome.
First, geographic information is rich and voluminous. While the contents of
a book of 100,000 words can be captured on a megabyte diskette, it can easily
take two orders of magnitude more storage capacity to capture a reasonably
precise representation of a single paper map. A single Earth image from a
satellite can fill the entire storage capacity of today's personal computer.
Second, the surface of the Earth is infinitely complex, and consequently
geographic information must always be an approximation. A vast range of choices
therefore exist, depending on what is captured and what is lost in the process
of creating a map, or a representation of the Earth's surface in digital form.
These choices will later affect the usefulness of the information, and may even
lead to litigation when mistakes are made.
Third, geographic information is increasingly essential to many activities
of modern society. Growth of international trade, and the globalization of
economies, requires an unprecedented level of knowledge of the diverse
conditions existing in different parts of the planet. The Earth's resources are
being exploited at ever faster rates, and accurate information is needed for
their effective management and conservation. Geographic information is essential
to our understanding of the physical Earth system, and the interrelationships
between its components. Moreover, the level of interest in detailed geographic
information inevitably varies geographically, leading to complex problems in
matching availability to need.
Fourth, geographic information science is inherently multidisciplinary. No
existing or traditional discipline can claim a unique role in solving the
problems of handling geographic information--and indeed research in these issues
has traditionally been divided among a number of disciplines that have often
competed among themselves for the available resources. In this environment,
UCGIS hopes to provide an interdisciplinary meeting ground, where scientists
from different disciplines who share a common interest in solving these problems
can work together, each bringing a different set of approaches and paradigms,
and together combining them to optimum effect.
Finally, the growth of geographic information technologies has already had
profound and in many cases unanticipated impacts on society. The ability to use
GIS to link together digital street maps and telephone directories, for example,
means that it is now possible to identify the telephone number of a house by
pointing to its image on a computer screen. Marketing campaigns can now be
targeted to the imputed socioeconomic status of each household. These
possibilities are the simple result of improvements in technology, but their
implications for individual privacy are much more profound.
The Columbus meeting of the UCGIS identified ten priority research topics
within geographic information science. While the delegates believe that they are
of higher priority than other topics, no attempt was made to rank the ten.
Instead, we believe that with sufficient resources significant progress can be
made on all ten topics in the next few years; and that in each case there will
be substantial benefits to society at large, and specifically to the various
groups who depend in one way or another on geographic information technologies.
While the following sections identify the specific benefits of research in each
case, in general we believe that investment in research in these priority topics
for geographic information science will:
- build the base of new discoveries and methods that will sustain the
continued vitality and competitiveness of the U.S. geographic information
technology industry over the coming decades;
- combine the strengths of scientists from different disciplines through the
medium of the UCGIS, and similar multidisciplinary organizations;
- invigorate advanced training of a new generation of research scientists;
- reduce the impediments that currently limit the effectiveness of
applications of geographic information technologies in many areas;
- lessen the likelihood that geographic information technologies will be
misused, or their products misinterpreted, or inappropriate decisions be made
based on their products.
The ten priority topics are presented in an order that has no significance.
We anticipate that subsequent dialog between the research community, funding
agencies, stakeholders with interests in the results of research, and other
groups will both refine the topics and add specific prioritization as they
attempt to adapt them to particular needs and objectives. In the interests of
brevity, this summary of the research priorities does not include references.
Instead, interested readers are referred to the appropriate UCGIS source
documents, available via the UCGIS web site
http://www.ucgis.org.
SPATIAL DATA ACQUISITION AND INTEGRATION
Technological advances are making it possible to capture geographic
information with ever increasing accuracy. Commercial remote sensing images from
space will soon offer a resolution of one meter or better. Satellite telemetry
using the Global Positioning System (GPS) can now achieve accuracies well within
one centimeter. But each new data set, and each new data item that is collected,
can only be utilized fully if it can be placed correctly within the context of
other available data. Integration with other data is increasingly important in
new geographic information products. For example, the production of a digital
orthophoto quadrangle (DOQ), a new form of digital imagery that has been
processed to correct for distortions due to topography and camera angle,
requires four distinct types of information, all of which must be successfully
integrated to produce an accurate result: the image acquired from the airborne
sensor; a digital model of the elevation of the land surface; a minimum of four
geodetic control points whose locations are known accurately; and information
about the sensor device itself.
Adding to the complexity of the task of integrating diverse forms of
geographic information is the existence of two very different types of accuracy.
A map or image can capture the relative positions of features with great
accuracy; but their absolute positions depend on how successfully the map or
image is registered to an Earth frame, most notably the system of latitude and
longitude. For example, we can know very precisely the distance from one
mountain peak to another, but have very poor information on their latitude and
longitude positions. This difference becomes crucial when two data sets have to
be combined--unless both have high levels of absolute positional accuracy, there
will be significant errors of misregistration. Such errors often occur when
databases are updated with apparently more accurate information.
Similar problems of data integration occur at the boundaries between data
sets, particularly if they have been registered independently to the Earth
frame, or if they have been produced using different standards and protocols.
This problem of edgematching is found frequently in geographic data, and can
have serious consequences in many applications. A road, for example, can
disappear, shift position, or change classification at a county boundary if the
two counties' mappings are of different dates, have been registered to the Earth
frame using different control points, or use different systems of
classification, respectively.
Recent trends affecting the agencies that have traditionally supplied the
nation's basic mapping have exacerbated the need for better approaches to
integration. The National Spatial Data Infrastructure is conceived as a system
of collaboration between agencies at all levels--federal, state, and local--and
the private sector, to work to common standards and protocols in building the
nation's base of geographic information. Instead of one agency, able to set its
own procedures and ensure high internal levels of quality control, the base
mapping of the future will be provided through a series of consortium agreements
between independent producers. Problems have also been exacerbated by
communication technologies like the Internet, which offer the opportunity to
integrate data from widely different sources.
To support such efforts, we need to develop much better tools for data
integration than currently exist, based on high quality research. The term
conflation has been suggested as a way of referring to techniques that are
capable of automatic registration of geographic data sets, based on recognition
of common features, and adjustments to both geometric positions and feature
types. Conflation techniques are needed for many different types of geographic
data, ranging from digitized maps to digital images; and with varying degrees of
human intervention. To be reliable they must be based on sound principles,
including an understanding of the causes of misregistration and their likely
effects.
Some of these techniques are likely to be common to other areas where
spatial data shares similar characteristics, such as medical imaging; but in
other cases the unique characteristics of geographic data argue for
specialization. Effective research on integration will require the collaboration
of many sciences with common interests and motivations, including image
processing, pattern recognition, robotics, computer science, geodetic science,
and photogrammetry.
In the coming years, we can expect continued research into better tools for
spatial data acquisition, as new satellite sensors are launched and new
generations of global positioning systems become available. Major advances are
also likely in ground- based data acquisition systems. Because of the enormous
volumes of data generated by automatic sensors, it will be increasingly
important to employ sophisticated algorithms for directing ground-based
sampling, for recognizing patterns and analyzing data directly in the field. The
term field GIS has been used to describe systems that can be taken
directly to the observation site, and use GIS-like tools to help scientists
collect a more efficient and economical representation. Field GIS is becoming
widely used in forestry, and in improving the efficiency and minimizing the
impacts of intensive agriculture.
DISTRIBUTED COMPUTING
Digital technology is moving rapidly to distributed computing. It is now
possible for parts of a database to be stored and maintained at different
locations; for users to take advantage of economical or specialized processing
at remote sites; for decision makers in collaborate across computer networks to
making decisions; or for large archives to offer access to their data to anyone
connected to the Internet. These and a host of other opportunities are offered
by recent developments in hardware, software, and large bandwidth communications
technologies.
In the future, it is likely that large scale, integrated packages such as GIS
will be transformed into collections of smaller, interoperable modules. The free
flow of data between them will be enabled by open specifications such as the
industry standard open object specifications, and by the GIS industry's OGIS, or
open geodata interoperability specification. Early versions of these "plug and
play" GIS software architectures are already appearing. Modules may coexist in
one system, or may be distributed across a network and assembled only when
needed and with minimal user intervention. Already, we are seeing the rapid
implementation of such ideas in the form of "add-ons" to World Wide Web
browsers, and in languages like Java.
These technical advances in hardware, software, and communications create
the need for two distinct types of research, both directed at making best use of
broad technical advances within the comparatively narrow field of geographic
information technologies. We need broadly based research into the economics,
institutional impacts, and applications of distributed computing; and more
narrowly defined research into the technical implications. The latter agenda is
presented below under the topic Interoperability.
The problems and applications that GIS addresses seem particularly suited to
take advantage of distributed computing. Geographic decisions supported by GIS
must often be made by many stakeholder groups who are distributed both
geographically and socially. Stakeholders are often located in different tiers
of the administrative hierarchy. Data custodians may also be distributed, as may
be the power to process geographic data in sophisticated software and hardware.
On the other hand, a host of issues arise with the implementation of distributed
architectures, some technical and some institutional. For example, we currently
lack the kind of comprehensive, rigorous approaches to data description that
will be needed if users are to be able to search for suitable data sources
across distributed networks.
GIS has already adapted to several changes in computing architectures. Early
mainframe systems were quickly extended to remote sites using phone lines and
terminals. The minicomputers of the late 1970s were replaced by workstations and
personal computers that were increasingly networked for exchange of data.
Client/server architectures were adopted in the late 1980s, in a first step
towards distributed software. Today, such architectures are being generalized to
full distribution, while the user may be presented with an integrated view of
the system that may bear little relationship to its actual structure. Indeed, we
may reach a time when the entire global network is best conceived as a single,
integrated computing system, as we once conceived of the mainframe.
Each of these changes has stimulated new growth in GIS applications, in the
managerial and institutional arrangements that support it, and in the basic
economics of GIS and geographic data in general. These changes are likely to
continue in the transition to fully distributed computing architectures.
Moreover, such architectures are likely to provide the opportunity for the GIS
community to interact with entire new communities, particularly the library
community, and for geographic information to become even more important to a
range of human activities.
We need to anticipate the new applications and services that will become
possible with distributed computing, and the costs and benefits associated with
each of them. Monolithic solutions, which fail to take advantage of distributed
computing architectures, are likely to become increasingly more expensive in
comparison to solutions that exploit the opportunities offered by technology to
share responsibilities and roles among various stakeholders. Studies are needed
of the effects of the implementation of distributed computing architectures, and
the opportunities they offer to GIS and geographic information in general. In
addition to specialists in the technical aspects of the architectures, such as
computer scientists, communications experts, and computer engineers, effective
research will require the skills of geographers, economists, information
scientists, digital librarians, and experts in public policy. UCGIS can play a
key role in providing the institutional framework to link experts from these
disciplines in a coordinated approach, and to develop partnerships with software
vendors and other institutions.
EXTENSIONS TO GEOGRAPHIC REPRESENTATIONS
The manner in which geographic information is represented both conceptually
and physically as stored data observations is a central issue for any field that
studies phenomena on, over, or under the surface of the Earth. A data
representation scheme is required, and is in fact inextricably linked with the
processes of analysis and modeling of geographic phenomena. For example, in
systems that find routes between places the geographic information is typically
represented in the form of links between places denoted as points. In dealing
with environmental problems, pollutants in air, water, or soil tend to be
represented simply as grids. For other purposes, these same places may be
represented as polygonal objects that are locationally defined by explicit
boundaries.
The selection of information to be represented, and the representational
scheme employed, is thus often driven by the application, and particularly by
anticipating later stages of analysis, modeling, or interpretation. In turn, the
results of any analysis can be greatly influenced by how the phenomena under
study are represented. This is why, on an everyday level, a strip map or route
map is more easily used for traveling from one place to another than an overall
areal map, whereas a route map is virtually useless for showing the overall
distribution of various geographic features within a given area.
While it is true that current geographic data representation techniques are
capable of representing complex associations among multiple variables, they are
nevertheless geared toward representation of static situations on a plane
surface at a specific scale--in this respect, they echo and are largely limited
to the nature of the paper maps from which many data sets are drawn. Many of
these 2-dimensional representations can be extended conceptually to accommodate
applications in which the third spatial dimension is important, but operational
capabilities for representing and analyzing 3-dimensional data have been
integrated only recently into general purpose, commercially available geographic
information systems. Current spatial data storage and access techniques are also
not designed to handle the increased complexity and representational robustness
needed to integrate diverse data across a wide range of applications and
disciplines.
Earth related data are being collected in digital form at a phenomenal rate,
and the data volumes that are being generated are far beyond anything we have
experienced so far. The Earth has nearly 1.5 x 1015 square meters of
surface area, a single complete coverage of satellite data at 10 meter pixel
resolution would total approximately 1.5 x 1013 pixels, and the
number of bytes needed to store it would be of the same order of magnitude.
Also, satellite imagery data is normally represented as a gridded array, or
matrix, of cells. It is geometrically impossible, however, to represent the
spheroidal Earth with a single mesh of uniform, rectangular cells, and research
is needed to find better, less distorted representations.
Although many efforts have been made to integrate GIS with dynamic modeling,
most have been limited to the development of an interface between two separate
types of software systems. Modeling software tends to operate within very
narrowly defined domains using mathematical simulation, while GIS is used
primarily for preprocessing of observational data and post-processing for
comparative display.
The ability to represent and examine the dynamics of observed geographic
phenomena is currently not available within a GIS context, except in the most
rudimentary fashion. We urgently need this capability as an essential tool for
examining an increasing variety of problems at local, regional, and global
scales. Problems requiring the analysis of change through time and of patterns
of change range from urban growth and agricultural impacts to global warming.
The need for research in this area is of particularly high priority because
these representational schemes must be present before databases can be built, or
analytical techniques based upon them can be developed.
Given the rapidly increasing use of geographic information systems for
policy analysis and decision making, another urgent issue is how to represent
data of varying exactness and degrees of reliability, and to convey this
additional information to the user. Much work remains to be done on how to
handle the fuzziness and imprecision that is inherent in geographic
observational data within a digital database. This becomes particularly
important when multiple layers of data from varying sources are combined.
COGNITION OF GEOGRAPHIC INFORMATION
In the past decade it has become clear that an understanding of certain
aspects of human cognition is essential if future geographic information
technologies are to realize their full potential as tools in the service of
human decision making. If geographic information systems are to be made easier
to use, by people who must make geographic decisions but are not willing to
undertake the extensive and lengthy training required by today's systems, then
GIS interfaces must be made more intuitive, and users must be able to interact
with them in ways that reflect their natural thought processes. We need to know
more about how humans learn geographic information, and how this understanding
varies as a function of the medium through which the information is learned
(direct experience, maps, descriptions, virtual reality). How do concepts of
geographic space vary as a function of training and experience? How can complex
geographic information be presented to the user in ways that promote
comprehension and effective decision making? How and why do individuals differ
in their cognition of geographic information, perhaps because of age, culture,
sex, or specific background? How does exposure to new geographic information
technologies alter human ways of perceiving and thinking about the world?
Inadequate attention to such cognitive issues is a major current impediment
to the effectiveness of geographic information technologies. Cognitive research
will lead to improved systems that take advantage of an understanding of human
geographic perception and expertise. It may lead to improvements in
representations, if the latter can be made to exploit the primitive elements of
human spatial understanding. Cognitive research promises to make geographic
information technologies more accessible to inexperienced and disadvantaged
users, and also to increase their power and effectiveness in the hands of
experienced users. Finally, it holds great promise for improving geographic
education at all levels, by addressing general concerns about the poor levels of
geographic knowledge in society, and low levels of awareness of such critical
issues as global environmental change.
For example, research has shown that the effectiveness of In-Vehicle
Navigation Systems (IVNS) depends on the format in which information is
presented to the user. For most users, certain forms of verbal instructions have
been shown to lead to faster processing and fewer errors than map displays, and
are also safer because they require less of the driver's attention. Further
research will help to determine the types of features that are most usefully
included in verbal instructions; the optimum timing of instructions; and other
aspects of the interaction between driver and IVNS.
The development of the Internet has opened the possibility of systems that
emulate the functions of map libraries by allowing a user to search for digital
geographic data over the network as if he or she were browsing among the shelves
of a traditional library. But the future of such technologies depends on our
ability to provide a user interface that successfully reproduces all of the map
library's functions, including the assistance provided by library personnel to
users with a wide range of levels of experience. Many of the concepts used to
classify and catalog maps, such as scale, or the latitudes and longitudes that
define the map's extent, are likely to be unfamiliar to at least some users of
the digital map library.
Research into the cognitive aspects of geographic information technologies
is part of a research tradition begun primarily in the 1960s by urban planners,
behavioral geographers, cartographers, and environmental psychologists. Planners
study how humans perceive and learn about places and environments. Behavioral
geographers develop theories and models of the human decision making processes
that lead to behavior in geographic space, such as shopping, migration, and the
journey to work. Cartographers study how maps are perceived and understood by
users with varying levels of expertise. Environmental psychologists have
refocused traditional questions about psychological processes and structures, to
examine how they operate in the contexts of built and natural environments. All
of these disciplines will need to work together to address the cognitive aspects
of geographic information technologies.
INTEROPERABILITY OF GEOGRAPHIC INFORMATION
The term interoperability refers to a bottom-up integration of
existing systems and applications that were not designed to be integrated when
they were built. Because there are so many options available for representing
geographic information, and so many different choices have been made by system
designers, it can be difficult if not impossible to transfer data from one
system to another; to access one system's data from another; to control one
system with the commands defined for another; or to take experience accumulated
with one system and apply it to another without retraining. The costs of this
situation, in wasted time, lack of communication and coordination, and
duplication of effort are enormous.
Interoperability implies the sharing or exchange of information between
different systems. In some instances data may be transmitted from one system to
another; in others, instructions may be sent from one system and executed on
another, without actual exchange of data. Such technical options are generally
easier to resolve than the more fundamental ones related to incompatibilities of
languages, representations, and syntax. For systems to be interoperable there
must be a consistent set of interpretations for information--one system must be
capable of understanding the meaning of another system's data. Such agreement on
the meaning of exchanged or shared information is termed semantic
interoperability.
Efforts over the past ten to fifteen years have produced a number of
exchange standards for geographic information, and many have been adopted. Such
exchange standards establish a standard format, with associated semantics. Each
system is then able to develop translators to and from the exchange standard,
and to map its own terms and language into those of the exchange standard. To
date, most of this effort has been focused on the data, rather than on the
operations which systems perform. Thus we are currently a long way from
achieving the full goals of interoperability. The exchange of data must be
initiated explicitly by the user, and command languages and user interfaces are
still largely unique to each system.
A key component of any interoperable environment is a shared system for
describing data. Such descriptions must travel ahead of the data, informing the
recipient system of the data's formats and semantics, so that the recipient
system can process it effectively. Metadata has emerged as the accepted
term to describe this form of digital documentation, and much attention has been
devoted recently to the development of appropriate standards and protocols. Much
further work is needed in storing and representing metadata, specifying metadata
requirements for geographic domains, and building tools that are able to find
commonalities between data from different systems and agencies.
A long term goal of research in interoperability is to develop methods that
are capable of extracting and updating essential metadata automatically. The
willingness of agencies to invest in the creation of useful metadata has proved
to be a key issue in achieving interoperability, since metadata definition is
labor-intensive and tends to require a high level of expertise. Yet much
metadata could be obtained automatically from the characteristics of the host
system, or by examination of the contents of the data set.
Much of the capability of GIS as a tool for the analysis of geographic
problems is derived from formal models of geographic features. In the past these
models were largely cartographic in origin. But geographic information
technologies are now being used to address problems that are not inherently
cartographic, such as the modeling of dynamic physical processes. Research is
needed to formalize methods for representing all kinds of geographic phenomena,
and to develop standardized languages for describing operations. The results of
such research will make it easier to integrate GIS data into dynamic models, and
to provide the environmental modeling community with tools that use standard
languages and thus offer a much higher degree of interoperability.
SCALE
The term scale refers generally to the level of detail with which
information can be observed, represented, analyzed, and communicated. Since we
can never observe the geographic world in complete detail, scale is necessarily
an important property of all geographic information. Changing the scale of data
without first understanding the effects of such action can result in the
representation of processes or patterns that are different from those intended.
The spatial scaling problem presents one of the major impediments, both
conceptually and methodologically, to advancing all of the sciences that use
geographic information; and the scaling of other dimensions, such as time,
raises similar problems.
Recent work on scaling behavior of various phenomena and processes
(including research on global change) has shown that many processes do not scale
linearly or uniformly. Thus, in order to characterize a pattern or process at a
scale other than the scale of observation, some knowledge is needed of how that
pattern or process changes with scale. Attempts to describe scaling behavior by
fractals or self-affine models have proven largely ineffective because the
properties of many geographic phenomena do not repeat over a range of scales as
precisely as the model requires. Multifractals have shown some promise, but
alternatives are needed if we are to understand the impacts of scale changes on
information content. Scale-based benchmarking of process and analytical models
will help scientists to validate hypotheses, which in turn will improve
geographic theory building.
Despite longstanding recognition of the implications of scale for geographic
inference and decision making, many questions remain unanswered. The transition
from paper maps to digital representations of geographic information forces us
to deal formally with the conceptual, technical, and analytical issues of scale
in new ways. The cartographer's familiar representative fraction, perhaps the
most widely used measure of geographic scale, defined as the ratio of distance
on the map to distance on the ground, becomes comparatively meaningless in the
world of digital information, where a data set may never exist in paper map form
at any stage of its existence. It is easy to demonstrate by isolated example
that scale poses constraints and limitations on geographic information, spatial
analysis, and models of the real world. The challenge is to articulate the
conditions under which scale-imposed constraints are systematic, and to develop
geographic models that compensate for scale-based variation.
The widespread adoption of GIS contributes to the scale problem, but it may
also offer solutions. GIS facilitates integration across scales; advanced
database designs can handle data at multiple scales in one consistent format;
hierarchical structures such as the quadtree allow a single data set to supply
representations at many scales; and the set of computer-based tools for
automated manipulation of scale is growing rapidly. Fundamental scale questions
will benefit from coordinated, multidisciplinary research. With the development
of alternative models of scale behavior, novel methods for describing the scale
of data that are appropriate for the digital world, and intelligent automation
of scale change, information systems of the future can both sensitize users to
the implications of scale dependence, and provide effective tools for management
of scale.
It has become clear that global and regional processes have implications for
local places, and that individual and local decisions have collective effects at
regional and global scales. Thus scientific information about global and
regional patterns and processes must be understood on a local level, and vice
versa. As the policy making and scientific communities come to grips with these
relationships, systematic understanding about spatial and temporal variations in
scale gains in importance. Geographic information plays an ever larger role as
we move to an increasingly automated information economy. Our understanding of
scale, and the management of data at various scales, must keep pace. Research is
needed:
- to assess the sensitivity of data, spatial properties of data, and
analyses to changes in spatial and temporal scale;
- to identify critical scales at which data content and structure change
significantly, and to identify the ranges of scales over which processes and
patterns are invariant;
- to quantify information content as a function of sampling interval and
observation scheme, and information loss as a function of data generalization
methods;
- to develop theory and methods for intelligent database generalization,
data enhancement, and data reconstruction;
- to develop alternative data models that permit variable-resolution
representations, integrated multiple scale representations, and scale-related
modeling of metadata; and
- to explore theoretical linkages between internal, external, and cognitive
concepts of scale that permit consistent representation across these domains.
SPATIAL ANALYSIS IN A GIS ENVIRONMENT
The collection, storage, and analysis of data has often been limited by our
ability to amass, collate, recognize, and detail observations; the spatial
component of data is no exception. Humans tend to simplify and generalize in the
process of drawing conclusions about relationships, trends, and patterns in
space. Although these goals may seem focused, the process of reaching them is
often simplistic, relying too much on possibly misleading intuition, and limited
by the poor quality of available tools.
Modern data collection methods, such as remote sensing, are capable of
supplying data in amounts, detail, and combinations that literally boggle the
mind. The increased availability of large, spatially referenced data sets, and
improved capabilities for visualization, rapid retrieval, and manipulation
within a GIS all point to the inadequacies of the human being's capacities for
data analysis, filtering, assimilation, and understanding. If we are to make
effective use of this vast supply of data, we need new methods of spatial data
analysis that are better designed for this new data-rich environment, whether
the objective is to explore for new patterns, or to test and confirm the
validity of previous ones.
To remain at the cutting edge of GIS technology, analytic and computational
methods must be devised that allow for solutions to problems conditioned by GIS
data models and the nature of spatial and space-time research. New forms of
statistical analy sis are needed to assess relationships between variables in a
variety of new spatial contexts. New theories must be devised that provide
understanding of relationships at the new levels of resolution and dimension
that are available with new sensing technologies.
Standing in the way of confirmatory spatial data analysis, including
modeling, are questions having to do with spatial scale, spatial association,
spatial heterogeneity, boundaries, and incomplete data. Without reasonable
responses to these problems, the usefulness of GIS as an analytical tool in a
sophisticated research environment will surely come into question. By the use of
GIS, previously prohibitive, computationally intensive, and highly visual ways
of spatial analysis will become accessible at reasonable costs.
UCGIS calls on spatial analysts from both the physical and human sciences to
assist in the development of spatial statistics, geostatistics, spatial
econometrics, structural and space-time modeling, mathematics, and computational
algorithms that can take advantage of the flexibility, capacity, and speed of
GIS. Those well-schooled in theory, empiricism, data collection, data
manipulation, programming, and computer technology will be in the best positions
to make advances in the field, but practitioners such as epidemiologists,
ecologists, climatologists, regional scientists, landscape architects, and
environmental scientists can provide much useful guidance and input.
New methods, techniques, and approaches are needed for the analysis of very
large and complex spatial data sets. Further development is needed in the area
of exploratory spatial data analysis, particularly to extend existing methods to
data that includes a temporal component. We are just beginning to see the
integration into GIS tools of the existing and powerful methods of geostatistics.
Procedures must be found that can identify key observations, clusters, and
anomalies in spatial data. There is a need to incorporate tools for complex
spatial and temporal simulation; and to improve access to such advanced analytic
and modeling methods as neural nets, wavelets, and cellular automata. We need to
explore the implications for spatial analysis of new computing architectures,
such as massively parallel and distributed systems, and the implications for
analytic and modeling software of open object-based programming methods. Spatial
econometrics is a new and burgeoning field, and it is important to link its
sophisticated procedures with the functionality and flexibility of GIS, and to
find appropriate techniques for heterogeneous geographic data. Better data
models are needed in GIS to handle the suite of models used to analyze and
forecast spatial interaction, and widely applied in transportation, demography,
and retailing; and to advance the sophistication of techniques of operations
research that are applied to vehicle routing, site selection, and location
analysis.
THE FUTURE OF THE SPATIAL INFORMATION INFRASTRUCTURE
In the early 1990s the U.S. National Research Council's Mapping Science
Committee articulated how spatial information handling might best be approached
from an organizational perspective. This led to a plan for the creation of a
National Spatial Data Infrastructure (NSDI), recognized as critical to serving
national priorities. In addition, many designated executive science and
technology priorities, such as science education, technology transfer, high
performance computing and networking, digital libraries, global environmental
change, and international competitiveness, all have significant geographic
information components, as do traditional land management activities. These
priorities are mirrored at state and local levels of government. However, there
is a growing need for increased coordination between programs, and to make the
outcomes of these activities appropriate, and available to address social needs.
Despite the large investments in geographic data development by government
and the private sector, there is often a lack of knowledge and experience with
the complex policy-related issues that arise from the community-wide creation,
compilation, exchange, and archiving of large geographic data sets. Technical,
legal, and public policy uncertainties interact, making it difficult to utilize
information resources fully to pursue social goals. The ownership of digital
geographic data, protection of privacy, access rights to the geographic data
compiled and held by governments, and information liability are all concepts
that require greater clarity in the new, automated context. Observations of the
ramifications of following different policy choices are needed to help guide
future choices.
The government sector plays an important role in developing the fundamental
spatial information infrastructure due to its activities in the systematic
collection, maintenance, and dissemination of geographic data. These resources
have significant uses beyond their governmental purposes. For example,
subsequent use of geographic data by organizations can stimulate the growth and
diversity of the information services market. At the same time, public access to
government information remains essential to ensuring government accountability
and democratic decision making. Reconciliation of the tensions inherent in these
and other policies becomes more important as we move toward global economies and
international networked environments. Rigorous and impartial analysis is
urgently needed to inform decision makers on the economic, legal, and political
ramifications of choosing one policy over another.
We propose four broad areas where research will help to strengthen the
future of the nation's spatial information infrastructure:
- Information Policy. The factors that shape the development of
spatial information policy and law reflect traditional and contemporary
culture and technology. Research is needed to identify optimal government
information policies and practices for promoting a robust spatial information
infrastructure. Basic policy issues include intellectual property rights,
information privacy, and liability as they pertain to geographic data. A range
of perspectives, from local to global, will need to be considered.
- Access to Government Geographic Information. Research is needed to
examine how government information policies affect the access to and use of
data by a broad spectrum of public and private sector stakeholders for a
variety of public and private (commercial) purposes. Public and private roles
in information creation through partnerships and cooperative agreements should
be a subject of particular attention.
- Economics of Information. Geographic information is an unusual
commodity of great value. Issues of cost recovery, pricing, and markets for
geographic data, and their relationship to intellectual property rights, are
of central importance. We need to achieve a better understanding of the
economic characteristics of information, especially government information,
through such concepts as public goods theory, network externalities, and
value-adding processes.
- Local Generation and Integration of Geographic Information. Locally
generated information and knowledge is increasingly important because new
developments in technology make it possible for local people to be more
involved in the production process, as well as in the use of the data for
decision making. Contributions of data can be systematic or ad hoc, coming
from civic groups, schools, local institutions, and informed individuals.
Local users can make significant contributions of their local knowledge,
identify gaps in existing data resources, and identify errors. Developing the
technical and institutional means to support creation and contribution of
local knowledge presents a novel challenge to technologists and decision
makers alike.
In summary, the goal of this research should be to help policy makers,
scientists, business leaders, and citizen groups to understand the relationships
between government policy and geographic information resources, services, and
infrastructure--and by so doing, to facilitate the accelerated growth and
utilization of geographic information resources in meeting society's future
needs.
UNCERTAINTY IN GEOGRAPHIC DATA AND GIS-BASED ANALYSES
Geographic data are unique, in that information about a geographic feature
contains three different kinds of attributes: the typological attributes
(describing the type of a geographic feature), the locational attributes, and
the spatial dependence (the spatial relationship with other features). For
example, a datum about a forest can include the type and species combination of
the forest (as typological attributes), the location and spatial extent of the
forest (the locational attributes), and its relationships with its surrounding
landscape features (spatial dependence). All of these attributes are subject to
uncertainty, since stored information is at best only an approximation to
reality; and they may also change over time, making geographic data very complex
and difficult to manage. The basic schemes used to create digital
representations of geographic features do not deal with complex objects which
may consist of interacting parts, or display variation at many different levels
of detail over space and time. Many forms of discrepancy therefore exist between
geographic data and the reality these data are intended to represent. These
discrepancies propagate through, and may be further amplified by, spatial data
management and analyses in a GIS environment. We refer to them here by the
general term uncertainty. Uncertainty information associated with a
geographic data set can be conceived as a map depicting varying degrees of
uncertainty associated with each of the features or phenomena represented in the
data set, and potentially separable into three components: uncertainty in the
typological attributes, uncertainty in the locational attributes, and
uncertainty in spatial dependence.
Unfortunately, geographic data are often used, analyzed, and presented under
the assumption that they are free of uncertainty. The beguiling attractiveness,
the high aesthetic quality of cartographic products from GIS, and the analytical
capability of GIS further contribute to an undue credibility, at times, of these
products. However, undeserved and inappropriate acceptance of the accuracy of
these data is often not warranted for the reasons discussed above. Error-laden
data, used without consideration of its intrinsic uncertainty, has a high
probability of leading to inappropriate decisions.
Uncertainty exists in every phase of the geographic data life cycle, from
data collection to data representation, data analyses, and final results,
transcending the boundaries of disciplines and organizations. As it passes along
the stages from observation to eventual archiving, geographic data may pass
between many different custodians, each of whom may provide their own distinct
interpretations to the data. Thus uncertainty is not a constant property of the
data's content so much as a function of the relationship between the data and
the user: uncertainty is a measure of the difference between the data, and the
meaning attached to the data by its current user. For example, if knowledge of
the classification scheme used to create a data set fails to pass from one
custodian to another, and a user mistakenly attributes the wrong classification
scheme, then uncertainty has been increased, because the data contents may now
be further from the new user's understanding of the truth, as defined by the
new, mistaken classification scheme.
At this time, our understanding of uncertainty in geographic data and its
consequences for decisions made using geographic information technologies is
very incomplete. Progress will require the combined efforts of experts in
particular domains of geographic data; experts in GIS, spatial analysis, and
modeling; spatial statisticians and geostatisticians; and developers and vendors
of GIS software. Intensive research is needed in the following areas:
- studying in detail the sources of uncertainty in geographic data and the
specific propagation processes of this uncertainty through GIS-based data
analyses;
- developing techniques for reducing, quantifying, and visualizing
uncertainty in geographic data, and for analyzing and predicting the
propagation of this uncertainty through GIS-based data analyses;
- testing new methods of managing uncertainty in geographic data and GIS
analyses; and
- implementing strategies and methodologies for reducing, quantifying,
tracking, and reporting uncertainty in GIS implementation, in geographic data
collection and generation, and in spatial data standards and decision making
processes.
GIS AND SOCIETY
Access to information technology is often presented as offering enormous
benefits to society, in the form of increased choices, a more informed
citizenry, economic growth, and empowerment of the individual. At the same time,
and while we may believe in m any of the assumptions on which these assertions
of improvement are based, we have very little understanding of the long term
political, economic, legal, and institutional impacts of technologies like GIS.
Moreover, the geographic information technologies seem to have certain unique
characteristics that will affect their eventual impact, including the potential
for invasion of privacy, and relevance to community-based decision making and
political processes.
In listing this as one of their ten research priorities, the delegates to
the UCGIS meeting in Columbus identified several specific issues that should
form the basis of a research agenda on GIS and society:
- In what ways will GIS actually affect and alter the society it is intended
to represent and serve?
- How can various conceptions and representations of space, not based on
traditional map formats or geometric views, be embedded within a GIS? Is GIS
more appropriate for some cultures than others? Can GIS be developed to
reflect complex and ambiguous perceptions of social and physical space?
- How will GIS affect the relationships among and within government
agencies, and between them and the various citizen groups concerned with the
environment, property rights, and advocating the needs of local communities?
- What are the interpersonal implications of GIS? Interaction at the
individual level underpins all other relationships.
- Can GIS provide citizens with an increased ability to monitor and hold
government accountable for proposals and actions?
- Will GIS provide citizens with an understanding of their rights and
interests in land?
- How accessible will spatial data and related GIS analysis tools be to all
parts of society?
- Can GIS be used to increase participation in public decision making?
The products of geographic information technologies are changing and will
continue to change the economic, legal, political, and cultural status of
adopting agencies, decision makers using the products, and the people and
organizations affected by the decisions. While early impacts are becoming
evident, little is known about the long term effects that the products of these
technologies will have on the communities and organizations that implement them.
We should observe, and ultimately be able to predict, how geographic information
technology and products alter decision making processes within organizations,
interactions between agencies, the citizen's relationships with government
agencies, and people's beliefs and actions in regard to the use and management
of land and resources.
At a deeper level, we need to ask to what extent the particular logics,
visualization techniques, value systems, forms of reasoning, and ways of
understanding the world that have been incorporated into existing GIS techniques
limit or exclude the possibilities of alternative forms of representation that
may be as yet unexplored. We need to ask how the proliferation and dissemination
of GIS has influenced the ability of different social groups to use information
for their own empowerment--who it has favored, and who it has excluded. Finally,
we need to ask whether ethical or legal restrictions need to be placed on access
to geographic information technologies because of their potential for misuse,
surveillance, and invasion of privacy.
CONCLUDING COMMENTS
As noted earlier, the ten topics identified by UCGIS as its research agenda
reflect the views of the research community at this point in time. They are
driven partly by the research community's perception of what is possible, and
where commitment of resources would lead to substantial results, and the
solution of well-defined problems, in reasonable time. They also reflect the
research community's consensus on problems that currently impede the use of
geographic information technologies in addressing current needs. On the other
hand, the research community's views on society's needs are clearly limited, and
must be refined and enlarged by those better equipped to address such issues,
including government agencies, elected officials, and the general public. Thus
this agenda is presented here more in the spirit of a shopping list--here is
where we think research could be done to good effect, as a first step in a
dialog.
UCGIS intends to refine the agenda, as perceptions change, results
accumulate, views are expressed, and problems are solved. We expect to do this
roughly every two years, at meetings similar to the one held in Columbus in
June, 1996.
Certain readers may be disappointed by apparent absences from the list of
topics. We have tried to construct a scientific research agenda, and to organize
it in terms of a set of fundamental issues rather than applications, and in
consequence none of the topics refers to a specific domain. As dialog proceeds,
we expect to identify areas within the ten topics that are particularly relevant
to domains--for example, several of the topics are of great relevance to
transportation, and several to global environmental change. A matrix showing the
importance of each of the ten topics to each domain of GIS application would be
useful and should be developed.
Similarly, none of the topics is itself a geographic information technology.
We do not have a research priority on GPS, or remote sensing, or GIS, because
these technologies form the underlying framework for the entire agenda. Remote
sensing, for example, is of particular relevance to the first topic, Spatial
Data Acquisition and Integration; to Spatial Analysis in a GIS Environment; and
to Uncertainty in Spatial Data and GIS-Based Analysis. Rather than focus a topic
on a specific technology, we feel that a focus on several fundamental issues
raised by the technology and currently impeding its use will be more productive.
As noted earlier, UCGIS welcomes comments and discussion of this agenda,
involvment in the dialog that will follow its publication, and participation in
the process of its continued evolution and refinement.