A framework to enhance semantic flexibility for analysis of distributed phenomena

 

 

John McIntosh (jmcintosh@ou.edu)

May Yuan (myuan@ou.edu)

 

Department of Geography, University of Oklahoma

 

Abstract

 

While many geographic phenomena are best represented as continuous surfaces, regions based on attribute values within these fields can often be identified as entities.  For example, areas of relatively high elevation may be viewed conceptually as hills while flat low-lying areas are perceived as plains.  Dual approaches that model distributed phenomena as both objects and continuous fields have been proposed to represent such phenomena in geographical information systems. However, these conceptual entities often have vague boundaries that vary depending on the use and the user.  The nature, and even existence, of these regions depends upon the range of values, or thresholds, used to define them. 

 

This paper proposes a representational framework to model distributed phenomena as both objects and continuous fields.  The proposed framework extends the dual representational approach by maintaining multiple representations of the conceptual entities delineated by boundaries relevant for the expected uses of the data.  Hourly rainfall accumulation data for the Southern Great Plains, USA, are used as a proof of concept.


1. Introduction

 

Over the past 20 years, the production of geospatial data has increased exponentially.  Accompanying the increase in data has been a shift in the dissemination data and in the characteristics of the data user.  Data sets are often distributed to a wide range of users with diverse applications.  This poses representational challenges because the data may be analyzed and manipulated differently for the various uses.  This issue is especially challenging for modeling phenomena that posses both object and field-like characteristics because the underlying data models typically used to represent these aspects differ substantially.  Approaches that combine field and object characteristics by explicitly identifying and modeling areas within the field as data objects have been proposed (Winter 1999, Yuan 2001). 

 

Combining strengths of the object and field representations enhances the ability to summarize and reason about overall patterns within distributed data.  This paper presents a framework to model both field and object-like characteristics of conceptual objects imbedded in continuous fields.  Recognizing that boundaries of conceptual objects in fields are inexact and context specific (Burrough 1996, Egenhofer and Mark 1995), the proposed framework extends the dual object/field representation framework by explicitly storing multiple boundaries.  The proposed framework also maintains related object-like characteristics and relationships for spatiotemporal analysis. 

 

A goal of the framework is to better support multiple users of data by representing both object and field like characteristics of geographic phenomena.  The framework is designed to provide enhanced capabilities for analysis by providing a means to investigate how object-like spatiotemporal characteristics of areas within the fields vary based on the how the boundaries are defined and to better represent hierarchies of areas within the phenomena.  The proposed framework also maintains object identity, necessary for temporal analysis.  A sample data set of hourly radar derived precipitation estimates over Oklahoma are represented in the framework as a proof of concept.  The dataset covers March 15, 2001 to July 15, 2001, a period with numerous rainstorms in the study area. 

 

The second section reviews data representation of geographic phenomena and provides a conceptual basis for the proposed framework.  The third section presents the proposed framework.  This is followed by sections describing the implementation of the framework using rainfall as a case study and the results.  The final section identifies strengths and weaknesses of the framework and discusses areas for future work.

 

2. Background

 

This section reviews exact object and continuous field conceptual views of the world.  These views provide the conceptual bases for the vector and raster models, which dominate geographic information systems (GIS).  The second part of this section discusses some limitations of the field and object-entity with geographic phenomena that exhibit characteristics of both views.  This is followed by a discussion of dual and hybrid object-field based models and the conceptual basis for the proposed framework.

 

Two Diametrically Opposed Views of the World: The Exact Object and Continuous Field Models

 

Two conceptual models of geographic phenomena dominate GIS views of the world: the exact object model and the continuous field model (Burroughs and McDonnell 1998, Erwig and Schneider 1997). The exact object model views the world as populated with entities. The emphasis is on the location of boundaries, which act as containers for attributes that apply uniformly to the space within.  Because of the assumption that entities are discrete uniform objects, the exact object model approach does not explicitly address variation within entities.  In contrast, the continuous field views the world as filled with attributes that vary continuously over space.  Because the fields are continuous, the concept of boundaries is not a basis of this model. 

 

These two conceptual models work well for some types of geographic applications.  For example, parcels of land, which have exact boundaries with uniform attributes such as value or ownership, fit neatly into the exact object model.  On the other hand air pressure varies continuously over the Earth, and the  field model is able to capture such spatial variation.  The two conceptual views result in different approaches to representing geographic data, although they share the underlying basis of absolute Cartesian space (Couclelis 1992, Peuquet 1988).

 

Modeling phenomena that possess object and field characteristics

 

In practice, many geographic phenomena do not fit neatly into one or the other of these conceptual models.  Data models that adhere to just one of the conceptual views are unable to provide a complete representation of such phenomena.  While air pressure varies continuously over the surface of the earth, conceptual entities within the pressure fields such as "ridges" are well recognized.  If the pressure ridges are modeled as objects, variation of pressure within the ridge is lost.  In contrast, if a field model is used boundaries and dimensions of the "ridge" are not explicitly modeled.  Each approach leads to an incomplete representation.  

 

Object-like characteristics  of distributed phenomena can play an important role in analysis.  For example, in weather forecasting the position of a ridge may be used by a forecaster to identify areas that are unlikely to experience rainfall.  Modeling the ridge as an object also allows the topological relationships of the conceptual object with other objects to be established.  For example, the ridge may be over a city or disjoint and approaching the city.  The general shape and orientation of conceptual areas can provide insight into physical processes or be related to conceptual models used by domain experts.  Finally, the object the focus on objects inherent in the exact object models provide a means of maintaining object identity over time which can be used as a basis for detecting, characterizing, and tracking changes.

 

It is inadequate to model areas that possess both object and field like characteristics simply as exact objects because information on the distributed nature of the phenomena is lost.  The entities occupy a space and have coherent spatiotemporal patterns that may be of interest. In an effort to capture both field and object like characteristics, dual,  hybrid and object-oriented approaches have been proposed (Blachke et al 2000, Winter 1998, Yuan 2001).  With dual representations, the object-like characteristics can be modeled, usually as a vector object or a zone within a raster layer and the field-like characteristics are captured in a raster or TIN model. Hybrid approaches capture both object and field-like characteristics in an integrated representation. Object oriented approaches can store the geometry of conceptual objects using the raster model, TIN model, vector model or some combination.  Explicitly representing the conceptual area as a polygon object or as a unique zone within a raster model can provide a basis for maintaining object identity over time in a temporal GIS. 

 

Selecting an Appropriate Boundary

 

Dual, hybrid or object oriented representational approaches that model both field and object-like characteristics require the boundaries of the conceptual objects, or regions imbedded in the fields, to be defined a priori.  For many conceptual entities, there is no single boundary that is valid for all uses.  Conceptual objects imbedded in fields often have vague boundaries that vary depending on the use and the user.  The nature, or even existence, of these conceptual objects depends upon the range of values, or thresholds, used to define the object. Even though there is no universally appropriate boundary, the issue of what boundary to use is important to consider because many useful object-like characteristics such as topological relationships or shape descriptions can vary depending on how the boundary is defined.   Figure 1 illustrates boundaries for an area of rainfall based on three thresholds: >0 mm/hour, >20 mm/hour, and >40 mm/hour. This figure illustrates the potential variation in commonly used object characteristics such as movement and spatial relationships associated with different boundaries.

 

Most GIS provide functions or tools to delineate zones in raster models based on threshold values or spatial characteristics derived from neighborhood analysis.  These zones can be both delineated and converted to polygons on an ad hoc basis. This provides the user the flexibility to specify a definition, or range of definitions, for zones specific to the application.  However, it may not be practical to derive complex characterizations or spatial and temporal relationships of zones on an ad hoc basis.  For example determining spatial and temporal relationships for boundaries may be computationally expensive, especially for distributed users, making it impractical to answer queries on a real time basis.  

 

By storing a range of likely boundaries, associated attributes and relationships, a system can support multiple users with a variety of needs reducing duplication of intensive data manipulation by a multiple of end users.  Maintaining this information also allows users to explore the relationship between semantics and spatiotemporal characteristics of the modeled conceptual entities.  Furthermore, with distributed data sets, summaries based on the object like characteristics of the zones can be used as a final data or as a screening tool for more efficient access of the distributed data.

 

The proposed framework is a dual approach with both object and field like characteristics represented.  However, unlike other dual approaches that use a single representation of the object like characteristics, the framework employs multiple representations of conceptual objects with boundaries based on a range of threshold values that are likely to be of utility to the users.  Multiple representations have been proposed to support intelligent zooms and to support multiple temporal and spatial scales of analysis (Mountrakis et al 2000, Timpf 1997). 

 

There is a trade off between storage and computation under the proposed framework.  This research examines benefits and costs of the proposed framework using radar derived gridded hourly rainfall accumulations as a case study.  Rainfall accumulation provides a good test case for the proposed framework.  It possesses both object and field like characteristics.  Rainfall accumulation can be conceptualized as a surface.  It is often represented as raster layers either derived by interpolating point measurements or from remote sensing such as radar or satellite imagery.  The field representation of rainfall is valuable for many uses such as water balance calculations or flood forecasting where it is necessary to have specific estimated of the rainfall on a grid-by-grid basis.  Rainfall is not uniform and areas within the modeled domain are often conceptualized as entities based on factors such as the presence of rainfall or relative extremes.  These conceptual entities provide a summary, or interpretation of the overall patterns in the rain field.  The ways in which these zones move and evolve can suggest the underlying organization and processes governing the storm event.

 

However, there is no universal definition of how rainfall regions should be defined.  For example, zones of high rainfall may be defined based on a very high hourly rainfall threshold for flood analysis.  For characterizing the structure of a rainstorm, zones might be based on relative rainfall amounts to capture characteristic patterns such as areas of low intensity stratiform precipitation and high intensity precipitation associated with convective cells.  It is clear that the boundaries used to represent object-like characteristics of accumulated rainfall vary by use. 

 

Representing conceptual entities using multiple data objects

 

The proposed framework is based on modeling each conceptual entity (region) within the distributed phenomena using a series of zones delineating areas that meet pre-specified threshold values.  Geometry and spatiotemporal relationships are stored for each of the zones.  Currently most GIS use a single database and provide functionality for a different uses by allowing the data to be viewed in a variety of formats generated on demand from the original database (Parent et al, 2000).  Although conceptually elegant, this approach is not always practical or even possible.  For example with temporal GIS, the geometry and attributes of modeled objects may change over time requiring representations that are valid for specific intervals or points in time.  In other cases, phenomena are not adequately modeled by a field or an object based representation alone requiring a dual or hybrid approach.   

 

The proposed framework takes a dual representational approach modeling both distributed and object like characteristics.  Although the boundaries used to delineate the object-like representation in the proposed framework can be derived automatically on an ad hoc basis from the distributed data, a multiple representational framework still provides some advantages.  The proposed framework maintains a complex set of spatiotemporal relationships between the modeled objects which would be impractical to implement on demand.  By storing this information explicitly, complex spatiotemporal information can be accessed in real time. 

 

While maintaining multiple representations increases storage requirements, this approach has also been taken by other researchers to support representations that are difficult to derive on demand.  For example, there have been a number of proposals directed at incorporating scaling effects such as change in geometry and semantics due to simplification, consolidation and categorization as the cartographic scale is changed (Buttenfield 1995, Davis and Laender 1999, Jones et al 1996, Parent and Spaccapietra 2000,  Rigaux and Scholl 1994, Timpf 1997).  Multiple representations have also been proposed as a means of enhancing interoperability.  By storing multiple representations  at different resolutions or using different data models, systems can better support complex analysis on the fly. 

 

3. The Framework

 

This section describes the organization of the proposed framework, data structures, and implementation.  The proposed framework is based on the assumption that areas within distributed phenomena such as areas of relatively high or low values or gradients are used to characterize and reason about distributed geographic phenomena.  The framework explicitly models areas of relatively high values in distributed phenomena as data objects.  The framework is a dual approach that stores both an object and field based representation of distributed phenomena.  The object based representation delineates areas of relatively high values within the field based representation.   Recognizing that there are often no universally appropriate boundaries of conceptual entities within distributed phenomena, the framework models each of the conceptual entities using multiple threshold values likely to be applicable for the expected uses and users of the data.  The framework links areas defined by these boundaries over time to support representation of and reasoning about change.  The framework also explicitly stores topological information regarding some spatial relationships between the modeled objects.  This provides a means to explore the relationship between semantics and spatiotemporal characteristics of the modeled conceptual entities.

 

The framework organizes data into states, zones, processes, zones, and events similar to Yuan (2001) (Figure 2).  The framework is designed to work with distributed data that is collected as snapshots at regular time intervals.  The basic building blocks are states. States are defined as areas of contiguous grids within a snapshot that meet or exceed a specified threshold value.  States represent the area contained within threshold specific boundaries.  Because the framework represents conceptual entities using a variety of thresholds, a state may contain, or be part of one or more states.

 

States represent conceptual entities at a specific point in time or interval.  With dynamic distributed phenomena, areas of relatively high values may appear, move, evolve and disappear over time.  These objects can be tracked through a sequence of snapshots.  The framework models states that overlap (or partially overlap) spatially with states defined by the same threshold in the previous or next time interval as zones.  A zone is a temporal object that may exist over multiple snapshots, or in a single snapshot in the case when a state does not overlap with another state in the previous or subsequent snapshot.  With many phenomena, the conceptual objects may split or merge over time.  If a zone splits into disjoint areas, the zone ends and each of the resulting states become the first state in new unique zones.  Likewise, if two zones merge, the zones end at the time period before the merger takes place and the resulting state becomes the first state in a new zone.  This simplifies analysis by ensuring that disjoint areas always have a distinct zone identification number.  Like states, the framework explicitly stores contains and part-of relations for zones.  Disjoint zones at a higher threshold may be contained within a single lower threshold zone.  The contains and part-of relations provide a means to link apparently unrelated disjoint zones to a single conceptual entity captured at the lower threshold.  Future and previous relations are also stored to allow easy tracking of zones that are involved in splits or merges. 

 

By assigning new zone identity when splits and merges occur, a single zone may not represent the entire lifespan of an area of values meeting a given threshold.  The framework models zones that are related through splits or merges as process objects. Process objects may consist of disjoint areas that are related by splits or mergers in previous or future time intervals.   Processes begin when new areas within the modeled domain meet a selected threshold value and end when the subsequent time interval does not possess any area that overlaps with the process area in the current time interval.   Like zones, the framework models distinct processes for each of the selected thresholds and stores contains and part-of relations.  The framework organized under the assumption that the very presence of areas of high values (or conceptual entities) within the modeled domain constitutes an event.  Events are defined as intervals of time when consecutive snapshots possess one or more areas that meet at least one of the selected boundary thresholds.  The modeled events may consist of one or more related or unrelated processes.

 

 

 

 

 

Data Structures

 

As mentioned above, the framework takes a dual representational approach that includes both a field and an object based representation.  The field based representation consists of a time series of raster layers.  The objected based representation is implemented through a series of tables that store the state, zone, process, and event information (Figure 3).  The geometry of the object based representation is stored in the state table as run length codes corresponding to specific areas of the field based representation with the object and field based representations linked by a time stamp.  In addition to the geometry, the state table includes an identification number, the time stamp, the threshold used to determine the geometry, and contains and part-of relations with states based on a different threshold that model the same conceptual entity. 

 

The geometry of the higher level objects: zones, processes, and events are not stored explicitly, but are easily determined by referencing the states that those objects link.   The zones table stores the zone identification number, the start time and duration, the identification number of the states that zone is formed from, previous zone and future zone identification numbers to if the zone is begins or ends as a result of two zones merging or a zone splitting.  The process table stores a process identification number, start time, duration, threshold, the identification numbers of zones that the process is formed from as well as contains and part-of relations with other spatially overlapping processes with different thresholds.  The event table stores an event identification number, the start time, duration, and the processes associated with the event.  It also includes contains and part-of relations for zones of modeling the same conceptual entity  based on a different threshold.  In the actual implementation, additional fields representing other attributes of interest can be associated with any of these tables. 

 

 

A simple example of a series of three temporal snapshots of hourly accumulated rainfall is provided in Figure 4 to illustrate the representation using the framework.  The snapshots are shown graphically at the top and the associated tables are below.  Three thresholds are modeled.  The lightest shade represents the area meeting the lowest threshold (0mm/hour) with the middle and darkest shades corresponding to the middle (>20mm/hour) and highest threshold (>40mm/hour). This example shows a single event consisting of three processes, seven zones, and eleven states.  The first process (process 1) models the boundaries of the area meeting the lowest threshold, the second (process 2) models the area meeting the middle threshold, and the third (process 3) models the area meeting the upper threshold.  Process 1 includes zones 1, 4, and 6.  Process 2 includes zones 2, 5 and 7.  Process 3 consists of zone 3.   The area of rainfall splits after T2 but all three processes continue into the next period because both new rainfall areas overlap their respective parent process in T2.   Zones 1 and 2 end because they split into two new zones.  The resulting areas in T3 are assigned new Ids.  Zone 3 does not split and therefore its identity continues in T3.  The zone table maintains the relationship between the new zones in T3 with the parent zones in T2 in the PreviousZoneID field.  Likewise, the FutureZoneID field contains the ID numbers of the results of the split for the original zones 1 and 2.  This allows a user to track changes when a zone has split or merged.

 

The geometry of the areas meeting the thresholds is stored as run length codes in the state table. The run length codes and fields representing contains, part-of, previous IDs and future IDs relations are stored as comma delimited lists for ease of use and interpretation.  Several functions have been written to allow the data within these lists to be queried, and in the case of the run length codes, loaded as spatial objects in the implementation stage.

 

Implementation

 

The framework was implemented using ArcView GIS 3.2 software (Environmental Systems Research Institute Inc.,  Redlands, California).  Although ArcView does not provide direct support for this type of modeling, the scripting language in combination with the relational database support and display capabilities of ArcView provide the necessary tools to implement the framework.  The implementation included several extensions built using the Avenue scripting language.

 

The first extension loads the geometry from the state table into objects that can be viewed, manipulated, and analyzed in ArcView.  To avoid distortion resulting from large domains, the implementation uses a polygon theme representing the corrected position and shape of the raster cells.  The polygons corresponding to the cells represented in the run length codes are selected from this master theme and new themes of states can be loaded individually or by zone, process number or event number on the fly for review. 

 

The tables store spatial relations as comma delimited lists for ease of review and space efficiency.  A several extensions to the standard SQL query language have been created to work with this data.  These commands include summary statistics not currently supported by the ArcView query builder including minimum, maximum, mean, variance, standard deviation, range, count, and sum (Figure 5).  The prototype allows records to be selected based on the values of objects in fields such as contains, part-of, future ID, Previous ID,  or States.  For example, the query in Figure 5 on the process table would return all processes beginning after time index 1 that contains zones that have an average duration of more than two time steps.  These summary characteristics can be calculated for any numeric attribute values that might be added to the basic representation such as movement or shape indices.  This allows exploration of relationships as well as attribute characteristics.  The summary functions work with standard SQL and can be used in concert to identify events, processes or zones of interest.

Because insights are often gained by graphically reviewing spatial properties and relationships, a viewer has been created to allow the user to vary time and thresholds of the object representation of the modeled conceptual entities.  Our implementation of the framework uses Avenue scripts to develop a viewing dialog (Figure 6).  This dialog allows the user to explore relationship between the various states, zones, processes and the thresholds used to derive these objects.

 

 

4. Case Study

 

This section describes the case study that was implemented to test the framework.  This study uses rainstorms as a proof of concept. Intense springtime storms are common in the Southern Plains, USA.  These storms can result in flooding and associated features such as wind or hail, which can damage crops and structures. The data sets are massive making it difficult to search for specific spatiotemporal patterns within the grids. The goals of the case study are to:

 

 

 

 

 

The Data

 

The National Weather Service’s Arkansas-Red River Forecast Center (ABRFC) produces digital precipitation arrays (DPAs) for the drainage area of the Red and Arkansas Rivers.  This product covers the entire state of Oklahoma and portions of surrounding states (Figure 7).  The DPAs are in a raster format and consist of approximately 4km x 4km grids in the Hydrologic Rainfall Analysis Project (HRAP) coordinate system and are archived in the NetCDF format (Arkansas-Red Basin River Forecast Center, 2002).  Each grid contains an hourly estimate of accumulated rainfall.  The estimates for accumulated rainfall are based on a composite from next generation radars (NEXRAD) and observations at ground weather stations (Schmidt et al, 2000).

The DPAs are generated in real time and are used by the ABRFC for flood forecasting.  They can also be valuable for other purposes such as climate analysis, risk assessment, facilities planning or agronomy.  The DPAs represent the rainfall as a continuous surface.  The distributed representation of the data is ideal for hydrologic modeling and flood forecasting where the estimates of hourly rainfall accumulation are needed for discrete locations within the modeled domain.  For other types of analysis, object-like characteristics of areas within the grids are also useful.

 

For example, the structure of the storms that pass through the area as revealed by the spatiotemporal rainfall patterns can also be valuable for other uses. In order to store and reason about the structure of storms, or the association of specific parts of the storm system with other severe weather, it is necessary to incorporate object like characteristics into the analysis by explicitly identifying and/or characterizing regions in the field.  This is commonly done for many types of analysis of rainfall and other distributed phenomena. These objects allow summarization and a characterization of the overall patterns

 

The patterns of the most intense rainfall can indicate the structure of the storm system.   For example, a linear array of regions of relatively high rainfall moving perpendicular to the alignment of the line might suggest a squall line of convective cells.  Meteorologists have associated the morphological and structural characteristics to storm dynamics.    Houze et al (1990) evaluated severe springtime rainstorms in Oklahoma.  Storm organization was graded according to the degree to which it matched an idealized model of a leading line/trailing stratiform structure.  Factors considered were shape, orientation, movement of the storm area, characteristics of the leading edge and the presence of stratiform rainfall.  Hagen et al (1999) studied thunderstorms in southern Germany and identified three classes of storms based on these object like characteristics; isolated cells, events which follow along a line, and linear aligned thunderstorms that move roughly perpendicular to the major axis.  Scheisser et al (1995) studied the structure of heavy rainfall events in Switzerland and classified storms based on the relative intensity of rain field derived from radar.   The storms were categorized based on the object like features including the shape and position of stratiform rainfall, characteristics, and the leading edge.

 

The ARBFC’s DPA data is currently stored in raster layers.  For uses other than inputs to distributed parameter models, the object-like characteristics of regions within the fields are of importance.  Multiple representational approaches could be used to model both object and field like characteristics.  However, the boundaries of these conceptual objects are dependent on the intended uses so there is no single boundary that would be expected to suit all needs.   For research such as Houze (1990), Scheisser (1995), or Hagen et al (1999) the areas of most intense precipitation associated with the cells and the stratiform rainfall areas are of interest so several rainfall thresholds would be needed.  For other uses, the boundaries might be different.  For example if the rainfall data is being studied to improve the efficiency of fertilizers or pesticides, the appropriate boundary might be the presence of rainfall. 

 

Data Preprocessing

 

Java scripts were developed to download DPAs of hourly rainfall accumulation from the ABRFC and process the data for input into the prototype.  Areas of rainfall are natural conceptual entities in these arrays.  Three thresholds were to used to delineate rainfall “states” for the object part of the representation (>0mm/hour, >20 mm/hour, and >40 mm/hour).  Houze et al (1990) identify several important structural aspects of springtime rainstorms, including areas of light stratiform precipitation, areas of intense rainfall corresponding to convective cells, and areas of heavy precipitation indicating features such as a squall line.  The thresholds selected for the case study are intended to capture these features in the precipitation arrays.

 

Rainfall states are linked to form zones based on overlap in consecutive time periods.  In cases where the rainfall objects split or merge, new identities are assigned.  Rainfall processes link zones related by merges or divisions and their predecessors and descendents.  Rainfall events are defined as consecutive periods with rainfall somewhere in the modeled domain.  It should be noted that the rainfall events do not imply that the rainfall processes are related.  The rainfall events, processes, zones and states are object representations of the distributed precipitation array.  They are linked to the digital precipitation array based on a time-date index.  In addition to the basic elements of the framework described in section 3, we include attributes relevant to the rainfall dataset.  These include area, centroid movement (speed and direction), elongation and orientation of the major axis.

 

Results

 

The framework provides the ability for enhanced querying for the meteorological case study.  By defining states, zones, and processes based on multiple thresholds and relating these objects through part-of and contains relations, the framework provides a means to investigate more complex spatio-temporal patterns than a dual representation with a single object-like representation.  In the rainfall case study the framework will support queries that search for low threshold rainfall processes that contain higher intensity processes.  For example, applying the following query to the zone table would identify rainfall zones with a threshold of greater than 20 mm of rainfall per hour that contain at least one zone of higher intensity rainfall.

 

         (min([contains].[Threshold]) > 20) and ([Starttime]<46007)

 

The start time of 46007 corresponds to events occurring earlier than 4/1/00.  Figure 8 shows one of the zones returned by the query (this zone is part of rainfall event over the border of Texas and Oklahoma on 3/27/00).   By including attribute information such as elongation and orientation, the framework supports queries based on characteristic patterns similar to those used in the springtime rainstorm typologies proposed by Houze et al (1990) and Scheisser et al (1996).  Including additional attributes can further refine the searches. For example, querying for low intensity processes that contain higher intensity processes that have a significant range in speeds may suggest rainstorms with rotation.  Objects selected by the query can be related back to the original gridded data based on the time index number and the object representations can be automatically loaded from the run length representation for display and analysis in the GIS.

 

One of the other reasons to include multiple thresholds is to accommodate different uses of the data.  For uses where the presence of rainfall is important, the states, zones, or process objects based on exceeding a threshold greater than 0mm could be used.  Likewise, for other uses where only the more intense rainfall is important, modeled objects meeting or exceeding one of the higher thresholds could be used for analysis or as a screening tool to identify rainfall events of interest.

 

One of the tradeoffs between a framework that stores multiple representations versus a system where the object-like characteristics are calculated on demand is the need for additional storage space.  The incremental storage requirements are dependent on the number of boundaries maintained, the complexity of the zones meeting the threshold, and the number of distinct areas modeled in each snapshot.  We find that modeling rainfall data with three levels of boundaries based on the thresholds described above, the incremental cost is about 0.4 percent.  This includes storing the relationships, attributes, and geometry of the objects. 

 

Since this implementation of the framework stores the geometry of the boundaries as runlength codes representing each area meeting the threshold value, it requires processing to convert the stored geometry into GIS data objects that can be displayed and analyzed using the GIS.  As discussed above, the areas are mapped to a shapefile representing the grids.  This process is completed within several seconds for most states of an event making investigation of the object like characteristics of the phenomena feasible.

 

 

5. Conclusions

 

A framework has been presented that builds on the basic dual representation approach to represent field and object-like characteristics.  The framework extends this concept by explicitly storing multiple boundaries to represent the object like characteristics of the phenomena.  By representing object like characteristics, the framework provides a means of summarizing patterns in the distributed phenomena as well as providing object identity for temporal analysis.

 

The contains, part-of, previousIDs futureIDs relationships in the basic implementation allow the multiple object representations based on different criteria to be related in queries and analysis.  The implementation of the framework includes a viewer and a prototype query builder to explore the data within the framework.  The query builder allows queries to incorporate contains, part-of, previousID and futureID relationships.  The viewer provides an opportunity to explore relationships between semantics and the object like characteristics of the conceptual entities by allowing the user to track conceptual objects within the distributed phenomena varying time and boundary definitions.

 

Although the multiple boundaries proposed here can be derived automatically from the original data sources, the framework provides some advantages.  It stores the complex set of relationships between the objects over time which would be impractical to derive on demand.  By implementing this framework, centralized data holders could provide a means for distributed users to query and do some types of analysis with relatively little overhead and avoid duplication of effort.  The case study was implemented in the relational data structure contained in ArcView.  While the framework could also be implemented in an object oriented data structure, the widespread familiarity with relational databases and the ease of construction and update make the relational database a reasonable choice to implement the framework. 

 

Hourly rainfall patterns exhibit a high degree of spatial autocorrelation making rainfall well suited to this type of framework.  The object based component of the framework emphasizes relative space and may not be suitable for object based analysis where absolute location is important.  For other types of distributed phenomena where there is little spatial autocorrelation, the object-based component of the framework may be of little use. 

 

The prototype framework has some other important limitations.  It is organized on data at fixed spatial and temporal scales based on the spatiotemporal granularity of the observed snapshots.  A basic assumption of the framework is that the behavior or change in the conceptual entities is continuous at the spatial and temporal granularity of the snapshots (Galton 1997, Wilcox et al 2000).  This assumption allows us to reasonably use the behavior of the object like parts of the as a basis for queries and analysis.  If this assumption is not met, then there is no logical basis for the combination of states into zones, and processes. 

 

This paper did not investigate how scale and the semantics associated with boundary definitions interact.  These relationships should be explored in future work.  Future investigations might also include extending the proposed framework to work with data with a variable temporal resolution.

 

6. Acknowledgements

 

This research was funded by the National Imagery and Mapping Agency (NIMA) through the University Research Initiative Grant NMA202-97-1-1024.  Its contents are solely the responsibility of the authors and do not necessarily represent the official view of the NIMA.


 

7. References

 

Arkansas-Red Basin River Forecast Center, “ABRFC Precipitation Products”, http://www.srh.noaa.gov/abrfc/pcpnpage.html, accessed on 5/24/02

 

Blachke T, S Lang, E Lorup, J Strobl and P Zeil (2000), "Object oriented images processesing in an integrated GIS/Remote Sensing environment and perspectives for environmental applications", in Environmental Information for Planning, Politics and the Public (edited by A Cremers and K Greve), Vol 2, Metrolopolis-Verlag, Marlburg

 

Burrough P. (1996), “Natural Objects with Indeterminate Boundaries”, in Geographic Objects with Indeterminate Boundaries, GISDATA 2 (edited by P Burrough and A Frank), Taylor & Francis, London, 3-28

 

Burrough P. and R. McDonnell (1998), Principles of Geographical Information Systems, Oxford University Press, Oxford

 

Couclelis H. (1992). People manipulate objects (but cultivate fields): beyond the raster-vector  debate in GIS. In Theories and methods of spatio-temporal reasoning in geographic space.  eds. A. U. Frank, I. Campari, and U. Formentini, 65-77. Berlin: Springer Verlag.

 

Davis C and A  Laender: (1999)" Multiple Representations in GIS: Materialization Through Map     Generalization, Geometric, and Spatial Analysis Operations." ACM-GIS 1999: 60-65

 

Egenhofer, M., and Mark, D., (1995), “Naive Geography”,  In Spatial Information Theory: A Theoretical Basis for GIS (edited by A.U. Frank and  W. Kuhn), Berlin: Springer-Verlag, Lecture Notes in Computer Sciences No. 988, pp. 1-15.

 

Erwig M. and M. Schneider (1997), “Vague Regions”,  5th Int. Symp. On  Advances in Spatial Databases (SSD'97), LNCS 1262, 298-320

 

Galton (1997), In S. C. Hirtle and A. U. Frank (eds.), Spatial Information Theory: A Theoretical Basis for GIS (Proceedings of International Conference COSIT'97, Laurel Highlands, Pennsylvania, USA, October 1997), Springer-Verlag, 1997, pages 1-13. ISBN 3-540-63623-4.

 

Hagen M, B. Bartenschlager and U. Finke (1999) “Motion characteristics of thunderstorms in southern Germany”, Meteorological Applications, 6, 227-239

 

Houze R, B Smull and P Dodge (1990) “Mesoscale Organization of Springtime Rainstorms in Oklahoma”, Monthly Weather Review, 118, 613-654

 

Jones C., Kidner D, L Luo, L. Bundy and J. Ware (1996) "Database design for a multi-scale spatial inforamtion system", International Journal of Geographical Information Systems, 10(8), 901-920

 

Langran G. (1992), Time in Geographic Information Systems London, Taylor and Francis

 

Mennis J, D Peuquet, and L Qian (2000), "A Conceptual Framework for incorporating cognitive principles into geographical database representation", International Journal of geographical Information Science, 14(6), 501-520

 

Mountrakis G, P Aguouris, and  A Stefinidis(2000), "Navigating through hierarchical change propagation in Spatiotemporal Queries", Time 2000 Workshop , IEEE Press, pp. 123-131

 

Parent C., S. Spaccapietra(2000), "Database Integration: the Key to Data Interoperability",  in Advances in Object-Oriented Data Modeling (edited by  M. P. Papazoglou, S. Spaccapietra, Z. Tari), The MIT Press, 2000

 

Parent C., Spaccapietra S., Zimanyi E. (2000) "MurMur: Database Management of Multiple Representations",  AAAI-2000 Workshop on Spatial and Temporal Granularity, Austin, Texas, July 30, 2000

 

Peuquet D. (1988), "Representations of Geographic Space: Toward a Conceptual Synthesis", Annals of the Association of American Geographers, 78(3), 375-394

 

Peuquet D. (1994), “Its about time: A conceptual framework for the representation of temporal dynamics in geographic information systems”, Annals of the Association of American Geographers, 84(3), pp 441-461

 

Peuquet D, and L Qian, (1996), “An Integrated Database Design for Temporal GIS”, In M.J. Kraak and M. Molenaar, Advances in GIS Research II, Proceedings of the 7th International Symposium on Spatial Data Handling , Taylor and Francis, London, p 21-31.

 

Peuquet D and E Wentz (1994) "An Approach for Time-Based Analysis of Spatiotemporal Data", Advances in GIS research; proceedings of the Sixth International Symposium on Spatial Data Handling, Edinburgh Scotland, Taylor and Francis, London. 489-504

 

Rigaux P. and  M. Scholl (1994) Multiple Representation Modelling and Querying. In: International Workshop on GIS (J. Nievergelt et al., eds.), LNCS No. 884, Springer-Verlag, Berlin, pp. 59-69.

 

Schiesser H, R Houze and H Huntrieser (1995) “The Mesoscale Structure of Severe Precipitation Systems in Switzerland”, Monthly Weather Review, 123, 2070-2097

 

 

Schmidt J, B Lawrence, B Olsen,”A Comparison of Operational Precipitation Processing Methodologies”, http://www.srh.noaa.gov/abrfc/p1vol.html, accessed on 5/24/02

 

 

Timpf S (1997), “Cartographic objects in a multi-scale data structure”, Geographic Information Research: Bridging the Atlantic. (edited by M Craglia and H Couclelis), Vol 1, Taylor & Francis, 224-234

 

Vangenot, Parent, and Spaccapietra (1999) "Multiple representations and multiple resolutions in geographic databases",  Proceedings of the Advanced Database Symposium (ADBS'99), Tokyo, December 6-7, 1999.

 

 

Wilcox D, M Harwell, and R Orth, (2000), “Modeling Dynamic Polygon Objects in Space and Time:  A New Graph-based Technique”, Cartography and Geographic Information Science, 27(2), 153-164

 

Winter, S (1998) "Bridging Vector and Raster Representation in GIS". In Advances in Geographic Information Systems (edited by R. Laurini, K Makki,and N Pissinou), The Association for Computing Machinery Press, Washington, D.C., 57-62.

 

Yuan M (1996), "Modeling Semantic, temporal, and spatial information in geographic information systems", in Geographic Information Research: Bringing the Atlantic edited by M Craglia and H Couclelis, Taylor and Francis, London 334-347

 

Yuan, M. (2001), "Representing Complex Geographic Phenomena with both Object- and Field-like Properties", Cartography and Geographic Information Science , 28(2),                 83-96.