Main Body
1 Lesson 1
Welcome to Lesson 1: Spatial Data Models and the Public Domain
In this lesson, we:
- Review typical spatial data models in the context of geographic information systems (GIS) today
- Dig into raster and vector data models, format, and quality
- Introduce public domain spatial data
- Review scale, coordinate systems, and projections.
Lesson Topics
This introductory lesson covers three topics and takes approximately 50 minutes to complete. We recommend working through each topic in the order in which they are listed below.
1. Spatial Data Models
GIS Applications
Let us briefly review some common applications of GIS in conservation and natural resources management. With spatial data and GIS, we can ask questions and solve problems such as the following:
-
Finding locations (where is X?)
- Finding entities at certain locations (what is at location Y?)
- Examining spatial relationships (why is object X located at location Y?)
- Finding best routes
- Finding most suitable sites
- Predicting future change
- Analyzing land cover/use conversions
- Looking at questions of environmental justice
- Analyzing policy outcomes
GIS technology integrates spatial information from many sources to answer a multitude of questions.
Four Key Parts of GIS
Geographic or spatial data are just one piece of a GIS. Together the system includes the following:
- Places with locational information, or spatial data (a depiction or model of the real world)
- A way to organize attributes of the spatial data, or database (such as ArcCatalog)
- Tools to manipulate and analyze spatial data, or geoprocessing functions (such as ArcToolBox)
- An interface, so we can make these pieces work together (e.g., QGIS, ArcMap, ArcGIS Pro, or ArcGIS online)
Geographic data are composed of two components, both of which are stored in GIS layers:
- Spatial Data refer to the points, lines, and polygons, or raster cells, which represent features on earth (georeferenced to real-world coordinates). Spatial data can be represented using various spatial data models, which are described further in this lesson.
- Attribute Data consist of characteristics or properties that describe spatial entities.
Geographic data models are the various geometric structures for storing geographic spatial data.We use different kinds of data models to:
- Define the classes of objects that can be represented and how they behave or act within GIS
- Define the basic design pattern for depicting the real world and how it is abstracted in GIS
- Common data models include:
- Raster
- Vector
- TIN (triangulated irregular network)
Geographic Data Models
Raster Data Models represent spatial data as a matrix or a grid of georeferenced cells (pixels) generated by a regularly spaced sampling of phenomena. They are either discrete (thematic—like land cover classes) or continuous (like soil moisture) and provide wall-to-wall coverage in a data layer.
Pixel or cell size, or resolution, is important for display, analysis, and operations.
Raster data can be stored in a variety of formats, with grid and image formats being the most common.
Grid format is the native or proprietary raster format used by ESRI software to represent discrete and continuous raster data.
Image format collectively means raster data with cells that store brightness values and can be derived from aerial photos, satellite imagery, and scanned maps.
The raster images below illustrate an aerial photo, and a scanned topographic quadrangle map.
Images © Shutterstock, Inc.
Raster Data Examples and Applications
Raster data are commonly used for themes that cover the whole map area, such as elevation, temperature, and precipitation. Examples include the following:
Images © Shutterstock, Inc.
These data are widely applied in:
- Environmental science
- Change detection
- Modeling
Vector
The vector data model displays spatial data as points, lines, and polygons, and each entity is georeferenced.
Points represent objects by a single X,Y coordinate pair.
Lines are composed of line segments connected by vertices (nodes) and are defined by the coordinate pairs of the nodes.
Polygons have a perimeter defined by line segments where the start and end node share the same geographic coordinates.
Each object represented in vector format has topology.
Topology is the arrangement that constrains how point, line, and polygon features share geometry.
In geometry, topology deals with the properties of a figure that remain unchanged even when the figure is bent, stretched, or otherwise distorted. Vector data can vary in scale, accuracy, and quality. We’ll come back to this in Topic 2.
Vector Data Examples and Applications
Vector data are commonly used for themes that are made of explicit objects, such as parcel ownership data, political boundaries, and streets. Other examples include the following:
Images © Shutterstock, Inc.
Image © Shutterstock, Inc.
TINs
Triangulated Irregular Networks (TINs) connect points to form triangles that represent facets to efficiently represent complex surfaces.
Each data model represents the real world in a different way and is suited to representing specific types of spatial phenomena.
Although TINs are used less often than raster data models, they can more effectively and efficiently describe a surface that has variable change. Imagine a plateau (relatively flat), with a cliff-like drop-off.
Topic 1 Knowledge Check
2. Digging into Raster and Vector Data
Raster and vector data models are more commonly used in natural resource applications than TINs. Yet each data model has its own advantages and disadvantages. The choice of raster or vector structure depends on the following:
- The nature of the earth surface or features being represented
- The spatial questions to be asked
- The kind of data available
- The software tools to be used
Either data model could be converted to, or overlaid with the other.
The following pages dig into data quality considerations of rasters and vectors, then compare the advantages and disadvantages of each data type, their applications, and considerations for moving between data types.
Data Structure
How the spatial data are structured to represent features on earth influences
- The level of precision of that spatial data
- What technology will be used for viewing and analysis
- How spatial overlays and modeling will be conducted
- If coordinate transformation or reprojection will be needed
Raster data have a simple data structure that results in large data volume and requires high storage capacity. Topology (or the spatial relationships between objects) cannot be represented in raster data.
Vector data have a compact but complex data structure that enables explicit definition of topology and requires less storage capacity.
Precision
In raster data, object shapes cannot be precisely represented when pixel size is large (resolution is low) because of constraints of the cell shape. Spatial variation can be accommodated through increased spatial resolution.
In vector models, objects can be precisely represented using points, lines, and polygons (areas). However, spatial variation of attributes within a polygon cannot be expressed.
Viewing and Analysis Technology
The simple data structure of cells has allowed the development of inexpensive technology for working with raster data (Bolstad 2016). But the graphical output of raster data products may not be pleasing (e.g., blocky depending on resolution).
On the other hand, due to its complex data structure, technology for viewing and analyzing vector data is more expensive, but graphical output of vector data may be more visually pleasing (smoother edges).
Spatial Overlay and Mathematical Modeling
These analyses are more difficult using vector data, because objects on different layers have different shapes.
On the other hand, modeling and overlay using raster data are easier, because pixels of different layers can be overlaid and added, subtracted, multiplied, and so on.
Coordinate Transformation
In raster data, coordinate transformation (or map projection) often results in loss of information due to resampling of pixels, and projection transformations are more difficult.
In vector, coordinate and projection transformations are easier, since every object is defined by points.
We often need to convert vector data to raster to run overlay or modeling calculations. Likewise, many geoprocessing operations work best with vector data. Converting from vector to raster can result in data loss due to the following:
- Cell size
- The choice of pixel size can result in the loss of spatial data, as large cells can mask spatial variation.
- Loss of attributes
- When converting from vector to raster data, only one attribute of the polygons can be used to assign values to the new grid cells. Therefore, other attributes will be lost, or will have to be rejoined.
- Mixed cell problem
- Occurs when edge pixels may have a mix of class properties. In vector to raster conversion, the grid cells may span two or more polygons, and the resulting cells can only be assigned to one class. This reduces the spatial accuracy of the resulting data.
Converting from raster to vector can be even uglier, and does not ensure that vector output is more accurate. Neighboring pixels with certain attribute values may get translated into new lines or polygons. After the conversion, many spatial problems are likely to arise requiring intensive editing.
Vector and Raster in ArcGIS and QGIS
We will work with both vector and raster data in ArcGIS. Many basic geoprocessing tools are designed to work with vector data, whereas raster data processing happens mostly through the spatial analyst extension to:
- View and overlay raster images
- Convert between raster and vector
- Basic spatial analysis with raster data (arithmetic and neighborhood functions)
Vector data in ArcGIS is referred to as Feature Class. We will cover this more in the next lesson.
As you know, raster data in ArcGIS typically use the Esri GRID format.
Since QGIS is open source and built by the community of users, it supports the use of many different raster file formats, including Esri GRID, GeoTIFF, and ERDAS IMAGINE files. The most used geoprocessing tools for raster data are included in the base installation of QGIS Desktop. QGIS also supports a variety of vector data, including Esri shapefiles (https://docs.qgis.org/2.8/en/docs/user_manual/working_with_raster/supported_data.html).
An important difference between ArcGIS and QGIS, is that in QGIS, you must specify if you are importing vector or raster data, whereas ArcGIS does not need prior specification.
Topic 2 Knowledge Check
Please answer the following questions to proceed to the next topic.
3. Spatial Data in the Public Domain
This topic introduces public domain data, discusses the emergence of the spatial data infrastructure (SDI), introduces national and international portals for spatial data, and covers considerations for utilizing data from these sources.
What is Public Domain Spatial Data?
More and more we can and do rely on publicly available spatial data for GIS analyses. We call this public domain spatial data. In their book, Kerski and Clark (2012, page 10) say you can
“…think of public domain spatial data as publicly accessible information about a spatial theme or phenomenon, the use of which does not infringe the legal rights of an individual or organization.”
As GIS technology grew throughout the 1980s as a powerful tool, local, state, and federal agencies provided spatial data to a variety of users. However, duplication of efforts and a lack of consistency in data generated by different government entities made data sharing and integration difficult.
Federal Geographic Data Committee
To address this problem, the Federal Geographic Data Committee (FGDC, www.fgdc.gov), composed of representatives from federal agencies and the Executive Office, was established in 1990 to coordinate the development, use, sharing, and dissemination of geospatial data on a national basis in the United States.
The Federal Geographic Data Committee (FGDC) is an organized structure of federal geospatial professionals and constituents that provide executive, managerial, and advisory direction and oversight for geospatial decisions and initiatives across the federal government.
(www.fgdc.gov, accessed January 23, 2018)
US National Spatial Data Infrastructure
In concert with FGDC, the US National Spatial Data Infrastructure (NSDI) was organized to coordinate data standards to reuse and share existing spatial data, and is managed by FGDC. Today, …
The NSDI leverages investments in people, technology, data, and procedures to create and provide the geospatial knowledge required to understand, protect, and promote our national and global interests.
The US National Spatial Data Infrastructure NSDI is defined as “the technology, policies, standards, and human resources necessary to acquire, process, store, distribute, and improve utilization of geospatial data.”
NSDI Goals
- Goal 1 – Develop Capabilities for National Shared Services
- Goal 2 – Ensure Accountability and Effective Development and Management of Federal Geospatial Resources
- Goal 3 – Convene Leadership of the National Geospatial Community
(www.fgdc.gov/nsdi/nsdi.html, accessed January 23, 2018)
Most importantly, these two institutions, FGDC and NSDI, provide consistency in the United States for geospatial data standards and distribution. Not all areas of the world have this level of federal consistency, though global spatial data institutions are growing.
Metadata
Recall—Metadata is information about data. Geospatial metadata describes . . .
- How, when, where, and by whom the data were collected
- Availability and distribution of information, projection, scale, resolution, and accuracy
- Data reliability with regard to some standard
The Content Standard for Digital Geospatial Metadata is the metadata standard developed by the FGDC factsheet [https://www.fgdc.gov/metadata, factsheet].
Metadata functions to . . .
- Preserve a history of the data
- Enable users to assess the suitability of data for a project
- Ensure accountability for data content
- Document data and project development and processing
Geospatial Platform
Neither the FGDC nor NSDI were set up to distribute geospatial data resources, so a national geospatial data clearinghouse was organized as a virtual repository of geospatial metadata from various government suppliers such as United States Geological Survey (USGS). Today, FGDC provides a portal to these resources:
Geospatial Platform is an FGDC initiative that provides shared and trusted geospatial data, services, and applications. Search our massive catalog of geospatial data and tools provided by a multitude of federal agencies. Whether you are a geographic professional, student, teacher, or citizen, you can find data that will help you with your project, assignment, presentation, or concern.
(www.fgdc.gov/dataandservices, accessed 23 Jan 2018)
Regional versus Thematic Data Portals
Data may be organized geographically, such as portals that contain data pertinent to one state or nation. US national data portals include the following:
Alternatively, data may be organized around certain themes, such as the following:
- The GeoNetwork of the Food and Agriculture Organization
- Data Basin by the Conservation Biology Institute (we’ll look at this source more soon).
And there are more grassroots open source geospatial data portals, such as:
International Spatial Data Infrastructure
Similarly, a global spatial data infrastructure (GSDI) initiative is also underway to facilitate the creation, maintenance, and sharing of spatial data around the world. This is especially important for projects seeking to address environmental issues that span national and regional boundaries, such as the tsunami that affected many countries in Southeast Asia.
However, sharing global data across boundaries has additional challenges and inconsistencies in the following:
- Gaps in spatial data and documentation
- Incompatible spatial datasets
- Incompatible GIS
- Limitations to sharing data
The GSDI Association, a consortium of organizations, agencies, companies, and individuals, promotes international cooperation in support of local, national, and international spatial data infrastructure to better address global, social, economic, and environmental issues.
Our purpose is to encourage international cooperation that stimulates the implementation and development of national, regional, and local spatial data infrastructures.
—GSDI (gsdiassociation.org, accessed January 23, 2018).
Several standards have been developed for global geospatial data, including those by the International Organization for Standardization, the Open Geospatial Consortium, the World Wide Web Consortium, and others.
A few key international data portals include . . .
Note that international portals may vary in the ability to view, download, or use spatial data.
We will visit some of these as we go through this book series.
Data Ethics and Politics
Accessing and utilizing spatial data across international boundaries carries special ethical and political considerations.
Spatial data depicting the location of and access to specific natural resources and land use may be contested by neighboring governments.
Differential access to spatial data also means that some individuals or groups may be at a disadvantage in disputes concerning natural resources, land use or ownership rights, and in access to goods and services.
The choice of spatial data should also be accompanied by privacy considerations. Maps are a rich source of information that illustrates patterns and relationships like no other format. This information is becoming increasingly accessible via the Internet and social media, as traditional sources location data privacy of public data go into the cloud and as many devices are geo-enabled. Increased spatial and temporal resolution means data can be examined at the individual level. We will work with publicly accessible mobile and online data in Lesson 2.
Meanwhile, privacy laws do not clearly dictate which personal and government data belongs in the public domain, and individual cases are setting precedent on both sides of privacy arguments.
Check out this Spatial Reserves blog on location data privacy by J. Kerski.
Topic 3 Knowledge Check
Review of Coordinate Systems
What is a coordinate System?
A coordinate system is a system that uses a common framework to represent locations of geographic features, observations, and imagery. In other words, a coordinate system provides a method for understanding real-world locations.
There are two main types of coordinate systems: Geographic and Projected
Geographic Coordinate System
A geographic coordinate system uses spherical coordinates (eg. decimal degrees) and measures from the center of a 3D sphere, ie. the center of the earth. Because the surface is spherical, the lengths, angles and areas are not constant, and therefore distances between latitude and longitude lines will change as we move from equator to poles.
Projected Coordinate System
A projected coordinate system uses planar coordinates (eg. feet, meters, etc) to project the earths spherical surface on to a 2D Cartesian coordinate surface. Projected coordinate systems are always based on Geographic coordinate systems, and the lengths, angles and areas are always constant. In order to make these measurements constant and accurate, projected coordinate systems include a map projection which translates spherical coordinates to planar coordinates using particular parameters that customize the map for a particular location.
How Does this Translate to ArcMap?
So, how does this translate to what you see in ArcMap? Lets imagine we have a dataset with 100 data-points which represent the locations of wolf-livestock conflicts in Idaho (ie. where wolves have killed livestock). Each data point has a latitude and a longitude value recorded in decimal degrees. Those decimal degrees relate to the geographic coordinate system of those points (eg. WGS-84). So, if we bring those points into ArcMap, the program would know where those points are located on the sphere of Earth. But, in order for you to be able to visualize those points as you would on a flat map (vs. a globe), and with other spatial data, you need to define a projection. Since all the points are in Idaho, you can look for a North American projection, and you may even want something more accurate, such as an Idaho specific projection. The program will then translate the decimal degrees into planar units so that you can accurately examine values such as the distances between points.
To learn more about this, look at ESRI’s online course Understanding Map Projections and Coordinate Systems
References:
Bolstad, Paul. 2016. GIS Fundamentals, 5th Edition. XanEdu Press. http://www.paulbolstad.net/gisbook.html
Kerski, Joseph and Jill Clark. 2012. The GIS Guide to Public Domain Data. Esri Press. 388 pp.
http://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=219&moduleID=0
http://resources.esri.com/help/9.3/arcgisengine/dotnet/89b720a5-7339-44b0-8b58-0f5bf2843393.htm
Media Attributions
- GIS Maps
- GIS
- Badger
- Raster
- Raster Thematic
- Raster Continuous
- Vector data model
- Types of Topology
- Census Map
- TIN
- Beach and Cliffs
- Vector example
- Raster to Vector
- Features
- Fisherman
- World map
- Roma, Texas
- Recording