"

Main Body

1 Lesson 1


Welcome to Lesson 1: Spatial Data Models and the Public Domain

In this lesson, we:

  • Review typical spatial data models in the context of geographic information systems (GIS) today
  • Dig into raster and vector data models, format, and quality
  • Introduce public domain spatial data
  • Review scale, coordinate systems, and projections.

Lesson Topics

This introductory lesson covers three topics and takes approximately 50 minutes to complete. We recommend working through each topic in the order in which they are listed below.

1. Spatial Data Models

GIS Applications

Let us briefly review some common applications of GIS in conservation and natural resources management. With spatial data and GIS, we can ask questions and solve problems such as the following:

  • GIS circle with layers of data around the GIS
    Image © Shutterstock, Inc.

    Finding locations (where is X?)

  • Finding entities at certain locations (what is at location Y?)
  • Examining spatial relationships (why is object X located at location Y?)
  • Finding best routes
  • Finding most suitable sites
  • Predicting future change
  • Analyzing land cover/use conversions
  • Looking at questions of environmental justice
  • Analyzing policy outcomes

GIS technology integrates spatial information from many sources to answer a multitude of questions.

Geographic Information System written on blackboard
Image © Shutterstock, Inc.

Four Key Parts of GIS

Geographic or spatial data are just one piece of a GIS. Together the system includes the following:

  • Places with locational information, or spatial data (a depiction or model of the real world)
  • A way to organize attributes of the spatial data, or database (such as ArcCatalog)
  • Tools to manipulate and analyze spatial data, or geoprocessing functions (such as ArcToolBox)
  • An interface, so we can make these pieces work together (e.g., QGIS, ArcMap, ArcGIS Pro, or ArcGIS online)

Geographic data are composed of two components, both of which are stored in GIS layers:

  1. Spatial Data refer to the points, lines, and polygons, or raster cells, which represent features on earth (georeferenced to real-world coordinates). Spatial data can be represented using various spatial data models, which are described further in this lesson.
    This badger den could be mapped as a point feature.
    This badger den could be mapped as a point feature. Image © Shutterstock, Inc.
  2. Attribute Data consist of characteristics or properties that describe spatial entities.
A square parcel of land represented as Raster, Vector, and Real World
Image source: http://mwiki.gichd.org/IM/Types_of_Data

Geographic data models are the various geometric structures for storing geographic spatial data.We use different kinds of data models to:

  • Define the classes of objects that can be represented and how they behave or act within GIS
  • Define the basic design pattern for depicting the real world and how it is abstracted in GIS
  • Common data models include:
    • Raster
    • Vector
    • TIN (triangulated irregular network)

Geographic Data Models

Land parcel represented by Raster DEM, Vector contours, and TIN
Image source: Bostad Fig 2-42, www.paulbolstad.net/gisbook.html

Raster Data Models represent spatial data as a matrix or a grid of georeferenced cells (pixels) generated by a regularly spaced sampling of phenomena. They are either discrete (thematic—like land cover classes) or continuous (like soil moisture) and provide wall-to-wall coverage in a data layer.

Graphic showing X and Y coordinates of raster data.
Graphic showing X and Y coordinates of raster data. Illustration drawn by Ashley Kissick, 2018

Pixel or cell size, or resolution, is important for display, analysis, and operations.

Esri image showing raster cell size and values
Graphic of raster cells and values. Image source: http://help.arcgis.com/en/geodatabase/10.0/sdk/arcsde/concepts/raster/entities/rasters_attr.htm

Raster data can be stored in a variety of formats, with grid and image formats being the most common.

Grid format is the native or proprietary raster format used by ESRI software to represent discrete and continuous raster data.

example of thematic raster data
Thematic raster data. Image source: http://desktop.arcgis.com/en/arcmap/10.3/manage-data/raster-and-images/what-is-raster-data.htm
Example of continuous raster data.
Example of continuous raster data. Image source: http://desktop.arcgis.com/en/arcmap/10.3/manage-data/raster-and-images/what-is-raster-data.htm

Image format collectively means raster data with cells that store brightness values and can be derived from aerial photos, satellite imagery, and scanned maps.

The raster images below illustrate an aerial photo, and a scanned topographic quadrangle map.

Images © Shutterstock, Inc.

Raster Data Examples and Applications

Raster data are commonly used for themes that cover the whole map area, such as elevation, temperature, and precipitation. Examples include the following:

Images © Shutterstock, Inc.

These data are widely applied in:

  • Environmental science
  • Change detection
  • Modeling

Vector

The vector data model displays spatial data as points, lines, and polygons, and each entity is georeferenced.

Points represent objects by a single X,Y coordinate pair.

Lines are composed of line segments connected by vertices (nodes) and are defined by the coordinate pairs of the nodes.

Polygons have a perimeter defined by line segments where the start and end node share the same geographic coordinates.

Illustration of vector data.
Illustration of vector data. Image source: http://www.catalonia.org/cartografia/Clase_03/Raster_Vector.html

Each object represented in vector format has topology.

Topology is the arrangement that constrains how point, line, and polygon features share geometry.

In geometry, topology deals with the properties of a figure that remain unchanged even when the figure is bent, stretched, or otherwise distorted. Vector data can vary in scale, accuracy, and quality. We’ll come back to this in Topic 2.

Illustration of types of topology
Illustration of types of topology, from: http://www.esri.com/news/arcnews/summer02articles/arcgis-brings-topology.html

Vector Data Examples and Applications

Vector data are commonly used for themes that are made of explicit objects, such as parcel ownership data, political boundaries, and streets. Other examples include the following:

Images © Shutterstock, Inc.

Census 2010 map – population density USA and Puerto Rico

Image © Shutterstock, Inc.

TINs

Triangulated Irregular Networks (TINs) connect points to form triangles that represent facets to efficiently represent complex surfaces.

Illustration of TIN data representing elevation
Illustration of TIN data representing elevation. Image source: https://www.pinterest.com/pin/493073859183245047/
Etretat, la Manneporte natural rock arch wonder, cliff and beach. Long exposure photography. Normandy, France. Image © Shutterstock, Inc.

 

 

Each data model represents the real world in a different way and is suited to representing specific types of spatial phenomena.

Although TINs are used less often than raster data models, they can more effectively and efficiently describe a surface that has variable change. Imagine a plateau (relatively flat), with a cliff-like drop-off.

 

Topic 1 Knowledge Check

2. Digging into Raster and Vector Data

Raster and vector data models are more commonly used in natural resource applications than TINs. Yet each data model has its own advantages and disadvantages. The choice of raster or vector structure depends on the following:

  • The nature of the earth surface or features being represented
  • The spatial questions to be asked
  • The kind of data available
  • The software tools to be used

Either data model could be converted to, or overlaid with the other.

The following pages dig into data quality considerations of rasters and vectors, then compare the advantages and disadvantages of each data type, their applications, and considerations for moving between data types.

Data Structure

How the spatial data are structured to represent features on earth influences

  • The level of precision of that spatial data
  • What technology will be used for viewing and analysis
  • How spatial overlays and modeling will be conducted
  • If coordinate transformation or reprojection will be needed

Raster data have a simple data structure that results in large data volume and requires high storage capacity. Topology (or the spatial relationships between objects) cannot be represented in raster data.

Vector data have a compact but complex data structure that enables explicit definition of topology and requires less storage capacity.

Precision

In raster data, object shapes cannot be precisely represented when pixel size is large (resolution is low) because of constraints of the cell shape. Spatial variation can be accommodated through increased spatial resolution.

In vector models, objects can be precisely represented using points, lines, and polygons (areas). However, spatial variation of attributes within a polygon cannot be expressed.

Viewing and Analysis Technology

The simple data structure of cells has allowed the development of inexpensive technology for working with raster data (Bolstad 2016). But the graphical output of raster data products may not be pleasing (e.g., blocky depending on resolution).

On the other hand, due to its complex data structure, technology for viewing and analyzing vector data is more expensive, but graphical output of vector data may be more visually pleasing (smoother edges).

Spatial Overlay and Mathematical Modeling

These analyses are more difficult using vector data, because objects on different layers have different shapes.

On the other hand, modeling and overlay using raster data are easier, because pixels of different layers can be overlaid and added, subtracted, multiplied, and so on.

Coordinate Transformation

In raster data, coordinate transformation (or map projection) often results in loss of information due to resampling of pixels, and projection transformations are more difficult.

In vector, coordinate and projection transformations are easier, since every object is defined by points.

Vector to raster conversion. Illustration by Ashley Kissick, 2018.
Vector to raster conversion. Illustration by Ashley Kissick, 2018.

We often need to convert vector data to raster to run overlay or modeling calculations. Likewise, many geoprocessing operations work best with vector data. Converting from vector to raster can result in data loss due to the following:

  • Cell size
  • The choice of pixel size can result in the loss of spatial data, as large cells can mask spatial variation.
  • Loss of attributes
  • When converting from vector to raster data, only one attribute of the polygons can be used to assign values to the new grid cells. Therefore, other attributes will be lost, or will have to be rejoined.
  • Mixed cell problem
  • Occurs when edge pixels may have a mix of class properties. In vector to raster conversion, the grid cells may span two or more polygons, and the resulting cells can only be assigned to one class. This reduces the spatial accuracy of the resulting data.

Converting from raster to vector can be even uglier, and does not ensure that vector output is more accurate. Neighboring pixels with certain attribute values may get translated into new lines or polygons. After the conversion, many spatial problems are likely to arise requiring intensive editing.

How an input raster is vectorized when converted to a polygon feature output. http://pro.arcgis.com/en/pro-app/tool-reference/conversion/raster-to-polygon.htm
How an input raster is vectorized when converted to a polygon feature output. http://pro.arcgis.com/en/pro-app/tool-reference/conversion/raster-to-polygon.htm

Vector and Raster in ArcGIS and QGIS

We will work with both vector and raster data in ArcGIS. Many basic geoprocessing tools are designed to work with vector data, whereas raster data processing happens mostly through the spatial analyst extension to:

  • View and overlay raster images
  • Convert between raster and vector
  • Basic spatial analysis with raster data (arithmetic and neighborhood functions)

Vector data in ArcGIS is referred to as Feature Class. We will cover this more in the next lesson.

All features in a feature class have the same geometry type, the same attributes, and are located within a common geographic extent.
All features in a feature class have the same geometry type, the same attributes, and are located within a common geographic extent. 

As you know, raster data in ArcGIS typically use the Esri GRID format.

Graphic of raster cells and values.
Graphic of raster cells and values.

Since QGIS is open source and built by the community of users, it supports the use of many different raster file formats, including Esri GRID, GeoTIFF, and ERDAS IMAGINE files. The most used geoprocessing tools for raster data are included in the base installation of QGIS Desktop. QGIS also supports a variety of vector data, including Esri shapefiles (https://docs.qgis.org/2.8/en/docs/user_manual/working_with_raster/supported_data.html).

Screenshot of primary raster toolbar in QGIS.
Screenshot of primary raster toolbar in QGIS.

An important difference between ArcGIS and QGIS, is that in QGIS, you must specify if you are importing vector or raster data, whereas ArcGIS does not need prior specification.

Screenshot of how to import data in QGIS.
Screenshot of how to import data in QGIS.
Screenshot of how to import data in ArcGIS
Screenshot of how to import data in ArcGIS

Topic 2 Knowledge Check

Please answer the following questions to proceed to the next topic.

3. Spatial Data in the Public Domain

This topic introduces public domain data, discusses the emergence of the spatial data infrastructure (SDI), introduces national and international portals for spatial data, and covers considerations for utilizing data from these sources.

What is Public Domain Spatial Data?

More and more we can and do rely on publicly available spatial data for GIS analyses. We call this public domain spatial data. In their book, Kerski and Clark (2012, page 10) say you can

“…think of public domain spatial data as publicly accessible information about a spatial theme or phenomenon, the use of which does not infringe the legal rights of an individual or organization.”

As GIS technology grew throughout the 1980s as a powerful tool, local, state, and federal agencies provided spatial data to a variety of users. However, duplication of efforts and a lack of consistency in data generated by different government entities made data sharing and integration difficult.

Federal Geographic Data Committee

imageTo address this problem, the Federal Geographic Data Committee (FGDC, www.fgdc.gov), composed of representatives from federal agencies and the Executive Office, was established in 1990 to coordinate the development, use, sharing, and dissemination of geospatial data on a national basis in the United States.

The Federal Geographic Data Committee (FGDC) is an organized structure of federal geospatial professionals and constituents that provide executive, managerial, and advisory direction and oversight for geospatial decisions and initiatives across the federal government.

(www.fgdc.gov, accessed January 23, 2018)

imageUS National Spatial Data Infrastructure

In concert with FGDC, the US National Spatial Data Infrastructure (NSDI) was organized to coordinate data standards to reuse and share existing spatial data, and is managed by FGDC. Today, …

The NSDI leverages investments in people, technology, data, and procedures to create and provide the geospatial knowledge required to understand, protect, and promote our national and global interests.

(www.fgdc.gov/nsdi/nsdi.html)

The US National Spatial Data Infrastructure NSDI is defined as “the technology, policies, standards, and human resources necessary to acquire, process, store, distribute, and improve utilization of geospatial data.”

NSDI Goals

  • Goal 1 – Develop Capabilities for National Shared Services
  • Goal 2 – Ensure Accountability and Effective Development and Management of Federal Geospatial Resources
  • Goal 3 – Convene Leadership of the National Geospatial Community

(www.fgdc.gov/nsdi/nsdi.html, accessed January 23, 2018)

Most importantly, these two institutions, FGDC and NSDI, provide consistency in the United States for geospatial data standards and distribution. Not all areas of the world have this level of federal consistency, though global spatial data institutions are growing.

Metadata

Recall—Metadata is information about data. Geospatial metadata describes . . .

  • How, when, where, and by whom the data were collected
  • Availability and distribution of information, projection, scale, resolution, and accuracy
  • Data reliability with regard to some standard

The Content Standard for Digital Geospatial Metadata is the metadata standard developed by the FGDC factsheet [https://www.fgdc.gov/metadata, factsheet].

Metadata functions to . . .

  • Preserve a history of the data
  • Enable users to assess the suitability of data for a project
  • Ensure accountability for data content
  • Document data and project development and processing

Geospatial Platform

Neither the FGDC nor NSDI were set up to distribute geospatial data resources, so a national geospatial data clearinghouse was organized as a virtual repository of geospatial metadata from various government suppliers such as United States Geological Survey (USGS). Today, FGDC provides a portal to these resources:

Geospatial Platform is an FGDC initiative that provides shared and trusted geospatial data, services, and applications. Search our massive catalog of geospatial data and tools provided by a multitude of federal agencies. Whether you are a geographic professional, student, teacher, or citizen, you can find data that will help you with your project, assignment, presentation, or concern.

(www.fgdc.gov/dataandservices, accessed 23 Jan 2018)

Regional versus Thematic Data Portals

Data may be organized geographically, such as portals that contain data pertinent to one state or nation. US national data portals include the following:

Alternatively, data may be organized around certain themes, such as the following:

  • The GeoNetwork of the Food and Agriculture Organization
  • Data Basin by the Conservation Biology Institute (we’ll look at this source more soon).

And there are more grassroots open source geospatial data portals, such as:

International Spatial Data Infrastructure

imageSimilarly, a global spatial data infrastructure (GSDI) initiative is also underway to facilitate the creation, maintenance, and sharing of spatial data around the world. This is especially important for projects seeking to address environmental issues that span national and regional boundaries, such as the tsunami that affected many countries in Southeast Asia.

http://gsdiassociation.org/

Image © Shutterstock, Inc.

However, sharing global data across boundaries has additional challenges and inconsistencies in the following:

  • Gaps in spatial data and documentation
  • Incompatible spatial datasets
  • Incompatible GIS
  • Limitations to sharing data

The GSDI Association, a consortium of organizations, agencies, companies, and individuals, promotes international cooperation in support of local, national, and international spatial data infrastructure to better address global, social, economic, and environmental issues.

Our purpose is to encourage international cooperation that stimulates the implementation and development of national, regional, and local spatial data infrastructures.

GSDI (gsdiassociation.org, accessed January 23, 2018).

Several standards have been developed for global geospatial data, including those by the International Organization for Standardization, the Open Geospatial Consortium, the World Wide Web Consortium, and others.

A few key international data portals include . . .

Note that international portals may vary in the ability to view, download, or use spatial data.

We will visit some of these as we go through this book series.

Image © Shutterstock, Inc.

Data Ethics and Politics

Accessing and utilizing spatial data across international boundaries carries special ethical and political considerations.

Spatial data depicting the location of and access to specific natural resources and land use may be contested by neighboring governments.

Roma, TX Winding Rio Grande River separating U.S. and Mexico. The right side is Texas, and the left side is Mexico.
Roma, TX Winding Rio Grande River separating U.S. and Mexico. The right side is Texas, and the left side is Mexico. Photo by Alan Schmierer, via Wikimedia Commons

 

 

 

 

 

 

 

 

 

Differential access to spatial data also means that some individuals or groups may be at a disadvantage in disputes concerning natural resources, land use or ownership rights, and in access to goods and services.

The choice of spatial data should also be accompanied by privacy considerations. Maps are a rich source of information that illustrates patterns and relationships like no other format. This information is becoming increasingly accessible via the Internet and social media, as traditional sources location data privacy of public data go into the cloud and as many devices are geo-enabled. Increased spatial and temporal resolution means data can be examined at the individual level. We will work with publicly accessible mobile and online data in Lesson 2.

Google Street View camera operator at work. It’s a technology featured in Google Maps and Google Earth that provides panoramic views from positions along streets in the world.
AUCKLAND—DEC 02 2015: Google Street View camera operator at work. It’s a technology featured in Google Maps and Google Earth that provides panoramic views from positions along streets in the world. Shutterstock

Meanwhile, privacy laws do not clearly dictate which personal and government data belongs in the public domain, and individual cases are setting precedent on both sides of privacy arguments.

Check out this Spatial Reserves blog on location data privacy by J. Kerski.

Topic 3 Knowledge Check

Review of Coordinate Systems

What is a coordinate System?

A coordinate system is a system that uses a common framework to represent locations of geographic features, observations, and imagery. In other words, a coordinate system provides a method for understanding real-world locations.

There are two main types of coordinate systems: Geographic and Projected

Geographic Coordinate System

A geographic coordinate system uses spherical coordinates (eg. decimal degrees) and measures from the center of a 3D sphere, ie. the center of the earth. Because the surface is spherical, the lengths, angles and areas are not constant, and therefore distances between latitude and longitude lines will change as we move from equator to poles.

image

Projected Coordinate System

A projected coordinate system uses planar coordinates (eg. feet, meters, etc) to project the earths spherical surface on to a 2D Cartesian coordinate surface. Projected coordinate systems are always based on Geographic coordinate systems, and the lengths, angles and areas are always constant. In order to make these measurements constant and accurate, projected coordinate systems include a map projection which translates spherical coordinates to planar coordinates using particular parameters that customize the map for a particular location.

Illustration of casting a shadow of a graticule onto a piece a paper

How Does this Translate to ArcMap?

So, how does this translate to what you see in ArcMap? Lets imagine we have a dataset with 100 data-points which represent the locations of wolf-livestock conflicts in Idaho (ie. where wolves have killed livestock). Each data point has a latitude and a longitude value recorded in decimal degrees. Those decimal degrees relate to the geographic coordinate system of those points (eg. WGS-84). So, if we bring those points into ArcMap, the program would know where those points are located on the sphere of Earth. But, in order for you to be able to visualize those points as you would on a flat map (vs. a globe), and with other spatial data, you need to define a projection. Since all the points are in Idaho, you can look for a North American projection, and you may even want something more accurate, such as an Idaho specific projection. The program will then translate the decimal degrees into planar units so that you can accurately examine values such as the distances between points.

Wolves hunting a Bison
Wolves hunting a Bison

To learn more about this, look at ESRI’s online course Understanding Map Projections and Coordinate Systems

References:

Bolstad, Paul. 2016. GIS Fundamentals, 5th Edition. XanEdu Press. http://www.paulbolstad.net/gisbook.html

Kerski, Joseph and Jill Clark. 2012. The GIS Guide to Public Domain Data. Esri Press. 388 pp.

http://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=219&moduleID=0

http://resources.esri.com/help/9.3/arcgisengine/dotnet/89b720a5-7339-44b0-8b58-0f5bf2843393.htm

Media Attributions

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Collecting and Mapping Data Copyright © 2018 by Janet Silbernagel is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.