G.+I.+S. on the Net


BAAMA Presentation on June-28-1996

U.S. Bureau of Reclamation Talk on Nov-6-96

G. + I. + S. on the Internet

By: Olof Hansen, U.S. Department of the Interior, Bureau of Reclamation

Abstract: The presentation dealt with the three basic components which
G.I.S. is split up into.  These  components are the geographic
components (the "pretty map syndrome"), the information component
(making decisions based on sound data), and the system (the computer -
hardware - technology issues).  All these aspects of G.I.S. were
presented with examples of homepages, the focus was on homepages
dealing with environmental data collected in the San Francisco Bay
estuary.

The presentation split up GIS into its three major components:

G. eographic:
This part dealt with maps and coverages, and used some US Geological Survey home pages. The example used some Land use/land cover data from a homepage of USGS.
I.nformation:
The middle part introduced the process from data via information via knowledge to decision making with the homepage of the Interagency Ecological Program (IEP).
S.ystem:
The last part dealt with technical concepts such as computer systems, hardware, data base design, RDBMS, access, and querying databases. These concepts were explained via the IEP Real Time Monitoring homepage.
The whole presentation used the San Francisco Bay area as home base (for the home pages, sorry for the pun!) to illustrate the concepts raised.

G.eographic

The word "geography" has its origin in the combination of two Greek words:

1.) GEOS - Earth

2.) GRAPHIKOS - Able to draw or paint

We normally associate these two ideas of drawing Mother Earth listed above with maps. The #1 mapmaking agency in the U.S. is -off course- the United States Geological Survey (short:USGS), and its National Mapping Division. The presentation showed an example of getting land use/land cover (LULC) types of geographic data from a homepage operated by the Eastern Region of USGS.

I used the LULC geographic search option to locate LULC data in the SF Bay Area. This option displays maps of the US, then of 1:100K areas. After moving and clipping the mouse on the part of the map`you are interested in, the geographic retrieval process accessed this file:

LULC- SF at 1:100,000-scale Land Use Land Cover for the latitude/longitude coordinates: N37.73 W122.42

These data can be copied by making sure that you have turned on the "Load to Local Disk" option located under the Options menu. LU/LC data comes in several layers such as political boundaries, hydrography,etc. These LU/LC file types exist.

In this presentation, I used LULC hydrography data for San Francisco to create a map with red and green boxes indicating the locations of environmental monitoring sites (water quality samples). These sites were downloaded from another database, and combined with the LULC hydrography to show the Northern tip of the San Francisco peninsula.

I.nformation

For this part, I introduced the audience to the concept of information being a communication process where the idea/concept/thought/word of one person needs to be communicated to the other person, and match both concepts with reality. I used a fish as an example.

For further discussion, I introduced the Interagency Ecological Program of the San Francisco Bay-Delta estuary, or for short IEP. The IEP`is the largest environmental monitoring data generating program for an estuary in the WORLD. Data include biological, chemical, physical, meteorological, model, etc. observations.

Since the IEP is interagency in nature, it is comprised of the following member organizations:

Six federal agencies: Bureau of Reclamation, Fish and Wildlife Service, Geological Survey, Corps of Engineers, National Marine Fisheries Service, and Environmental Protection Agency; three state agencies: Department of Water Resources, Department of Fish and Game, and State Water Resources Control Board; and one non-government organization (NGO): San Francisco Estuarine Institute.

The current information base of IEP (status: Summer 1996) can be summarized as follows:

The IEP and its member organizations collect environmental information (biological, chemical, physical, operations data) at over 800 monitoring sites in the estuary.

All data collected are put on a file server. The data are documented with a lot of metadata, i.e. data about data. The metadata serve to explain the information contained in the data files. Thus, without metadata the end user cannot interpret or analyze the data. The following metadata and data file types exist on the IEP server:

  1. Brief Overview of the Project (READ ME File)
  2. Metadata with Quality Control Information (DOC File)
  3. Format Files/Header/Legend (FMT File)
  4. Reference Tables/Look-up References (REF File)
  5. Raw Data Files in ASCII Format

In order to be able to interpret raw data you need to have enough data about data (metadata). One type of metadata is the format file:

The format files describe the structure of each type of data file listed under a particular IEP program element. As a minimum, the IEP format files may provide the following information:

The metadata files of all IEP data give a comprehensive overview of each program element. They describe purpose, period of record, field protocols, lab analysis procedures, quality control/quality assurance, and geographic range of monitoring for each element.

The IEP reference files may contain cross-reference tables such as:

Explanations of fields in the data files, units of measurement, instrumentation maximum detection levels, taxonomic identification codes for biological organisms, latitude/longitude for stations with description of site location, sample collection gear types, and codes for chemical parameters.

The following paragraph is an example of an IEP documentation file (DOC). (All DOC files come with a contact person's name and phone number for follow-up questions):

San Francisco Bay Monitoring

Contact Person: Kathryn Hieb (209) 942-6078

Otter and midwater trawls have been used since 1980 to track the abundance and distribution of marine and estuarine organisms at 52 stations from South San Francisco Bay to the lower Sacramento and San Joaquin rivers. Fish, caridean shrimp, and brachyuran crab catch and size frequency data, salinity and water temperature profile data, and water transparency data are collected monthly at each station. Additional sampling is conducted with baited ringnets at 12 stations for Cancer crabs. Relationships between annual abundance and environmental conditions have been developed for several species, including Crangon franciscorum, longfin smelt, Pacific herring, and starry flounder. The Fall Midwater Trawl (FMWT) survey described in the previous section tracks abundance trends of estuarine species from mid San Pablo Bay through the Delta. The FMWT survey has a longer period of record (since 1967) and samples more of the Delta than the San Francisco Bay Monitoring survey.

The DOC file need not always be a text file. The DOC file could also be a geographic display/map of the element. E.g., a map with SF Bay monitoring sites: stations of tows and trawls from research vessels.

In order to get a raw data file into a software for further analysis, you as an end user need to understand the structure/format of the data. An example of an IEP format file follows:

Data Format for Midwater Trawl Data

File Created Size Recno

Midwater trawl (MWTOW.TXT) 07/25/95 563 5895

File Description: Data specific to each midwater trawl effort

Variable List

RKI CHARACTER 10

STATION NUMERIC 3

DATE NUMERIC 8

TIME NUMERIC 4

DEPTH NUMERIC 4.1

TOW NUMERIC 4 see below

BEARING NUMERIC 3 compass direction of tow

DISTANCE NUMERIC 4 distance traveled

DIRECTION NUMERIC 1 direction relative tide 1=with

2=against 3=neither

CATCHCODE NUMERIC 1 see below

TOTALMETER NUMERIC 5 flowmeter revolutions during tow

STARTLAT NUMERIC 6 latitude at start dd mm.dm

STARTLONG NUMERIC 7 longitude at start ddd mm.dm

ENDLAT NUMERIC 6 latitude at end dd mm.dm

ENDLONG NUMERIC 7 longitude at end ddd mm.dm

The actual data collected out in the field, and checked for errors may look such as the following example:

And yes, these are data?... Yes, data! Raw, but real data!

"CHSSFB36",101,19930216,"",13.4,1,1205,3

"CHSSFB36",101,19930216,"",13.4,1,1514,44

"CHSSFB36",101,19930216,"",13.4,1,2509,1

"CHSSFB36",101,19930216,"",13.4,1,7138,6

"CHSSFB33",102,19930216,"",3.7,1,1514,1

"CHSSFB33",102,19930216,"",3.7,1,2692,2

"CHSSFB20",104,19930217,"",4.0,1,1514,9

etc.....

In order to get from data to decision-making you need to have the metadata available, and analyze the raw data to generate information and then knowledge useful to a decision maker. The following plot was used for that purpose:

Salinity Analysis at Port Chicago

Roughly ten thousands of daily electric conductivity readings were plotted against time, a statistical trend regression line was generated, and the time-value plot was generated for over 20 years worth of data.

The last part of the paper presented the systematic aspects of a GIS as well as the fourth dimension in the system: time.

S.ystem

All computerized data base systems are designed with input and output features as well as the structure of the data base. This part of the paper describes query features for an end user to get environmental data about the San Francisco Bay estuary. This part of the paper also deals with time:

The Fourth Dimension: X,Y,Z & T

Man-made and natural changes to an ecosystem can be, and must be measured over time, in order to provide forecasting and planning for the future. In order to present this concept, I showed the IEP real time monitoring homepage.

Any data which are on the net and available to the world need to be presented with descriptors about the data quality. E.g., the IEP real time monitoring disclaimer describes the following precautions:

Direct quantitative comparisons of effort corrected catch (catch per 10,000 cubic meters of water sampled) should not be made between the Real Time Monitoring sites where different gear types were used, e.g. between a site where a Kodiak trawl was used and a site where a Midwater trawl was used or between a trawl site and one or both of the water transfer facilities (SWP or CVP). Each gear type has its own sampling efficiencies and caveats as do the state and federal water projects. Qualitative comparisons of trends between the RTM data and the water transfer facilities data can be made as can quantitative analysis of data within any given gear type.

In order to retrieve real time data from the homepage the user has the following available query options:

In my presentation, I selected to create multiple maps of chinook salmon displaying all chinook salmon caught after April-4-96. The fish abundance data are displayed as circles with varying radii on the map of the Delta. If you wanted to look at one map closer, you can click on it, and it would display one map enlarged in size.

In case you have an Internet browser which allows you to view videos in the MPEG format you can display these fish abundance maps as an example of scientific visualization. This MPEG format movie visualizes the distribution and abundance of fish species over time.

URLs Used in this Presentation

I used the following universal resource locators (URLs) in my presentation:

In case you have further questions, please contact me:

Olof Hansen IEP Biological Information Specialist U.S. Department of the Interior, Bureau of Reclamation Voice (415) 744-1965 Fax (415) 744-1078
E-mail: Hansen.Olof@epamail.epa.gov

or OHansen@mp.usbr.gov