1 mark c. rains 1, len vacher 1 and marian norris 2 1 department of geology, university of south...

19
1 Mark C. Rains 1 , Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center for Urban Ecology, Washington, DC 20007 © 2011 University of South Florida Libraries. All rights reserved. SSACgnp.TD367.MCR1.5 Core Quantitative Issue Average, mean, median, mode Supporting Quantitative Issues Making and reading graphs Thresholds Core Geoscience Issue Water quality (nutrient limitation; eutrophication) Tom Paradis & National Park Service This material is based upon work supported by the National Science Foundation under Grant Number NSF DUE- 0836566. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Nitrate Levels in the Rock Creek Park Watershed, Washington DC, 1: Measures of Central Tendency

Upload: spencer-julius-paul

Post on 30-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

1

Mark C. Rains1, Len Vacher1 and Marian Norris2

1Department of Geology, University of South Florida, Tampa, FL 336202National Park Service, Center for Urban Ecology, Washington, DC 20007© 2011 University of South Florida Libraries. All rights reserved.

SSACgnp.TD367.MCR1.5

Core Quantitative IssueAverage, mean, median, mode

Supporting Quantitative IssuesMaking and reading graphsThresholds

Core Geoscience IssueWater quality (nutrient limitation; eutrophication)

Tom Paradis & National Park Service

This material is based upon work supported by the National Science Foundation under Grant Number NSF DUE-0836566.Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Nitrate Levels in the Rock Creek Park Watershed, Washington DC, 1: Measures of Central Tendency

Page 2: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

2

Getting started

After completing this first of two modules introducing basic statistics describing a data set, you should be able to:

• Calculate and describe the differences between the three main measures of central tendency: mean, median and mode.• Know how to create a plot of monthly means and medians in a graph that also shows the individual data points. • Describe why nitrate and phosphate are included in water quality monitoring programs.• Use the word eutrophication, nutrients, microbes, and oxygen in a sentence about surface water pollution.

And you should also know where Rock Creek Park is.

Washington DC

2

Page 3: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

Rock Creek Park is one of the oldest federal parks, having been established in 1890 long before the National Park Service was created in 1916. When established, Rock Creek Park was already a favorite rural retreat for residents of Washington DC. The establishing legislation states that the park will “provide for the preservation from injury or spoliation of all timber, animals, or curiosities…and their retention in their natural condition, as nearly as possible.” Today, Rock Creek Park is one of the largest urban parks in the world, being entirely embedded within the Washington DC, metropolitan area. Nevertheless, Rock Creek Park is 81% forested, contains extensive wetland and stream habitats, and provides outdoor recreation opportunities to many millions of visitors every year.

3

The Setting – Rock Creek Park

Tom Paradis, National Park ServiceTom Paradis, National Park Service National Park Service

National Park Service National Park Service

Page 4: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

4

The Challenge of Urban Parks

Urban parks, such as Rock Creek Park, pose unique challenges to Park managers. Urbanization fundamentally impacts the physical, chemical, and biological structure and function of the environment. Those impacts are not restricted to the urban environment; rather, those impacts can be transferred into nearby natural environments, such as urban parks, through a variety of pathways.

Urbanization changes the quantity and quality of water flowing off of the land surface. Precipitation falls, perches on impervious surfaces, and flows rapidly to nearby lows. Along the way, it collects pollutants from both point sources (e.g., stormwater outfalls) and non-point sources (e.g., fertilized gardens and lawns).

Impacts to water quantity and quality are readily transferred to downstream natural environments. This is particularly true for rivers, because rivers serve as the major features by which natural and urban landscapes are ultimately drained.

US Environmental Protection Agency

Page 5: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

5

Nutrient Limitation and Eutrophication

All organisms require nitrogen (N) and phosphorus (P). N is in the amino acids that compose all proteins, in nucleic acids that compose DNA, and in many other essential organic and inorganic compounds. Similarly, P is in the nucleic acids that compose DNA, in the ATP used to store energy harvested from sunlight or food, and in many other essential organic and inorganic compounds including bones and teeth.

N and P are both essential nutrients; organisms cannot grow if one or the other is missing. Imagine an ice cream shop that makes root beer floats, for which both root beer and vanilla ice cream are essential ingredients. If business is brisk, then they may run out of either ingredient, in which case they can no longer make root beer floats no matter how much of the other ingredient remains.

Cornell University Cooperative Extension

N and P are so essential that they are two of the three essential nutrients that are included in all commercially available fertilizers. (The other is potassium.) By law, the percent by weight of N and P in the fertilizer is expressed in the first two of three numbers listed on every fertilizer container. For example, the imaginary product to the left contains 21% and 3% of N and P, respectively.

Page 6: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

6

Nutrient Limitation and Eutrophication

Nutrient limitation is the condition where one essential nutrient is lacking. In aquatic ecosystems, the limiting nutrient is typically N or P. Therefore, primary productivity (e.g., growth by photosynthesis) can be spurred by the addition of N and/or P. This process is called eutrophication, and the primary beneficiaries are typically phytoplankton which can cause pronounced algal blooms. Algal blooms are typically green, but can also be yellow-brown or red, depending upon the species of algae. When phytoplankton die, dead biomass is consumed by microbes, which simultaneously consume oxygen. Decaying biomass during algal blooms can deplete oxygen to the extent that aquatic organisms such as fish can be suffocated.

Florida International University

Page 7: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

7

The Problem

After being cut around the time of the Civil War, the Rock Creek watershed returned to forest. Remnants of this regrowth remain in the Park, with Virginia pine, loblolly pine, American beech, white oak, and many other species represented in the uplands and American sycamore, green ash, and many other species represented in the wetlands and alongside the streams (Endnote 1). However, few remnants of these forests remain elsewhere in the watershed. Instead, much of the watershed is covered with impervious surfaces, such as rooftops, driveways, sidewalks, parking lots, and roads. (See below left.) Runoff from these impervious surfaces is directed to a dense network of combined sewer/stormwater pipes, approximately 350 of which discharge to or adjacent to Rock Creek Park.(See below middle and right.)

Carruthers et al. 2009

Park managers want to know if the water quality in Rock Creek is impaired, particularly with regards to the limiting nutrients N and P. How could they do so?

Page 8: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

Monitoring Vital Signs, 1

Recall your last doctor visit; prior to seeing the doctor, a nurse likely measured your vital signs such as your height, weight, pulse, blood pressure, and temperature. Now recall previous doctor visits; prior to seeing the doctor, a nurse likely measured those same vital signs during those visits, too. They do so to ensure that your vital signs don’t exceed thresholds at those points in time and to ensure that your vital signs don’t change precipitously and/or erratically over the course of time. Like doctors, Park managers monitor vital signs of National Park Service resources for the same reasons. In Rock Creek Park, Park managers monitor multiple metrics in four broad categories: water quality, air quality, biodiversity, and ecosystem pattern and process. A recent summary of conditions, titled Rock Creek Park Natural Resource Condition Assessment, was published in 2009 (Carruthers et al. 2009). The full reference for this report is included at the bottom of this slide.

Carruthers, T., S. Carter, L.N. Florkowski, J. Runde, and B. Dennison. 2009. Rock Creek Park Natural Resource Condition Assessment. Natural Resource Report NPS/NCRN/NRR—2009/109. Natural Resource Program Center, Fort Collins, CO.

8

Return to Slide 17.

Page 9: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

Monitoring Vital Signs, 2

Park managers monitor water quality at 11 locations in the Rock Creek watershed, with one location on Rock Creek and 10 locations on tributaries to Rock Creek. (The Montgomery County Department of Environmental Protection monitors water quality at two other locations in the Rock Creek watershed, but we will not be using these data in this module.) Of particular interest are nitrate (NO3

-), which is typically the most common form of N found dissolved in water, and total phosphorus (TP), which is the sum of all forms of P found dissolved or suspended in water.

9

Now let’s get the data and see if they exceed our vital sign thresholds.

University of Maryland Integration and Application Network

Page 10: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

Let’s Get the Data

10

The data are in three columns and 201 rows. The columns are: the month the samples were collected (Date), the nitrate concentration (NO3), and the total phosphorus concentration (TP). The rows are in chronological order by the month the samples were collected. They begin with a group of measurements made in January 2006 and end with a group of measurements made in December 2007. In some months, samples were not collected at every location; in other months, samples were collected at one or more locations more than one time. Therefore, the groups of measurements range from a minimum of eight measurements to a maximum of 15 measurements. No data were collected in December 2006, February 2007, or August 2007. Nitrate is reported in mg/L (i.e., milligrams of nitrate per liter of water), and total phosphorus is reported in g/L (i.e., micrograms of total phosphorus per liter of water). (Note that 1 mg is equal to 1000 g.)

There’s a lot of information in that table (201 × 3 pieces of information). How can we visualize the data? We look at the NO3 data.

Click on the Excel worksheet to the right and save immediately to your computer. Complete the spreadsheets at each of the tabs starting with “Slides 11-13.” Yellow cells contain given values, and orange cells contain formulas. The spreadsheet at the “EOM Answers” tab is for your answers to the end-of-module questions.

Page 11: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

To start we can simply plot the 201 NO3 values in the order they are listed in the data table.

Looking at the nitrate data

One can do this effectively with a line graph, but we choose a scatter plot so that we can do some manipulations with the graph in later slides.

You can save yourself considerable grief if you add a column and convert the months to numbers, namely a count of successive months starting with the first (Jan 2006) and skipping the numbers for months where no data were recorded (Dec 2006, Feb 2007, Aug 2007).

One way of doing this easily is to place a 1 in the top cell of the new column (B4), and the equation =B4 in the next cell (B5). Next, copy down to the bottom row to produce a column of 1s. Then manually enter the next month when there is a change of month (e.g., enter 2 in Cell B14 and 3 in Cell B24). 11

Page 12: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

12

To break the line, simply insert a row after each month.

The graph on the previous slide nicely shows the range of NO3 values for each month. But you may have noticed that there is a connecting line segment from one vertical range line to the next. These connectors don’t really signify anything important. They merely connect the last data point of one month to the first data point of the next month. The reason is that the entire plot is a single line from the first data point of Jan 2006 to the last data point of Dec 2007. We need to break the line!

Obviously, the NO3 values go up and down from month to month. How might you visualize those changes? There are still 201 NO3 data points. Let’s look at a summary or representative value for each month.

Cleaning up the graph

Page 13: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

13

We can calculate and plot the average for each month in Column D next to the last data point of the month.

An average is a summary or representative value of a list of numbers.

=AVERAGE(C4:C13)

Caution: if you copy and paste to produce the subsequent averages (e.g., in Cell D24), be sure to check that the range of the cell equation is correct. Not all of the months have the same number of data points.

Adding monthly averages

Right click on the graph, select “Source Data” and expand “Data Range” to include all of Column D.

Page 14: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

14

The word “average” is a non-technical word for a central, representative value. What Excel actually calculates with the =AVERAGE function is the arithmetic mean: the sum of the values in the list divided by the number of values.

There are two other commonly used central, representative values. The median is the middle value. The mode is the most frequent value. In symmetric distributions, as in the bell curve of the normal frequency distribution (which you will see in part 2 of this two-part set of modules), the mean, median, and mode are all the same. Many distributions, however, are not normal distributions. Many distributions are skewed – described by lopsided bells, as we shall see in Slide 16 for the total NO3 data set.

A note about averages

You may notice that the arithmetic mean (reddish circles) is not necessarily in the center of the range of values. Also, in many cases, it does not coincide with the middle value. For example, it is common for there to be more data points below the mean than above it (e.g., the Jan 2006 and April 2007 data are analogous to the statement, “most of the students are below average.”).

14

Page 15: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

Adding the monthly median values

It’s worth adding a column (E) to calculate and plot the median values for each month. The equation for Cell E13 is =MEDIAN(C4:C13). After calculating all the medians, expand the range of the graph as you did before.

Median

Mean

15

Page 16: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

16

Mode = 2.0 mg/L

Median = 2.3 mg/L

Mean = 2.7 mg/L

More about mean, median and mode

Here is a frequency histogram of all the 201 nitrate values. The three measures of central tendency (as they are called) are listed on the figure.

The arithmetic mean is the center of mass of the distribution of values. Imagine the x-axis to be a see-saw, and the units of the standing bars to be unit weights. The balance point would be at 2.7 mg/L.

The median is the middle value. If you arranged the 201 values in order from largest to smallest, the 101st would be 2.3 mg/L (note there are actually eight data points with 2.3 mg/L).

The mode is the highest standing bar. There are 12 data points with 2.0 mg/L.

This is a case of an “asymmetric bell curve” that is nicely lopsided to the right. The mean is larger than the median, which is larger than mode. The distribution is positively skewed, meaning it is lopsided toward the larger values. (This is a very common distribution in hydrologic data!)

We have a lot to say about this weird value in part 2.

Page 17: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

17

What about the threshold?

The threshold can be shown on the graph by adding another column, two rows, and entries to define the start and end of a horizontal line across the figure. Do this by right-clicking on the graph and choosing select data. Add a series titled “threshold.” Select the cells F2 and F3 for the Y value and set the X value to -1, 30. Be sure to format the x axis to create a graph matching the one shown above.

The Assessment Report (Slide 8) lists thresholds for each of 22 water-quality metrics of the natural resource study. The threshold indicates a numerical value at and above which the water quality is considered impaired. For nitrate, the threshold was selected as 2 mg/L. (The threshold for TP was selected as 36.56 g/L, which you will need later in this module.) (Endnote 2) The graph shows that 12 of the monthly averages and the vast majority of the

single measurements exceed the threshold.

Here’s that weird value again

17

Page 18: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

18

End-of-Module Assignment

1. Turn in your spreadsheet and graph for the NO3 values, means, and threshold from Slide 17.

2. Turn in a spreadsheet and graph for the TP values, means, and threshold analogous to the presentation of the NO3 data.

3. For how many months did average NO3 concentrations exceed the threshold value? For how many months did average TP concentrations exceed the threshold value?

4. For how many months did all NO3 concentration measurements exceed the threshold value? For how many months did all TP concentration measurements exceed the threshold value?

5. Consider the following block of data: 1, 2, 2, 2, 3, 4, 4, 4. What is the median? What is the mode if you determine it by eye? What is the mode if you determine it using the =MODE() function in Excel with the items listed vertically in ascending order? What is the mode if you determine it using the =MODE() function in Excel with the items listed vertically in descending order?

6. Recall from Slide 7 that Park managers want to know if the water quality in Rock Creek is impaired, particularly with regards to the limiting nutrients N and P. What does your analysis of these data suggest? Suggest and briefly discuss one way that Park managers could address this problem.

Page 19: 1 Mark C. Rains 1, Len Vacher 1 and Marian Norris 2 1 Department of Geology, University of South Florida, Tampa, FL 33620 2 National Park Service, Center

Endnotes

19

1. The Nature Conservancy in conjunction with the Federal Geographic Data Committee and the Ecological Society of America Vegetation Subcommittee has developed a detailed description and map of the vegetation of Rock Creek Park (http://biology.usgs.gov/npsveg/rocr/index.html). The final product is intended to provide information that National Park Service managers need in their myriad day-to-day operations. Specifically, the final product is intended to inform management decisions related to ensuring the persistence of the native plant and animal species in light of human use and the related invasion of exotic species. Return to Slide 7.

2. These thresholds were set based upon known or suspected relationships between NO3 and TP and ecosystem health. The NO3 threshold was based upon a published relationship between NO3 concentrations and a benthic index of biotic integrity, which is an index calculated using data on the numbers and abundances of invertebrates found on a stream bed. (Invertebrates are animals without backbones, such as insects, worms, and molluscs. Vertebrates are animals with backbones, like amphibians, fish, and you.) The TP threshold was based upon a US Environmental Protection Agency ecoregional criterion, which in turn was based upon a desire to maintain TP concentrations everywhere at levels similar to those observed in undeveloped watersheds Return to Slide 17.

19