|
Statistics is the science of data! Collecting, classifying, organizing, analyzing, interpreting, etc. A variable is a characteristic that differs or varies from one observation to the next. Quantitative data are data that consist of numbers. Categorical data are data that do not consist of numbers. The number of M&M’s in a small bag is a piece of quantitative data.
The color of an M&M is a piece of categorical data.
Describing Quantitative Data Numerically
For a given set S,
example: Let S={3, -5, 2, 1, 8, -6} Then,
example: The average of set S is 3/6=.5.
The median in a set of n observations that are ordered from smallest to largest is the middle observation (if n is odd) or the mean of the two middle observations (if n is even). example: Let S1 = {1, 4, 6, 9, 10}. The median of S1 is 6.
example: Let S = {3, -5, 2, 1, 8, -6}. In order to find the median, we must first order the set S from smallest to largest. In doing so we see that S = {-6, -5, 1, 2, 3, 8}. The two middle observations are 1 and 2. The average of these two middle observations is 1.5. Thus, the median of S is 1.5.
For every median, 50% of the data falls below the median and 50% falls above the median.
For a given set of data, the pth percentile is a number x such that p% of the data falls below x. Consequently, (100-p)% falls above x. example: The median is P50, the 50th percentile. The lower quartile, QL (Q1 on the TI-83), is P25, the 25th percentile. The upper quartile, QU (Q3 on the TI-83), is P75, the 75th percentile.
Which score would you prefer to own on the next test, P10 or P80? Explain.
The mode of a data set is the value that occurs with the greatest frequency. Neither set S nor S1 have a mode. Consider the set S2 ={-1, 3, 5, -1, 8, 9, -1}. The mode of S2 is -1. A set may have more than one mode. Consider S3 = {1, 2, 3, 1, 2, 4, 1, 2, 5, 1, 2, 6, ...}. The set S3 has two modes, 1 and 2. If a set has two modes then the set is said to be bimodal. Caution must be used when consider the mode for inclusion as a summary statistic.
The range of a set of data is the minimum value subtracted
from the maximum value. The interquartile range, IQR, is Q3-Q1.
Let A = {0, 1, 2, 3, 4, 5, 500, 500, 995,
996, 997, 998, 999, 1000} let B = {0, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 1000}. Compute the mean, median, mode and range for both sets A and B.
Consider boxes A and B contain slips of papers with the entries as previously listed. A student selects a box, randomly selects a slip of paper and receives the value written on it in dollars. Which box would you select to play: A or B?
We need some other more complex statistical function. This function will be the standard deviation. The standard deviation will measure how far away the data in a set is from the average. The sample standard deviation s is computed from the formula
The TI-83/84 will quickly and easily compute many of these summary statistics.
|