Chapter 9. Error is unavoidable
Data are models, and as such, are never perfect. But there are a few standard types of errors to watch out for.
Data represent a special type of model, one that is central to the scientific method. We use data to tell us about the phenomenon we are studying. Abstract models, such as theories and hypotheses are models that help us simplify nature. But data are our surrogates of reality.
Like all models, data are "false." No matter how hard we may try, the data will never exactly match what we think they represent. Instead of referring to data as being "false," however, we say that data are "measured with error." In this context, "error" does not imply blunder (as in baseball), rather it means "variation."
Another way to look at data is this. There is one fundamental issue that underlies data collection in all applications of the scientific method: if the data were to be gathered more than once, would they turn out the same each time? We say that data are measured with error to describe the extent that attempts to record the same phenomena differ. That is, any variation that causes our measurement of something to be inexact is error.
A universal goal when using the scientific method is to reduce the error/variation so that you know what the data represent (as closely as you need). This claim may seem to contradict the statement above that error is unavoidable. But, in fact, there are ways to reduce the error. Understanding how this error can arise is the first step in reducing it.
Four types of error
There are different sources of error in data. Although error cannot be completely eliminated -- a single coin flip cannot be 50% heads even though the probability of heads may be 50% -- there are safeguards and precautions that can reduce many types of errors. However, different types of error require different safeguards. Understanding the different types of error is thus the first step in understanding those precautions.
Rounding, Precision, and Accuracy Error. Some kinds of measurements can never be made exactly, so we have to "round off" the value at some quantity that is less than exact. When a machine fails to provide a value beyond a fixed number of decimal places, we call it precision error. Consider the weight (mass) of a penny. To the nearest tenth, a penny is 2.5gm. To the nearest 0.01, it is 2.54. Using the finest balances, we could measure the mass to many more decimal places. But at some point, we would reach the limit of precision for our scale and thus be left with a rounded-off value. Or we would reach the point that the scale was no longer accurate enough to consistently give us the correct weight (accuracy error). We can never measure the mass exactly - not even to millions of decimal places, much less to an infinity of decimal places. In many branches of science, this type of error is specifically included in measurements by providing a measurement ±
(plus or minus) some smaller value, such as 101 ±
0.23 meters.
Precision and rounding error apply to many kinds of measurements - those in which we are not simply counting numbers of things: time, speed, weight, energy, volume, distance, and many others. For most non-technical applications of the scientific method, however, this kind of error is unimportant because we don't care about the value beyond a few decimal places. In economics, for example, a company is not likely concerned about the cost to produce an item to the fraction of a cent. And our monetary system forces each of us to accept rounding error because we cannot pay in fractions of a cent. Rounding error even applies to the estimation of percents and probabilities: the probability of heads in a coin toss cannot be calculated more precisely than the number of tosses (e.g., it is impossible to obtain 50% heads with an odd number of tosses).
Sampling Error -- random deviation from an average. Another source of error comes from sampling only some of the data in which we are interested. Consider again a coin toss. If the probability of heads was exactly 1/2, and we tossed the coin 4 times, there is only a 3/8 chance that we would get 2 heads and 2 tails (1/8 of the time we would obtain all heads or all tails). The reason is sampling error. As a second example, we might be interested in the percent student attendance in lecture. The average attendance might be 60%, but attendance on some days would certainly be higher than on other days. Again, we would attribute this variation to sampling error. In both cases, the data we gather in one trial would not generally match exactly the data we gathered in other trials. The issue here is not in our ability to count accurately -- we know how many heads and tails we got or how many people attended class. Rather, the error lies in the fact that what actually happens one time is not the same as what happens the next, even though the underlying rules or probabilities are the same.
Sampling error is a widespread phenomenon that is often ascribed to random "noise" and unmeasured variables. In the case of a coin toss, the outcome of the toss is usually attributed to random noise. In the case of student attendance, there would undoubtedly be reasons why each non-attending student missed class, but the reasons would be too diverse to measure and thus be attributed to unmeasured variation.
Sampling error is universal, although its importance may vary greatly from case to case. The way to reduce sampling error (discussed in the next chapter) is to make many observations and to obtain an average that swamps out most of the sampling error made in each observation. Sampling error is a big problem in studies of environmental hazards (e.g., cancer-causing agents), because only a low percentage of people develop any specific kind of health problem, so we need large samples to overcome the sampling error. For example, if we observe 1 excess case of cancer in one million people who eat bacon and 0 excess cases of cancer in people who avoid bacon, we can't infer that the cancer rates differ between the two groups because sampling error would give us this result 50% of the time if there was no difference between the groups. We would need a sample size about 10 times larger than this to overcome sampling error.
Technical and human error. Our machines and our abilities to record data are not foolproof. Technicians handling hundreds of tubes, loading samples, and labeling samples can and do make mistakes. A common example occurs in televised football games, in which an official misreads a play and inappropriately assigns penalties. And a machine which has been calibrated wrong or whose calibration has drifted will also give erroneous data - the Hubble space telescope gave fuzzy pictures during the first few years of its operation due to faulty assembly.
Some machines and people are obviously less error-prone than others, and indeed, some technicians may never actually make any mistakes in their career. But there is always the possibility of error, and no amount of observations on any machine or person can show that a mistake is impossible (recalling our points about sampling error above).
Unintentional Bias. Biases are consistent differences between the data gathered and what the data are thought to represent. In particular, bias is a tendency of the data to fall more on one side of the average than the other, which distinguishes it from sampling error. Whereas sampling error tends to balance itself out in the average as more observations are gathered, bias persists -- when data are biased, gathering bigger samples means that the average of the data is certain to differ from the expected average (or the true average). For example, opinion polls are often conducted over the telephone. Data gathered in these surveys do not represent people who lack telephones, and those data would be biased if people lacking phones had consistently different opinions than people with phones. Or consider the frequency of people carrying the AIDS virus. At this time, the frequency in the U.S. population is thought to be something like 1 in 200. But the frequency of people with this virus would be much higher in some groups than in others (prostitutes versus nuns, for example). The data for one subgroup would then be a biased model of the population at large; this bias would be important when calculating the chance of acquiring the virus from a sexual encounter with someone who has lots of other sexual partners, for example.
Unintentional bias is easy to confuse with sampling error. Remember that bias represents a deviation consistently to one side. As an analogy, think of sighting-in a rifle. If the rifle sights are mis-aligned, the average of the bullets will consistently lie to one side of the bull's-eye, no matter how many shots are fired. This is analogous to bias. Where-ever the sights are set, however, bullets will lie in a cluster around the average point of impact; this scatter around the average is akin to sampling error.
Biases may occur by mistake
or deliberately. In this chapter,
we restrict attention to accidental
or unintentional bias. In a subsequent
module, we deal with the problem of deliberate bias,
as when people intentionally attempt to deceive.