Chapter 23. Deliberate bias: Conflict creates bad science
When proper scientific procedure is undermined by conflicting
goals so that it results in deception, we say it is biased. This form
of bias is prevalent in advertising - companies universally advocate
their products, emphasizing product assets while concealing product
faults and concealing the advantages of competitor products. You
don't expect an advertisement from a private company to offer a fair
appraisal of the commodity. But bias even exists in the way that
states promote their lotteries by advertising the number of winners
and money "given away" without telling you the number of
losers and money taken in.
Deliberate bias occurs in science as well as in business and
politics. The potential for bias arises when a scientist has some
goal other than (or in addition to) finding an accurate model of
nature, such as increasing profits, furthering a political cause, or
protecting funding. As an example, the environmental consulting firm
that renders an opinion about the environmental effects of the latest
subdivision may not give a completely accurate assessment. The future
work that this firm receives from developers is likely to depend on
what they say. If the developers don't like the assessment, they will
likely find another environmental next time. Thus there is a conflict
between obtaining an unbiased model, and making the largest profit
possible. In these cases, the scientists are motivated to present
biased arguments. There are many situations which present a conflict
between scientific objectivity and some other goal.
Drugs and medicine
We take for granted an unlimited supply of medicinal drugs. If we
get pneumonia, gonorrhea, HIV, or cancer, the drugs of choice are
invariably available. They may not cure us, but whatever drugs we
know about (that have been approved) are in abundant supply.
For the most part, this abundance of and reliance on drugs comes
from public trust of health care. We don't imagine that our doctors
try to prescribe us useless or unnecessary drugs (if anything, the
patient often requests drugs when they are unnecessary). But in
reality, many drugs ARE unnecessary, and some drugs are no better
than cheaper alternatives. The Food and Drug Administration (FDA) is
charged with approving new foods and drugs for the U.S. In 1992, it
was approving about 20 new drugs a year but regarded only about 20%
of those as true advances. So many drugs are no better than
alternatives (many are obviously at least slightly worse than
alternatives). And physicians often don't have the evidence to know
which drugs are best.
The goals of consumers are in conflict with those of drug
companies in some respects.
The consumer wants drugs that
are
- cost-effective
- safe
- with few side-effects.
If two drugs are equally effective, we want the cheaper one. We may
even not want the most effective drug if a cheaper one will do the
trick.
But the goals of any drug company
are different:
which may involve
- company reputation
- a successful treatment but not necessarily a cure
- low risk of liability claims (which may include drug safety and low
side-effect)
It costs to hundreds of millions of dollars to get a drug approved
by the FDA now. Much of the cost is in research and trials, but even
FDA consideration itself costs millions. So it is not cheap. Most
important is time, because the sooner a drug hits the market, the
sooner the company reaps the benefits. So drug companies have strong
incentives to market any product that is approved by the FDA -- once
approved, the major costs of money (and time) have already been
borne.
Of course, it does not behoove a company to market a harmful
product -- liability costs can be quite high. But most products that
pass all the hurdles of FDA approval can be regarded as harmless at
worst. The drug company has a very strong incentive to market its
approved products regardless of whether the consumer benefits or not.
One of the most economically successful drugs ever was an ulcer
medicine that reduced suffering. It did not cure ulcers but instead
had to be taken as long as the patient had the ulcer. Research later
found that most ulcers were caused by a bacterium and that treatment
with antibiotics cured the ulcer. So the original ulcer treatment was
based on a misunderstanding of the cause of ulcers.
The drug industry is a relatively new one. Yet is it also a huge
one economically. Perhaps as a consequence of this high-dollar label,
its practices have come under scrutiny in the last decade or so. In
spite of the public trust of health care in this country, many
practices of the drug industry are not in the best interest of the
patient. They instead seemed to be directed at promoting and selling
products, and for a prescription drug, the key player in drug sales
is the physician. Drug companies have thus made huge efforts to
influence physicians toward selling company products. Some of the
major (legal) practices that have been uncovered include:
1) Drug companies have paid for
university research on their products and have then blocked
publication of unfavorable results and/or cut continued funding of
the work when the results began to look bad.
2) Pharmacy sales people routinely visit
physicians at work, offering them free lunches, free samples of
medicines, gifts, information to promote their products, and notepads
and pens with company logos. (Next time you visit a physician, look
around the inner rooms for evidence of company logos on charts, pens,
pads, and so on.)
3) To maintain their licenses, physicians
are required to take courses in continuing medical education (CME).
These courses are often sponsored and paid for by drug companies in
exotic locations and with hand-picked speakers who provide favorable
coverage of company products.
4) Drug companies publish ads in medical
journals that look and read like research articles. These ads promote
products.
These practices are not in the interest of good medicine, unless
one assumes that what is good for the drug company is good for us.
DNA
A second, well-documented case in which conflict is manifested is
over DNA typing. These examples may not reflect current debate over
DNA technology, but one should use them to appreciate the strong
potential for conflict over any scientific issue in the legal system.
The rush to implement DNA typing in the U.S. criminal system was
done before guidelines were set for proper DNA typing procedures.
Consequently, there were varying levels of uncertainty in the use of
these methods by law enforcement agencies and commercial labs. They
were also reluctant to admit the uncertainty. The manifestation of
conflict over DNA evidence was thus heated and surfaced in the
popular press on several occasions. We introduce this material in a
prospective manner - by first imagining how the prosecution, defense,
and forensic lab can be expected to behave to achieve their goals in
ways that are contrary to fair scientific procedure. You have already
been exposed to the nature of the models and data in DNA typing, so
now the issue is how the different legal parties deal with the
problems in evaluation and ideal data.
If a case that uses DNA typing has come to trial, we can assume
that the DNA results support the prosecution's case. There are thus
three parties in conflict:
Prosecution <<< (in
conflict with) >>> Defense <<< (in
conflict with) >>> DNA lab
We can assume that the DNA lab's
results support the Prosecution's case, or there would not be a
trial, so the conflict will lie between the Defense and the other two
agencies. Now consider how this conflict might be manifested.
I) What might the prosecution do to
improve the chances of reaching its goals?
- 1) eliminate test procedures that
benefit the suspect (eliminate standards; build a case of
circumstantial evidence to shield criticism of the DNA fingerprint
evidence).
- 2) harass or impede prior
witnesses for the defense who might testify in the future.
- 3) keep a list of sympathetic
expert witnesses
- 4) Maintain positive relationships with labs that have
contributed to convictions in the past.
II) With respect to the errors and
uncertainties of DNA evidence in specific cases:
1) keep a list of sympathetic
expert witnesses
2) emphasize all inconsistencies
in the DNA analyses as evidence of innocence
3) question all assumptions, original data, calculations
1) Produce results that enhance
the goals of their economic benefactor, including
2) establish a reputation for a
lack of indecisiveness; overstate case
3) defend its initial conclusions
Evidence from DNA cases
The case histories available from the last 4-5 years of DNA
forensics verify many of these expectations. In particular, DNA
testing has omitted such basic elements as standards and blind
procedures (I.1 above); the prosecution in the Castro case ignored
inconsistencies in the evidence (I.5, I.6); the lab in the Castro
case overstated the significance of a match and defended such
practices as failing to include a male control for the male-specific
probe (III.2, III.3). Defense and prosecution agencies definitely
keep lists of sympathetic witnesses (I.3, II.1), and defense agencies
indeed choose witnesses to challenge the nature of DNA evidence based
on its (necessarily false) assumptions (II.3). And finally,
harassment by the prosecution of experts who testify for the defense
is well documented, both in the courtroom and outside (I.2). This
harassment includes character assassination on the witness stand,
implied threats to personal liberties for witnesses who were in the
U.S. on visas, and contacts made to journal editors to prevent
publication of papers submitted by the witness. Some of these cases
have been described in the popular press, and others are known to us
through contacts with our colleagues. These latter manifestations of
conflict don't make sense in the context of a single trial, but they
stem from the fact that networks exist in which prosecuting attorneys
share information and parallel networks exist among defense
attorneys. Expert witnesses are often used in multiple trials across
the country, so any expert witness who is discouraged at the end of
one trial will be less likely to participate in future trials, and
conversely, an expert who does well in a trial may be more likely to
participate in future trials.
The suggestions of harassment have even extended to scientists who
merely publish criticisms of forensic applications of DNA typing. In
the 1991-92 Christmas break, two papers were published in Science on
opposites of the DNA fingerprinting conflict (Science 254: 1745-50,
and 1735-39). At the same time, news items were also published in
Science, Nature, The New York Times, and The Washington Post, in
which the details of this conflict were aired in full detail. The
authors of the paper opposing use of current DNA typing methods
(Lewontin & Hartl) were phoned by a Justice Department official
and allegedly threatened with retaliation (having their federal
funding jeopardized); the official denied the threats but did not
deny the phone call. The Lewontin-Hartl article had apparently been
leaked in advance of publication, and an editor for Science contacted
the head editor to have a rebuttal published. However, it turned out
that the editor requesting the rebuttal owns a patent for DNA
fingerprinting and stands to benefit financially by forensic use of
DNA typing methods. The two authors chosen for the rebuttal had been
funded by the FBI to work on DNA fingerprinting methods. So, there
appears to have been some serious conflicts of interest at least on
one side of the issue.
This treatment of conflict in DNA trials has omitted a 4th
category of people whose goals may conflict with the agencies above:
the expert witnesses themselves. The goals of expert witnesses may
be varied, including monetary (the standard rate is $1000/day in
court), notoriety, and philosophical (e.g., some people volunteer to
assist the defense on a no-cost basis, merely to ensure a fair
trial).
EMFs
The last specific example to be discussed concerns some old
examples of conflict between power companies and population exposure
to high-intensity EMFs. The primary conflict has been between the
people exposed to EMFs (e.g. those living under high-voltage power
lines) and corporations or government agencies responsible for
various aspects of electric power. In a series of three articles in
The New Yorker (12, 19, and 26 June, 1989), Paul Brodeur
described in a somewhat sensationalist manner how various agencies
dealt with the potentially damaging evidence that EMF-emitting
utilities may be harmful, the various extents to which they attempted
to obscure and suppress evidence contrary to their views, along with
the suspicious ill fates that befell the careers of scientists who
testified in opposition to these organizations. For example, the Navy
"classified" a report of such evidences to prevent its
dissemination. The New York Power Company attempted to discredit
scientists testifying against it. Witnesses testifying against the
NYPC lost appointments and funding. The President of the National
Academy threatened to sue the Saturday Review for an article
criticizing its own investigation of this matter. Fortunately, the
suppression of possible harmful effects of EMFs happears to have
subsided, and power companies have even recently funded research into
the problem.
The footprints of bias: Generalities
The attempt to deceive in science may take many specific forms. At
a general level, arguments may avoid the scientific method entirely,
or they may instead appear to follow scientific procedure but violate
one or more elements of the scientific method (models, data,
evaluation, revision). The next few sections of this chapter describe
different kinds and levels of possible bias at a level that
transcends specific cases. These generalities are useful in that they
enable you to detect bias without knowing the specifics of the
situation.
The standard scientific approach to evaluating a model is to
gather data. If you suspect bias (e.g., you doubt the claim of
support for a model), the ideal approach is to simply gather the
relevant data yourself and evaluate the claim. But this approach
requires time that none of us have (we can't research everything). In
many cases, blatant examples of bias can be detected by noting some
simple features of a situation..
Look for Conflict of Interest
The first and easiest clue to assist you in anticipating
deliberate bias is conflict of interest. If another party's goal
differs from your goal, and your goal is to seek the
scientific "truth", then there is a good chance that that
party is biased -- just as you may be biased if YOUR goal differs
from seeking scientific truth. Service on many Federal panels
requires a declaration of all conflicts of interest in advance (and
you will be excused from consideration where those conflicts lie).
That is, the government avoids the mere appearance of bias based on
the existence of conflict, without looking for evidence of actual
bias. However, in our daily lives, we are confronted with conflict at
every turn, and we can't simply avoid bias by avoiding interactions
involving conflict (e.g., every time you make a purchase, there is a
conflict of interest between you and the seller). Thus, being aware
of conflict is a first step in avoiding bias, but you can also
benefit by watching for a few symptoms of bias.
Non-scientific arguments (blatant bias)
Sometimes, someone is so biased that they resort to lines of
reasoning and argumentation that are clearly in violation of science.
These cases are easy to expose, because they can be detected without
even looking at data or analysis. And many of them are already
familiar to you, as given in the following table:
Arguments in Violation of the Scientific Method
|
Appeal to authority
|
Appeal to authority is the defense of a model by indicating that the model is endorsed by someone well known (an authority). A model should stand on its own merits. The fact that a
particular person supports the model is irrelevant, though the
specifics of what they have to say may assist you in evaluating
the model.
|
|
Character assassination of opponent
|
Character assassination is the attempt to discredit someone's character (e.g., point out that they associate with undesirable people, etc.). The character of somebody is irrelevant to the evidence they
present that supports or refutes the model. We should evaluate the
evidence, not the person presenting it.
|
|
Refusal to admit error
|
Refusal to admit error is the refusal to specify the conditions under which a model should be rejected or the refusal to accept its refutation in the face of solid evidence against it. All models are false, and anyone who refuses to discuss how
their model could be seriously in error is obscuring a fair
appraisal of their model (or is using an unfalsifiable model)
|
|
Identify trivial flaws in an opponent's model
|
This violation refers to the practice of searching for unimportant details about a model that are false, and using those minor limitations as the basis for refuting the model. The fact that all models are false does not mean that all are
useless. Yet it is a common trick of lawyers to harp endlessly on
the fact that a particular model advocated by their opponent is
not perfect and thus should be abandoned.
|
|
Defend an unfalsifiable model
|
A model must be falsifiable to be useful. "Falsifiable" merely means that it could be refuted if the data turn out to be a certain way. An unfalsifiable model is one that cannot be refuted no matter how the data turn out. Creationists, for example, adopt and then defend an
unfalsifiable model. An unfalsifiable model is one that is framed
so that we could never gather data to show it is wrong. By
contrast, science is predicated on the assumption that all models
will eventually be overturned.
|
|
Require refutation of all alternatives
|
A special case of defending an unfalsifiable model, this one is
subtle. It is takes the form of insisting that a class of models is correct until all variations of them have been rejected. As an example, we might refuse to accept that the probability of Heads in a coin flip is 0.5 unless we reject all alternatives to 0.5. Whereas it is possible to refute that the probability of Heads in
a coin flip is 1/2, it is impossible to refute that the
probability of Heads is anything other than 1/2, because that
would mean showing it is exactly 1/2. (It would take an infinite number of flips to reject everything other than 1/2.) This argument also takes the form
of claiming that there is some truth to a model until it has been shown that there is nothing to it at all.
|
|
Use anecdotes and post hoc observations
|
This category represents a non-systematic presentation of special cases made in defense of (rather than as a test of) a particular model.
An anecdote is an isolated, often informal observation made without a systematic, thorough evaluation of the available evidence.
As a selected observation, it is not necessarily representative of a systematic survey of the relevant observations.
Post hoc observations are observations made after the fact, often to bolster a particular model. It is easy to select unrepresentative data that support almost any model.
|
Perhaps the most subtle but useful of these points is the refusal
to admit error. In science, models are tested precisely because the
scientist acknowledges that every model has imperfections which may
warrant its abandonment. Someone who is trying to advocate a model
may want to suppress all doubt about its imperfections and thus
suggest that it can't be wrong. That attitude is a sure sign that the
person is biased. Of course, in many cases you will already know that
the person is biased (as with a car salesperson), and the best that
you can hope for is to determine how much they deviate from
objectivity.
Subtle violations of the scientific method: Experimental design
The template for ideal data presented earlier is a strategy for
producing data with a minimum of bias. But the template can be
applied in many ways, and someone with a goal of biasing data can
nonetheless adhere to this template and still generate biased data.
Let's consider a pharmaceutical company testing the efficacy of a new
drug. How many ways can we imagine that the data reported from such a
study might be deliberately biased, when the trials are undertaken by
the company that would profit from marketing the drug? The following
table lists a few of the possibilities.
Bogus designs
|
Violation of accepted procedure
|
Impact
|
|
Change design in mid-course
|
An investigator may terminate an experiment prematurely if it
is producing unwanted results; if the experiment is never
completed, it will not be reported.
|
|
Assay for a narrow spectrum of unlikely results
|
The public well being is many-faceted, and a product is
unlikely to have a negative impact on more than a few facets. With
advance knowledge of the likely negative effects (e.g., a drug
causes brain cancer), a study can be designed to purposefully omit
measuring those negative effects and focus on others (e.g., colon
cancer). Were the subjects a fair sample of the relevant
population? The medicine might be more effective on some age
groups than others, so the study might be confined to the most
responsive age groups (determined in preliminary trials). While
the data would be accurate as reported, details of the age group
might be omitted to encourage a broader interpretation of the
results than is warranted.
|
|
Protocol concealed
|
It is easy to write a protocol that conceals how the study was actually conducted in some important respects. For example, was a blind design really used? Although a blind design exists
on paper, it is possible to let patients and staff know which
patients belong to which groups. Indeed, patients can sometimes
determine whether they are receiving the drug or placebo. Were the
controls treated in exactly the same manner as the group receiving
the medicine? It is possible to describe countless ways in which
the control group and treatment group were treated similarly, yet
to omit ways in which they were treated differently. The medicine
might be given along with some other substance that can affect
patient response, with this additional substance being omitted
from the placebo.
|
|
Small samples
|
Science often assumes "innocent until proven guilty"
in interpreting experiments designed to determine if a product is
hazardous. Small samples increase the difficulty of demonstrating
that a compound is hazardous, even when it really is.
|
|
Non-random assignments
|
Most studies, especially those of humans, begin with enough
variation among subjects that random assignment to control or
treatment groups is essential to eliminate a multitude of
confounding factors. Clever non-random assignments could produce a
strong bias in favor of either outcome.
|
|
Pseudo controls
|
There are many dimensions to the proper establishment of
controls, including assignment of the control groups and
subsequent treatment of the controls. It is possible to describe
many ways in which a control is treated properly while omitting
other ways in which a control is treated differently.
|
We can obviously find additional ways to bias the outcome of
tests. Short of undertaking the study yourself, or having a neutral
organization conduct the study, there are always ways to present a
biased model comparison while nonetheless being absolutely truthful
about the experimental design.
Biased model evaluation
Even when the raw data themselves were gathered with the utmost
care, there is still great opportunity for bias. Bias can arise as
easily during data analysis, synthesis and interpretation, as during
data gathering. This idea is captured in the title of a book
published some years ago, "How to Lie With Statistics." Two
methods of biasing evaluation are (i) throwing out some of the
results, and (ii) searching for a statistical test to support a
desired outcome.
Throwing out results. We often assume that a study reports
all relevant results. But studies often have (valid) reasons for
throwing out certain results. Throwing out results can also bias a
study, however. If we flip a coin ten times, and we repeat this
experiment enough times, we will eventually obtain 10 heads in some
trials and ten tails in others. We might then report that a random
coin flip test produced ten head (or tails), even though the entire
set of results produced an equal number of heads and tails - by
failing to report some results, we have biased those that we do
report. For example, a product test may have been repeated many
times, with the ones finally published being limited to those
favoring the product.
This principle applies widely. In a court case, the defense will
only present the data that they have that tends to exonerate their
client. In other cases, the models being tested during the evaluation
and revision step of the scientific method are not representative of
all those that could be compared. For example, the U.S. Forest
Service, when writing management plans for National Forests, is often
required to compare several alternative management options. The bias
arises because the alternative options (e.g. models) that the U.S.
Forest Service considers are not always representative of all
possible management options. Hence the models being evaluated are a
biased subset of all those conceivable.
Searching for the "right" test. There are
hundreds of ways to conduct statistical tests. Some study designs fit
cleanly into standardized statistical procedures, but in many cases,
unexpected results dictate that statistical tests be modified to suit
the circumstances. Thus, any one data set may have dozens to hundreds
of ways of being analyzed. In reporting the results, someone may bias
the evaluation step by reporting only those tests favorable to a
particular goal. We should point out that this practice offers a
limited opportunity to bias an evaluation. If the data strongly
support a particular result, it won't be easy to find an acceptable
test which obscures that result.
Very subtle bias: Controlling the null model
As noted in a previous chapter,
many evaluations are based on a
null
model approach:
the null model is accepted until proven
wrong.
To "control" the
null model means to "choose" the null model.
Choice of the null model can have a big effect on the outcome
of even the most unbiased scientific evaluation for the simple reason that a null model is accepted until proven guilty. Any uncertainty or inadequacy in the data will thus rule in favor of the null model. By choosing the null model, therefore, many of the studies testing the model will "accept" it, not because the evidence for it is strong, but because the evidence against it is weak. As a consequence, the null model enjoys a protected status, and it is to anyone's advantage to choose which model is adopted as the null model. Choice of the null model in this sense does not even mean developing it or proposing/inventing it. Given a set of alternatives decided upon in advance, controlling the null model means simply the selection of which model from that set is adopted.
Consider the two alternative models that might be used in
approving a new food additive for baby formula:
As the null model, (a) requires a rigorous demonstration of the
safety of a food additive before it is approved. In contrast, (b)
requires that an additive can be used until a harmful effect is
demonstrated. As noted in the Data chapters, an enormous sample size
might be required to demonstrate a mild harmful effect, so a harmful
product could reach market much more easily under null model (b) than
under (a).
Choice of the null model represents a powerful yet potentially
subtle way in which an entire program of research can be biased.
Every other aspect of design, models used, and evaluation could meet
acceptable standards, yet choice of a null model favorable to one
side in a conflict will bias many outcomes in favor of that side.
Minimizing the abuses
Recognizing the possible abuses of science is the simplest and
most effective way to avoid being subjected to them. Beyond this, we
can think of no single simple rule to follow that will minimize the
opportunity for someone to violate the spirit of science and present
misleading results -- there are countless ways to bias data. One
strategy to avoid bias is to require detailed, explicit protocols.
Another is to have the data gathered by and individual or company
lacking a vested interest in the outcome. But even with these
policies, there is no guarantee that deliberate biases can be weeded
out. The following table gives a few pointers.
Ensuring legitimate science
|
Property of Study
|
Impact
|
|
Publish protocols in advance of the study
|
Prevents mid-course changes in response to results; enables
requests for design modifications with little cost.
|
|
Publish the actual raw data
|
Enables an independent researcher to look objectively at
the
data, possible uncovering any attempts
to obfuscate certain
results.
|
|
Specify evaluation
criteria before obtaining results
|
Minimizes after-the-fact
interpretation of
data.
|
|
Anticipate
vested
interests
|
Conclusions of individuals,
corporations, and political
bodies
can be predicted with
remarkable accuracy by
knowing their
financial interests and their
political and ideological
leanings.
Understanding these data helps
immensely in understanding how
they
may
have used biased (but perhaps,
well-intentioned) methods
in
arriving at
conclusions.
|