Methodological and Statistical Issues in Adult Nutritional Research
May 1, 2009
In general, an overriding purpose of all science is to find the truth (Table 1). 1-3 To accomplish this goal, science, including nutritional research, generally splits its work into three categories: work that describes and classifies, work that explains, and work that predicts. In clinical nutrition, which works extensively in all three categories, acceptance of a statement as “true” requires satisfaction of both definitions given in Table 1. To obtain the truth, a series of assumptions must be made (Table 2). See notes 3-6 for a further discussion of these assumptions and their rationale. Based on their “track records,” the nutritional and medical sciences, although fallible, are capable of approaching and even finding the truth only if rigorous scientific methods are used.4-7
The complex and difficult problem of causality is central to our understanding of nutrition research.4-8 A cause is defined as “that factor which is possible or convenient for us to alter in order to produce or prevent an effect. This concept contains two components: production of an effect and an understanding of its mechanisms.”5, 6, 8To understand current concepts of causality, it is helpful to briefly review historical thinking about it (Table 3). Aristotle believed that bodies in motion required constant force (efficient cause) to keep them moving, that the seed contained the adult (teleological cause). After more than 2,000 years, Newton overturned Aristotle in physics with the concept of inertia. Hume further advanced our understanding by postulating that our notion of causality depends on well-documented associations. Partially correct, Kant believed the mind (brain) imposes notions of time, extension, and causality on nature.
More recently, the concepts of necessary and sufficient cause and Koch’s postulates were clearly delineated. Finally, for many situations (e.g., high cholesterol as a “cause” of atherosclerosis, which leads to heart disease and stroke) the idea of contributory causality emerged.4-6 (In the case of human atherosclerosis, the “cause” is thought to be due to many factors, including cholesterol, macrophages, platelets, lipoproteins, cytokines, leukotrienes, etc. There is not just a single cause.) Contributory causes are sometimes termed precipitating, predisposing, or sustaining causes and are particularly relevant in nutrition research.4-8 Moreover, the concept of contributory causality often requires statistical thinking and its attendant science, since not everyone with the “cause” (e.g., high cholesterol) develops disease nor does everyone lacking the “cause” fail to develop disease.4-8 Therefore, because of the nature of contributory causality, large, randomized, long-term controlled clinical trials, especially when effects are modest or slow, are sometimes required to establish the value of certain interventions, e.g., lowering serum cholesterol with diet and/or drugs.4-6, 8
In general, there are three main types of studies in adult human nutritional research to establish causality as shown in Table 4. All three types can play an important role where appropriate. For example, anecdotes can occasionally be definitive. The remarkable responses of unconscious thiamine-deficient patients (with Wernicke’s encephalopathy) to intravenous thiamine (vitamin B1) or comatose hypoglycemic patients to intravenous glucose do not require fancy statistical analyses to see that the intervention (intravenous thiamine or glucose, respectively) is the variable causatively responsible for the response of the patient’s awakening. These are two straightforward examples of necessary and sufficient causality at work. However, many problems in nutrition involve contributory causality and require more complex methods.
General evidentiary requirements in clinical nutrition to prove hypotheses of the type “A causes B” are shown in Table 5.4, 6 To satisfy these criteria, epidemiological/observation studies (EOS; Table 4) are often employed specifically to determine whether an association exists between two variables (e.g., an outcome and another variable) and sometimes whether the association is causally related.3-8 As noted above, in EOS, we are generally testing a hypothesis retrospectively in a case control or prospectively in some cohort studies. (For example, in a case-control, cross-sectional, or cohort study, those with a certain beneficial effect or disease [outcome] may be questioned about the use of a certain food or supplement [putative cause] and compared with controls. When an apparent association between the outcome and putative cause emerges, then the odds of developing the outcome can be calculated in those with the putative cause versus those without. If the association is strong, a causal relationship between the putative cause and outcome is often [frequently incorrectly] assumed.)2-4, 6 Of course, the fundamental problem with EOS is they are not randomized. 4 Without randomization, one can never be sure the controls are the same as the “experimentals” (e.g., those with the outcome) in everything else except the one variable (the cause) of interest. Therefore, EOS are subject to bias and confounding. Moreover, when it is clear that the two groups are not comparable, attempts to correct the imbalances after the fact are fraught with problems. In the end, often just guesswork or other unproven hypotheses are applied to already questionable data (see below under statistics).
Many years ago, Hill published criteria for deciding causation when association occurs in EOS (Table 6). 3-5Hill pointed out the need for employing these criteria, especially the strength of associations, including dose-response relationships, when trying to establish causation. 3(In philosophy, this is now termed manipulability theory and is an updating of Mill’s Method of Concomitant Variation.) Hill was concerned about the potential to obtain misleading associations and even worse, false causal associations, unless his criteria were employed. Hill also pointed out that you cannot use statistical analyses in EOS because they are not randomized and hence do not satisfy a key assumption on which all comparative statistical analyses are based (see below). (Obviously, in addition to the criteria listed in Table 6, the study must be properly and honestly performed, not just the result of “data dredging.”)3-5 Perhaps the best example of the success of the Hill approach is the relationship between smoking and cancer of the lungs. In the case of smoking, heavy smoking “causes” a tenor twentyfold increase in lung cancer and there is a “dose-response” relationship. 3-5 Most of the rest of the criteria in Table 6 are also satisfied. Another example is the spread of papilloma virus (that causes cervical cancer) through sexual activity. However, in many alleged “nutritional causal relationships” based on EOS, the relative risk is less than two, and the attendant statistical analyses are not valid. (See below.)4, 5
In general, the most powerful method to establish the truth of many nutritional hypotheses, when contributory causality is postulated, is the prospective, randomized, controlled trial whose methodological components are shown in Table 7. 2-5, 6 This type of trial is the “gold standard” used by regulatory bodies worldwide, including the U.S. FDA for licensure of nutritional claims and drugs. 3-5, 6 Such trials can focus on one hypothetically causal variable. (A discussion of the important roles of randomized withdrawal and rechallenge trials in obtaining nutritional truths is beyond the scope of this article [see note 3]). In most prospective, randomized clinical trials (Table 7), the hypothesis tested is that no difference exists in a single-outcome variable (e.g., alive or dead) between control (often placebo) and experimental treatment (e.g., a nutritional intervention like megavitamin E), the so-called null hypothesis. This can be either an efficacy or safety hypothesis. The statistical probability of the outcome (p-value) is calculated assuming the null hypothesis is true. If the p-value of the comparison is 0.05 or less and the trial has sufficient power, the null hypothesis is unlikely to be true. 4The smaller the p-value, the less likely the truth of the null hypothesis. Ideally, whenever possible, these trials are blinded; neither sponsor, patient, nor investigator knows who receives which treatment until the study is finished and all the data are cleaned and “locked” in preparation for unblinding. If enough persons are included in the clinical trial, randomization (and the blinding processes if employed) minimizes or eliminates the chance for bias, i.e., baseline differences between the groups (see below). Although the prospective, randomized, controlled clinical trial, especially when blinded, is presently the most powerful method for approaching or finding the truth about contributory causality in clinical nutrition and medicine, regulatory authorities usually require two replicate trials before licensure. 2-5, 6 This, in part, is because when results are of borderline statistical significance (i.e., p just less than 0.05), there is a reasonable chance that an identical repeat of the clinical trial will not show a statistically significant (p < 0.05) result. The probability of two consistent replicate trials (both p < 0.05) being incorrect, however, is very low. 4, 6 A single, large, blinded, controlled, randomized nutritional or clinical trial that is highly statistically significant (p < .001) is also likely to be correct. 4 Of course, the nutritional and clinical (as opposed to statistical) significance of the results depends on the importance of the hypothesis tested, the choice of comparator, and the quantitative size of the difference. When these criteria have been met, few mistakes in ascertaining the truth of the hypotheses tested have occurred. It is also worth noting that the FDA generally demands “substantial” scientific evidence for both safety and efficacy before approving and allowing claims. Finally, extrapolation of the results of successful trials to populations is only reasonable when the participants in the trial mirror the population.
Another critical point is the nature of the outcome variable chosen. What we really want to know are outcomes that are important. For example, in cardiovascular disease trials, we want to prevent angina, heart attacks, strokes, and death. So-called “surrogate” outcomes like serum cholesterol or blood homocysteine are scientifically interesting but can be very misleading. For example, it is not always true that lowering serum cholesterol leads to better outcomes. In fact, lowering cholesterol with the drug clofibrate led to more disease, not less. Thus, the cholesterol hypothesis is that although there is no doubt that lowering cholesterol with statin drugs is highly beneficial in certain populations,6, 7these drugs (statins) have many other effects and it is not clear whether lowering cholesterol or the other effects of these drugs or both leads to less morbidity (heart attacks and strokes) and lower mortality. 6, 7, 9
Probability concepts and statistical thinking play an important role in nutritional research (see Table 8). 4 As noted above, if randomization (defined in Table 9) is not a part of comparative hypothesis testing, a crucial assumption of statistical usage is not met, and application of statistical tests is chancy and pseudoscientific because, as noted above, potential consequences of non-randomization are bias and confounding (Table 9). 3, 4 There are two general types of bias: intentional bias, in which the investigator’s mind has predetermined what is happening, versus unintentional types. 3, 4 Over twenty types of unintentional bias have been documented in EOS. 3, 4 Because of these problems of non-randomization, bias, and confounding (Table 9), EOS (Table 4) can only occasionally establish causality fairly conclusively (e.g., smoking and lung cancer or sexual activity and papilloma virus infections) as noted above.
Moreover, in many EOS nutrition trials, the measurement instruments are un-validated or poorly validated (e.g., questionnaires about diet, vitamin use, etc.; Table 9). 3, 5The use of poorly validated or un-validated measurement instruments is another reason for the poor track record of many EOS. 1-5, 10
Finally, the notion of falsification in Table 9 requires comment. Popper and others noted how difficult it is to verify certain propositions (e.g., all swans are white), as Hume emphasized centuries ago, since not every swan can be assessed. 4-6 Only one black swan will disprove the proposition. Verification is especially difficult in testing contributory causality. Popper argued successfully that in all science falsification is easier to understand and embrace than verification. Therefore, nutritional and pharmacological research is often oriented toward falsifying the null hypothesis. The FDA and other regulatory bodies support this approach. Moreover, as in all nutritional and medical research, subtle undefined and unknown confounding variables may still exist.
Thus, in EOS, the lack of randomization, the frequent use of inadequate instruments, and the retrospective nature of hypothesis testing in case-control and cross-sectional studies can often be expected to be problematic with bias, confounding, and erroneous associations and conclusions. This is not just a theoretical concern; in fact, this often happens. 3-5, 10 Many examples of claims strongly supported by EOS have subsequently been shown erroneous in large, prospective, blinded, randomized comparative trials. 3-5, 6, 7, 11-18 Thus, unless the Hill criteria (Table 6) are met in EOS trials, EOS trial associations and conclusions must be considered, at best, hypothesis-generating. 1-5, 10 In other words, one never knows which EOS might be correct and which ones are “false or misleading or non-causal” associations. Moreover, in many cases, there were strong a priori reasons to think the results of the EOS were unlikely to be correct. For example, it was extremely unlikely that mega-vitamin supplements (in non-deficient people) would affect cognitive decline notwithstanding the positive EOS19, 20; there was never biological plausibility (Table 6) for such hypotheses.
In summary, the use of the appropriate methods and statistical analyses in adult nutrition research are critical for finding the truth. Without them, nutrition research and practice are often harmful guesswork or pseudoscience.
- Taubes, G. 2007. Good Calories, Bad Calories. New York, Alfred A. Knopf.
- Taubes, G. 2007. “Do We Really Know What Makes Us Healthy?” New York Times Magazine, p. 52, Sept. 16.
- Spector, R., and E.S. Vesell. 2000. “The Pursuit of Clinical Truth: Role of Epidemiology/Observation Studies.” Journal of Clinical Pharmacology 40: 1205–1210.
- Spector, R., and E.S. Vesell. 2006. “Pharmacology and Statistics: Recommendations to Strengthen a Productive Partnership.” Pharmacology 78: 113–122.
- Spector, R., and E.S. Vesell. 2002. “Which Studies of Therapy Merit Credence? Vitamin E and Estrogen Therapy as Cautionary Examples.” Journal of Clinical Pharmacology 42: 1–8.
- Spector, R., and E.S. Vesell. 2006 “The Power of Pharmacological Sciences: The Examples of Proton Pump Inhibitors.” Pharmacology 76: 148–156.
- Spector, R., and E.S. Vesell. 2006. “The Heart of Drug Discovery and Development: Rational Target Selection.” Pharmacology 77: 85–92.
- Woodward, J. 2003. Making Things Happen: A Theory of Causal Explanation. New York; Oxford, 2003
- Taubes, G. 2008. “What’s Cholesterol Got to Do With It?” The New York Times, p.18, Jan. 27.
- Tatsioni, A., N.G. Bonitsis, and J.P.A. Ioannidis. 2007. “Persistence of Contradicted Claims in the Literature.” Journal of the American Medical Association 298: 2517–2526.
- Moloo, J. 2008. “Dietary Supplements Don’t Prevent Cognitive Decline, CVD, or Infections.” Journal Watch 28: 7–8.
- Yaffe, K. 2007. “Antioxidants and Prevention of Cognitive Decline: Does Duration of Use Matter?” Archives of Internal Medicine 167: 2167–2168.
- Peters, U., M.F. Leitzmann, N. Chatterjee, et al. 2007. “Serum Lycopene, Other Carotenoids, and Prostate Cancer Risk: A Nested Case- Control Study in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial.” Cancer Epidemiological Biomakers and Prevention 16: 962–968.
- Kang, J.H., N. Cook, J. Manson, et al. 2006. “A Randomized Trial of Vitamin E Supplementation and Cognitive Function in Women.” Archives of Internal Medicine 166: 2462–2468.
- Espeland, M.A., and V.W. Henderson. 2006. “Preventing Cognitive Decline in Usual Aging.” Archives of Internal Medicine 166: 2433–2434.
- Jamison, R.L., P. Hartigan, J.S. Kaufman, et al. 2007. “Effect of Homocysteine Lowering on Mortality and Vascular Disease in Advanced Chronic Kidney Disease and End-Stage Renal Disease.” Journal of the American Medical Association 298: 1163–1170.
- Cook, N.R., C.M. Albert, M. Gaziano, et al. 2007. “A Randomized Factorial Trial of Vitamins C and E and Beta Carotene in the Secondary Prevention of Cardiovascular Events in Women.” Archives of Internal Medicine 167: 1610–1618.
- Brunner, E. 2006. “Oily Fish and Omega 3 Fat Supplements.” British Medical Journal 332: 739–740.
- Spector, R., and C. Johanson. 2006. “Micronutrient and Urate Transport in Choroid Plexus and Kidney: Implications for Drug Therapy.”
- Spector, R., and C. Johanson. 2007. “Vitamin Transport and Homeostasis in Mammalian Brain: Focus on Vitamins B and E.” Journal of Neurochemistry. 103: 425–438.