Principles for Evaluating Epidemiologic Data in Regulatory Risk Assessment

Developed by an Expert Panel at a Conference in London, England,

October 1995


Preliminary and summary issues to be considered in assessing the utility of an epidemiologic study for risk assessment:

a. Were the objectives of the study defined and stated?

b. Are the data relevant for risk assessment?

c. Was the study designed to have sufficient power to detect the effect(s) of interest?

d. Were good epidemiological practices followed?

e. Can the study findings be generalized for statutory regulations?

f. Were the principles enumerated below followed?

The listing of "preliminary and summary issues" is designed to help the reviewer to begin focusing on fundamental issues of the study's utility for risk assessment, and whether the study is suitable for either or both the hazard identification and dose-response components of a risk assessment.

The presence of questions in the format of a checklist in this preliminary section and under each of the hazard identification principles which follow does not imply that the principles and checklists can be applied in a mechanical fashion, that they are intended to produce some kind of numerical "score" or "grade," or that there are certain minimum quality hurdles a study must surmount. Nevertheless, when considered in their totality, the principles and subquestions are intended to assist the risk assessor, assisted by experts in epidemiology and other relevant disciplines, in forming an opinion as to the overall quality of the data and the weight they should. be given in a risk assessment. While nonconformance with any single principle, or a "No" or "Not Known" answer to any subquestion, should not eliminate a study from consideration, review of the study in light of all the principles might result in its being given essentially no weight in a risk assessment.



The numbered principles in this section apply only to the hazard identification portion of a risk assessment. The questions under each principle are designed to help elucidate the principle and to assist the expert reviewer in judging whether the study is consistent with that principle. The subquestions are framed so that a Yes answer is preferred.

The emphasis in these hazard identification principles is on evaluating individual studies, and the principles follow a logical progression from design and study population selection to reporting of results and evaluation of the results in a risk assessment context. Principle A-6, however, addresses interpretation of multiple studies through application of the "Bradford Hill criteria;" and Principle B-6 in the dose-response section, concerning meta-analysis, applies to consideration of multiple studies for hazard identification purposes as well as for dose-response purposes.

It must be emphasized that it is intended that application of these principles and interpretation of the data for risk assessment should be done by the risk assessor with the assistance of expert epidemiologists, and preferably with the assistance of a multidisciplinary team that includes not only epidemiologists, but also experts from other relevant disciplines, such as toxicology, medicine, biology, and industrial hygiene.

Finally, it is recognized that these principles set high standards, and that it is unlikely that any individual study can be considered perfect. The principles were drafted not only for the purpose of evaluating existing studies, but also with the hope that they will encourage greater rigor in future studies that are likely to be used in regulatory risk assessment.

[NOTE: In the book, lettered sub-principles are followed by boxes to check "yes," "no," "unknown," or "not applicable."]

Principle A-1. The population studied should be pertinent to the risk assessment at hand, and it should be representative of a well-defined underlying cohort or population at risk.

a. Were study subjects representative of exposed and unexposed persons (cohort study), or of diseased and non-diseased persons (case-control study)?

b. To minimize bias, were exposed and unexposed persons comparable "at baseline" (cohort study), or were cases similar to controls, prior to exposure, with respect to major risk factors for the disease or condition under study?

Principle A-2. Study procedures should be described in sufficient detail, or available from the study's written protocol, to determine whether appropriate methods were used in the design and conduct of the investigation.

a. To minimize the potential for bias, were interviewers and data collectors blind to the case/control status of study subjects and to the hypothesis being tested?

b. Were there procedures for quality control in place for all major aspects of the study's design and implementation (e.g., ascertainment and selection of subjects for study, methods of data collection and analysis, follow-up, etc).

c. Were the effects of nonparticipation, a low response rate, or loss to follow-up taken into account in producing the study results?

Principle A-3. The measures of exposure(s) or exposure surrogates should be: (a) conceptually relevant to the risk assessment being conducted; (b) based on principles that are biologically sound in light of present knowledge; and (c) properly quantitated to assess dose-response relationships.

a. Were well-documented procedures for quality assurance and quality control followed in exposure measurement and assessment (e.g. calibrating instruments, repeat measurements, re-interviews, tape recordings of interviews, etc.)

b. Were measures of exposure consistent with current biological understanding of dose (e.g., with respect to averaging time, dose rate, peak dose, absorption via different exposure routes)?

c. If there is uncertainty about appropriate exposure measures, was a variety of measures used (e.g, duration of exposure, intensity of exposure, latency)?

d. If surrogate respondents were the source of information about exposure, was the proportion of the data they provided given, and were their relationships to the index subjects described?

e. To improve study power and enhance the generalizability of findings, was there sufficient variation in the exposure among subjects?

f. Were correlated exposures measured and evaluated to assess the possibility of competing causes, confounding, and potentiating effects (synergy)?

g. Were exposures measured directly rather than estimated? If estimated, have the systematic and random errors been characterized, either in the study at hand or by reference to the literature?

h. Were measurements of exposure or human biochemical samples of exposure made? Was there a distinction made between exposures estimated by emission as opposed to body absorption?

i. If exposure was estimated by questionnaire, interview, or existing records, was reporting bias considered, and was it unlikely to have affected the study outcome?

j. Was there an explanation/understanding of why exposure occurred, the context of its occurrence, and the time period of exposure?

Principle A-4. Study outcomes (endpoints) should be clearly defined, properly measured, and ascertained in an unbiased manner.

a. Was the outcome variable a disease entity or pathological finding rather than a symptom or a physiological parameter?

b. Was variability in the possible outcomes understood and taken into account -- e.g., various manifestations of a disease considering its natural history?

c. Was the method of recording the outcome variable(s) reliable -- e.g., if the outcome was disease, did the design of the study provide for recording of the full spectrum of disease, such as early and advanced stage cancer; was a standardized classification system, such as the International Classification of Diseases, followed; were the data from a primary or a secondary source?

d. Has misclassification of the outcome(s) been minimized in the design and execution of the study? Has there been a review of all diagnoses by qualified medical personnel, and if so, were they blinded to study exposure?

Principle A-5. The analysis of the study's data should provide both point and interval estimates of the exposure's effect, including adjustment for confounding, assessment of interaction (e.g, effect of multiple exposures or differential susceptibility), and an evaluation of the possible influence of study bias.

a. Was there a well-formulated and well-documented plan of analysis? If so, was it followed?

b. Were the methods of analysis appropriate? If not, is it reasonable to believe that better methods would not have led to substantially different results?

c. Were proper analytic approaches, such as stratification and regression adjustment, used to account for well-known major risk factors (potential confounders such as age, race, smoking, socio-economic status) for the disease under study?

d. Has a sensitivity analysis been performed in which quantitative adjustment was made for the effect of unmeasured potential confounders, e.g., any unmeasured, well-established risk factor(s) for the disease under study?

e. Did the report avoid selective reporting of results or inappropriate use of methods to achieve a stated or implicit objective? For example, are both significant and non-significant results reported in a balanced fashion?

f. Were confidence intervals provided in the main and subsidiary analyses?

Principle A-6. The reporting of the study should clearly identify both its strengths and limitations, and the interpretation of its findings should reflect not only an honest consideration of those factors, but also its relationship to the current state of knowledge in the area. The overall study quality should be sufficiently high that it would be judged publishable in a peer-reviewed scientific journal.

a. Were the major results directly related to the a priori hypothesis under investigation?

b. Were the strengths and limitations of the study design, execution, and the resulting data adequately discussed?

c. Is loss to follow-up and non-response documented? Was it minimal? Has any major loss to follow-up or migration out of study been taken into account?

d. Did the study's design and analysis account for competing causes of mortality or morbidity which might influence its findings?

e. Were contradictory or implausible results satisfactorily explained?

f. Were alternative explanations for the results seriously explored and discussed?

g. Were the Bradford Hill criteria (see Appendix B) for judging the plausibility of causation (strength of association, consistency within and across studies, dose response, biological plausibility, and temporality) applied when interpreting the results?

h. What are the public health implications of the results? For example, are estimates of absolute risk given, and is the size of the population at risk discussed?



Proceeding to application of the dose-response principles assumes that the existence of a hazard has been adequately established under the above principles. On the other hand, adequate establishment of hazard, even with a showing of strong and consistent association, does not necessarily mean there are sufficient data for use in dose-response evaluation. The dose-response principles assume that there is a need for dose-response extrapolation because no individual epidemiologic study provides sufficient high-quality information on dose-response to reach conclusions about dose-response at the exposure levels being addressed in the regulatory risk assessment.

These principles also assume that higher quality data are required for dose-response evaluation than for hazard identification, and that data used for dose-response should meet some minimum standards or quality hurdles. In other words, the reviewer and risk assessor should answer the basic question of whether the epidemiologic data, in an individual study or cumulatively, are adequate for use in dose-response evaluation. There is no formula or quantitative weighting scheme prescribed for making this judgment.

The principles address not only the use of epidemiologic data by themselves, but also their use in combination or conjunction with animal and/or biologic data. Consequently, there is an even greater need than in the hazard identification phase for scientists from relevant disciplines other than epidemiology to work with the risk assessor to interpret the data.

If epidemiologic data adequate for dose-response evaluation are not available, and a risk assessment is being developed for use in making an important regulatory decision, and if it is feasible to develop new epidemiologic data, or to extract new data from existing studies, an effort should be made to develop and provide good epidemiologic dose-response data that can be used together with, or in preference to, high-dose animal data.

Principle B-1. Dose-response assessment should include a range of reasonable dose measures, explain why any were rejected, and provide a rationale if any particular dose metric is preferred. In evaluations of both human and animal data, several different measures of dose should be evaluated (if possible).

Principle B-2. In the selection of a dose-response model, the greatest weight should be given to models that fit the observed animal and human data and are consistent with the biologically relevant mode(s) of action (genotoxic, nongenotoxic, unclassified). When mechanistic knowledge is uncertain or limited, several plausible dose-response models should be considered and the most plausible ones, based on available data and professional judgment, should generally be used in dose-response evaluation.

Principle B-3. When extrapolating cancer risk to exposure levels below the observable range, mechanistic data should be used to characterize the shape of the dose-response function.

Principle B-4. When the available epidemiologic data are not adequate to perform dose-response analyses, causing low-dose estimates of risk to be derived exclusively from animal data, every effort should still be made to use the available human data in assessing the validity of low-dose risk estimates. To the extent feasible, heterogeneity in the human population should be accounted for. Whenever feasible, human data on metabolic biomarkers and other biological measures should be employed to adjust the risk estimates for known differences between species and between high and low doses. If possible, data on susceptibility should be included.

Principle B-5. When epidemiologic studies are selected for dose-response assessment, higher quality studies should be given preference, especially those with precise and accurate exposure information. The availability of information with respect to timing of exposure and response (time/age of first exposure, intensity of exposure, time to tumor), adjustment for confounding variables, and potential interaction with other effect modifiers is particularly important.

Principle B-6. A properly conducted meta-analysis, or preferably an analysis based on the raw data in the original studies, may be used in hazard identification and dose-response evaluation when such combination includes an evaluation of individual studies and an assessment of heterogeneity. The combined results ought to provide, more than any single study, precise risk estimates over a wider range of doses. Before using these tools, the gains should be judged sufficient to justify potential errors in inference resulting from combining studies of dissimilar design and quality.

Principle B-7. When epidemiological data are used in dose-response assessment, a quantitative sensitivity analysis should be conducted to determine the potential effects on risk estimates of confounders, measurement error, and other sources of uncontrolled bias in study design.

Principle B-8. Scientific understanding of differentials in human susceptibility to disease (racial/ethnic/gender/genetic differences, genetic polymorphisms, etc.) should be used to refine the low-dose extrapolation procedures when such phenomena are adequately understood.

Principle B-9. To characterize the most important sources of uncertainty in the final estimate of risk, a quantitative analysis should be conducted to determine the major sources of uncertainty in dose-response assessment, including discussion of the prospects that future research might diminish the various sources of uncertainties.

EPILOGUE TO PRINCIPLES: Questioning of epidemiologist researchers by risk assessors

  1. Risk assessors' criticisms and major questions about the methods, analyses, data, or interpretation of a published report should be directed, whenever possible, to the epidemiologist(s) responsible for the paper, and they should be given an opportunity to respond.

  2. Risk assessors may want access to the study's data set for other analyses to be used in the risk assessment. This would be done with the consent and cooperation of the study epidemiologist(s).


Recommendation 1. A commitment to collaboration should be made by epidemiologists and risk assessors that includes (a) sharing of raw data where feasible, (b) exchange of protocols and survey instruments, (c) inclusion of epidemiologists in dose-response modeling exercises, and (d) care and fairness by risk assessors in the critique of original epidemiologic studies.

Recommendation 2. Future epidemiologic studies should be funded and designed with the needs of regulatory risk assessors in mind, including (a) richer exposure information (e.g., age-specific exposure histories and measures of key confounders), and (b) ample resources for careful dose-response analyses.

Recommendation 3. Epidemiologic study teams (and the peer review panels that evaluate them for funding) should include multidisciplinary expertise from the fields of medicine, toxicology, industrial hygiene, statistics, and risk assessment, as well as epidemiology.

Recommendation 4. Peer review should be applied to the use of epidemiologic data in risk assessment, including (a) involvement of the original epidemiologic investigator(s) when possible, (b) panels that reflect stature, objectivity, appropriate areas of expertise, and balance in perspective, and (c) opportunity for public comment, such as that used by EPA's Science Advisory Board.

Recommendation 5. Reporting of epidemiologic findings should be responsive, if possible, to the needs of risk assessors, including (a) documentation of rationales for decisions about how data were grouped for analysis purposes, (b) clear distinctions between subjects with small vs. zero exposure, and (c) reporting of extent of pre-testing in multivariate modeling in order to allow better interpretation of classical statistical tests.

Federal Focus, Inc.® is a non-profit 501(c)(3) foundation established in 1986 to engage in research and educational activities pertaining to Federal government policy issues, particularly ones of inter-agency concern. For the last eight years, environmental health issues, especially the development of improved Federal government risk assessment principles and guidance, have been a principal focus of the foundation.

Principles for Evaluating Epidemiologic Data in Regulatory Risk Assessment

Library of Congress Catalogue No. 96-S8998

International Standard Book No.0-9654148-0-9

Copyright Federal Focus, Inc.® 1996. All rights reserved, except that any person may copy the "Principles, Preambles, and Recommendations" portion of the publication which is printed on light blue pages.

Additional copies of this publication are available from Federal Focus, Inc.® at the address:

Federal Focus, Inc.

11 Dupont Circle, Suite 700

Washington, DC 20086

(202) 797-6368

The previous principles are excerpted with permission from the copyrighted book Principles for Evaluating Epidemiologic Data in Regulatory Risk Assessment, developed by an Expert Panel at a Conference in London, England, October 1995. Federal Focus Inc.

Expert Panel Co-Chairs: Graham, J.D.; Koo, L.C.; Paustenbach, D.J.; Wynder, E.L.;

Expert Panel Members: Ashby, J.; Carlo, G.; Cohen, S.M.; Evans, J.S.; Holland, W.; Matanoski, G.M.; North, G.W.; Pershagen, G.; Schlesselman, J.J.; Starr, T.B.; Swenberg, J.A.; Teta, M.J.; Wichmann, E.; Williams, G.M.

Federal Focus participants: Kelly Jr., W.J.; Auchter, T.G.; Landeck, S.; Ploger, W.D. Washington, DC: Federal Focus, Inc. 1996.