general discussion new oecd guidelines for animal toxicity: are they adequate?

4
APMIS 109(SuppI. 103): S332-5. 2001 GENERAL DISCUSSION NEW OECD GUIDELINES FOR ANIMAL TOXICITY: ARE THEY ADEQUATE? Moderators: Paul Foster & L. Earl Gray Paul Foster (CIIT, USA) I would like to pose the question: "Are current standard testing pro- tocols adequate and appropriate and give us the best chance for detecting compounds which have potential endocrine-like activity"? Dr. Gray has already indicated at this confer- ence that there are deficiencies in the tests that we are using (E. Gray, p. S302). L. Earl Gray (EPA, USA) The Environnient Protection Agency (EPA) was given the task of developing a screening and testing programme to cover the 87,000 chemicals which are on the EPA's toxic inventory. Clearly not all of these have been screened or tested, and they go through a priority process, which selects the top few hundred for screening. We have concerns that many testing data gaps exist at present. Most of the 87,000 chemicals do not go through developmental toxicity studies or multigener- ational testing which are reserved for drugs, which may be used by women who are preg- nant, and for food use pesticides. However, we are not convinced that the developmental toxic- ity and multigenerational tests are particularly useful for endocrine disrupting chemicals. We are concerned about the multigenerational tests in general, and not just the OECD tests. There are a number of chemicals classified as endocrine disrupters or endocrine active, which produce reproductive malformations or alter- ations but are negative in standard teratology studies. This often depends on whether the tests where conducted under the old or new guide- lines. In the old guidelines, the period of ex- posure did not include the period of sexual dif- ferentiation. Also, the effects on the ventral prostate are difficult to assess grossly when it is undifferentiated. The reproductive alterations measured included sexual differentiation and testicular differentiation but many other mal- formations were not assessed. Newer develop- mental toxicity guidelines have expanded the period of exposure but even then it is difficult assess many of the malformations that we have looked at. Developmental toxicity studies are negative for most of the phthalates we have tested even when exposure is extended up to gestational day 20 in rats. The only tests in the S 332 guideline, which include the appropriate ex- posure evaluation, are the multigenerational tests. There are numerous multigenerational testes listed under the old US EPA guidelines but many of the chemicals which we now know to produce malformations in a low percentage of the population were negative under these guidelines. Some endocrine disrupters produce malformations in less than 10% of animal popu- lations. Current guidelines require 20 animals to be tested. Paul Foster It is only recently that the period of sexual differentiation has been covered by the dosing period. This becomes very important when we assess compounds such as the phthal- ates which have a relatively short half life and are not accumulated in the body. Multigener- ational studies are large studies costing cur- rently about $500.000 per test. An example of such a test involved initially 4 groups of animals with each group expected to achieve 20 litters. The parental generation had extensive exposure lasting for 10 weeks and the offspring are exam- ined to determine what happened due to ex- posure during the critical window of sexual dif- ferentiation. This illustrated study was orig- inally designed purely to assess the effects on reproduction and development, and it was a good study for these specific endpoints. Shortly after the birth of the first generation offspring, the litter is reduced by cullinghtandardisation into 4 males and 4 females which is considered adequate numbers for further studies. The off- spring are next examined at weaning, at post- natal day 21. The prostates are examined, al- though this is difficult in 21-day old rats, and any effects on the prostate are unlikely to be detected in weaned rats. It is the effect on the prostate when they become adults which is more meaningful. The weaned litters are again culled by 1 male and 1 female, which means that in multigenerational studies only a selection of the animals are examined at the various different ages. In contrast to the developmental toxicity studies described by Dr Earl Gray where all the pups are examined. If we are looking for mal- formation rates of 10-15%, which is a good positive effect in teratology studies, these mal-

Upload: paul-foster

Post on 21-Jul-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

APMIS 109(SuppI. 103): S332-5. 2001

GENERAL DISCUSSION NEW OECD GUIDELINES FOR ANIMAL TOXICITY: ARE THEY ADEQUATE?

Moderators: Paul Foster & L. Earl Gray

Paul Foster (CIIT, USA) I would like to pose the question: "Are current standard testing pro- tocols adequate and appropriate and give us the best chance for detecting compounds which have potential endocrine-like activity"?

Dr. Gray has already indicated at this confer- ence that there are deficiencies in the tests that we are using (E. Gray, p. S302).

L. Earl Gray (EPA, USA) The Environnient Protection Agency (EPA) was given the task of developing a screening and testing programme to cover the 87,000 chemicals which are on the EPA's toxic inventory. Clearly not all of these have been screened or tested, and they go through a priority process, which selects the top few hundred for screening. We have concerns that many testing data gaps exist at present. Most of the 87,000 chemicals do not go through developmental toxicity studies or multigener- ational testing which are reserved for drugs, which may be used by women who are preg- nant, and for food use pesticides. However, we are not convinced that the developmental toxic- ity and multigenerational tests are particularly useful for endocrine disrupting chemicals. We are concerned about the multigenerational tests in general, and not just the OECD tests.

There are a number of chemicals classified as endocrine disrupters or endocrine active, which produce reproductive malformations or alter- ations but are negative in standard teratology studies. This often depends on whether the tests where conducted under the old or new guide- lines. In the old guidelines, the period of ex- posure did not include the period of sexual dif- ferentiation. Also, the effects on the ventral prostate are difficult to assess grossly when it is undifferentiated. The reproductive alterations measured included sexual differentiation and testicular differentiation but many other mal- formations were not assessed. Newer develop- mental toxicity guidelines have expanded the period of exposure but even then it is difficult assess many of the malformations that we have looked at. Developmental toxicity studies are negative for most of the phthalates we have tested even when exposure is extended up to gestational day 20 in rats. The only tests in the

S 332

guideline, which include the appropriate ex- posure evaluation, are the multigenerational tests. There are numerous multigenerational testes listed under the old US EPA guidelines but many of the chemicals which we now know to produce malformations in a low percentage of the population were negative under these guidelines. Some endocrine disrupters produce malformations in less than 10% of animal popu- lations. Current guidelines require 20 animals to be tested.

Paul Foster It is only recently that the period of sexual differentiation has been covered by the dosing period. This becomes very important when we assess compounds such as the phthal- ates which have a relatively short half life and are not accumulated in the body. Multigener- ational studies are large studies costing cur- rently about $500.000 per test. An example of such a test involved initially 4 groups of animals with each group expected to achieve 20 litters. The parental generation had extensive exposure lasting for 10 weeks and the offspring are exam- ined to determine what happened due to ex- posure during the critical window of sexual dif- ferentiation. This illustrated study was orig- inally designed purely to assess the effects on reproduction and development, and it was a good study for these specific endpoints. Shortly after the birth of the first generation offspring, the litter is reduced by cullinghtandardisation into 4 males and 4 females which is considered adequate numbers for further studies. The off- spring are next examined at weaning, at post- natal day 21. The prostates are examined, al- though this is difficult in 21-day old rats, and any effects on the prostate are unlikely to be detected in weaned rats. It is the effect on the prostate when they become adults which is more meaningful. The weaned litters are again culled by 1 male and 1 female, which means that in multigenerational studies only a selection of the animals are examined at the various different ages. In contrast to the developmental toxicity studies described by Dr Earl Gray where all the pups are examined. If we are looking for mal- formation rates of 10-15%, which is a good positive effect in teratology studies, these mal-

GENERAL DISCUSSION ARE OECD GUIDELINES ADEQUATE?

functions will not be identified when we only look at 25% of the animals. We therefore have concern about the relative power of these mul- tigenerational studies, and they should not be used for purposes other than those for which they were originally intended. However, the US food agency where Dr Earl Gray works has in- dicated that these multigenerational studies are now going to be used for the definitive assess- ment of whether or not a specific compound is an endocrine disrupter, and they are the only studies which are going to provide dose re- sponse information for human risk assessment.

I now pose these questions to you. How good are our regulatory studies at evaluating EAC’s (Endocrine Active Compounds)? Firstly, are our dosing regiments aligned to the critical windows of development when the effects of these agents are likely to occur? It does not make sense to me to do all the work in adult animals when the most critical events are during pregnancy. Secondly, what are the appropriate endpoints to measure, and are they sufficiently sensitive to show the types of responses, which are import- ant and related to human health issues? Thirdly, is there sufficient study power to be able to de- tect the low incidence phenomena such as repro- ductive tract malfunctions, which may only oc- cur in less than 5% of the population? Fourthly, are there adequate dose response relationships to allow us to evaluate the extent of the human risk appropriately.

The bottom line is that we use animal models in order to predict what might happen in humans and wildlife.

Daniel Sheehan (Jefferson, USA) The as- sumptions made in toxicology testing must be explicitly understood by everyone. An import- ant issue is dose and the assumptions made in risk assessment. A threshold is assumed below which no risk exists and no adverse effect oc- curs and this assumption is the basis for limiting to high dose testing. Small groups of animals are tested to define a no-effect dose. This is di- vided by a safety factor to provide an acceptable dose for humans. However it is very rare to test that acceptable dose because it is not required under the guidelines. We should include low doses in the study design so that the risk from ambient exposure levels can be measured, as op- posed to assuming safety at low dose. We never test the ambient concentrations found in the en-

vironment. We define a threshold and we as- sume safety below this defined threshold. We have heard papers describing definite effects of chemicals at ambient exposure levels, but those are assessed in safety tests.

Earl Gray I must comment on that and take issue with you. This is an integrative process. A threshold is assumed unless data indicate other- wise for pesticides. I do agree that the reference dose (RFD) is rarely tested for effects because this is assured to be lOOx or lOOOx the no ob- served adverse effect level (NOAEL). When testing substances, if no NOAEL is detected, then smaller doses are tested to find the NOAEL. However, if small effects were present below the NOAEL then they could only be de- tected if very large samples were to be used. The necessity to include such an approach is an un- proven but testable hypothesis. There is little or no evidence that effects exist only at low levels. The low dose xenoestrogen data sets, which have been used as evidence for ultralow dose effects, do not display a U-shaped response. This is also a hypothesis, and is based on the dose-response for a single endpoint. When U- shaped dose response curves do exist for an endpoint, other endpoints typically display a more traditional dose response relationship. Much of this data comes from the studies on dioxins when it was found that environmental levels where present at relevant doses. You sug- gest that we are completely missing environmen- tal effects, but I do not think that is true.

Finn Bro-Rasmussen (Copenhagen Denmark) If you are confronted with a series of chemicals such as phthalates or urones with potential en- docrine disruptive activity, both of which are commercially available and are found in drink- ing water, how far do you examine these sub- stances from a molecular structural point of view and relate this (via QSAR studies) to po- tential similar effects from chemically/structur- ally related chemicals? Do you perform QSAR (Quantitative-Structure-Activity-Relationship) studies in order to predict endocrine disruptive activity of structurally related compounds? You mentioned linuron but have you examined other urone-type herbicides?

Earl Gray If you are asking about extrapol- ation from chemical structure to mechanisms, this is not addressed in our studies. The struc- ture of our testing protocol is designed to pro-

s 333

GENERAL DISCUSSION: ARE OECD GUIDELINES ADEQUATE?

duce a dose response curve which is assessing effects rather than mechanisms. Our multigen- erational studies are not sufficiently robust to pick up these effects even if they occur in rats.

John Brock (Atlanta, USA) I am primarily interested in humans rather than rats. Our studies should not be performed on animals, which are not exposed to anything as a model for background exposure in humans. But we should look at the humans themselves and de- termine background exposures and use this as a starting point for the animal experiments. We should set priorities for testing substances based on what we know about which chemical are ex- posed to humans. How do you select the com- pounds to test?

Gail Risbridger (Clayton, Australia) I would like to comment on the endpoints used in mouse models, especially when looking at the effects of endocrine disrupters on prostate. The pros- tate in rodents is quite different from the human prostate. The rodent prostate is a lobular struc- ture, and many studies are fixated on the ventral prostate disregarding the lateral and anterior lobes. The hierarchy of sensitivity is quite differ- ent from the human. The anterior prostate is much more sensitive to oestrogens whereas the ventral prostate is more sensitive to androgens. These differences have not been built into the endpoints in the model systems used for testing endocrine disrupters activity, and consideration of this may give different results.

Paul Foster Many workers dissect the pros- tate and weigh the different lobes and examine all of the lobes histologically in the rodents. You must remember that typically rodents do not develop prostatic cancer.

John Ashby (Alderley Park, UK) In response to Dr. Sheehan, there are many chemicals which are being assessed over a large range of doses such as an ongoing bisphenol A (BPA) study which tests doses from 100s of mg down to pg amounts. Dr. Foster identified 2 potential prob- lems. Firstly, some animals are examined when they are too young to show an effect, for ex- ample, on the prostate; and secondly, the culling of 1 out of 4 pups. Which of these 2 factors is the more important? If the animals were all to be kept alive, would a linuron effect be detected? Do you leave any data, or are you just speculating?

Paul Foster Dr. McIntyre’s poster shows that our studies on linuran involved keeping every

s 334

single animal in the rat litters so that we could assess the true litter incidence of abnormal re- productive organ development. This incidence was quite low and is not seen using conventional protocols in which the litter size is culledstan- dardised

John Ashby How can we change those proto- cols with a degree of urgency?

Paul Foster Agencies do not act rapidly! John Ashby You should justify your concerns

to EHP (Environmental Health Perspectives) so that we can start to modify our protocols in ad- vance of regulatory changes in the future. We do not want to devalue half million dollar studies.

Paul Foster Changing regulatory protocols is a difficult and time-consuming effort. A letter from scientists to the editor of a journal ex- pressing concern on whether the multigener- ational study protocol can meet the challenge posed by providing definite information on en- docrine-active agents is an excellent approach.

Shanna Swan (Columbia, USA) I would like to comment on the analytical end of the pro- cess. Epidemiologist are now starting to use methods which simultaneously analyse results involving multiple endpoints by using multivari- ate analysis, and we are moving away from a dependence on the “zero-one” range null hy- pothesis and p-values. Do you not think that this approach should also be adopted by toxi- cologists? Simply classifying studies as positive or negative is extremely crude, and this method discards a great deal of information making it, effectively, obsolete.

Paul Foster We should certainly not be too dependent on statistical analyses which are sup- portive rather than illuminative. In our studies on prenatal treatments, statistically we induce “heterogeneity of variance of the variance - co- variance matrices” which invalidates the use of such approaches.

Erica van den Akker (Rotterdam, The Nether- lands) There is much information about the endpoints of endocrine disruption affecting the reproductive system in the new OECD guide- lines, but there is less emphasis on the adrenal and thyroid glands and other endocrine effects. Are these non reproductive endpoints also being examined and implemented in the ongoing studies?

Earl Gray In the US EPA screening and test- ing programme there is a requirement for thy-

GENERAL DISCUSSION ARE OECD GUIDELINES ADEQUATE?

roid evaluation. The screening assays include an assessment of T4, T3, TSH, thyroid histology and growth of the animals. There is a suggestion that these endpoints should also be added to the multigenerational tests. The adrenal function was not designated as one of the key areas of concern, and it was hoped that endocrine ab- normalities would not be missed. Other end- points would be added if any effects were to be identified.

Manolis Kogevinas (Barcelona, Spain) Dr Earl Gray describes interactions between ex- perimental and epidemiological studies, and perhaps we should be looking more at mixtures of compounds and synergism rather than effects of single compounds. Epidemiological studies tend to concentrate on single compounds.

Earl Gray We are studying risk assessments on individual chemicals, and this may not pro- vide sufficient information on the safety of these in mixtures because of possible combi- nation effects. For example, different antiandro- genic substances have additive effects so that we shall have to look at exposures to multiple com- pounds for an effect.

Peter Meyer (Charlottesville, USA) What are the EPA’s and other agency’s plans for system- atic assessment of mixtures which, as we have heard, are key to environmental effects because of additive or synergistic activity?

The Food Quality Protection Act (FQPA) in USA has only taken a primitive first step to- wards this issue. What do you recommend?

Earl Gray I work in the research division of EPA and I am not a “policy” person - I believe that we shall soon be regulated under the FQPA in relation to some of the organophosphate pesticides, and they are considering some of the antiandrogen combinations.

Elaine Francis (EPA, Washington, USA) The FQPA mandates that the EPA conduct cumula- tive risk assessment for mixtures of those pesti- cides that act by a similar mode of action. There has been concern expressed about the adequacy of the 2 generational and multigenerational studies to identify all potential endocrine dis- ruptive outcomes. However, these tests are not performed in a vacuum. There are a whole bat- tery of other tests used on food-use pesticides, such as developmental toxicity studies, 2-year bioassay tests, carcinogenicity, mutagenicity and neurotoxicity studies. In the screening and test- ing programme, there are 8 screens and 5 tests including mammalian and avian studies. We should be looking at the adequacy of the entire battery of in vivo and in vitro tests, and not just the adequacy of a single test in isolation.

Earl Gray The addition of the EPA screening and testing programme to the overall database would be very helpful. The developmental toxic- ity study and the cancer bioassay are not very useful in identifying endocrine disrupter chemi- cals, and additional relevant data could be ob- tained from a 90-day chronic study.

Retha Newbold (NIEHS, USA) The Na- tional Toxicology Program (NTP) has a large multigenerational study running over 5 gener- ations looking at multiple organ effects of 5 dif- ferent compounds. The endpoints include beha- vioural aspects, neuroendocrine aspects and im- mune functions as well as reproductive endpoints and cancer studies. We are looking at a dose range, which extends down to levels comparable to environmentally relevant levels of human exposure.

s 335