I will deal with the issues Mary raised point by point. But first, let’s correct some misunderstandings. Mary claimed I am a *“fluoride promoter”* and had* “sought to discredit the study via his blog posts and tweets.”*

- I do not
*“promote fluoride.”*My purpose on this issue has always been to expose the misinformation and distortion of the science surrounding community water fluoridation (CWF). I leave promotion of health policies to the health experts and authorities. - I have not
*“sought to discredit the study.”*The article Mary responded to was a critique of the misrepresentation of that study by Paul Connett – not an attack on the study itself. This might become clear in my discussion below of the study and how it was misrepresented.

The paper we are discussing is:

Bashash, M., Thomas, D., Hu, H., Martinez-mier, E. A., Sanchez, B. N., Basu, N., … Hernández-avila, M. (2016). *Prenatal Fluoride Exposure and Cognitive Outcomes in Children at 4 and 6 – 12 Years of Age in Mexico.*Environmental Health Perspectives, 1, 1–12.

Anti-fluoride activists have leaped on it to promote their cause – Paul Connett, for example, claimed it should lead to the end of community water fluoridation throughout the world! But this is not the way most researchers, including the paper’s authors, see the study. For example, Dr. Angeles Martinez-Mier, co-author and one of the leading researchers, wrote this:

*1. “As an individual, I am happy to go on the record to say that I continue to support water fluoridation”*

*2. “If I were pregnant today I would consume fluoridated water, and that if I lived in Mexico I would limit my salt intake.”*

*3. “I am involved in this research because I am committed to* *contribute to the science to ensure fluoridation is safe for all.”*

Mary asserts:

“Perrott claims that the results were not statistically significant but his analysis is incorrect.”

**That is just not true. I have never claimed their reported association was not statistically significant.**

I extracted the data they presented in their Figures 2 and 3A and performed my own regression analysis on the data. This confirmed that the associations were statistically significant (something I never questioned). The figures below illustrating my analysis were presented in a previous article (*Maternal urinary fluoride/IQ study – an update*). These results were close to those reported by Bashash et al., (2017).

**For Fig. 2:**

My comment was –* “Yes, a “statistically significant” relationship (p = 0.002) but it explains only 3.3% of the variation in GCI (R-squared = 0.033).”*

For Fig 3A:

My comment was –* “Again, “statistically significant” (p = 0.006) but explaining only 3.6% of the variation in IQ (R-squared = 0.0357).”*

So I in no way disagreed with the study’s conclusions quoted by Mary that:

” higher prenatal fluoride exposure, in the general range of exposures reported for other general population samples of pregnant women and nonpregnant adults, was associated with lower scores on tests of cognitive function in the offspring at age 4 and 6–12 y.”

I agree completely with that conclusion as it is expressed. But what Mary, Paul Connett and all other anti-fluoride activists using this study ignore is the real relevance of this reported association. The fact that it explains only about 3% of the IQ variance. I discussed this in the section *“The small amount of variance explained“* in my article.

This is a key issue which should have been clear to any reader or objective attendee of Paul Connett’s meeting where the following slide was presented:

Just look at that scatter. It is clear that the best-fit line explains very little of it. And the 95% confidence interval for that line (the shaded area) does not represent the data as a whole. The comments on the statistical significance and confidence intervals regarding to the best-fit line do not apply to the data as a whole.

Finally, yes I did write (as Mary quotes) in my introductory summary that *“the study has a high degree of uncertainty.”* Perhaps I should have been more careful – but my article certainly makes clear that I am referring to the data as a whole – not to the best fit line that Connett and Mary concentrate on. The regression analyses indicate the uncertainty in that data by the low amount of IQ variance explained (the R squared values) and the standard error of the estimate (about 12.9 and 9.9 IQ points for Fig 2 and Fig 3A respectively).

Despite being glaringly obvious in the scatter, this is completely ignored by Mary, Paul Connett and other anti-fluoride activists using this study. Yet it is important for two reasons:

- It brings into question the validity of the reported statistically significant association
- It should not be ignored when attempting to apply these findings to other situations like CWF in New Zealand and the USA.

Paul Connett actually acknowledged (in a comment on his slides) I was correct about the association explaining such small amount of the variance but argued:

- Other factors will be
*“essentially random with respect to F exposure,”*and - The observed relationship will not be changed by the inclusion of these other factors.

I explained in my article *Paul Connett’s misrepresentation of maternal F exposure study debunked* how both these assumptions were wrong. In particular, using as one example the ADHD-fluoridation study I have discussed elsewhere (see Perrott, 2017). I hope Mary will refer to my article and discussion in her response to this post.

While ignoring the elephant in the room – the high degree of scattering, Mary and others have limited their consideration to the statistical significance and confidence intervals of the reported association – the association which, despite being statistically significant, explains only 3% of the variation (obvious from the slide above.

For example, Mary quotes from the abstract of the Bashash et al., (2017) paper:

“In multivariate models we found that an increase in maternal urine fluoride of 0.5mg/L (approximately the IQR) predicted 3.15 (95% CI: −5.42, −0.87) and 2.50 (95% CI −4.12, −0.59) lower offspring GCI and IQ scores, respectively.”

I certainly agree with this statement – but please note it refers only to the model they derived, not the data as a whole. Specifically, it applies to the best-fit lines shown in Fig 2 and Fig 3A as illustrated above. The figures in this quote relate to the coefficient, or slope, of the best fit line.

Recalculating from 0.5 mg/L to 1 mg/L this simply says the 95% of the coefficient values, or slopes, of the best fit lines resulting from different resampling should be in the range -10.84 to -1.74 CGI (Fig 2) and -8.24 to 1.18 IQ (Fig 3A).

[Note – these are close to the CIs produced in my regression analyses described above – an exact correspondence was not expected because digital extraction of data from an image is never perfect and a simple univariate model was used]

The cited CI figures relate only to the coefficient – not the data as a whole. And, yes, the low p-value indicates the chance of the coefficient, or slope, of the best-fit line being zero is extremely remote. The best fit line is highly significant, statistically. But it is wrong to say the same thing about its representation of the data as a whole.

This best-fit line explains only 3% of the variance in IQ – and a simple glance at the figures shows the cited confidence intervals for that line simply do not apply to the data as a whole.

That brings us back to the problem of misrepresentation. We should draw any conclusions about the relevance of the data in the Bashash et al., (2017) study from the data as a whole – not just from the small fraction with an IQ variance explained by the fitted line.

Paul Connett claimed:

“The effect size is very large (decrease by 5-6 IQ points per 1 mg/L increase in urine F) and is highly statistically significant.”

But this would only be true if the model used (the best-fit line) truly represented all the data. A simple glance at Fig 2 in the slide above shows that any prediction from that data with such a large scatter is not going to be “highly statistically significant.” Instead of relying on the CIs for the coefficient or slope of the line, Connett should have paid attention to the standard error for estimates from the data as a whole given in the Regression statistics of the Summary output. – For Fig. 2, this is 12.9 IQ points. This would have produced an estimate of * “5-6 ± 26 IQ points which is not statistically significantly different to zero IQ points,”* as I described in my article

Statistical analyses can be very confusing, even (or especially) to the partially initiated. We should be aware of the specific data referred to when we cite confidence intervals (CIs).

For example, Mary refers to the CI values for the coefficients, or slopes, of the best fit lines.

Figs 2 and 3A in the Bashash et al., (2017) paper include confidence intervals (shaded areas) for the best fit lines (these take into account the CIs of the constants as well as the CIs of the coefficients). That confidence interval describes the region of 95% probability for where the best-fit line will be.

Neither of those confidence intervals applies to the data as a whole as a simple glance at Figs 2 and 3A will show. In contrast, the “prediction interval” I referred to in my article, does. This is based on the standard error of the estimate listed in the Regression statistics. Dr. Gerard Verschuuren demonstrated this in this figure from his video presentation.

Mary is perfectly correct to claim *“it is the average effect on the population that is of interest”* – but that is only half the story as we are also interested in the likely accuracy of that prediction. The degree of scatter in the data is also relevant because it indicates how useful this average is to any prediction we make.

Given the model described by Bashash et al., (2017) explained only 3% of the IQ variance, while the standard error of the estimate was relatively large, it is misleading to suggest any *“effect size”* predicted by that model would be *“highly significant”* as this ignores the true variability in the reported data. When this is considered the effect size (and 95% CIs) is actually **“5-6 ± 26 IQ points** which is not statistically significantly different to zero IQ points,”

I will leave these for now as they belong more to a critique of the paper itself (all published papers can be critiqued) rather than the misrepresentation of the paper by Mary Byrne and Paul Connett. Mary can always raise them again if she wishes.

So, to conclude, Mary Byrne is correct to say that the model derived by Bashash et al., (2017) predicts that an increase of *“fluoride level in urine of 1 mg/L could result in a loss of 5-6 IQ points” *– on average. But she is wrong to say this prediction is relevant to New Zealand, or anywhere else, because when we consider the data as a whole that loss is **“5-6 ± 26 IQ points.”**

I look forward to Mary’s response.

**Image credit:** BuildGreatMinds.Com