Friday, 14 June 2013

Truths, Glorified Truths and Statistics (II)

(part 2: "To boldly, go...")

First a disclaimer: I love the work on the p-curve and the estimation of effect sizes. I support the disclosure initiatives (4 questions, 21 words) and the call for more quality and less quantity (however, also see part 1 in which I remind the reader there are many scientists for which there has been no life before p-hacking and a claim of ignorance on these matters is at the very least disrespectful to these scholars)



Let me be the one to spoil all the fun: There is no true effect!

It does not exist as an entity in reality, it is not one of the constituents of the universe, it should be a measurement outcome observed in a measurement context that was predicted by by a theory about a specific domain in reality.

As was pointed out by Klaus Fiedler at the Solid Science Symposium"What does it mean there is an effect?" (I am quoting from memory, this may be incorrect)

According to the live tweet feed earlier that day:

Solid Science Symposium Tweet Feed - Excellent!

If you believe this is possible, that a true effect can somehow be discovered, out there in reality, like a land mass across the ocean where everyone said there would be dragons, or a new species of silicon based life forms at the other end of the worm-hole, then you show one of the symptoms of participating in a failing system of theory evaluation and revision that I dubbed the [intergalactic] explorer delusion

This refers the to the belief expressed by many experimental psychological scientists that the purpose of scientific inquiry is to go where no man has gone before and observe the phenomena that are “out there” in reality waiting to be uncovered by clever experimental manipulation and perhaps some more arbitrary poking about as well. 

A laboratory experiment is however not a field study or an excursion beyond the neutral zone. Even if it were, I would argue that wherever you go as a scientist, boldly, or otherwise, you will be guided and quite possible even be blinded, by a theory or a mathematical formalism about reality that is in most cases implicitly present in your theorising.


Let's analyse this delusion by scrutinising a recent paper by Greenwald (2012) entitled: “There is nothing so theoretical as a good method”, which is a reference to the famous quote by a giant of psychological science, Kurt Lewin (1951). This also allows me to comment on what it actually is that Platt meant to say by the term "strong inference" in his 1964 paper.

Greenwald is explicit about his position towards theory; he is not anti-theoretic, as he acknowledges that theories achieve parsimonious understanding and guide useful applications (but he does not specify… of what?). The author is however also skeptical of theory, because he noticed the ability of theory to restrict open-mindedness. This is indeed a proper description of a theory: It is a specific tunnel-vision, but from the perspective of the Structural Realist (forgive me, I will explain this position more  precisely in the near future), this tunnel-vision is is only temporary.

It will be no surprise I disagree with the following: 
“When alternative theories contest the interpretation of an interesting finding, researchers are drawn like moths to flame. J. R. Platt (1964) gave the approving label “strong inference” to experiments that were designed as crucial empirical confrontations between theories that competed to explain a compellingly interesting empirical result.” (Greenwald, 2012, pp. 99–100, emphasis added)
That is not at all what Platt meant by strong inference, but incidentally we find another symptom of a failing system of theory evaluation, the interpretation fallacy I mentioned in part 1: Theories do not compete for their ability to provide an understandable description or explanation of empirical phenomena. They compete for the ability to predict measurement contexts in which phenomena may be observed and they compete for the accuracy with which measurement outcomes were predicted. And J.R. Platt agrees with this perspective as he describes very clearly:


“Strong inference consists of applying the following steps to every problem in science, formally and explicitly and regularly:


1) Devising alternative hypotheses;

2) Devising a crucial experiment (or several of them), with alternative possible outcomes, each of which will, as nearly as possible, exclude one or more of the hypotheses;

3) Carrying out the experiment so as to get a clean result;

1') Recycling the procedure, making subhypotheses or sequential hypotheses to refine the possibilities that remain; and so on.”

(Platt, 1964, p. 347, emphasis added)
Strong inference starts with devising alternative hypotheses to a problem in science and not with an interesting finding. Platt comments that step 1 and 2 require intellectual invention, which I take the liberty to translate as ‘theorizing about reality’. That is what you do when you device a method.

One source of evidence for his argument concern 13 papers, listed in a table that have started controversies on average 44 years ago in psychological science, but which still have no resolution. The author claims that in order to resolve the controversies, the method of strong inference was applied, which obviously failed. Also, it is claimed that philosophy of science provides no answers to resolve the controversies, because it discusses (apparently endlessly) whether such issues can be resolved empirically in principle. It is clear that Greenwald is referring to the resolution of these controversies as a resolution about the ‘reality’ of the ontology of a theory. This is again a matter of interpretation and is not what formal theory evaluation is about. The constituents of reality posited to exist by a theory are irrelevant in theory evaluation. As long as everything behaves according to the predictions by the theory, we should just accept those constituents as temporary vehicles for understanding. I believe these controversy theories were not properly evaluated for their predictive power and empirical accuracy. I don't know if they can be evaluated in that way, if they cannot, the conclusion must be the theories are trivial.

This impression that ontology evaluation seems to be the problem here is indeed supported by the descriptions provided for the 13 controversies: It is primarily a list of clashes of ontology, e.g., Spreading activation vs. Compound cueing. Further support comes from the examples provided to argue that even if philosophy had an answer, this would not refrain scientists to continue the debate. The fact that scientists do not do this implies to the author there must be another way than strong inference to resolve controversies in science. This is illustrated by examples in which a scientific community was able to achieve consensus about a problem in their discipline (the classification of Pluto as a dwarf planet, HIV as the cause of AIDS and the influence of human activity on global warming). The author suggests that controversies in psychology could be resolved if only a reasonable consensus could be achieved.

I cannot disagree with the author on his wish for a science that worked towards reaching consensus about the phenomena in its empirical record, instead of wasting energy on definite existence proofs for the ontologies of competing theories. Recall the history of the quantum formalism, two very different theoretical descriptions of reality (waves vs. particle ontologies) were found to be the same for all intents and purposes. I am certain that scientists in cosmology, virology and climatology used strong inference to work towards those consensus resolutions, but I did not check it. Strong inference and consensus formalism science go hand in hand.

What I can say is that Platt’s recycling procedure (step 1’) suggests replication attempts should be carried out and apparently there is somewhat of a problem with replication of phenomena in psychological science. So this makes it again very unlikely any strong inference has been applied to resolve theoretical disputes in psychological science. Indeed, one of the authors listed to have caused a controversy that was unresolved by strong inference, recently challenged the discipline to start replicating the ‘interesting findings’ in its empirical record (e.g. Yong, 2012).

(There must be some proverb about dismissing something before its merits have been properly examined...)


A second source of evidence to support his suspicions about the benefits of theorising, Greenwald examines the description of Nobel Prizes for their being rewarded due to theoretical or methodological contributions. The [intergalactic] explorer delusion is obvious here; Greenwald highly values the appearance of the word ‘discovery’: 
“Most “discovery” citations were for methods that permitted previously impossible observations, but in a minority of these, “discovery” indicated a theoretical contribution.” 
He concludes that theory was important for the development of methods, and that novel methods produced inconceivable results, that prompted new theory.

I am quite certain that the referred inconceivable results were predicted by a theory or considered as an alternative hypothesis. They concern measurement contexts one just does not accidentally stumble upon. If outcomes were surprising given the predicted context, an anomaly to the theory was found, and in that case, naturally, a new theory would have to be created. It was however due to an anomaly to a theoretical prediction, not due to a ‘discovery’ of a phenomenon by a method! The Large Hadron Collider (or any other billion-dollar instrument of modern physics) was not built as a method, a vehicle to seek out previously unknown phenomena like the starship U.S.S Enterprise. Theory, very strongly predicted a measurement context in which a boson should be observable that completed the standard model of particle physics. The methods scientists use for obtaining knowledge about the structure of reality is the result of testing predictions by theories, without exception. Satellites are not sent into space equipped with multi-million dollar X-ray detectors just to see what they will find when they get there. 

I conclude by commenting on the way the author describes why Michelson won the Nobel Prize for Physics in 1907. This involves a recurring theme in a paper I am about to submit: the luminiferous Æther. Experimental physicists like Michelson and Morley spent most of their academic careers (and most of their money) on experiments that tested the empirical accuracy of theories that predicted a very specific observable phenomenon called Æther-dragging. Their most famous experiment reported in “On the Relative Motion of the Earth and the Luminiferous Ether” (Michelson & Morley, 1887), showed very accurately and consistently that there was no such thing as an Æther, or at least, that its influence on light and matter was not as large as the Æther-dragging hypothesis predicted it would be. This of course harmed the precision and accuracy of Æther-based theories of the cosmos, but to hint, as Greenwald seems to do, that the method ‘caused’ Einstein to create special relativity theory is farfetched. 

Michelson won the Nobel Prize for Physics in 1907 for the very consistent null-result (yes psychological science, such things can be important) and for the development of the interferometer instruments that meticulously failed to measure any trace of the Æther (cf. Michelson, 1881). Their commitment to the Æther was adamant though. To be absolutely certain that the minute interferences that were occasionally measured were indeed due to measurement error, instruments of increasing accuracy and sensitivity were built. The largest were many meters wide and placed on high altitude on heavy slabs of marble floating on quicksilver in order to avoid vibrations interfering with the measurement process. Now that is a display of ontological commitment! It was however as much motivated by theoretical prediction as the construction of the Large Hadron Collider. Not a theory-less discovery by some clever poking about.

Greenwald admits that the word theory is often used in Michelson and Morley’s 1885 article, so theory must have played an important role in the design of the instruments. The role was not just 'important', without the theory there would have been no method at all. In fact, if a theory of special relativity had been published 20 years before 1905 (physicists knew something like relativity was necessary), there would have been no instruments constructed at all because:
"Whether the ether exists or not matters little - let us leave that to the metaphysicians; what is essential for us is, that everything happens as if it existed, and that this hypothesis is found to be suitable for the explanation of phenomena. After all, have we any other reason for believing in the existence of material objects? That, too, is only a convenient hypothesis; only, it will never cease to be so, while some day, no doubt, the ether will be thrown aside as useless." (Poincaré, 1889/1905, p. 211). 
And indeed, the Æther  was thrown aside as useless, because a method devised to test a prediction by a theory yielded null results. Strong inference means this repeated null-result has consequences for the credibility of the theory that predicted the phenomenon. Apparently, in psychological science, this id a difficult condition to achieve. 

The Structural Realist's take home message is: 

  1. We should believe what scientific theories tell us about the structure of the unobservable world, but
  2. We should be skeptical about what they tell us about the posited ontology of the unobservable world. 
In this quote by Poincaré may lie the answer to Greenwald's interpretation of current practice of psychological science (which is in fact a very accurate description of the problems we have with theory evaluation, I just do not agree with the interpretation): Why does Poincaré reserve a special place for the hypothesis about material objects, which will never cease to to be so? 


Still believe it is possible to use a method that was not predicted to yield measurement outcomes by a theory about reality? 

Ok.

I'll think of some more examples.



References

Greenwald, A. G. (2012). There Is Nothing So Theoretical as a Good Method. Perspectives
on Psychological Science, 7(2), 99–108. doi:10.1177/1745691611434210

Michelson, A. . (1881). The Relative Motion of the Earth and the Luminiferous Ether. American Journal of Science, 22(128), 120–129. Retrieved from http://www.archive.org/details/americanjournal62unkngoog

Michelson, A. ., & Morley, E. W. (1887). On the Relative Motion of the Earth and the Luminiferous Ether. American Journal of Science, 34(203), 333–345. Retrieved from http://www.aip.org/history/gap/PDF/michelson.pdf

Platt, J. (1964). Strong Inference. Science, 146(3642), 347–353. Retrieved from http://clustertwo.org/articles/Strong Inference (Platt).pdf

Poincaré, H. (1905). Science and Hypothesis. New York: The Walter Scott Publishing Co., LTD. Retrieved from http://www.archive.org/details/scienceandhypoth00poinuoft

Yong, E. (2012). Nobel laureate challenges psychologists to clean up their act. Nature. Retrieved from http://www.nature.com/doifinder/10.1038/nature.2012.11535