Noah van Dongen, Johnny van Doorn, Quinten Gronau, Don van RavenZwaaij, Rink Hoekstra, Matthias Haucke, Daniel Lakens, Christian Hennig, Richard Morey, Saskia Homer, Andrew Gelman, Jan Sprenger and Eric-Jan Wagenmakers: “Multiple Perspectives on Inference for Two Simple Statistics Scenarios”
When data analysts operate within different statistical frameworks (e.g., frequentist versus Bayesian, emphasis on estimation versus emphasis on testing), how does this impact the qualitative conclusions that are drawn for real data? To study this question empirically we selected from the literature two simple scenarios –involving a comparison of two proportions and a Pearson correlation– and asked four teams of statisticians to provide A) a concise analysis and a qualitative interpretation of the outcome and B) discuss their and each others results.
Noah van Dongen and Leonie van Grootel: “Systematic Review on NHST criticism“
The null hypothesis significance test (NHST) procedure is used ubiquitously for analyzing experimental and observation data in the social science. However, since its infancy it has been criticized for it deficiencies and (its ease for) misuse by scientists. The body of literature on NHST criticism has been steadily growing over the years, which accelerated when talk of a ‘replication crisis’ started in the social sciences. All though, many articles have been published on the subject, a clear overview of NHST’s deficiencies, misinterpretations, and misuses is absent and we hope to correct this via a systematic review.
Michał Sikorski: “Values, Bias and Replicability”
The value-free ideal is a view which claims that scientist should not use their non-epistemic values while they justifying their hypothesis. The view recently become unpopular among philosophers. I defend the ideal by showing that if we accept the uses of non-epistemic values prohibited by it, we are forced to accept as legitimate scientific conduct some of the disturbing phenomena of nowadays science e.g. sponsorship bias. I will also show how the use of non-epistemic values contributes to the replication crisis.
Noah van Dongen, Matteo Colombo, Felipe Romero, and Jan Sprenger: “Semantic Intuitions – a meta-analysis”
(contact authors for more details)
At the beginning of the twenty-first century, experimental philosophy started to contribute to the debate on theories of reference through Machery, Mallon, Nichols, and Stich’s seminal article Semantics, cross-cultural style (2004) Their empirical results indicated a difference in semantic intuitions between Western and East Asian people. While these results have raised several questions about the evidential support of philosophical theories of the semantics of proper names, they have more generally contributed to challenge the use of intuitions in philosophy.
However, our attempt to replicate these results failed. Als part of the X-phi replication project (Cova et al., 2018), we tried to replicated the original results in a high-powered study using a a similar design. Our failed replications was considered a curiosity, because the authors of the original study claimed that the effect had been replicated on many occasions.
Therefore, we decided to perform a meta-analysis on all available research on semantic intuitions to obtain a better understanding of the robustness of Machery et al’s (2004) finding, and, more generally, of cross-cultural differences in semantic intuitions.
Felipe Romero and Jan Sprenger: “The Replication Crisis: Social or Statistical Reform?”
(contact authors for more details)
Science is going through a worrisome replication crisis. Many findings in the behavioral sciences don’t replicate: in follow-up experiments, effects often diminish or disappear completely (e.g., Open Science Collaboration, 2015). And albeit less pronounced, the same pattern has been observed for economic and medical sciences. This crisis casts severe doubt on the reliability and trustworthiness of experimental research.
How should we change science to make it more reliable? Statistical reformists hypothesize that the reliability of experiments would be greatly improved by moving away from significance tests (NHST), for instance, by increasing reliance on Bayesian statistics. On the other hand, social reformists hypothesize that changes in inference methods alone do not make science more reliable: we also have to change the social structure of science and the current credit reward scheme.
Using a computer simulation study, we evaluate whether the replication crisis is an artifact of the currently dominant statistical framework, or whether it has deeper causes in the social structure of science. We end up advocating a middle ground between the social and statistical reformist: statistical reform will science more reliable, but only if combined with social reform.
Matteo Colombo, Georgi Duev, Michèle B. Nuiten and Jan Sprenger: “Statistical Reporting Inconsistencies in Experimental Philosophy”
Experimental philosophy (x-phi) is a young field of research in the intersection of philosophy and psychology. It aims to make progress on philosophical questions by using experimental methods traditionally associated with the psychological and behavioral sciences, such as null hypothesis significance testing (NHST). Motivated by recent discussions about a methodological crisis in the behavioral sciences, questions have been raised about the methodological standards of x-phi. Here, we focus on one aspect of this question, namely the rate of inconsistencies in statistical reporting. Previous research has examined the extent to which published articles in psychology and other behavioral sciences present statistical inconsistencies in reporting the results of NHST. In this study, we used the R package statcheck to detect statistical inconsistencies in x-phi, and compared rates of inconsistencies in psychology and philosophy. We found that rates of inconsistencies in x-phi are lower than in the psychological and behavioral sciences. From the point of view of statistical reporting consistency, x-phi seems to do no worse, and perhaps even better, than psychological science.
Jan Sprenger: “Conditional Degrees of Belief”
Conditional degree of belief is a fundamental concept in Bayesian inference. Some conditional degrees of belief—the probability of an observation given a statistical hypothesis—are closely aligned with the probability density functions (i.e., “objective chance distributions”) of the corresponding statistical models. Justifying this alignment is, however, a far from trivial task. This paper articulates and defends a suppositional analysis of conditional degree of belief: it is in line with our reasoning practices and explains, unlike other accounts, why degrees of belief often track probability density functions. My account also clarifies the role of chance-credence coordination principles in Bayesian inference. Then, I extend the suppositional analysis and argue that all probabilities in Bayesian inference should be understood as (model-relative) conditional degrees of belief. I conclude with an exploration of how this view affects the relationship between Bayesian models and their target system, and the epistemic significance of Bayes’ Theorem.