Articles

Personalise
UNSW Business School in Sydney CBD
  • October 23, 2024

    Relentlessly headed for another defeat?

    After having her defamation claims against the DataColada and Harvard defendants dismissed rather decisively (and very pointedly), Gino amends her 25 million US dollar suit against the University to include Title VII and discrimination claims.  

    Read more

    October 5, 2024

    Don't Stop Believing Though.

    Another day another retraction involving some of the usual suspects (Alison Wood Brooks, Juliana Schroeder, Jane L. Risen, Francesca Gino, Adam D. Galinsky, Michael I. Norton, Maurice E. Schweitzer).

    Read more

    See Schroeder’s X thread on this matter.

    September 29, 2024

    Scores of papers by Eliezer Masliah, prominent neuroscientist and top NIH official, fall under suspicion

    This promises to be massive. Read the Science report that Mu Yang links to.

    Here and here

    September 28, 2024

    A curious reading of the memorandum with which Judge Joun dismissed Gino’s defamation claims against the DataColada and Harvard defendants.

    A curious reading of Judge Joun's Memorandum at best. I guess there is a reason why Gino turned off comments on her post.  
     
    What Gino omits is this: 
    "For the reasons below, the Harvard Defendants’ Motion to Dismiss is GRANTED in part and DENIED in part, and the Data Colada Defendants’ Motion to Dismiss is GRANTED." 
     
    Judge Joun's Memorandum makes for fascinating reading. I am not a lawyer but ... (here is AO’s summary): 
    All defamation claims have been dismissed. Often in damning and no uncertain terms. What's left for the "Harvard Defendants" is the claims around contract violation and procedural irregularities. I doubt they will go anywhere. Gino and her lawyers have made much of the alleged switch in policies from the 2013 Research Integrity Policy to the 2021 Interim Policy and Procedures for Responding to Allegations of Research Conduct, this latter based apparently on federal guidelines. Alas, the 2013 policy gives the Dean considerable leeway in the way s/he deals with such allegations (see p. 6 of the Decision), which includes arguably the creation of such Interim Policy. Time will tell what the good judge makes of it ... 

    Memorandum Gino's post

    September 24, 2024

    Retracted by Nature Human Behavior: A Study that Was Hailed as a Win for Science Reform 

    This retraction had been a long time coming. Bak-Coleman & Devezer asked important questions about it, published now as Bak-Coleman, J. & Devezer, B. Claims about scientific rigour require rigour. Nat. Hum. Behav. (2024). On 11 December 2023 the NHB editors alerted their readers that "this paper is subject to criticisms that are being considered by the editors. …" Jessica Hullman, referring to the then circulating Bak-Coleman & Devezer critique, contributed on 27 March 2024 a pointed entry on the Gelman blog Statistical Modeling, Causal Inference, and Social Science https://statmodeling.stat.columbia.edu/

    • " ... one of the questions raised by Bak-Coleman and Devezer about the published version was about their claim that all of the confirmatory analyses they present were preregistered. There was no such preregistration in sight if you checked the provided OSF link. I remarked back in November that even in the best case scenario where the missing preregistration was found, it was still depressing and ironic that a paper whose message is about the value of preregistration could make claims about its own preregistration that it couldn’t back up at publication time.  
      ... 
      It seems clear that the dishonesty here was in service of telling a compelling story about something. I’ve seen things like this transpire plenty of times: the goal of getting published leads to attempts to find a good story in whatever results you got. Combined with the appearance of rigor and a good reputation, a researcher can be rewarded for work that on closer inspection involves so much post-hoc interpretation that the preregistration seems mostly irrelevant. It’s not surprising that the story here ends up being one that we would expect some of the authors to have faith in a priori.  
      ... 
      What do I care? Why should you?  
      ... 
      There are many lessons to be drawn here. When someone says all the analyses are preregistered, don’t just accept them at their word, regardless of their reputation. Another lesson that I think Andrew previously highlighted is that researchers sometimes form alliances with others that may have different views for the sake of impact but this can lead to compromised standards. Big collaborative papers where you can’t be sure what your co-authors are up to should make all of us nervous. Dishonesty is not worth the citations." 
       
      (Jessica Hullman on the Gelman blog, March 27, 2024)

      On 24 September 2024, she contributed another entry on the Gelman blog in which she identified four reviewers of the Protzko et al. manuscript (Elson, Yarkoni, Lakens, herself). You can sense her disappointment and frustration clearly.

      That very same day Stefanie Lee also posted a must-read on The Chronicle of Higher Education that she titled “This Study Was Hailed as a Win for Science Reform. Now It’s Being retracted”.

      On 26 September 2024, Andrew Gelman himself posted a long summary of the events titled "What's the story behind that paper by the Center for Open Science team that just got retracted?"

      In conclusion: "The 2023 paper that claimed, “this high replication rate justifies confidence in rigour-enhancing methods to increase the replicability of new discoveries,” was a disaster. The 2024 retraction of the paper makes it less of a disaster. As is often the case, what appears to be bad news is actually the revelation of earlier bad news; it’s good news that it got reported.

      Confusion remains regarding the different purposes of replication, along with the role of procedural interventions such as preregistration that are designed to improve science.

      We should all be thankful to Bak-Coleman and Devezer for the work they put into this project. I can see how this can feel frustrating for them: in an ideal world, none of this effort would have been necessary, because the original paper would never have been published!

      The tensions within the science reform movement—as evidenced by the prominent publication of a research article that was originally designed to study a supernatural phenomenon, then was retooled to represent evidence in favor of certain procedural reforms, and finally was shot down by science reformers from the outside—can be seen as symbolic of, or representative of, a more general tension that is inherent in science. I’m speaking here of the tension between hypothesizing and criticism, between modeling and model checking, between normal science and scientific revolutions (here’s a Bayesian take on that). I think scientific theories and scientific measurement need to be added to this mix."

      On 14 October 2024, Holly Else provides a summary of the events and the questions it prompts (e.g., about the challenges pre-registration faces.)

      Meanwhile, NHB has invited the authors to revise and resubmit their study – how strange is that? A subset of the authors has posted a piece titled “Historical Summary of ‘Highly replicability of newly discovered social-behavioural findings is achievable’ paper” on OSF.

    July 30, 2024

    Heterogeneity – it’s a thing. A thing that limits the generalizability of published scientific findings.

    A PNAS publication by Holzmeister, Johannesson, Boehm, Dreber, Huber, & Kirchler.

    "In conducting empirical research in the social sciences, the results of testing the same hypothesis can vary depending on the population sampled, the study design, and the analysis. Variation in such choices across studies leads to heterogeneity in results that introduce an additional layer of uncertainty, limiting the generalizability of published scientific findings."

    Read more

    January 15, 2024

    What are pre-registrations good for? (Absolutely nothing!?) 

    This is a review of some relevant references and results. It has since its publication repeatedly updated, the last time on 25 September 2024. 

    Read more

  • To be updated soon.

  • To be updated soon.

  • April 23, 2021

    Unconscious thought theory is dead. As dead as can be.

    “The main article that contains evidential value was published in 2016. Based on these results, I argue that 47 of the 48 articles do not contain credible empirical information that supports the claims in these articles. These articles should not be cited as if they contain empirical evidence.”

    Read more

    April 21, 2021

    A discussion and meta-discussion of statistical modeling, causal inference, and social science

    Andrew Gelman of Columbia has a handy list of give-aways that should make you suspicious about the claims of any paper:

    Here are some quick reasons to distrust a paper (with examples):

    • The claimed effect is implausibly large (beauty and sex ratio, ovulation and voting)
    • The results look too good compared to the noise level (various papers criticized by Gregory Francis)
    • The paper makes claims that are not addressed by the data at hand (power pose)
    • The published numbers just don’t add up (pizzagate)
    • The claim would seem to violate some physical laws (ESP)
    • The fitted model makes no sense (air pollution in China)
    • Garbled data (gremlins)
    • Misleading citation of the literature (sleep dude)

    And lots more.

  • January 4, 2020

    Warne on Mindsets Research

    Growth mindsets can improve academic performance–if you have Carol Dweck in charge of your intervention. Savage.

  • December 31, 2019

    A Toast To Error Detectors

    The tide is turning.

    December 30, 2019

    Wansink on What He Learned

    “In 2017-19, about 18 of my research articles were retracted. These retractions offer some useful lessons to scholars, and they also offer some useful next steps to those who want to publish in the social sciences. Two of these steps include 1) Choose a publishable topic, and 2) have a rough mental roadmap of what the finished paper might look. That is, what’s the positioning, the study, and the possible contribution.”

    Very strange.

    December 26, 2019

    Comparing Meta-Analyses (the gold-standard for many) and Multi-Lab Replications

    “We find that meta-analytic effect sizes are significantly different from replication effect sizes for 12 out of the 15 meta-replication pairs. These differences are systematic and, on average, meta-analytic effect sizes are almost three times as large as replication effect sizes.” Wow. There goes the gold-standard.

    December 23, 2019

    What we can learn from five naturalistic field experiments that failed to shift commuter behavior?

    That nudges are oversold? The evidence accumulates (see here and here)

    December 11, 2019

    Do Elevated Viewpoints Increase Risk Taking?

    Apparently not. Yes, we are as shocked as you are. This (failed) replication was the inaugural one by our friends from Data Colada, introducing a new feature called Data Replicada.

    December 10, 2019

    Data Replicada

    Two of our heroes at Data Colada – they of the 2011 false-positive-psychology hit – have started a new feature called Data Replicada in which they will report the results of “informative” (i.e., properly powered-up) replication attempts of recent and well-designed studies recently published in two top behavioral marketing journals: the Journal of Consumer Research and the Journal of Marketing Research. All replication attempts (i.e., there will be no file drawer).

    November 15, 2019

    Do PAPs (Pre-Analysis Plans) work?

    “So, even if improvements in research credibility do not come from every PAP, the growing adoption of PAPs in Political Science and Economics has almost certainly increased the number of credible studies in these fields.” Dah.

    September 1, 2019

    The Title Says It All

    Null results of oxytocin and vasopressin administration across a range of social cognitive and behavioral paradigms: Evidence from a randomized controlled trial.

    July 21, 2019

    Warne On A Big, And In His View Well-Designed UK Mindset Training Study

    Clearly he is not impressed: Mindset training had ZERO impact on any of the dependent variables:

    • Reading achievement
    • Writing mechanics
    • Math achievement
    • Self-worth
    • Self-efficacy
    • Test anxiety
    • Self-regulation
    June 30, 2019

    How Effective is Nudging? Well, It Depends.

    The authors report the results of a quantitative review of nudging with 100 primary publications including 317 effects sizes from different research areas. After having sliced and diced their data in various ways, they find for example huge differences for relative effect sizes in environment, finance, health, and privacy on the one hand and energy and policy making on the other. (Take a guess where you find the large effects.) Note that primary publications are neither meta-analyses nor pre-registered multi-lab studies which suggest that all effect sizes reported here are over-estimated. See the Kvarven et al study from later in the year.

  • November 19, 2018

    Too Harsh An Assessment

    Ed Yong’s take on the ManyLabs2 study. We think it’s too harsh an assessment. Certainly, this massive replication attempt provides a roadmap for identifying robust effects. Even in psychology. Another good write-up can be found here.

    November 14, 2018

    Good Advice on How to Conduct a Replication Study

    Annette L. Brown provides useful and sensible advice about what not to do when you try to replicate someone else’s study.

    For example: Don’t present, post or publish replication results without first sharing them with the original authors. Word.

    November 8, 2018

    Replicability Ranking of Eminent Social Psychologists

    Uli Schimmack, of Replicability Index fame (and also a featured speaker at the BIBaP – BizLab 2018 workshop on questionable research practices, has computed a replicability ranking of eminent social psychologists. And why not? Given the prevalence of questionable research practices in social psychology in particular, some such ranking seems more useful than citation indices. Some surprises (positive and negative) there but see for yourself.

    November 8, 2018

    Pre-Registration is Not For the Birds

    In Mother Jones, Kevin Drum identifies what he calls — sensationalistically — the Chart of the Decade.

    It is actually a study that’s three years old. The authors collected every significant clinical study of drugs and dietary supplements for the treatment or prevention of cardiovascular disease between 1974 and 2012.

    Prior to 2000, questionable research practices were fair game. 13 out of 22 studies (59 percent) showed significant benefits. Once pre-registration was required, only 2 out of 21 did. Surprise! Not!

    October 9, 2018

    James Heathers On The Wansink Story – How It Began And How It Has Ended, For Now

    After the JAMA family of journals retracted six of Wansink’s articles in one fell swoop, and after Cornell announced his resignation, there was a flurry of comments and reflections. One of the most interesting ones, was this. James Heathers was one of the four musketeers to investigate selected pieces of Wansink’s oeuvre. Here James describes how he got involved initially. Shockful of interesting links. A very readable background story about the motives of the people who engineered Wansink’s downfall.

    September 12, 2018

    What Authorship Means

    Ioannidis, himself a very prolific author, and his co-authors identify 9,000 authors published more than 72 papers (the equivalent of one paper every 5 days) in any one calendar year between 2000 and 2016, a figure that many would consider implausibly prolific. Lots of interesting details about where these authors live and what disciplines they work in. A fabulous illustration.

    September 7, 2018

    Talking about Spaceship Earth and the Dinosaurs’ Extinction Can Get Heated

    A very long but very interesting article. Lots of interesting facts about spaceship earth and the apparently five major extinctions it has gone through. The latest one has been cause for a scientific controversy (about whether the last extinction was rapid and caused by an asteroid or a series of collossal volcanic eruptions) that has gone on unabated for decades, with a viciousness against which even current debates in psychology are nothing. Originally published August 2018.

    August 27, 2018

    Another Day, More Replication Failures

    An international team of researchers, mostly from economics and psychology and known as the Social Sciences Replication Project, published the results of their attempt to replicate 21 studies published in Science and Nature between 2010 and 2015. The results were in our view somewhat sobering: The researchers succeeded in only 13 of 21 cases and where they succeeded the effect size was about half. Arguably the most interesting finding was that the prediction market the researchers conducted in parallel predicted amazingly well which studies would fail and which would succeed. Good write-ups on this latter finding can be found here (Ed Yong in The Atlantic) and here (Gidi Naeve in The Neuroeconomist).

    You might want to play this rather interesting guessing game yourself.

    June 14, 2018

    Retracted and Replaced for No Good Reason?

    The Washington Post reported on a major study on the mediterranean diet having been retracted and replaced. Was there a good reason for it?

    The lead author on the study told The Washington Post that the causal link is just as strong as the original report. Which poses the interesting question: If the original study was so problematic that the authors chose to withdraw it entirely, could the new one be trusted?

    June 7, 2018

    Deconstruction and Re-evaluation of (In)famous Experiments

    Both, Zimbardo’s (in)famous Stanford Prison Experiment and the equally (in)famous Milgram Experiment have recently been deconstructed and re-evaluated. A good write-up about the developments pertaining to the former can be found here; a good write-up about the developments pertaining to the latter can be found here.

    May 1, 2018

    Thinking Clearly About Correlations and Causations

    Julia Rohrer who was one of our featured speakers at last year’s BIBaP – BizLab workshop has two excellent recent papers worth a read. The first is on correlation and causation and the second illustrates the need for specification curves through a study of birth-order effects. Recommended.

    April 17, 2018

    An Upbeat Mood May Boost Your Paper’s Publicity

    Well, maybe not.

    April 7, 2018

    Four Misconceptions About Statistical Power

    Explained by Zad here

    • Misconception 1: Statistical power can only be increased with larger sample sizes
    • Misconception 2: You’ve reached enough statistical power, or your study is underpowered.
    • Misconception 3: The problem with low power is that you’ll miss true effects.
    • Misconception 4: Effect sizes and standard deviations from pilot studies should be used to calculate sample sizes for larger studies.

    A related good read: Statistical tests, P values, confidence intervals, and power: a guide to misinterpretation

    March 30, 2018

    Keeping Science Honest, One Whistle Blown at a Time

    Wherein whistleblowers Josefin Sundin  and Fredrik Jutfelt explain how them blowing the whistle on the authors of a widely publicised sensationalist — (Fish prefer microplastics to live prey!) – but fraudulent research article published in Science in June 2016 led in their case ultimately to vindication and retraction of the paper, at huge private and professional consequences to them.

    They conclude: “Ideally, whistle-blowing should not be necessary. The scientific community must enforce a culture of honesty. Sometimes that takes courage.”

    True fact. But it should not have to.

    March 28, 2018

    How to Publish Statistically Insignificant Results in Economics

    MIT economist Alberto Abadie makes the case that statistically insignificant results are at least as interesting as significant ones. One of Abadie’s key points (in a deeply reductive nutshell) is that results are interesting if they change what we believe (or “update our priors”). With most public policy interventions, there is no reason that the expected impact would be zero. So there is no reason that the only finding that should change our beliefs is a non-zero finding. This is a very readable write-up about the Abadie paper (less readable) by the Development Impact bloggers at the World Bank.

    March 20, 2018

    How (and Whether) to Teach Undergraduates About the Replication Crisis in Psychological Science

    “We developed and validated a 1-hr lecture communicating issues surrounding the replication crisis and current recommendations to increase reproducibility. Pre- and post-lecture surveys suggest that the lecture serves as an excellent pedagogical tool. Following the lecture, students trusted psychological studies slightly less but saw greater similarities between psychology and natural science fields. We discuss challenges for instructors taking the initiative to communicate these issues to undergraduates in an evenhanded way.” (from the abstract).

    March 19, 2018

    Big Booze, like Big Banks and Other Big Business, is Relentless in Its Pursuit of Profit

    HealthNewsReview.Org (now also featured on our useful links page) summarizes a report by the New York Times that suggests that lead researchers on a $100 million NIH study of the effects of moderate alcohol consumption had extensive discussions with the alcohol industry prior to securing the sponsorship and related reports. Apparently the NIH’s standards are slipping. The alcohol industry has also tried to buy favourable reporting by journalists. Which makes sense: “After all, if you’re going to invest $100 million in a study, wouldn’t it make sense to cultivate journalists to help put a nice shine on the results?”

    March 9, 2018

    How to Make Replication The Norm

    The enquiring minds of Paul Gertler, Sebastian Galiani, and Mauricio Romero wanted to know. Focusing on economics, political science, sociology and psychology, in which ready access to raw data and software code are crucial to replication efforts, they surveyed deficiencies in the current system and propose reforms that can both encourage and reinforce better behaviour — a system in which authors feel that replication of software code is both probable and fair, and in which less time and effort is required for replication. Food for thought.

    Here you can find the World Development Report 2015: Mind, Society, and Behavior.

    And here is a critical review of it by one of our own.

    March 8, 2018

    Is There or Is There Not (A Reproducibility Crisis)?

    Somewhat surprisingly (given his own contributions over the last ten years or so), Daniele Fanelli suggests in a recent piece in PNAS that the “narrative of crisis” is mistaken, and that “a narrative of epochal changes and empowerment of scientists would be more accurate, inspiring, and compelling.”

    We have our doubts. So do many others as the relevant discussion on the Facebook Methodological Discussion group shows.

    February 24, 2018

    The New Lancet Study about Antidepressants is Not Exactly News

    A new Lancet study about antidepressants got lots of play in the press. One of our favorite curmudgeons, Neurosceptic, points out that it tells us very little that we didn’t already know, and it has a number of limitations: “The media reaction to the paper is frankly bananas.”

    February 20, 2018

    Brembs on Prestigious Science Journals: Do They Live up to Their Reputations?

    Brembs, well-known to the readers of this site through his blog, has just published a review of the evidence that speaks to the issue of the reliability (trustworthiness) of prestigious science journals. He comes to the somewhat distressing conclusion that these journals can’t be trusted any more than lower-ranked journals. In fact, the evidence seems to adjust that science published in lower level journals is more reliable. You read that right.  

    February 19, 2018

    Introducing Advances in Methods and Practices in Psychological Science

    Daniel Simons introduces new journal Advances in Methods and Practices in Psychological Science (AMPPS) which is designed to foster discussions of, and advances in, practices, research design, and statistical methods.

    February 18, 2018

    Gelman on the Replication Crisis and Underpowered Studies: What Have We Learned Since 2004 (or 1984, or 1964)?

    The man behind the Statistical Modeling, Causal Inference, and Social Science website, Andrew Gelman, takes a look back to consider what we have learned from critiques way back when that sound very modern. Apparently a little bit. 

    February 10, 2018

    RCTs are Not Always the Gold Standard: How We Figured Out that Smoking Causes Cancer

    A very good discussion of the limits of RCTs and what it takes to establish causality here.

    (We have added the source of this article to our Useful Links page because it has lots to offer.)

    February 1, 2018

    Replication is Not Enough: The Case for “Triangulation”

    Mufano and Smith on why replication is not enough and why a problem has to be attacked in several ways.

    January 31, 2018

    More on the Cornell Food and Brand Lab Situation

    Wansink’s former students are not amused. And who can blame them?

    Meanwhile… Cornell is still belaboring its internal investigation “in compliance with our internal policies and any external regulations that may apply”. Some scholars are not impressed.

    January 18, 2018

    Replication Studies: A Report from the Royal Netherlands Academy of Arts and Sciences

    The Dutch Academy of Arts and Sciences just published the report of a committee that was charged to identify, and address, the problems that seem to beset in particular Dutch psychologists (cue Stapel and others). Eric-Jan Wagenmakers was a member of that committee and provides a useful summary of the results of its findings.

    January 15, 2018

    The Ultimate Reading List for a Graduate Course on Reproducibility and Replicability, For Now

    Brent Roberts and Dan Simons have compiled, from the hard labors of many other people, an up-to-date and fairly complete reading list for their current 2018 graduate course on Reproducibility and Replicability, focusing on readings that 1) identified the reasons for the current crisis, and 2) provide ways to fix the problems.

    Subsections include:

    • Definitions of reproducibility & replicability
    • Houston, we have a problem (and by “we” we mean everyone, not just psychologists or social psychologists)
    • The problems that plague us have been plaguing us for a very long time
    • The problems that plague us: low power
    • The problems that plague us: selective publication; bias against the null
    • The problems that plague us: procedural overfitting
    • The problems that plague us: quality control  
    • NHST, P-values, and the like
    • Preregistration
    • Power and power analysis
    • On Replication
    • Open Science
    • Complexities in data availability
    • Informational value of existing research
    • Solutions
    January 5, 2018

    Progress Assessed

    The editor of Science, Jeremy Berg, assesses the progress on reproducibility and is hopeful:

    “Over the past year, we have retracted three papers previously published in Science. The circumstances of these retractions highlight some of the challenges connected to reproducibility policies. In one case, the authors failed to comply with an agreement to post the data underlying their study. Subsequent investigations concluded that one of the authors did not conduct the experiments as described and fabricated data. Here, the lack of compliance with the data-posting policy was associated with a much deeper issue and highlights one of the benefits of policies regarding data transparency. In a second case, some of the authors of a paper requested retraction after they could not reproduce the previously published results. Because all authors of the original paper did not agree with this conclusion, they decided to attempt additional experiments to try to resolve the issues. These reproducibility experiments did not conclusively confirm the original results, and the editors agreed that the paper should be retracted. This case again reveals some of the subtlety associated with reproducibility. In the final case, the authors retracted a paper over extensive and incompletely described variations in image processing. This emphasizes the importance of accurately presented primary data.”

    January 5, 2018

    Should the Bem Feeling-the-Future Article (JPSP 2011) be Retracted?

    Uli Schimmack has done some superb data sleuthing and, based on it, makes a fairly persuasive case for this article (which in a sense involuntarily started the replicability revolution in psychological science) to be retracted.

    A very good read that will teach you, by example, more about questionable research practices than pretty much anything else. That someone like Bem would make the comments attributed to him is hard to believe.

  • December 18, 2017

    The Next Stapel?

    Nick Brown, one of the researchers who was behind the questions raised about the Cornell Brand and Food Lab (see here and here), finally has raised questions about another matter. After having for two years tried to get answers from Nicolas Guéguen, of the Université Bretagne-Sud in France, on numerous papers of that researcher that had obvious problems, he posted on his blog a summary of his findings and questions. Good questions all in all we can see.

    One also wonders how any respectable journal would publish some of these papers?:

     “As well as the articles we have blogged about, he has published research on such vital topics as whether women with larger breasts get more invitations to dance in nightclubs (they do), whether women are more likely to give their phone number to a man if asked while walking near a flower shop (they are), and whether a male bus driver is more likely to let a woman (but not a man) ride the bus for free if she touches him (we’ll let you guess the answer to this one).”

    Why the next Stapel? See here.

    December 1, 2017

    The Health Benefits of Volunteering May Not Be What They Have Been Made Out to Be

    Leif Nelson, one of the three musketeers behind Data Colada has a good piece on three alleged scientific findings that come with volunteering, other than the warm glow. He concludes:

    “I see three findings, all of which are intriguing to consider, but none of which are particularly persuasive. The journalist, who presumably has been unable to read all of the original sources, is reduced to reporting their claims. The readers, who are even more removed, take the journalist’s claims at face value: ‘if I volunteer then I will walk around better, lower my blood pressure, and live longer. Sweet.'”

    November 30, 2017

    Failed

    Uli Schimmack has produced an impressive quantitative review of the “evidence” in Bargh’s book Before you know it: The unconscious reasons we do what we do.

    Concludes he: “There is no clear criterion for inadequate replicability, but Tversky and Kahneman (1971) suggested a minimum of 50%.  Professors are also used to give students who scored below 50% on a test an F.  So, I decided to use the grading scheme at my university as a grading scheme for replicability scores.  So, the overall score for the replicability of studies cited by Bargh to support the ideas in his book is F.”

    November 29, 2017

    Five (Out of a Zillion) Ways to Fix Statistics

    Nature asked influential statisticians to recommend one change to improve science.

    Relatedly, one of the authors (Gelman) has really had it with all that talk about false positives, false negatives, false discoveries, etc.

    Update: December 5, 2017

    Ed Hagen comments on these five comments and stresses that, before better statistics can be effective, it is imperative to change the incentives for researchers to use them. Specifically, “changing the incentives to reward high quality studies rather than sexy results would have enormously positive effects for science.” He makes the case for pre-registration of hypotheses and statistical tests.

    November 28, 2017

    d=2.44? Too Good to be True?

    .. so men might not be less likely to help a woman who has her hair tied up in a ponytail or a bun after all. Which right there restores our faith in mankind.

    November 23, 2017

    The Cornell Food and Brand Lab Saga Continues

    No, the magic number is not 42, apparently it is 770.

    November 12, 2017

    Changing the Default p-Value? Why Stop There?

    In the wake of recent proposals to (not) change the p-value threshold, McShane, Gal, Gelman, Robert, and Tackett recommend abandoning the null hypothesis significance testing paradigm entirely, leaving p-values as just one of many pieces of information with no privileged role in scientific publication and decision making.

    That’s of course not a particularly new idea: Gigerenzer, Leamer, Ziliak & McCloskey, and Hubbard have promoted it for decades (something the present authors seem to either not know or prefer not to acknowledge) but it is worth being recalled.

    November 10, 2017

    The Cuddy Saga Continues

    Following developments here, here, here, here, here, here and here, Cuddy now claims it was all a great misunderstanding and that power posing effects are about felt power (i.e. mere thinking and feeling) rather than subsequent choices. An interesting refocusing of the narrative…

    P-curving A More Comprehensive Body of Research on Postural Feedback Reveals Clear Evidential Value For “Power Posing” Effects: Reply to Simmons and Simonsohn

    Update: December 6, 2017

    In Data Colada,  Joe Simmons, Leif Nelson, and Uri Simonsohn take on this analysis. You will not be surprised that they come to a different conclusion.

    October 25, 2017

    Replicability and Reproducibility Discussion in Economics

    The Economic Journal (one of the leading economics journals) has a Feature ( a collection of articles) on the replicability of economic science. Of particular noteworthiness is the article by Ioannidis, Doucouliagos, and Stanley titled The Power of Bias in Economics Research.

    If you have any illusions about the sorry state of the economics research, the abstract will abuse you of it:

    “We investigate two critical dimensions of the credibility of empirical economics research: statistical power and bias. We survey 159 empirical economics literatures that draw upon 64,076 estimates of economic parameters reported in more than 6,700 empirical studies. Half of the research areas have nearly 90% of their results under-powered. The median statistical power is 18%, or less. A simple weighted average of those reported results that are adequately powered (power ≥ 80%) reveals that nearly 80% of the reported effects in these empirical economics literatures are exaggerated; typically, by a factor of two and with one-third inflated by a factor of four or more.”

    Chris Doucouliagos is one of the key speakers at our 2017 workshop.

    October 25, 2017

    Making Replication Mainstream

    Behavioural and Brain Sciences (the journal) has accepted a new target article by Rolf Zwaan and colleagues on an important issue. Comments are currently being solicited.

    Rolf was one of the speakers at our 2015 workshop.

    October 24, 2017

    Criticising a Scientist’s Work Isn’t (Sexist) Bullying. It’s Science.

    Simine Vazire reflects on the accusations that followed the New York Times piece on Amy Cuddy. Says she …

    “Cuddy’s story is an important story to tell: It is a story of a woman living in a misogynistic society, having to put up with internet bullies … . But it is also a story of a woman experiencing completely appropriate scientific criticism of a finding she published. Conflating those issues, and the people delivering the “attacks,” does a disservice to the fight for gender equality, and it does a disservice to science.”

    October 19, 2017

    Did Power-Posing Guru Amy Cuddy Deserve her Public Shaming?

    In the wake of The New York Times piece on Amy Cuddy, Daniel Engber investigates the accusation that she was dragged through the mud because she is a woman.

    October 19, 2017

    Wansink’s Cornell Food and Brand Lab Saga, Continued

    Last month yet another flawed study by  Wansink’s lab was retracted (one of about 50 currently questioned, under review, or already retracted for good), only to be replaced by a corrected study. Unfortunately, this corrected study itself apparently needs also correction. How many rounds will it go? We wonder, too.

    Relatedly,  BuzzFeed has provided email excerpts from Wansink’s attempts to save his reputation; predictably they do not really help his case. You can read about it here.

    October 18, 2017

    Power Poses, Revisited

    The New York Times’s Susan Dominus has written a long piece on When the Revolution Came for Amy Cuddy. The piece was actually as much about her, as it was about her collaborator (Carney) on the original power poses paper, Simmons and Simonsohn (who critiqued it), and the culture that led to it and which is currently swept away. A good, if overly sympathetic look at the travails in which Cuddy now finds herself.

    Relatedly, Gelman (who previously has commented repeatedly on the power poses controversy) published on the same day a reflection on the NYT piece which is worth a read, as is the extensive commentary that follows it (as of October 21 more than 100 pages if you were to print it out).

    October 9, 2017

    You Always Wanted to do This, So Here is Your Chance

    Daniel Lakens now has a course on offer on Coursera on Improving your statistical inferences 

    Recommended.

    October 8, 2017

    (Actually an article from last year which we have become aware of only now)

    Science in the Age of Selfies

    Lots of eminently citable quotes in there which capture the essence of what is becoming, or arguably has become, a serious problem: “Albert Einstein remarked that “an academic career, in which a person is forced to produce scientific writings in great amounts, creates a danger of intellectual superficiality”. True fact.

    See also: Simine Vazire’s blog entry below.

    October 1, 2017

    Can We Really Not Avoid Being Blinded By Shiny Results?

    Simine Vazire has some food for thought.

    September 28, 2017

    Cornell Food and Brand Lab Scandal: Update

    BuzzFeed News has a very useful update on the Wansink food lab story; no words are minced. See here.

    September 27, 2017

    Gelman Has Sympathy for Big-Shot Psychologist That He Skewered Earlier

    The beginning of a beautiful new friendship? Find out here.

    September 26, 2017

    James Coyne Doubles Down on His Two Earlier Power Pose Blog Entries, Taking No Prisoners in Typical Coyne Style

    The titles of his two blog entries are, as usually, descriptive. Part 1 is titled Demonstrating that replication initiatives won’t salvage the trustworthiness of psychology and Part 2 is titled Could early career investigators participating in replication initiatives hurt their advancement?.

    See whether you agree.

    July 31, 2017

    Changing the Default p-Value?

    A group of mostly fairly well known economists, psychologists, and others interested in the replication crisis of the social sciences have proposed to change the default p-value threshold for statistical threshold for statistical significance for claims of new discoveries from 0.05 to 0.005. Much hilarity ensued.

    See, for example, herehere and here.

    July 22, 2017

    Should Journals Be Responsible for Reproducibility?

    The American Journal of Political Science, one of the top journals in the field, wants to know. Very seriously. A good write-up from Inside Higher Ed here.

    July 21, 2017

    Another Day, Another Challenge.

    Our friends from the Open Science Framework have issued a pre-registration challenge:

    “Preregistration increases the credibility of hypothesis testing by confirming in advance what will be analyzed and reported. For the Preregistration Challenge, one thousand researchers will win $1,000 each for publishing results of preregistered research.”

    How tempting is that? We will soon find out.

    The Leaderboard for the Preregistration Challenge can be found here: Take a look.

    The project is closely related to the Replicability Index which we recommend on our resources page.

    July 15, 2017

    McDonaldization of Positive Psychology: On Tour With Marty Seligman

    James C. Coyne commenting and vicariously reporting…

    July 8, 2017

    Psychologist Disputes His Findings Won’t Replicate

    Andrew Gelman recently posted a comment on a new controversy involving findings by German psychologist Fritz Strack who found quite a while ago that holding a pen in your mouth forces you to smile. This very famous result has recently repeatedly failed to replicate. A comment by Strack on the ensuing controversy led to Gelman’s comment and extended commentary on his blog as well as on various other discussion groups such as the Facebook Psychological Methods discussion group.

    Well known philosopher of Science Deborah G. Mayo has also chipped in on this controversy.

    June 28, 2017

    Power Poses Have No Lasting Power

    The relatively new journal Comprehensive Results in Social Psychology just published an issue on power poses.

    Going back to work done by Dana Carney, Amy Cuddy, and Andy Yap a few years back, an emergent literature on power poses originally suggested that nonverbal expressions of power affect people’s feelings, behaviors, and hormone levels. In particular, they claimed that adopting body postures associated with dominance and power (“power posing”) can increase testosterone, decrease cortisol, increase appetite for risk, and cause better performance in job interviews. Subsequent attempts to replicate the effects failed and last year Dana Carney made it clear that she did not think there was any such effect and that the original study that started it all was methodologically flawed.

    Carney was one of the special-issue editors of Comprehensive Results in Social Psychology’s just published issue on power poses. This issue is of interest both because of its results (no power poses effects) and its methodology (pre-registered replication studies).

    Quite some discussion of this issue did also occur on the Facebook Psychological Methods discussion group.

    On the PLOS Blogs James Coyne has, in a two part series (see here and here), also usefully reflected on this controversy.

    1 June 2017

    Quantity Uncertainty Erodes Trust in Science

    Of interest to those interested in the recent replicability and credibility crisis in the social sciences is this article outlining the inability of consumers of science to adequately evaluate the strength of scientific studies without appropriate transparency.

    31 May 2017

    Cornell Food and Brand Lab Scandal

    A recent scandal that has played out in the blog-sphere (and even the popular press) has been the questionable on-goings in the Cornell Food and Brand Lab. Here is a recent summary of the controversy and the important questions about questionable research practices it poses.

    Here and here are more relevant discussions.

    31 May 2017

    Priming Research Train Wreck Debate

    In a blog entry dated 2 February 2017, Ulrich Schimmack, Moritz Heene, and Kamini Kesavan wrote that the “priming research” of Bargh and others featured in Kahneman’s book Thinking: Fast and Slow “is a train wreck” and should not be considered “as scientific evidence that subtle cues in their environment can have strong effects on their behavior outside their awareness” — had conceded that he placed too much faith in under-powered studies. Here  is a good start to read up on the debate.

    28 May 2017

    The Social Cost of Junk Science

    In this blog entry, Andrew Gelman discusses what needs to be done to avoid it.

    7 May 2017

    On Eminence, Junk Science, and Blind Reviewing

    In this blog entry, Andrew Gelman discusses these issues with Lee Jussim and Simine Vazire.

    30 April 2017

    Ujjval VYas on Evidence-Based Design and Ensuing Discussion

    Andrew Gelman recently posted a comment by Ujjval Vyas as separate entry on his blog and has drawn, not surprisingly, huge reactions. Yyas pointed out that evidence-based design abounds with the kind of questionable research practices that makes Wansink look scrupulous! A fascinating read which you can find here.