Book review: Science Fictions

Science Fictions was one of my favourite books of 2020. It’s a wonderfully succinct account of the many failures in science. Starting out chronicling the flagrant instances of outright fraud, and ending at the insidious problems of misaligned incentives between funders, researchers, publishers, the media, and the public.

Note: The notes I took below are more of a collection of facts I found interesting than a complete summary. I’ve tried to tidy them up to be legible, but there are probably things I wouldn’t completely endorse in there. If I could find the paper in a few minutes of Googling, I linked it, and otherwise didn’t.

Replication crisis

A shockingly low ~1% of studies are replications at the moment. Up to 2014, this was 1/1000 for economics.

Two important studies that didn't replicate:

  • Peanut allergies developing when you delay introducing your kids to peanuts

  • Caesarian section improving outcomes for twins over vaginal births.

Fraud

Stories of fraud

Some stories are just so far out of the scope of the realm of conceivable human actions, I had to double check them.

As one example of scientific fraud, Paolo Macchiarini invented a novel tracheal transplant and tested it on a bunch of patients. Seven of eight mysteriously subsequently either died from direct complications of the surgery, or died from indirect complications. But Macchiarini gets away with a bunch of publications by simply not reporting that the patients died as outcomes, which isn’t checked. Just totally a compulsive liar and psychopath.

The Karolinska Institute and The Lancet – two of the most prestigious institutions in medicine – keep defending him and refusing investigations.

In the end, he was only was found out because he told he person he was having an affair with that he had treated the pope and the pope was going to officiate their marriage. (He didn’t.)

How frequent is fraud?

A survey on frequency of fraud found:

  1. 1 in 50 scientists admitted to personally committing.

  2. 15% said they knew someone else who did.

Retracted papers still go on to get citations, and 80% are cited apparently credulously.

Bias

A 2014 review of top medical journals showed 1/3 of meta analyses didn't check for publication bias. When they did, it was found 20% of the time.

Pre-registration doesn't necessarily fix things. Ben Goldacre found among registered trials published in top 5 medical journals, 20% didn’t report the primary outcome they said they would. Registered reports, where you have a peer review of the trial protocol before results are known, are an alternative that might offer less room for non-compliance.

1/3 of all registered medical trials in the US are funded by pharmaceutical companies. They’re ~30% more likely to report positive results.

Negligence

Growth in a time of debt was a highly influential economics paper that was used as a justification of austerity measures. The paper argues that when "gross external debt reaches 60 percent of GDP", a country's annual growth declined by two percent, and "for levels of external debt in excess of 90 percent" GDP growth was "roughly cut in half." They accidentally left off some of the countries from the analysis because of a typo in Excel, and when you removed it, the threshold effect reduced a bunch, and is now highly disputed.

Even big research findings aren't immune. The original cognitive dissonance paper didn't pass the GRIM Test, which checks the stats reported to see if the numbers are even possible given the sample size. For example, if you get a result of 1/3 when there are 10 participants, something went wrong. Of articles which were amenable to GRIM test analysis, 1 in 2 contained at least one impossible value and 16 contained multiple impossible values.

1 in 20 psych studies failed to properly randomise (despite attempting to).

Hype

Growth mindset

Carol Dweck writes a book about how life-changing it is to tell kids that their skills aren’t fixed. Subsequently, it’s confirmed that it might account for 1% of variance in childhood outcomes. So she might be right that growth mindset is an effect, but it’s massively overblown relative to its effect size.

Does everything cause cancer?

Another problem with hype in science is that only 1 in 2 studies on health are subsequently confirmed by meta-analyses. Couple that with a selection effect - the studies you read in the paper are more likely sensational, and thus to more likely to regress to the mean, it means that less than half of health findings you read reported in the papers are likely to be subsequently confirmed.

A 2012 Ioaniddis study looked at 50 cookbook ingredients and showed 80% of commonly consumed food had at least one trial saying it caused or prevent cancer.

 
Ioannidis study - Does everything cause cancer.png
 

Other problems in science

  • Referring friends to act as reviewers

  • 15% of time is spent writing grant applications

  • Peer reviews aren't predictive of quality

  • Credit isn't shared well with the naming system

  • Elsevier costs 25x more than another journal

Causes of science’s failures

Intense pressure

The scientific literature doubles every 9 years. As a result, publish-or-perish selection pressures increase, and the signalling of ostentatious results/CVs is favoured. Just like with the peacock’s tail - the greater the (evolutionary) selection pressure, the more the brighter tails are favoured in males. This leads to Goodharting metrics of impact, for example the H-Index and Impact Factor.

It also leads to other bad outcomes like:

  • Salami slicing what could be published in 1 study into multiple to get more citations

  • Citation cartels, self-citation, and reviewers coercing citations. Self-citation makes up 1/3rd of citations in the first 3 years.

 
publish_or_perish_academic_publishing_ecology_cartoon.jpg
 

Selection effects, switching, & spin

These are all best illustrated by a 2018 study by De vries which followed 100 antidepressant trials through their lifecycle from registration to publication and citation. (The following are all the rounded numbers used for simplicity in the book.)

  • 50 out of the 100 were negative or null

    • Only half of these (25) were published, so there were large selection effects.

      • 15: outcome switched/p hacked

      • 10: put spin to make them seem positive, despite being negative or null results

  • 50 were positive

    • 98% of positives which were published (49/50).

    • Overall, the trials which had positive results were cited 3x as often.

 
de vries.jpeg
 

Cures

Hard or not clearly doable

  1. State funds everything or big funders like NIH start caring (somehow?)

  2. Switch to Bayesian stats

Easier wins

  1. Software testing. We could use software testing like GRIM and Statcheck even if researchers won't share data.

  2. Open publishing and data (and OSF ideas like badges)

  3. Switch peer review for pre-prints and separate reviewers and publishers

  4. Fund scientists instead of projects

  5. Pre-registration of trials. Medical trial success rate in the US dropped from 60% to 10% when pre-registration was enforced in 2000.

 

 

Speculations

If I were just trying to maximise the outputs from science, how much of a random cross-section of current research output would I trade to remove the messed up incentives? My initial reaction is somewhere between half and a tenth. A half is a lower bound set by the fraction of papers that don't replicate, a tenth is basically made up.

The book has increased my pessimism that we'll see fast improvements to these problems. On an outside view, it's often a decadal-generational struggle to meaningfully change a culture; and the small strides outlined in the book that we’ve made so far have been hard won.

However, I am optimistic that science is going to get a lot better at solving its many problems over the long run, though. There are a lot of promising signs in how aware intellectuals are of the replication crisis and how much has happened even in half a decade since its rise to prominence.

Because I expect scientific infrastructure and incentives to not improve rapidly, I also expect the economically productive disciplines to be outcompeted by industry (as is arguably occurring with the tech sector’s hoovering up of academics in AI and related disciplines). That would still be a worrying outcome, as there will be a lot of public goods missed if there's a necessary short term connection to a bottom line that’s profit. But it might be better – from the perspective of scientific innovation – than being connected to a bottom line of citations.

Image by holdentrils from Pixabay.