The trouble with p-values

An annoying T-shirt

An annoying T-shirt

Nature has two pieces this week on how p-values are commonly misused to distort scientific results. I’ve often been annoyed by casual statements like “what’s your p-value?” which is sometimes dropped as a quasi-scientific rebuttal in online discussions. Nature’s editors encourage us all to dive a little deeper into the foundations of statistical methods.

The first piece is an editorial called Number Crunch, issues a call to action for practicing scientists and educators:

The first step towards solving a problem is to acknowledge it. In this spirit, Nature urges all scientists to read the News Feature and its summary of the problems of the P value, if only to refresh their memories.

The second step is more difficult, because it involves finding a solution. Too many researchers have an incomplete or outdated sense of what is necessary in statistics; this is a broader problem than misuse of the P value.

Department heads, lab chiefs and senior scientists need to upgrade a good working knowledge of statistics from the ‘desirable’ column in job specifications to ‘essential’. But that, in turn, requires universities and funders to recognize the importance of statistics and provide for it. Nature is trying to do its bit and to acknowledge its own shortcomings. Better use of statistics is a central plank of a reproducibility initiative that aims to boost the reliability of the research that we publish (see Nature 496, 398; 2013).

The second piece is a more detailed article by Regina Nuzzo, titled “Scientific Methods: Statistical Errors.” Nuzzo gives a straightforward explanation of the problem:

It turned out that the problem was not in the data or in Motyl’s analyses. It lay in the surprisingly slippery nature of the P value, which is neither as reliable nor as objective as most scientists assume. “P values are not doing their job, because they can’t,” says Stephen Ziliak, an economist at Roosevelt University in Chicago, Illinois, and a frequent critic of the way statistics are used.

For many scientists, this is especially worrying in light of the reproducibility concerns. In 2005, epidemiologist John Ioannidis of Stanford University in California suggested that most published findings are false2; since then, a string of high-profile replication problems has forced scientists to rethink how they evaluate results.

P values have always had critics. In their almost nine decades of existence, they have been likened to mosquitoes (annoying and impossible to swat away), the emperor’s new clothes (fraught with obvious problems that everyone ignores) and the tool of a “sterile intellectual rake” who ravishes science but leaves it with no progeny3. One researcher suggested rechristening the methodology “statistical hypothesis inference testing”3, presumably for the acronym it would yield.

The article goes on to dissect several common distortions that result from researcher’s pursuit of results with low p-values. The point is well taken, and is a reminder for us all to spend some quality time examining our foundations.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s