Skip to content
/ adjust Public

Correct for multiple testing by controlling the FDR or FWER.

License

Notifications You must be signed in to change notification settings

treynr/adjust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

adjust

https://img.shields.io/circleci/project/github/treynr/adjust/master.svg?style=flat-square

In statistics, the multiple testing problem occurs when testing multiple hypotheses simultaneously. As the number of tests increases, so does the probability of encountering a type I error (false positive). With as little as 20 tests, the chance of finding a significant result is ~64% although no tests may actually be significant [Goldman2008]. adjust can be used to correct for multiple testing by controlling the family-wise error rate (FWER) or the false discovery rate (FDR). It is designed to be simple to use and relatively fast, e.g., controlling the FDR for a set of GWAS results--13,549,588 tests (1.4GB)--takes about a minute and a half (on a Xeon E5-2640 @ 2.50GHz, running on a single thread).

Usage

$ adjust [options] <input> <output>

adjust operates over delimiter separated value (DSV) files. If the file has a header, adjust will attempt to infer which column contains the p-value. For example, the following command can be used to control the FDR at alpha < 0.05 using the Benjamini-Hochberg step-up procedure. All rows that don't meet this criteria are removed:

$ adjust input-stats.tsv output-stats.tsv

Or, if you'd rather use the Bonferroni correction to control the FWER:

$ adjust -b input-stats.tsv output-stats.tsv

It can also read from stdin and write to stdout if the command is just part of a larger pipeline or processing step:

$ cat input-stats.tsv | adjust -i -o > output-stats.tsv

Options

  • --adjust: Convert and replace p-values with adjusted p-values
  • -a, --alpha=NUM: Set the alpha (default = 0.05)
  • --fdr: Control the FDR using Benjamini-Hochberg step-up procedure
  • --fwer: Control the FWER using the Bonferroni correction
  • -d, --delim=CHAR: Specify a delimiter to use when parsing the input and writing output.
  • -c, --column=INT: Zero-indexed column containing the p-value (currently disabled)
  • -n, --no-header: Specify that the input does not contain a header file (currently disabled)
  • -r, --remove: Remove rows above the given alpha threshold. This is only relevant when producing adjusted p-values using the --adjust option.
  • -i, --stdin: Read from stdin instead of a file
  • -o, --stdout: Write to stdout instead of a file

Installation

Compilation and installation is done with Stack. Setup GHC:

$ stack setup

Build the application:

$ stack build

If you wish to install it to your $PATH:

$ stack build --copy-bins

Requirements

Refs

[Goldman2008]https://www.stat.berkeley.edu/~mgoldman/Section0402.pdf

About

Correct for multiple testing by controlling the FDR or FWER.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published