Copyright holder: EMBL-European Bioinformatics Institute (Apache 2 License)
This script is designed to automatically finemap and highlight the causal variants behind GWAS results by cross-examining GWAS, population genetic, epigenetic and cis-regulatory datasets.
Its original design was based on STOPGAP. It takes as input a disease identifier, extracts associated SNPs via GWAS databases, expands them by LD, then searches an array of regulatory and cis-regulatory databases for gene associations.
Add the lib/
directory to your $PYTHONPATH
environment variable.
To install all dependencies run sh install_dependencies.sh
Add the bin
directory to your $PATH
environment variable.
-
Via the FTP site (recommended)
The following script downloads a bunch of files into PWD.
sh download.sh
Ideally, save these files in a separate directory, which we will call ```databases_dir````
Everytime you run POSTGAP, add
--database_dir /path/to/databases_dir
to the command line. -
Manually (sloooow) This script will create a
databases_dir
directory for you:- Type
make download
to download public databases. - Type
make process
to preprocess the databases. Warning this may take days as it needs to split the entire 1000 Genomes files by population.
- Type
By default, run from the root directory the command:
python POSTGAP.py --disease autism
Multiple disease names can be provided.
You can also provide a list of EFOs:
python POSTGAP.py --efos EFO_0000196
Or an rsID:
python POSTGAP.py --rsID rs10009124
Or a manually defined variant:
python POSTGAP.py --coords my_variant 1 1234567
To short cut the GWAS databases and enter you own data with a file:
python POSTGAP.py --summary_stats my_stats.txt
The summary statistics file should be tab delimited with the following columns:
- Chromosome (GRCh37)
- Position (GRCh37)
- MarkerName
- Effect_allele
- Non_Effect_allele
- Beta
- SE
- Pvalue
For an EFO, you can trigger the Bayesian calculations with:
python POSTGAP.py --efos EFO_0000196 --bayesian
In this case, POSTGAP produces an output file, 'postgap_output', which can be displayed as:
python extract_data.py
By default, the script writes out a tab delimited file to standard out.
If you wish, you can redirect this into a file:
python POSTGAP.py --disease autism --output results.txt
If you want a JSON dump of all the data retrieved by the pipeline:
python POSTGAP.py --disease autism --output results.json --json
python POSTGAP.py --disease autism --json
python postgap_and_tests.py --database_dir /path/to/postgap/databases --efos EFO_0008263 --output EFO_0008263.txt --GWAS GWAS_Catalog
Check out our Wiki