Skip to content

Pipeline that runs two-sample Mendelian randomization on molecular QTLs and GWAS loci and tests for horizontal pleiotropy.

License

Notifications You must be signed in to change notification settings

laleoarrow/TwoSampleMR_pipeline

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

TwoSampleMR pipeline

This repository contains the software implementation of a TwoSampleMR (Two sample Mendelian randomization) pipeline that performs Mendelian randomization using GWAS and QTL summary statistics to estimate the causal effect of an QTL in a given tissue (exposure), such as expression or splicing QTLs on a trait (outcome). This pipeline was run in the study: Integrating genetic regulation and single-cell expression with GWAS prioritizes causal genes and cell types for glaucoma. Hamel AR, et al. medRxiv 2023 (https://www.medrxiv.org/content/10.1101/2022.05.14.22275022v2), accepted in principle at Nature Communications 2023.

This pipeline runs the TwoSampleMR and MendelianRandomization packages in R (version 4.1.2). MR estimates are generated by calculating the Wald ratio. Where multiple variants constituted the instrument for the candidate gene, the inverse-variance weighted (IVW) method is used as the primary method for pooling variant-specific estimates. For sensitivity analysis, this pipeline also runs the simple-median, weighted-median, MR-Egger, and MR-PRESSO methods. Horizontal pleiotropy is tested using the Egger-intercept test and MR-PRESSO global heterogeneity test on cases with 3 or more instrumental variable variants. P<0.05 indicates the presence of horizontal pleiotropy. To correct for multiple hypothesis testing, Bonferroni correction or Benjamini-Hochberg (BH) FDR<0.05 applied to the primary IVW/Wald ratio test can be used to identify statistically significant MR results.

Authors:
Puja Mehta[1,2,3], Skanda Rajasundaram[1,2,3,4]
Segre lab, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA

Affiliations:
[1] Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
[2] Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
[3] Broad Institute of Harvard and MIT, Cambridge, MA, USA
[4] Centre for Evidence-Based Medicine, University of Oxford, Oxford, UK; Faculty of Medicine, Imperial College London, London, UK

For questions or comments regarding this tool, please contact Puja Mehta at pamehta [at] meei [dot] harvard [dot] edu, and Ayellet Segre at ayellet_segre [at] meei [dot] harvard [dot] edu.

Date: November 16, 2023

Repository structure

src: the directory contains scripts for the software pipeline and for generating results
data: the directory contains input files required to run TwoSampleMR and MendelianRandomization for GTEx v8 QTLs. Needs to be downloaded by user. Each type of molecular QTL will have a separate sub-directory (e.g. GTEx_v8_eQTL, GTEx_v8_sQTL)
tmp_data: the directory contains temporary files generated
results: the directory containing the results file

Dependencies

TwoSampleMR was written in R (at least 3.5) and requires the following libraries and modules:

library("data.table")
library("dplyr")
library("tidyr")
library("foreign")
library("tibble")
library("metafor")
library("meta")
library("survival")
library("ggplot2")
library("plyr")
library("gridExtra")
library("gtable")
library("grid")
library("tidyverse")
library("stringr")
library("coloc")
library("devtools")
library("glmnet")
library("MendelianRandomization")
library("TwoSampleMR")

Step by step description for running our TwoSampleMR pipeline

Guide to running our TwoSampleMR pipeline, preprocessing of input files, and generating results: Instructions are based on using GTEx v8 data as input, but can be applied to any non-GTEx QTL datasets.

  1. Create the repository structure
  2. Format the GWAS summary statistics file. Required columns: chr, pos, SNP, effect_allele, Other_allele, effect, StdErr, gwas_p_value
  3. Download and format the molecular QTL files. Note: Current pipeline is build to work with the GTEx v8 expression and splicing QTL data output format (GTEx Download https://www.gtexportal.org/home/downloads/adult-gtex#qtl). Required columns: variant_id, gene_id, slope, slope_se, pval_nominal
  4. Generate a manifest file (space separated) with Trait name, Gene ID, Gene Symbol, Tissue, QTL type, p-Value cutoff, File name of the trait, path to the QTL file and QTL file extension. An example manifest file can be found in: manifest.sh
  5. Run TwoSampleMR (wrap_Manifest.R). An example shell script that runs the manifest file and launches the MR jobs: wrap_manifest.sh. The output resuts file is per GWAS/trait, gene, QTL type, tissue combination and contains reuslts from all two sample MR tests and horizontal pleiotropy tests.
  6. Concatenate all results files across all GWAS/trait, gene, QTL type, tissue combinations into a single file/table (concatenate_results.R).

License

Our code is distributed under the terms of the BSD 3-Clause License. See LICENSE.txt file for more details.

Citation

  1. Hamel et al., "Integrating genetic regulation and single-cell expression with GWAS prioritizes causal genes and cell types for glaucoma", medRxiv 2023 (https://www.medrxiv.org/content/10.1101/2022.05.14.22275022v2). Accepted in principle at Nature Communications 2023.
  2. GTEx Consortium, "The GTEx Consortium atlas of genetic regulatory effects across human tissues", Science 369, 1318–1330 (2020).
  3. Hemani et al.,”The MR-Base platform supports systematic causal inference across the human phenome”, eLife 2018 (https://elifesciences.org/articles/34408)
  4. Hemani et al.,”Orienting the causal relationship between imprecisely measured traits using GWAS summary data”, PLOS Genetics 2017 (https://doi.org/10.1371/journal.pgen.1007081)
  5. Yavorska and Burgess, "MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data", Int. J. Epidemiol. 46, 1734–1739 (2017).
  6. Bowden et al., "Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression", Int. J. Epidemiol. 44, 512–525 (2015).
  7. Verbanck et al., "Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases", Nat. Genet. 50, 693–698 (2018).

Last updated: November 16, 2023

About

Pipeline that runs two-sample Mendelian randomization on molecular QTLs and GWAS loci and tests for horizontal pleiotropy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 91.0%
  • Shell 9.0%