Title: | Surrogate Functional False Discovery Rates for Genome-Wide Association Studies |
---|---|
Description: | Pleiotropy-informed significance analysis of genome-wide association studies with surrogate functional false discovery rates (sfFDR). The sfFDR framework adapts the fFDR to leverage informative data from multiple sets of GWAS summary statistics to increase power in study while accommodating for linkage disequilibrium. sfFDR provides estimates of key FDR quantities in a significance analysis such as the functional local FDR and $q$-value, and uses these estimates to derive a functional $p$-value for type I error rate control and a functional local Bayes' factor for post-GWAS analyses (e.g., fine mapping and colocalization). |
Authors: | Andrew Bass [aut, cre], Chris Wallace [aut] |
Maintainer: | Andrew Bass <[email protected]> |
License: | LGPL |
Version: | 1.0.0 |
Built: | 2025-01-10 06:19:49 UTC |
Source: | https://github.com/ajbass/sffdr |
The summary level data is a subset of independent SNPs from the UK Biobank where we performed a GWAS of body mass index (BMI), body fat percentage (BFP), cholesterol, and triglycerides. Note that BFP, cholesterol and triglycerides are conditioning traits and were calculated using a separate set of individuals than BMI. See manuscript for details.
data(bmi)
data(bmi)
A list called sumstats
containing:
bmi |
Vector of 10,000 p-values for BMI. |
bfp |
Vector of 10,000 p-values for BFP. |
cho |
Vector of 10,000 p-values for cholesterol. |
tri |
Vector of 10,000 p-values for triglycerides. |
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # apply sffdr # Note all tests are independent see 'indep_snps' argument # The very small p-values, set epsilon to min of p sffdr_out <- sffdr(p, fpi0, epsilon = min(p)) # Plot significance results plot(sffdr_out, rng = c(0, 5e-4)) # Functional P-values, Q-values, and local FDR fp <- sffdr_out$fpvalues fq <- sffdr_out$fqvalues flfdr <- sffdr_out$flfdr
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # apply sffdr # Note all tests are independent see 'indep_snps' argument # The very small p-values, set epsilon to min of p sffdr_out <- sffdr(p, fpi0, epsilon = min(p)) # Plot significance results plot(sffdr_out, rng = c(0, 5e-4)) # Functional P-values, Q-values, and local FDR fp <- sffdr_out$fpvalues fq <- sffdr_out$fqvalues flfdr <- sffdr_out$flfdr
Perform functional fine mapping with a set of functional local FDRs in a region of interest (assuming a single causal variant).
ffinemap(flfdr, fpi0)
ffinemap(flfdr, fpi0)
flfdr |
A vector of functional local FDRs of a region of interest |
fpi0 |
An estimate of the function proportion of null tests |
A list of object type "sffdr" containing:
BF |
The functional local Bayes' factors. |
PP |
Posterior probability of a SNP being causal. |
he function fpi0est
estimates the functional proportion of null tests given
a set of informative variables.
fpi0est( p, z, pi0_model, indep_snps = NULL, lambda = seq(0.05, 0.9, 0.05), method = "gam", maxit = 1000, pi0.method.control = NULL, ... )
fpi0est( p, z, pi0_model, indep_snps = NULL, lambda = seq(0.05, 0.9, 0.05), method = "gam", maxit = 1000, pi0.method.control = NULL, ... )
p |
A vector of p-values. |
z |
A vector of informative variables |
pi0_model |
Model formula corresponding to |
indep_snps |
A boolean vector (same size as p) specifying the set of independent tests. Default is NULL and all tests are treated independently. |
lambda |
A vector of values between [0,1] to estimate the functional proportion of truly null tests. |
method |
Either the "gam" (generalized additive model) or "glm" (generalized linear models) approach. Default is "gam". |
maxit |
The maximum number of iterations for "glm" approach. Default is 1000. |
pi0.method.control |
A user specified set of parameters for convergence for either "gam" or "glm". Default is NULL. See |
... |
This code extends the function from the fFDR package to handle multiple informative variables and linkage disequilibrium.
A list of object type "fpi0" containing:
fpi0 |
A table containing the functional proportion of truly null tests. |
tableLambda |
Functional proportion of null tests at the lambda values |
MISE |
MISE values. |
lambda.hat |
The chosen lambda value. |
Andrew J. Bass, David G. Robinson (author of original function)
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # Estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # See relationship of BFP/cholesterol/triglycerides and fpi0 plot(fmod$zt$bfp, fpi0) plot(fmod$zt$cholesterol, fpi0) plot(fmod$zt$triglycerides, fpi0)
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # Estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # See relationship of BFP/cholesterol/triglycerides and fpi0 plot(fmod$zt$bfp, fpi0) plot(fmod$zt$cholesterol, fpi0) plot(fmod$zt$triglycerides, fpi0)
Calculate functional p-values from functional local FDRs. Internal use.
fpvalues(lfdr, p = NULL)
fpvalues(lfdr, p = NULL)
lfdr |
A vector of functional local FDRs of a region |
p |
A vector of p-values. Default is NULL. |
A list of object type "sffdr" containing:
fp |
Functional p-values. |
fq |
Functional q-values. |
This function is adapted from the fFDR package.
kernelEstimator( x, transformation = "probit", eval.points = x, subsample = 1e+07, epsilon = 1e-15, epsilon.max = 0.999, maxk = 10000, trim = 1e-15, nn = NULL, ... )
kernelEstimator( x, transformation = "probit", eval.points = x, subsample = 1e+07, epsilon = 1e-15, epsilon.max = 0.999, maxk = 10000, trim = 1e-15, nn = NULL, ... )
x |
Either a vector or a 2-column matrix |
transformation |
Either probit (default), complementary log-log, or identity (not recommended) |
eval.points |
Points at which to evaluate the estimate, default x |
subsample |
Number of points that are randomly subsampled for computing the fit; useful for computational efficiency and for ensuring the density estimation does not run out of memory. NULL means no the fit is performed on all points |
epsilon |
How close values are allowed to come to 0 |
epsilon.max |
How close values are allowed to come to 1 |
maxk |
maxk argument passed to locfit |
trim |
In one-dimensional fitting, the very edges often have high variance. This parameter fixes the estimate on the intervals (0, trim) and (1 - trim, 1). |
nn |
nearest neighbor parameter |
... |
additional arguments to be passed to lp in locfit, used only if cv=FALSE |
Provide density estimates that are needed by sffdr
pi0_model
helps generate the model for the proportion of truly null tests.
For more details, refer to the vignette.
pi0_model(z, indep_snps = NULL, basis.df = 3, knots = NULL)
pi0_model(z, indep_snps = NULL, basis.df = 3, knots = NULL)
z |
|
indep_snps |
|
basis.df |
|
knots |
|
We note that this function is specifically designed for informative p-values and other complex models should be created outside this function.
A list with the following entries:
fmod: model formula
zt: matrix of rank-transformed informative variables
Andrew Bass
data(bmi) p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # For p-values, you want to specify the lower quantiles fmod <- pi0_model(z, knots = c(0.005, 0.01, 0.025, 0.05, 0.1))
data(bmi) p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # For p-values, you want to specify the lower quantiles fmod <- pi0_model(z, knots = c(0.005, 0.01, 0.025, 0.05, 0.1))
Graphical display of the sffdr object
## S3 method for class 'sffdr' plot(x, rng = c(0, 5e-08), ...)
## S3 method for class 'sffdr' plot(x, rng = c(0, 5e-08), ...)
x |
A sffdr object. |
rng |
Significance region to show. Optional. |
... |
Additional arguments. Currently unused. |
Nothing of interest.
Andrew J. Bass
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # apply sffdr # Note all tests are independent see 'indep_snps' argument # The very small p-values, set epsilon to min of p sffdr_out <- sffdr(p, fpi0, epsilon = min(p)) # Plot significance results plot(sffdr_out, rng = c(0, 5e-4))
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # apply sffdr # Note all tests are independent see 'indep_snps' argument # The very small p-values, set epsilon to min of p sffdr_out <- sffdr(p, fpi0, epsilon = min(p)) # Plot significance results plot(sffdr_out, rng = c(0, 5e-4))
Estimate the functional p-values, q-values, and local false discovery rates given a set of p-values and informative variables. The functional p-values is mapping from the functional q-value (FDR-based measure) to a p-value for type I error rate control.
sffdr( p.value, fpi0, surrogate = NULL, indep_snps = NULL, monotone.window = NULL, epsilon = 1e-15, nn = NULL, fp_ties = TRUE, ... )
sffdr( p.value, fpi0, surrogate = NULL, indep_snps = NULL, monotone.window = NULL, epsilon = 1e-15, nn = NULL, fp_ties = TRUE, ... )
p.value |
A vector of p-values. |
fpi0 |
An estimate of the function proportion of null tests using the |
surrogate |
A surrogate variable that compresses more than one informative variables.
Default is NULL. If |
indep_snps |
A boolean vector (same size as p) specifying the set of independent tests. Default is NULL and all tests are treated independently. |
monotone.window |
Enforce monotonicity at specified step size. Default is NULL. |
epsilon |
A numerical value the truncation for the p-values during density estimation. Default is 1e-15. You may want to consider decreasing this value if there are a substantial number of small p-values. |
nn |
A numerical value specifying the nearest neighbor parameter in |
fp_ties |
A boolean specifying whether ties should be broken using the ordering of the p-values when calculating the fp-values. Only impacts the tests when the local FDR is tied. Default is TRUE. |
... |
Additional arguments passed to |
The function fpi0est
should be called externally to estimate the
functional proportion of null tests given the set of informative variables.
The surrogate functional FDR methodology builds from the functional FDR
methodology and implements some of the functions from the package.
A list of object type "sffdr" containing:
pvalues |
A vector of the original p-values. |
fpvalues |
A vector of the estimated functional p-values. |
fqvalues |
A vector of the estimated functional q-values. |
flfdr |
A vector of the estimated functional local FDR values. |
pi0 |
An vector of the original functional proportion of null tests. |
density |
An object containing the kernel density estimates from |
Andrew J. Bass
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # apply sffdr # Note all tests are independent see 'indep_snps' argument # The data has very small p-values, set epsilon to min of p sffdr_out <- sffdr(p, fpi0, epsilon = min(p)) # Plot significance results plot(sffdr_out, rng = c(0, 5e-4)) # Functional P-values, Q-values, and local FDR fp <- sffdr_out$fpvalues fq <- sffdr_out$fqvalues flfdr <- sffdr_out$flfdr
# import data data(bmi) # separate main p-values and conditioning p-values p <- sumstats$bmi z <- as.matrix(sumstats[, -1]) # apply pi0_model to create model knots <- c(0.005, 0.01, 0.025, 0.05, 0.1) fmod <- pi0_model(z, knots = knots) # estimate functional pi0 fpi0_out <- fpi0est(p, z = fmod$zt, pi0_model = fmod$fmod) fpi0 <- fpi0_out$fpi0 # apply sffdr # Note all tests are independent see 'indep_snps' argument # The data has very small p-values, set epsilon to min of p sffdr_out <- sffdr(p, fpi0, epsilon = min(p)) # Plot significance results plot(sffdr_out, rng = c(0, 5e-4)) # Functional P-values, Q-values, and local FDR fp <- sffdr_out$fpvalues fq <- sffdr_out$fqvalues flfdr <- sffdr_out$flfdr