--- title: "lit Package Vignette" author: "Andrew J. Bass and Michael P. Epstein" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{LIT: Latent Interaction Testing} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Overview The `lit` package implements a flexible kernel-based multivariate testing procedure, called Latent Interaction Testing (LIT), to detect latent genetic interactions in a genome-wide association study. In a standard GWAS analysis, one typically attempts to determine which SNPs are associated with one (or many) traits. Another important question is - Do any SNPs demonstrate any interactive effects, e.g., gene-by-gene or gene-by-environment interactions? This question has been very difficult to answer because effect sizes of interactions are likely small, interactive variables are unknown, and there's often a large multiple testing burden from testing many candidate interactive variables. One way to help address some of these issues is to use a variance-based testing procedure which does not require the interactive variable(s) to be specified or observed. These procedures can detect any unequal residual trait variation among genotype categories at a specific SNP (i.e., heteroskedasticity), which could suggest an unobserved (or latent) genetic interaction. However, researchers apply such procedures on a trait-by-trait basis and ignore any biological pleiotropy among traits. In fact, it is simple to show that a latent genetic interaction not only induces a variance effect but also a covariance effect between traits, and these covariance patterns can be harnessed to improve the statistical power. The `lit` package addresses this gap by leveraging both the differential variance _and_ differential covariance patterns to substantially increase power to detect latent genetic interactions in a GWAS. In particular, LIT assesses whether the trait variances/covariances vary as a function of genotype using a kernel-based distance covariance (KDC) framework. LIT often provides substantial increases in power compared to trait-by-trait univariate approaches, in part because LIT uses shared information (i.e., pleiotropy) across tests and does not require a multiple testing correction which negatively impacts power. Note that this package contains the core functionality for the methods described in > Bass AJ, Bian S, Wingo AP, Wingo TS, Culter DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. *Submitted*; 2023. Additional software features will be added in the future. ## Installation ```{r, eval = FALSE} # install development version of package install.packages("devtools") library("devtools") devtools::install_github("ajbass/lit") ``` ## Quick start We provide two ways to use the `lit` package. For small GWAS datasets where the genotypes can be loaded in R, the `lit()` function can be used: ```{r} library(lit) # set seed set.seed(123) # generate SNPs and traits X <- matrix(rbinom(10 * 10, size = 2, prob = 0.25), ncol = 10) Y <- matrix(rnorm(10 * 4), ncol = 4) # test for latent genetic interactions out <- lit(Y, X) head(out) ``` The output is a data frame of $p$-values where the rows are SNPs and the columns are different implementations of LIT to test for latent genetic interactions: the first column (`wlit`) uses a linear kernel, the second column (`ulit`) uses a projection kernel, and the third column (`alit`) maximizes the number of discoveries by combining the $p$-values of the linear and projection kernels. For large GWAS datasets (e.g., biobank-sized), the `lit()` function is not computationally feasible. Instead, the `lit_plink()` function can be applied directly to plink files. To demonstrate how to use the function, we use the example plink files from the `genio` package: ```{r} # load genio package library(genio) # path to plink files file <- system.file("extdata", 'sample.bed', package = "genio", mustWork = TRUE) # generate trait expression Y <- matrix(rnorm(10 * 4), ncol = 4) # apply lit to plink file out <- lit_plink(Y, file = file, verbose = FALSE) head(out) ``` See `?lit` and `?lit_plink` for additional details and input arguments. Note that a marginal testing procedure for latent genetic interactions based on the squared residuals and cross products (Marginal (SQ/CP)) can also be implemented using the `marginal` and `marginal_plink` functions: ```{r, eval = FALSE} # apply Marginal (SQ/CP) to loaded genotypes out <- marginal(Y, X) # apply Marginal (SQ/CP) to plink file out <- marginal_plink(Y, file = file, verbose = FALSE) ```