The RUV-rinv algorithm. Estimates and adjusts for unwanted variation using negative controls.
RUVrinv(Y, X, ctl, Z = 1, eta = NULL, fullW0 = NULL, invsvd = NULL, lambda = NULL, k = NULL, l = NULL, randomization = FALSE, iterN = 1e+05, inputcheck = TRUE)
Y |
The data. A m by n matrix, where m is the number of samples and n is the number of features. |
X |
The factor(s) of interest. A m by p matrix, where m is the number of samples and p is the number of factors of interest. Very often p = 1. |
ctl |
The negative controls. A logical vector of length n. |
Z |
Any additional covariates to include in the model. Either a m by q matrix of covariates, or simply 1 (the default) for an intercept term. |
eta |
Gene-wise (as oposed to sample-wise) covariates. These covariates are adjusted for by RUV-1 before any further analysis proceeds. A matrix with n columns. |
fullW0 |
Can be included to speed up execution. |
invsvd |
Can be included to speed up execution. Generally used when calling RUV(r)inv many times with different values of lambda. |
lambda |
Ridge parameter. If unspecified, an appropriate default will be used. |
k |
When calculating the default value of lambda, a call to RUV4 is made. This parameter specifies the value of k to use. Otherwise, an appropriate default k will be used. |
l |
If lambda and k are both NULL, then k must be estimated using the getK routine. The getK routine only accepts a single-column X. If p > 1, l specifies which column of X should be used in the getK routine. |
randomization |
Whether the inverse-method variances should be computed using randomly generated factors of interest (as opposed to a numerical integral). |
iterN |
The number of random "factors of interest" to generate (used only when randomization=TRUE). |
inputcheck |
Perform a basic sanity check on the inputs, and issue a warning if there is a problem. |
Implements the RUV-rinv algorithm as described in Gagnon-Bartsch, Jacob, and Speed (2013). This function is essentially just a wrapper to RUVinv, but with a little extra code to calculate the default value of lambda
.
A list containing
betahat |
The estimated coefficients of the factor(s) of interest. A p by n matrix. |
sigma2 |
Estimates of the features' variances. A vector of length n. |
t |
t statistics for the factor(s) of interest. A p by n matrix. |
p |
P-values for the factor(s) of interest. A p by n matrix. |
multiplier |
The constant by which |
df |
The number of residual degrees of freedom. |
W |
The estimated unwanted factors. |
alpha |
The estimated coefficients of W. |
byx |
The coefficients in a regression of Y on X (after both Y and X have been "adjusted" for Z). Useful for projection plots. |
bwx |
The coefficients in a regression of W on X (after X has been "adjusted" for Z). Useful for projection plots. |
X |
|
k |
|
ctl |
|
Z |
|
fullW0 |
Can be used to speed up future calls of RUV4. |
lambda |
|
## Create some simulated data m = 50 n = 10000 nc = 1000 p = 1 k = 20 ctl = rep(FALSE, n) ctl[1:nc] = TRUE X = matrix(c(rep(0,floor(m/2)), rep(1,ceiling(m/2))), m, p) beta = matrix(rnorm(p*n), p, n) beta[,ctl] = 0 W = matrix(rnorm(m*k),m,k) alpha = matrix(rnorm(k*n),k,n) epsilon = matrix(rnorm(m*n),m,n) Y = X%*%beta + W%*%alpha + epsilon ## Run RUV-rinv fit = RUVrinv(Y, X, ctl) ## Get adjusted variances and p-values fit = variance_adjust(fit)