Calculate reference distribution using parametric bootstrap

Calculate reference distribution of likelihood ratio statistic in mixed effects models using parametric bootstrap

PBrefdist(
  largeModel,
  smallModel,
  nsim = 1000,
  seed = NULL,
  cl = NULL,
  details = 0
)

# S3 method for lm
PBrefdist(
  largeModel,
  smallModel,
  nsim = 1000,
  seed = NULL,
  cl = NULL,
  details = 0
)

# S3 method for merMod
PBrefdist(
  largeModel,
  smallModel,
  nsim = 1000,
  seed = NULL,
  cl = NULL,
  details = 0
)

Arguments

largeModel: A linear mixed effects model as fitted with the lmer() function in the lme4 package. This model muse be larger than smallModel (see below).
smallModel: A linear mixed effects model as fitted with the lmer() function in the lme4 package. This model muse be smaller than largeModel (see above).
nsim: The number of simulations to form the reference distribution.
seed: Seed for the random number generation.
cl: Used for controlling parallel computations. See sections 'details' and 'examples' below.
details: The amount of output produced. Mainly relevant for debugging purposes.

Value

A numeric vector

Details

The model object must be fitted with maximum likelihood (i.e. with REML=FALSE). If the object is fitted with restricted maximum likelihood (i.e. with REML=TRUE) then the model is refitted with REML=FALSE before the p-values are calculated. Put differently, the user needs not worry about this issue.

The argument 'cl' (originally short for 'cluster') is used for
controlling parallel computations. 'cl' can be NULL (default),
positive integer or a list of clusters.

Special care must be taken on Windows platforms (described below) but the general picture is this:

The recommended way of controlling cl is to specify the
component \code{pbcl} in options() with
e.g. \code{options("pbcl"=4)}.

If cl is NULL, the function will look at if the pbcl has been set
in the options list with \code{getOption("pbcl")}

If cl=N then N cores will be used in the computations. If cl is
NULL then the function will look for

References

Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/

Author

Søren Højsgaard sorenh@math.aau.dk

Examples


data(beets)
head(beets)
#>   harvest  block  sow yield sugpct
#> 1   harv1 block1 sow3 128.0   17.1
#> 2   harv1 block1 sow4 118.0   16.9
#> 3   harv1 block1 sow5  95.0   16.6
#> 4   harv1 block1 sow2 131.0   17.0
#> 5   harv1 block1 sow1 136.5   17.0
#> 6   harv2 block2 sow3 136.5   17.0
beet0 <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest), data=beets, REML=FALSE)
beet_no.harv <- update(beet0, . ~ . -harvest)
rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=1)
rd
#>  [1] 3.120726e-01 1.530013e+00 7.464388e+00 2.615508e-02 2.557797e-01
#>  [6] 5.717452e-01 1.258622e+00 1.429834e-03 8.870589e-01 2.728823e-02
#> [11] 7.467763e-02 1.621034e-01 2.606911e+00 5.525129e+00 1.010020e-01
#> [16] 7.143766e-01 3.343534e-02 3.257146e+00 4.674691e+00 2.406896e-07
#> attr(,"cl")
#> [1] 1
#> attr(,"ctime")
#> elapsed 
#>   0.624 
#> attr(,"stat")
#>         tobs           df      p.value 
#> 1.291424e+01 1.000000e+00 3.260911e-04 
#> attr(,"samples")
#>        nsim        npos   n.extreme         pPB 
#> 20.00000000 20.00000000  0.00000000  0.04761905 
#> attr(,"class")
#> [1] "refdist"
if (FALSE) {
## Note: Many more simulations must be made in practice.

# Computations can be made in parallel using several processors:

# 1: On OSs that fork processes (that is, not on windows):
# --------------------------------------------------------

if (Sys.info()["sysname"] != "Windows"){
  N <- 2 ## Or N <- parallel::detectCores()

# N cores used in all calls to function in a session
  options("mc.cores"=N)
  rd <- PBrefdist(beet0, beet_no.harv, nsim=20)

# N cores used just in one specific call (when cl is set,
# options("mc.cores") is ignored):
  rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=N)
}

# In fact, on Windows, the approach above also work but only when setting the
# number of cores to 1 (so there is to parallel computing)

# In all calls:
# options("mc.cores"=1)
# rd <- PBrefdist(beet0, beet_no.harv, nsim=20)
# Just once
# rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=1)

# 2. On all platforms (also on Windows) one can do
# ------------------------------------------------
library(parallel)
N <- 2 ## Or N  <- detectCores()
clus <- makeCluster(rep("localhost", N))

# In all calls in a session
options("pb.cl"=clus)
rd <- PBrefdist(beet0, beet_no.harv, nsim=20)

# Just once:
rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=clus)
stopCluster(clus)
}