| Title: | Differential Item Functioning Using Robust Scaling |
|---|---|
| Description: | Provides tools for testing differential item functioning (DIF) and differential test functioning (DTF) in two-group item response theory models. The package estimates robust scaling parameters via iteratively reweighted least squares with Tukey's bisquare loss, and supports Wald-type tests of item-level and test-level differences from robust scaling parameters. Inputs can be supplied directly from model parameter/covariance objects or extracted from fitted 'mirt' and 'lavaan' models. Methods are described in Halpin (2022) <doi:10.48550/arXiv.2207.04598>. |
| Authors: | Peter Halpin [aut, cre], Kyle Nickodem [ctb], James Eagle [ctb] |
| Maintainer: | Peter Halpin <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.2.0 |
| Built: | 2026-05-28 14:57:20 UTC |
| Source: | https://github.com/peterhalpin/robustdif |
Registers S3 methods at load time: - print for class "rdif" - plot for class "rdif" - summary for class "rdif"
.onLoad(libname, pkgname).onLoad(libname, pkgname)
libname |
Character string with the path to the package library. |
pkgname |
Character string with the package name. |
No return value, called for side effects when the package loads.
Computes the scaling function a2/a1 for item slopes (a) in groups g = {1, 2}
a_fun(mle, log = FALSE)a_fun(mle, log = FALSE)
mle |
the output of |
log |
logical: return of log(a2/a1)? |
The vector of scaling function values.
If abs(u) > k , psi(u) = 0. Else, psi(u) = (1 - (u/k)^2)^2.
bsq_weight(u, k = 1.96)bsq_weight(u, k = 1.96)
u |
Can be a single value, vector, or matrix. |
k |
The tuning parameter. Can be a scalar or the same dimension as |
The bi-square psi-prime function.
Computes the scaling function to be used from the item thresholds (d) in groups = {1, 2}. The options are:
type = 1: computes d.fun1 = (d2 - d1)/a1
type = 2: computes d.fun2 = (d2 - d1)/a2
type = 3: computes d.fun3 = (d2 - d1)/sqrt{(a1^2 + a2^2)/2}
d_fun(mle, type = 3)d_fun(mle, type = 3)
mle |
the output of |
type |
a number in |
Computes (d2 - d1)/a for each threshold (d) of each item in groups g = {1, 2}. The parameter 'a' depends on the value of type:
The vector of scaling function values.
A Wald test of the difference between the unweighted mean of the y_fun and robust scaling parameter from rdif. Called internally by rdif
delta_test(object, theta = NULL, k = NULL, fun = "d_fun3")delta_test(object, theta = NULL, k = NULL, fun = "d_fun3")
object |
either the output of |
theta |
the estimated scaling parameter from |
k |
the tuning parameter from |
fun |
one of |
A data.frame that contains the output of the test.
# mod <- rdif(mle = rdif.eg) delta_test(object = rdif.eg, theta = mod$est, k = mod$k) delta_test(mod)# mod <- rdif(mle = rdif.eg) delta_test(object = rdif.eg, theta = mod$est, k = mod$k) delta_test(mod)
A Wald test of DIF on each item. Called internally by rdif
dif_test(object, theta = NULL, fun = "d_fun3")dif_test(object, theta = NULL, fun = "d_fun3")
object |
either the output of |
theta |
the estimated scaling parameter from |
fun |
one of |
A data.frame whose rows containing the results of the test for each item parameter.
mod <- rdif(mle = rdif.eg) dif_test(object = rdif.eg, theta = mod$est) dif_test(mod)mod <- rdif(mle = rdif.eg) dif_test(object = rdif.eg, theta = mod$est) dif_test(mod)
Helper function used to format parameters estimates
format_pars(pars, names.vec, type)format_pars(pars, names.vec, type)
pars |
numeric vector of item parameter estimates |
names.vec |
character vector item names |
type |
character; are |
data.frame of item parameter estimates
[robustDIF::get_lavaan_params()], [robustDIF::get_mplus_params()]
lavaan.Extract item parameter estimates and their covariance matrix from lavaan.
get_lavaan_pars(lavaan.object)get_lavaan_pars(lavaan.object)
lavaan.object |
an object of class |
A three-element list:
vector of parameter names taking the form "item.parameter"
list (one element per group) of vectors of item parameter estimates
list (one element per group) of covariance matrices of item parameter estimates
mirt.Extract item parameter estimates and their covariance matrix from mirt.
get_mirt_pars(mirt.object, cluster = NULL)get_mirt_pars(mirt.object, cluster = NULL)
mirt.object |
a |
cluster |
optional cluster-ID vector used to compute a cluster-robust covariance matrix using Oakes bread, empirical score outer products aggregated by cluster, and a CR1 finite-sample correction. |
A three-element list:
vector of parameter names taking the form "item.parameter"
list (one element per group) of vectors of item parameter estimates
list (one element per group) of covariance matrices of item parameter estimates
Takes a 1-factor model fit or list of 1-factor model fits from mirt or cfa
and formats the item parameter estimates and their covariance matrix for use in other robustDIF functions.
get_model_parms(object, cluster = NULL)get_model_parms(object, cluster = NULL)
object |
model fit from a multigroup analysis or list of model fits for each group for a 1-factor model. See Details. |
cluster |
optional clustering IDs used to compute a cluster-robust covariance matrix for mirt fits.
For |
The function takes a fitted 1-factor multigroup model or list of fitted 1-factor single group models. The factor must be standardized (i.e., variance = 1) and the covariance matrix be asymptotically correct. Currently, the function accepts:
a mirt object of class SingleGroupClass or MultipleGroupClass with SE = TRUE (to return covariance matrix) and itemtype of any combination of "2PL", "graded", or "gpcm".
a lavaan object estimated from cfa with std.lv = TRUE.
When cluster is supplied for mirt fits, the covariance matrix is computed using a cluster-robust sandwich estimator with Oakes bread and cluster-summed empirical scores.
A CR1 finite-sample correction is applied:
(G/(G-1)) * ((N-1)/(N-p)), where G is the number of clusters,
N is the number of observations, and p is the number of free parameters.
It is possible to use fits from other software with robustDIF functions, but the parameter estimates and their covariance matrices must be formatted identically to what is returned by get_model_parms. For details, see the documentation for the example dataset rdif.eg.
A three-element list:
par.names: list with internal and original parameter names.
est: list (one element per group) of data frames containing item parameters by row (a1, d1, d2, ...).
var.cov: list (one element per group) of covariance matrices for the corresponding parameter vectors.
rdif.Compute starting values for rdif.
get_starts(mle, fun = "d_fun3", alpha = 0.05)get_starts(mle, fun = "d_fun3", alpha = 0.05)
mle |
the output of |
fun |
one of |
alpha |
the desired false positive rate for flagging items with DIF. |
A vector containing the median of y_fun, the least trimmed squares estimate of location for y_fun with 50-percent trim rate, and the minimum of rho_grid.
a_fun.The gradient is taken with respect to the item parameters and organized to be conformable with Matrix::bdiag(mle$var.cov). When evaluating the gradient under the null hypothesis of no DIF, the optional argument theta can be provided. It replaces the item-specific values of a_fun in the gradient computation.
grad_a(mle, theta = NULL, log = FALSE)grad_a(mle, theta = NULL, log = FALSE)
mle |
the output of |
theta |
(optional) the scaling parameter. Replaces item-specific values of alpha if provided. |
log |
logical: return of log(a2/a1)? |
A matrix in which the columns are the gradient vectors of a_fun, for each item.
d_fun.The gradient is taken with respect to the item parameters and organized to be conformable with Matrix::bdiag(mle$var.cov). When evaluating the gradient under the null hypothesis of no DIF, the optional argument theta can be provided. It replaces the item-specific values of d_fun in the gradient computation.
grad_d(mle, theta = NULL, type = 3)grad_d(mle, theta = NULL, type = 3)
mle |
the output of |
theta |
(optional) the scaling parameter. Replaces item-specific values of d_fun if provided. |
type |
a number in |
A matrix in which the columns are the gradient vectors of d_fun, for each item and threshold.
The least trimmed squares (LTS) estimate of location
lts(y, p = 0.5)lts(y, p = 0.5)
y |
a vector of data points. |
p |
the proportion of data points to trim. |
The LTS estimate of location of y.
If abs(u) > k , psi(u) = 0. Else, psi(u) = u(1 - (u/k)^2)^2.
psi(u, k = 1.96)psi(u, k = 1.96)
u |
Can be a single value, vector, or matrix. |
k |
The tuning parameter. Can be a scalar or the same dimension as |
The bi-square psi function.
If abs(u) > k , psi_prime(u) = 0. Else,
psi_prime(u) = (1 - (u/k)^2)^2 - (2u/k)^2 (1 - (u/k)^2).
psi_prime(u, k = 1.96)psi_prime(u, k = 1.96)
u |
Can be a single value, vector, or matrix. |
k |
The tuning parameter. Can be a scalar or the same dimension as |
The bi-square psi-prime function.
Estimation can be performed using iteratively re-weighted least squares (IRLS) or Newton-Raphson (NR). Currently, only IRLS is implemented. If starting.value = "all", three starting values are computed: the median of y_fun, the least trimmed squares estimate of location for y_fun with 50-percent trim rate, and the minimum of rho_grid. The estimate is computed from each starting value, and the solution with the lowest value of the bi-square objective function is returned. If there are multiple solutions, they are stored other.solutions.
rdif( mle, fun = "d_fun3", alpha = 0.05, starting.value = "all", tol = 1e-07, maxit = 100, method = "irls" )rdif( mle, fun = "d_fun3", alpha = 0.05, starting.value = "all", tol = 1e-07, maxit = 100, method = "irls" )
mle |
the output of |
fun |
one of |
alpha |
the desired false positive rate for flagging items with DIF. |
starting.value |
one of |
tol |
convergence criterion for comparing subsequent values of estimate |
maxit |
maximum number of iterations |
method |
one of |
Implements M-estimation of an IRT scale parameter using the bi-square loss function. Also returns the bi-square weights for each item.
An rdif object.
# Item intercepts, using the built-in example dataset "rdif.eg" rdif(mle = rdif.eg, fun = "d_fun3") # Item slopes rdif(mle = rdif.eg, fun = "a_fun1")# Item intercepts, using the built-in example dataset "rdif.eg" rdif(mle = rdif.eg, fun = "d_fun3") # Item slopes rdif(mle = rdif.eg, fun = "a_fun1")
A named list containing the maximum likelihood estimates and their estimated covariance matrix, for the 2PL IRT model fitted to 5 items in two independent groups. The first item has additive bias of .5 applied to the intercept only. The groups have a mean difference of .5 standard deviations on the latent trait. The variances of the latent trait are equal in each group.
rdif.egrdif.eg
A named list with 4 components:
A data.frame or named list with par0$a containing the item slopes and par0$d containing the item intercepts, for the reference group.
The item parameter estimates of the comparison groups. See par0 for formatting.
The covariance matrix of par0, formatted as either a data.frame or matrix. The parameters should be organized by item, with the slope parameter coming first and the intercept parameter coming second (e.g., a.item1, d.item1, a.item2, d.item2, ...).
The covariance matrix of par1. See vcov0 for formatting.
If abs(u) > k , rho(u) = 1. Else, psi(u) = 1 - (1 - (u/k)^2)^3.
rho(u, k = 1.96)rho(u, k = 1.96)
u |
Can be a single value, vector, or matrix. |
k |
The tuning parameter. Can be a scalar or the same dimension as |
The bi-square rho function.
Computes the objective function of the bi-square minimization problem in a location parameter, theta. The theta values are obtained internally by a grid search over the range of y_fun. Used for starting values and graphically diagnosing local solutions.
rho_grid(mle, fun = "d_fun3", alpha = 0.05, grid.width = 0.01)rho_grid(mle, fun = "d_fun3", alpha = 0.05, grid.width = 0.01)
mle |
the output of |
fun |
one of |
alpha |
the desired false positive rate for flagging items with DIF. |
grid.width |
the width of grid points. |
A named list with theta values and the corresponding Rho values.
When evaluating the covariance matrix under the null hypothesis of no DIF, the optional argument theta can be provided. It replaces the item-specific scaling functions in the gradient computation. Type should be the same as used in y_fun.
vcov_y(mle, theta = NULL, fun = "d_fun3")vcov_y(mle, theta = NULL, fun = "d_fun3")
mle |
the output of |
theta |
(optional) the scaling parameter. Replaces item-specific scaling functions if provided. |
fun |
one of |
The covariance matrix of y_fun.
Computes the a scaling function using the item thresholds (d) and slopes (a) in groups = {0, 1}. The options are:
"a_fun1": computes a2/a1
"a_fun2": computes log(a2/a1)
"d_fun1": computes (d2 - d1)/a1 for each threshold
"d_fun2": computes (d2 - d1)/a2 for each threshold
"d_fun3": computes (d2 - d1)/sqrt((a1^2 + a2^2)/2) for each threshold
y_fun(mle, fun = "d_fun3")y_fun(mle, fun = "d_fun3")
mle |
the output of |
fun |
one of |
Computes the scaling function specified by fun, for each item.
A vector of scaling function values.