Abstract
Fixed tests and computerized adaptive testing (CAT) coexist in many testing programs and are often used interchangeably on the premise that both testing formats meet the same test specifications. In conventional CAT, however, items are selected through computer algorithms to meet primarily statistical criteria, whereas fixed forms are often created focusing heavily on content, non-statistical, and practical requirements. Founded on the optimal test design framework, the shadow-test approach to CAT and its generalization can allow for constructing fixed and adaptive test forms to the same test specifications with complex sets of constraints. This approach can render a variety of testing formats with different levels of adaptation and relative efficiency. TestDesign is a package implemented in R to allow for constructing both static and adaptive test forms to the same test specifications based on the framework.
The shadow-test approach to CAT provides a flexible framework for adaptive testing solutions requiring a complex set of constraints (van der Linden and Reese 1998).
Utilizing the universal shadow-test assembler framework (van der Linden and Diao 2014), many testing formats can be assembled as a special case of the approach, including fixed, multi-stage, fully-adaptive tests, and their mixtures.
The key advantage is that it enforces the same set of constraints for different testing formats.
TestDesign
PackageBased on the shadow-test approach to CAT, TestDesign
implements the optimal test design framework (van der Linden and Reese 1998) to perform both the assembly of fixed-form tests and run simulations of their adaptive counterparts with great customizability.
TestDesign
is unique in that it can assemble both static and adaptive test forms subject to the same test specification based on the optimal test design framework.
TestDesign
supports item pools that include a mixture of dichotomous and polytomous items calibrated according to common IRT models.
TestDesign
implements the universal shadow-test assembler framework (van der Linden and Diao 2014) allowing for various levels of adaptivity:
Universal Shadow-Test
The current version can work with the following open-source and commercial MIP solvers:
Utilizing the \(\textbf{S4}\) object-oriented programming (OOP) system in R, TestDesign
provides a collection of classes and methods for various IRT models:
TestDesign
is built on two primary modules:
Static()
for fixed-form assemblyShadow()
for adaptive assemblyThe primary modules share the same input data components prepared sequentially by the following loading functions:
loadItemPool()
to load an item poolloadItemAttrib()
to load item attributesloadStAttrib()
to load optional stimulus attributesloadConstraints()
to load test constraintsThe input data components required for test assembly are an item pool containing item parameter estimates, item attributes, and test constraints. Stimulus-based test assembly also requires stimulus attributes. The data components are:
An item pool is prepared as a plain csv
file. The following example file contains a mix of 1000 dichotomous and polytomous items.
itempool_science_data <- read.csv(file.path(find.package("TestDesign"), "/extdata/itempool_science_1000.csv"), header = TRUE)
datatable(itempool_science_data, rownames = FALSE)
An item attribute file is prepared as a plain csv
file, containing various item-level characteristics.
itemattrib_science_data <- read.csv(file.path(find.package("TestDesign"), "/extdata/itemattrib_science_1000.csv"), header = TRUE, as.is = TRUE)
datatable(itemattrib_science_data, rownames = FALSE)
A constraints file is prepared as a csv
file. The following example contains 36 constraints.
constraints_science_data <- read.csv(file.path(find.package("TestDesign"), "/extdata/constraints_science_1000.csv"), header = TRUE, as.is = TRUE)
datatable(constraints_science_data, rownames = FALSE)
The loading functions can load csv
files directly or data.frame
objects created from them. Using the latter option, the following sections illustrate how to load and create input data components.
loadItemPool()
This creates an item_pool
object.
itempool_science <- loadItemPool(itempool_science_data)
summary(itempool_science)
## Item pool
## # of items : 1000
## item_3PL : 918
## item_GPC : 82
## has SE : FALSE
head(itempool_science@parms)
## [[1]]
## Three-parameter logistic model (item_3PL)
## Slope : 0.2961202
## Difficulty : -0.6070774
## Guessing : 0.1987126
##
## [[2]]
## Three-parameter logistic model (item_3PL)
## Slope : 0.9188548
## Difficulty : -2.94024
## Guessing : 0.2708818
##
## [[3]]
## Three-parameter logistic model (item_3PL)
## Slope : 1.136193
## Difficulty : -0.4767972
## Guessing : 0.2061712
##
## [[4]]
## Three-parameter logistic model (item_3PL)
## Slope : 1.1097
## Difficulty : -1.733667
## Guessing : 0.1672703
##
## [[5]]
## Three-parameter logistic model (item_3PL)
## Slope : 0.9155325
## Difficulty : -0.2742345
## Guessing : 0.1377026
##
## [[6]]
## Three-parameter logistic model (item_3PL)
## Slope : 0.8146092
## Difficulty : -1.181495
## Guessing : 0.2755008
loadItemAttrib()
This creates an item_attrib
object. The second argument references the item.pool
object created above.
itemattrib_science <- loadItemAttrib(itemattrib_science_data, itempool_science)
summary(itemattrib_science)
## Item attributes
## # of attributes : 9
## INDEX : (1000 levels)
## ID : (1000 levels)
## LEVEL : 3 4 5
## STANDARD : 1 2 3 4
## OBJECTIVE : (28 levels)
## DOK : 1 2 3
## TYPE : DRAG EQTN FILL GRAPH HOTS MATCH SRMU SRSI
## PVALUE : (1000 levels)
## PTBIS : (1000 levels)
loadConstraints()
This creates a constraints
object. The third argument references the item_attrib
object created above.
constraints_science <- loadConstraints(constraints_science_data, itempool_science, itemattrib_science)
datatable(constraints_science@constraints, rownames = FALSE)
We will first illustrate fixed-form assembly problems with Static()
followed by examples of adaptive assembly using Shadow()
.
Static()
Static()
provides three item selection methods:
item_selection = "MAXINFO"
)item_selection = "TIF"
)item_selection = "TCC"
)The helper function createStaticTestConfig()
allows the user to create a configuration object, config_Static
, and to specify the item_selection
method of choice and related options.
Here, we illustrate a fixed-form assembly with the Target Information Function option.
cfg_fixed <- createStaticTestConfig(
item_selection = list(
method = "TIF",
target_location = c(-1, 1),
target_value = c(8, 10)
)
)
fixed_science <- Static(cfg_fixed, constraints_science)
datatable(fixed_science@selected, rownames = FALSE)
datatable(fixed_science@achieved, rownames = FALSE)
plot(fixed_science)
Illustrating a fixed-form assembly with the Target Characteristic Curve option:
cfg_fixed <- createStaticTestConfig(
item_selection = list(
method = "TCC",
target_location = c(-1, 0, 1),
target_value = c(10, 15, 20)
)
)
fixed_science <- Static(cfg_fixed, constraints_science)
datatable(fixed_science@selected, rownames = FALSE)
datatable(fixed_science@achieved, rownames = FALSE)
plot(fixed_science)
Here, we illustrate a fixed-form assembly with the Target Information Function option:
cfg_fixed <- createStaticTestConfig(
item_selection = list(
method = "MAXINFO",
target_location = c(-1, 1)
)
)
fixed_science <- Static(cfg_fixed, constraints_science)
datatable(fixed_science@selected)
datatable(fixed_science@achieved)
plot(fixed_science)
Shadow()
Shadow()
provides a flexible mechanism to control the level of adaptivity in CAT to render different test formats with the same test specifications. Although maximum adaptivity is realized in fully adaptive testing whereby the shadow test is reassembled upon administering each item, the freezing/refreshing mechanism (van der Linden and Diao 2014) allows for assembling any conceivable testing format as a special case of the shadow-test approach. A few common test formats with reduced levels of adaptivity include:
Fixed - a single shadow test constructed targeting a specific trait level(s) to be administered in whole to all examinees
LOFT - an individualized shadow test constructed for each examinee targeting a specific location on the ability continuum or the examinee’s score from a previous administration and presented in its entirety
On-the-fly MST - a common shadow test constructed for a group of examinees to be reassembled at some predetermined points into testing (or when the change in trait estimate is greater than a certain threshold or both) to be optimized for each examinee’s updated trait estimate
Any hybrids of the above
cfg_adaptive <- createShadowTestConfig()
adaptive_science <- Shadow(cfg_adaptive, constraints_science, true_theta = c(0, 1))
plot(adaptive_science, type = "audit" , examinee_id = 1)
plot(adaptive_science, type = "audit" , examinee_id = 2)
plot(adaptive_science, type = "shadow", examinee_id = 1, simple = TRUE)
plot(adaptive_science, type = "shadow", examinee_id = 2, simple = TRUE)
cfg_adaptive <- createShadowTestConfig()
cfg_adaptive@refresh_policy$method <- "POSITION"
cfg_adaptive@refresh_policy$position <- c(1, 11, 21)
adaptive_science <- Shadow(cfg_adaptive, constraints_science, true_theta = c(0, 1))
plot(adaptive_science, type = "audit" , examinee_id = 1)
plot(adaptive_science, type = "audit" , examinee_id = 2)
plot(adaptive_science, type = "shadow", examinee_id = 1, simple = TRUE)
plot(adaptive_science, type = "shadow", examinee_id = 2, simple = TRUE)
Delivering test forms online and on demand has become a standard practice in educational, psychological, and health outcomes testing arenas in recent years. The optimal test assembly framework using MIP provides a viable solution for online and on demand test-assembly problems with complex test specifications and constraints. TestDesign
is available from the Comprehensive R Archive Network: (https://CRAN.R-project.org/package=TestDesign) and GitHub (https://github.com/choi-phd/TestDesign).
Berkelaar, Michel, and others. 2020. lpSolve: Interface to Lp_solve V. 5.5 to Solve Linear/Integer Programs. https://CRAN.R-project.org/package=lpSolve.
Gurobi Optimization and LLC. 2019. Gurobi: Gurobi Optimizer 9.0 Interface. http://www.gurobi.com.
Harter, Reinhard, Kurt Hornik, and Stefan Theussl. 2017. Rsymphony: SYMPHONY in R. https://CRAN.R-project.org/package=Rsymphony.
Kim, Vladislav. 2019. Lpsymphony: Symphony Integer Linear Programming Solver in R. https://doi.org/10.18129/B9.bioc.lpsymphony.
Theussl, Stefan, and Kurt Hornik. 2019. Rglpk: R/GNU Linear Programming Kit Interface. https://CRAN.R-project.org/package=Rglpk.
van der Linden, Wim J., and Qi Diao. 2014. “Using a Universal Shadow-Test Assembler with Multistage Testing.” In Computerized Multistage Testing: Theory and Applications, edited by Duanli Yan, Alina A. von Davier, and Charles Lewis. Chapman; Hall/CRC Press. https://doi.org/10.1201/b16858.
van der Linden, Wim J., and Lynda M. Reese. 1998. “A Model for Optimal Constrained Adaptive Testing.” Applied Psychological Measurement 22 (3): 259–70. https://doi.org/10.1177/01466216980223006.