Randomized Response Regression

rrreg is used to conduct multivariate regression analyses of survey data using randomized response methods.

rrreg(formula, p, p0, p1, q, design, data, start = NULL,
h = NULL, group = NULL, matrixMethod = "efficient",
maxIter = 10000, verbose = FALSE, optim = FALSE, em.converge = 10^(-8),
glmMaxIter = 10000, solve.tolerance = .Machine$double.eps)

Arguments

formula	An object of class "formula": a symbolic description of the model to be fitted.
p	The probability of receiving the sensitive question (Mirrored Question Design, Unrelated Question Design); the probability of answering truthfully (Forced Response Design); the probability of selecting a red card from the 'yes' stack (Disguised Response Design). For "mirrored" and "disguised" designs, p cannot equal .5.
p0	The probability of forced 'no' (Forced Response Design).
p1	The probability of forced 'yes' (Forced Response Design).
q	The probability of answering 'yes' to the unrelated question, which is assumed to be independent of covariates (Unrelated Question Design).
design	One of the four standard designs: "forced-known", "mirrored", "disguised", or "unrelated-known".
data	A data frame containing the variables in the model.
start	Optional starting values of coefficient estimates for the Expectation-Maximization (EM) algorithm.
h	Auxiliary data functionality. Optional named numeric vector with length equal to number of groups. Names correspond to group labels and values correspond to auxiliary moments.
group	Auxiliary data functionality. Optional character vector of group labels with length equal to number of observations.
matrixMethod	Auxiliary data functionality. Procedure for estimating optimal weighting matrix for generalized method of moments. One of "efficient" for two-step feasible and "cue" for continuously updating. Default is "efficient". Only relevant if `h` and `group` are specified.
maxIter	Maximum number of iterations for the Expectation-Maximization algorithm. The default is `10000`.
verbose	A logical value indicating whether model diagnostics counting the number of EM iterations are printed out. The default is `FALSE`.
optim	A logical value indicating whether to use the quasi-Newton "BFGS" method to calculate the variance-covariance matrix and standard errors. The default is `FALSE`.
em.converge	A value specifying the satisfactory degree of convergence under the EM algorithm. The default is `10^(-8)`.
glmMaxIter	A value specifying the maximum number of iterations to run the EM algorithm. The default is `10000`.
solve.tolerance	When standard errors are calculated, this option specifies the tolerance of the matrix inversion operation solve.

Value

rrreg returns an object of class "rrreg". The function summary is used to obtain a table of the results. The object rrreg is a list that contains the following components (the inclusion of some components such as the design parameters are dependent upon the design used):

est

Point estimates for the effects of covariates on the randomized response item.

vcov

Variance-covariance matrix for the effects of covariates on the randomized response item.

Standard errors for estimates of the effects of covariates on the randomized response item.

data

The data argument.

coef.names

Variable names as defined in the data frame.

The model matrix of covariates.

The randomized response vector.

design

Call of standard design used: "forced-known", "mirrored", "disguised", or "unrelated-known".

The p argument.

The p0 argument.

The p1 argument.

The q argument.

call

The matched call.

Details

This function allows users to perform multivariate regression analysis on data from the randomized response technique. Four standard designs are accepted by this function: mirrored question, forced response, disguised response, and unrelated question. The method implemented by this function is the Maximum Likelihood (ML) estimation for the Expectation-Maximization (EM) algorithm.

References

Blair, Graeme, Kosuke Imai and Yang-Yang Zhou. (2014) "Design and Analysis of the Randomized Response Technique." Working Paper. Available at http://imai.princeton.edu/research/randresp.html.

Examples


data(nigeria)

set.seed(1)

## Define design parameters
p <- 2/3  # probability of answering honestly in Forced Response Design
p1 <- 1/6 # probability of forced 'yes'
p0 <- 1/6 # probability of forced 'no'

## Fit linear regression on the randomized response item of whether 
## citizen respondents had direct social contacts to armed groups

rr.q1.reg.obj <- rrreg(rr.q1 ~ cov.asset.index + cov.married +
                    I(cov.age/10) + I((cov.age/10)^2) + cov.education + cov.female,
                    data = nigeria, p = p, p1 = p1, p0 = p0,
                    design = "forced-known")

summary(rr.q1.reg.obj)
#> 
#> Randomized Response Technique Regression 
#> 
#> Call: rrreg(formula = rr.q1 ~ cov.asset.index + cov.married + I(cov.age/10) + 
#>     I((cov.age/10)^2) + cov.education + cov.female, p = p, p0 = p0, 
#>     p1 = p1, design = "forced-known", data = nigeria)
#> 
#>                       Est.    S.E.
#> (Intercept)       -0.34017 0.50856
#> cov.asset.index    0.07896 0.04136
#> cov.married       -0.26743 0.25451
#> I(cov.age/10)     -0.35283 0.26423
#> I((cov.age/10)^2)  0.04099 0.02603
#> cov.education     -0.00691 0.04558
#> cov.female        -0.55439 0.16244
#> 
#> Randomized response forced design with p = 0.67, p0 = 0.17, and p1 = 0.17.

## Replicates Table 3 in Blair, Imai, and Zhou (2014)