| Title: | Correcting Misclassified Mediation Analysis |
|---|---|
| Description: | Use three methods to estimate parameters from a mediation analysis with a binary misclassified mediator. These methods correct for the problem of "label switching" using Youden's J criteria. A detailed description of the analysis methods is available in Webb and Wells (2024), "Effect estimation in the presence of a misclassified binary mediator" <doi:10.48550/arXiv.2407.06970>. |
| Authors: | Kimberly Webb [aut, cre] |
| Maintainer: | Kimberly Webb <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.1 |
| Built: | 2026-05-16 06:09:16 UTC |
| Source: | https://github.com/kimberlywebb/comma |
Jointly estimate and parameters from the true outcome
and observation mechanisms, respectively, in a binary outcome misclassification
model.
COMBO_EM_algorithm( Ystar, x_matrix, z_matrix, beta_start, gamma_start, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )COMBO_EM_algorithm( Ystar, x_matrix, z_matrix, beta_start, gamma_start, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )
Ystar |
A numeric vector of indicator variables (1, 2) for the observed
outcome |
x_matrix |
A numeric matrix of covariates in the true outcome mechanism.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
beta_start |
A numeric vector or column matrix of starting values for the |
gamma_start |
A numeric vector or matrix of starting values for the |
tolerance |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
max_em_iterations |
An integer specifying the maximum number of
iterations of the EM algorithm. The default is |
em_method |
A character string specifying which EM algorithm will be applied.
Options are |
COMBO_EM_algorithm returns a data frame containing four columns. The first
column, Parameter, represents a unique parameter value for each row.
The next column contains the parameter Estimates, followed by the standard
error estimates, SE. The final column, Convergence, reports
whether or not the algorithm converged for a given parameter estimate.
EM-Algorithm Function for Estimation of the Misclassification Model
COMBO_EM_function(param_current, obs_Y_matrix, X, Z, sample_size, n_cat)COMBO_EM_function(param_current, obs_Y_matrix, X, Z, sample_size, n_cat)
param_current |
A numeric vector of regression parameters, in the order
|
obs_Y_matrix |
A numeric matrix of indicator variables (0, 1) for the observed
outcome |
X |
A numeric design matrix for the true outcome mechanism. |
Z |
A numeric design matrix for the observation mechanism. |
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
COMBO_EM_function returns a numeric vector of updated parameter
estimates from one iteration of the EM-algorithm.
Compute E-step for Binary Outcome Misclassification Model Estimated With the EM-Algorithm
COMBO_weight(ystar_matrix, pistar_matrix, pi_matrix, sample_size, n_cat)COMBO_weight(ystar_matrix, pistar_matrix, pi_matrix, sample_size, n_cat)
ystar_matrix |
A numeric matrix of indicator variables (0, 1) for the observed
outcome |
pistar_matrix |
A numeric matrix of conditional probabilities obtained from
the internal function |
pi_matrix |
A numeric matrix of probabilities obtained from the internal
function |
sample_size |
An integer value specifying the number of observations in
the sample. This value should be equal to the number of rows of the observed
outcome matrix, |
n_cat |
The number of categorical values that the true outcome, |
COMBO_weight returns a matrix of E-step weights for the EM-algorithm,
computed as follows:
.
Rows of the matrix correspond to each subject. Columns of the matrix correspond
to the true outcome categories n_cat.
Generate Bootstrap Samples for Estimating Standard Errors
COMMA_boot_sample( parameter_estimates, sigma_estimate = 1, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix )COMMA_boot_sample( parameter_estimates, sigma_estimate = 1, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix )
parameter_estimates |
A column matrix of |
sigma_estimate |
A numeric value specifying the estimated
standard deviation. This value is only required if |
outcome_distribution |
A character string specifying the distribution of
the outcome variable. Options are |
interaction_indicator |
A logical value indicating if an interaction between
|
x_matrix |
A numeric matrix of predictors in the true mediator and outcome mechanisms.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
COMMA_boot_sample returns a list with the bootstrap sample data:
obs_mediator |
A vector of observed mediator values. |
true_mediator |
A vector of true mediator values. |
outcome |
A vector of outcome values. |
x_matrix |
A matrix of predictor values in the true mediator mechanism. Identical to that supplied by the user. |
z_matrix |
A matrix of predictor values in the observed mediator mechanism. Identical to that supplied by the user. |
c_matrix |
A matrix of covariates. Identical to that supplied by the user. |
Generate Data to use in COMMA Functions
COMMA_data( sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator, outcome_distribution, true_beta, true_gamma, true_theta )COMMA_data( sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator, outcome_distribution, true_beta, true_gamma, true_theta )
sample_size |
An integer specifying the sample size of the generated data set. |
x_mu |
A numeric value specifying the mean of |
x_sigma |
A positive numeric value specifying the standard deviation of
|
z_shape |
A positive numeric value specifying the shape parameter of
|
c_shape |
A positive numeric value specifying the shape parameter of
|
interaction_indicator |
A logical value indicating if an interaction between
|
outcome_distribution |
A character string specifying the distribution of
the outcome variable. Options are |
true_beta |
A column matrix of |
true_gamma |
A numeric matrix of |
true_theta |
A column matrix of |
COMMA_data returns a list of generated data elements:
obs_mediator |
A vector of observed mediator values. |
true_mediator |
A vector of true mediator values. |
outcome |
A vector of outcome values. |
x |
A vector of generated predictor values in the true mediator mechanism, from the Normal distribution. |
z |
A vector of generated predictor values in the observed mediator mechanism from the Gamma distribution. |
c |
A vector of generated covariates. |
x_design_matrix |
The design matrix for the |
z_design_matrix |
The design matrix for the |
c_design_matrix |
The design matrix for the |
set.seed(20240709) sample_size <- 10000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) head(example_data$obs_mediator) head(example_data$true_mediator)set.seed(20240709) sample_size <- 10000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) head(example_data$obs_mediator) head(example_data$true_mediator)
Jointly estimate , , and parameters from
the true mediator, observed mediator, and outcome mechanisms, respectively,
in a binary mediator misclassification model.
COMMA_EM( Mstar, outcome, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start, sigma_start = NULL, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )COMMA_EM( Mstar, outcome, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start, sigma_start = NULL, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )
Mstar |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
outcome |
A vector containing the outcome variables of interest. There
should be no |
outcome_distribution |
A character string specifying the distribution of
the outcome variable. Options are |
interaction_indicator |
A logical value indicating if an interaction between
|
x_matrix |
A numeric matrix of predictors in the true mediator and outcome mechanisms.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
beta_start |
A numeric vector or column matrix of starting values for the |
gamma_start |
A numeric vector or matrix of starting values for the |
theta_start |
A numeric vector or column matrix of starting values for the |
sigma_start |
A numeric value specifying the starting value for the
standard deviation. This value is only required if |
tolerance |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
max_em_iterations |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
em_method |
A character string specifying which EM algorithm will be applied.
Options are |
COMMA_EM returns a data frame containing four columns. The first
column, Parameter, represents a unique parameter value for each row.
The next column contains the parameter Estimates, followed by the standard
error estimates, SE. The final column, Convergence, reports
whether or not the algorithm converged for a given parameter estimate.
set.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] EM_results <- COMMA_EM(Mstar, outcome, "Bernoulli", FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) EM_resultsset.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] EM_results <- COMMA_EM(Mstar, outcome, "Bernoulli", FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) EM_results
Estimate Bootstrap Standard Errors using EM
COMMA_EM_bootstrap_SE( parameter_estimates, sigma_estimate = 1, n_bootstrap, n_parallel, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem", random_seed = NULL )COMMA_EM_bootstrap_SE( parameter_estimates, sigma_estimate = 1, n_bootstrap, n_parallel, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem", random_seed = NULL )
parameter_estimates |
A column matrix of |
sigma_estimate |
A numeric value specifying the estimated
standard deviation. This value is only required if |
n_bootstrap |
A numeric value specifying the number of bootstrap samples to draw. |
n_parallel |
A numeric value specifying the number of parallel cores to run the computation on. |
outcome_distribution |
A character string specifying the distribution of
the outcome variable. Options are |
interaction_indicator |
A logical value indicating if an interaction between
|
x_matrix |
A numeric matrix of predictors in the true mediator and outcome mechanisms.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
tolerance |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
max_em_iterations |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
em_method |
A character string specifying which EM algorithm will be applied.
Options are |
random_seed |
A numeric value specifying the random seed to set for bootstrap
sampling. Default is |
COMMA_EM_bootstrap_SE returns a list with two elements: 1)
bootstrap_df and 2) bootstrap_SE. bootstrap_df is a data
frame containing COMMA_EM output for each bootstrap sample. bootstrap_SE
is a data frame containing bootstrap standard error estimates for each parameter.
set.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] EM_results <- COMMA_EM(Mstar, outcome, "Bernoulli", FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) EM_results EM_SEs <- COMMA_EM_bootstrap_SE(EM_results$Estimates, sigma_estimate = NULL, n_bootstrap = 3, n_parallel = 1, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, random_seed = 1) EM_SEs$bootstrap_SEset.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] EM_results <- COMMA_EM(Mstar, outcome, "Bernoulli", FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) EM_results EM_SEs <- COMMA_EM_bootstrap_SE(EM_results$Estimates, sigma_estimate = NULL, n_bootstrap = 3, n_parallel = 1, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, random_seed = 1) EM_SEs$bootstrap_SE
Estimate , , and parameters from
the true mediator, observed mediator, and outcome mechanisms, respectively,
in a binary mediator misclassification model using an ordinary least squares
correction.
COMMA_OLS( Mstar, outcome, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )COMMA_OLS( Mstar, outcome, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )
Mstar |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
outcome |
A vector containing the outcome variables of interest. There
should be no |
x_matrix |
A numeric matrix of predictors in the true mediator and outcome mechanisms.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
beta_start |
A numeric vector or column matrix of starting values for the |
gamma_start |
A numeric vector or matrix of starting values for the |
theta_start |
A numeric vector or column matrix of starting values for the |
tolerance |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
max_em_iterations |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
em_method |
A character string specifying which EM algorithm will be applied.
Options are |
Note that this method can only be used for Normal outcome models, and interaction
terms (between x and m) are not supported.
COMMA_PVW returns a data frame containing four columns. The first
column, Parameter, represents a unique parameter value for each row.
The next column contains the parameter Estimates. The third column,
Convergence, reports whether or not the algorithm converged for a
given parameter estimate. The final column, Method, reports
that the estimates are obtained from the "PVW" procedure.
set.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, 2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Normal", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] OLS_results <- COMMA_OLS(Mstar, outcome, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) OLS_resultsset.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, 2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Normal", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] OLS_results <- COMMA_OLS(Mstar, outcome, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) OLS_results
Estimate Bootstrap Standard Errors using OLS
COMMA_OLS_bootstrap_SE( parameter_estimates, sigma_estimate = 1, n_bootstrap, n_parallel, x_matrix, z_matrix, c_matrix, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem", random_seed = NULL )COMMA_OLS_bootstrap_SE( parameter_estimates, sigma_estimate = 1, n_bootstrap, n_parallel, x_matrix, z_matrix, c_matrix, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem", random_seed = NULL )
parameter_estimates |
A column matrix of |
sigma_estimate |
A numeric value specifying the estimated standard deviation. Default is 1. |
n_bootstrap |
A numeric value specifying the number of bootstrap samples to draw. |
n_parallel |
A numeric value specifying the number of parallel cores to run the computation on. |
x_matrix |
A numeric matrix of predictors in the true mediator and outcome mechanisms.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
tolerance |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
max_em_iterations |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
em_method |
A character string specifying which EM algorithm will be applied.
Options are |
random_seed |
A numeric value specifying the random seed to set for bootstrap
sampling. Default is |
COMMA_OLS_bootstrap_SE returns a list with two elements: 1)
bootstrap_df and 2) bootstrap_SE. bootstrap_df is a data
frame containing COMMA_OLS output for each bootstrap sample. bootstrap_SE
is a data frame containing bootstrap standard error estimates for each parameter.
set.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, 2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Normal", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] OLS_results <- COMMA_OLS(Mstar, outcome, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) OLS_results OLS_SEs <- COMMA_OLS_bootstrap_SE(OLS_results$Estimates, sigma_estimate = 1, n_bootstrap = 3, n_parallel = 1, x_matrix, z_matrix, c_matrix, random_seed = 1) OLS_SEs$bootstrap_SEset.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, 2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Normal", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] OLS_results <- COMMA_OLS(Mstar, outcome, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) OLS_results OLS_SEs <- COMMA_OLS_bootstrap_SE(OLS_results$Estimates, sigma_estimate = 1, n_bootstrap = 3, n_parallel = 1, x_matrix, z_matrix, c_matrix, random_seed = 1) OLS_SEs$bootstrap_SE
Estimate , , and parameters from
the true mediator, observed mediator, and outcome mechanisms, respectively,
in a binary mediator misclassification model using a predictive value weighting
approach.
COMMA_PVW( Mstar, outcome, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )COMMA_PVW( Mstar, outcome, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem" )
Mstar |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
outcome |
A vector containing the outcome variables of interest. There
should be no |
outcome_distribution |
A character string specifying the distribution of
the outcome variable. Options are |
interaction_indicator |
A logical value indicating if an interaction between
|
x_matrix |
A numeric matrix of predictors in the true mediator and outcome mechanisms.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
beta_start |
A numeric vector or column matrix of starting values for the |
gamma_start |
A numeric vector or matrix of starting values for the |
theta_start |
A numeric vector or column matrix of starting values for the |
tolerance |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
max_em_iterations |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
em_method |
A character string specifying which EM algorithm will be applied.
Options are |
Note that this method can only be used for binary outcome models.
COMMA_PVW returns a data frame containing four columns. The first
column, Parameter, represents a unique parameter value for each row.
The next column contains the parameter Estimates. The third column,
Convergence, reports whether or not the algorithm converged for a
given parameter estimate. The final column, Method, reports
that the estimates are obtained from the "PVW" procedure.
set.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] PVW_results <- COMMA_PVW(Mstar, outcome, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) PVW_resultsset.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] PVW_results <- COMMA_PVW(Mstar, outcome, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) PVW_results
Estimate Bootstrap Standard Errors using PVW
COMMA_PVW_bootstrap_SE( parameter_estimates, sigma_estimate, n_bootstrap, n_parallel, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem", random_seed = NULL )COMMA_PVW_bootstrap_SE( parameter_estimates, sigma_estimate, n_bootstrap, n_parallel, outcome_distribution, interaction_indicator, x_matrix, z_matrix, c_matrix, tolerance = 1e-07, max_em_iterations = 1500, em_method = "squarem", random_seed = NULL )
parameter_estimates |
A column matrix of |
sigma_estimate |
A numeric value specifying the estimated
standard deviation. This value is only required if |
n_bootstrap |
A numeric value specifying the number of bootstrap samples to draw. |
n_parallel |
A numeric value specifying the number of parallel cores to run the computation on. |
outcome_distribution |
A character string specifying the distribution of
the outcome variable. Options are |
interaction_indicator |
A logical value indicating if an interaction between
|
x_matrix |
A numeric matrix of predictors in the true mediator and outcome mechanisms.
|
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
tolerance |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
max_em_iterations |
A numeric value specifying when to stop estimation, based on
the difference of subsequent log-likelihood estimates. The default is |
em_method |
A character string specifying which EM algorithm will be applied.
Options are |
random_seed |
A numeric value specifying the random seed to set for bootstrap
sampling. Default is |
COMMA_PVW_bootstrap_SE returns a list with two elements: 1)
bootstrap_df and 2) bootstrap_SE. bootstrap_df is a data
frame containing COMMA_PVW output for each bootstrap sample. bootstrap_SE
is a data frame containing bootstrap standard error estimates for each parameter.
set.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] PVW_results <- COMMA_PVW(Mstar, outcome, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) PVW_results PVW_SEs <- COMMA_PVW_bootstrap_SE(PVW_results$Estimates, sigma_estimate = NULL, n_bootstrap = 3, n_parallel = 1, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, random_seed = 1) PVW_SEs$bootstrap_SEset.seed(20240709) sample_size <- 2000 n_cat <- 2 # Number of categories in the binary mediator # Data generation settings x_mu <- 0 x_sigma <- 1 z_shape <- 1 c_shape <- 1 # True parameter values (gamma terms set the misclassification rate) true_beta <- matrix(c(1, -2, .5), ncol = 1) true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE) true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1) example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape, interaction_indicator = FALSE, outcome_distribution = "Bernoulli", true_beta, true_gamma, true_theta) beta_start <- matrix(rep(1, 3), ncol = 1) gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2) theta_start <- matrix(rep(1, 4), ncol = 1) Mstar = example_data[["obs_mediator"]] outcome = example_data[["outcome"]] x_matrix = example_data[["x"]] z_matrix = example_data[["z"]] c_matrix = example_data[["c"]] PVW_results <- COMMA_PVW(Mstar, outcome, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, beta_start, gamma_start, theta_start) PVW_results PVW_SEs <- COMMA_PVW_bootstrap_SE(PVW_results$Estimates, sigma_estimate = NULL, n_bootstrap = 3, n_parallel = 1, outcome_distribution = "Bernoulli", interaction_indicator = FALSE, x_matrix, z_matrix, c_matrix, random_seed = 1) PVW_SEs$bootstrap_SE
Function is for cases with and with no interaction term
in the outcome mechanism.
EM_function_bernoulliY( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )EM_function_bernoulliY( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )
param_current |
A numeric vector of regression parameters, in the order
|
obs_mediator |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
obs_outcome |
A vector containing the outcome variables of interest. There
should be no |
X |
A numeric design matrix for the true mediator mechanism. |
Z |
A numeric design matrix for the observation mechanism. |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
EM_function_bernoulliY returns a numeric vector of updated parameter
estimates from one iteration of the EM-algorithm.
Function is for cases with and with an interaction term
in the outcome mechanism.
EM_function_bernoulliY_XM( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )EM_function_bernoulliY_XM( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )
param_current |
A numeric vector of regression parameters, in the order
|
obs_mediator |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
obs_outcome |
A vector containing the outcome variables of interest. There
should be no |
X |
A numeric design matrix for the true mediator mechanism. |
Z |
A numeric design matrix for the observation mechanism. |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
EM_function_bernoulliY returns a numeric vector of updated parameter
estimates from one iteration of the EM-algorithm.
Function is for cases with and with no interaction term
in the outcome mechanism.
EM_function_normalY( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )EM_function_normalY( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )
param_current |
A numeric vector of regression parameters, in the order
|
obs_mediator |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
obs_outcome |
A vector containing the outcome variables of interest. There
should be no |
X |
A numeric design matrix for the true mediator mechanism. |
Z |
A numeric design matrix for the observation mechanism. |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
EM_function_bernoulliY returns a numeric vector of updated parameter
estimates from one iteration of the EM-algorithm.
Function is for cases with and with an interaction term
in the outcome mechanism.
EM_function_normalY_XM( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )EM_function_normalY_XM( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )
param_current |
A numeric vector of regression parameters, in the order
|
obs_mediator |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
obs_outcome |
A vector containing the outcome variables of interest. There
should be no |
X |
A numeric design matrix for the true mediator mechanism. |
Z |
A numeric design matrix for the observation mechanism. |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
EM_function_bernoulliY returns a numeric vector of updated parameter
estimates from one iteration of the EM-algorithm.
Function is for cases with and without an interaction term
in the outcome mechanism.
EM_function_poissonY( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )EM_function_poissonY( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )
param_current |
A numeric vector of regression parameters, in the order
|
obs_mediator |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
obs_outcome |
A vector containing the outcome variables of interest. There
should be no |
X |
A numeric design matrix for the true mediator mechanism. |
Z |
A numeric design matrix for the observation mechanism. |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
EM_function_bernoulliY returns a numeric vector of updated parameter
estimates from one iteration of the EM-algorithm.
Function is for cases with and with an interaction term
in the outcome mechanism.
EM_function_poissonY_XM( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )EM_function_poissonY_XM( param_current, obs_mediator, obs_outcome, X, Z, c_matrix, sample_size, n_cat )
param_current |
A numeric vector of regression parameters, in the order
|
obs_mediator |
A numeric vector of indicator variables (1, 2) for the observed
mediator |
obs_outcome |
A vector containing the outcome variables of interest. There
should be no |
X |
A numeric design matrix for the true mediator mechanism. |
Z |
A numeric design matrix for the observation mechanism. |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
EM_function_bernoulliY returns a numeric vector of updated parameter
estimates from one iteration of the EM-algorithm.
Compute the conditional probability of observing mediator given
the latent true mediator as
for each of the n subjects.
misclassification_prob(gamma_matrix, z_matrix)misclassification_prob(gamma_matrix, z_matrix)
gamma_matrix |
A numeric matrix of estimated regression parameters for the
observation mechanism, |
z_matrix |
A numeric matrix of covariates in the observation mechanism.
|
misclassification_prob returns a dataframe containing four columns.
The first column, Subject, represents the subject ID, from to n,
where n is the sample size, or equivalently, the number of rows in z_matrix.
The second column, M, represents a true, latent mediator category .
The third column, Mstar, represents an observed outcome category .
The last column, Probability, is the value of the equation
computed for each subject, observed mediator category, and true, latent mediator category.
set.seed(123) sample_size <- 1000 cov1 <- rnorm(sample_size) cov2 <- rnorm(sample_size, 1, 2) z_matrix <- matrix(c(cov1, cov2), nrow = sample_size, byrow = FALSE) estimated_gammas <- matrix(c(1, -1, .5, .2, -.6, 1.5), ncol = 2) P_Ystar_M <- misclassification_prob(estimated_gammas, z_matrix) head(P_Ystar_M)set.seed(123) sample_size <- 1000 cov1 <- rnorm(sample_size) cov2 <- rnorm(sample_size, 1, 2) z_matrix <- matrix(c(cov1, cov2), nrow = sample_size, byrow = FALSE) estimated_gammas <- matrix(c(1, -1, .5, .2, -.6, 1.5), ncol = 2) P_Ystar_M <- misclassification_prob(estimated_gammas, z_matrix) head(P_Ystar_M)
Example data from the National Vital Statistics System of the National Center for Health Statistics (NCHS), 2022
NCHS2022_sampleNCHS2022_sample
A dataframe 30 columns, including demographic and birth information for a random sample of 20,000 singleton births from nulliparous mothers in the US in 2022.
https://data.nber.org/nvss/natality/inputs/raw/2022/
## Not run: data("NCHS2022_sample") head(NCHS2022_sample) ## End(Not run)## Not run: data("NCHS2022_sample") head(NCHS2022_sample) ## End(Not run)
Compute Probability of Each True Outcome, for Every Subject
pi_compute(beta, X, n, n_cat)pi_compute(beta, X, n, n_cat)
beta |
A numeric column matrix of regression parameters for the
|
X |
A numeric design matrix. |
n |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
pi_compute returns a matrix of probabilities,
for each of the n subjects. Rows of the matrix
correspond to each subject. Columns of the matrix correspond to the true outcome
categories n_cat.
Compute Conditional Probability of Each Observed Outcome Given Each True Outcome, for Every Subject
pistar_compute(gamma, Z, n, n_cat)pistar_compute(gamma, Z, n, n_cat)
gamma |
A numeric matrix of regression parameters for the observed
outcome mechanism, |
Z |
A numeric design matrix. |
n |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
pistar_compute returns a matrix of conditional probabilities,
for each of the n subjects. Rows of the matrix
correspond to each subject and observed outcome. Specifically, the probability
for subject and observed category $1$ occurs at row . The probability
for subject and observed category $2$ occurs at row n.
Columns of the matrix correspond to the true outcome categories n_cat.
Sum Every "n"th Element
sum_every_n(x, n)sum_every_n(x, n)
x |
A numeric vector to sum over |
n |
A numeric value specifying the distance between the reference index and the next index to be summed |
sum_every_n returns a vector of sums of every nth element of the vector x.
Sum Every "n"th Element, then add 1
sum_every_n1(x, n)sum_every_n1(x, n)
x |
A numeric vector to sum over |
n |
A numeric value specifying the distance between the reference index and the next index to be summed |
sum_every_n1 returns a vector of sums of every nth element of the vector x, plus 1.
Likelihood Function for Normal Outcome Mechanism with a Binary Mediator
theta_optim(param_start, m, x, c_matrix, outcome, sample_size, n_cat)theta_optim(param_start, m, x, c_matrix, outcome, sample_size, n_cat)
param_start |
A numeric vector or column matrix of starting values for the |
m |
A vector or column matrix containing the true binary mediator or the
E-step weight (with values between 0 and 1). There
should be no |
x |
A vector or column matrix of the predictor or exposure of interest. There
should be no |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
outcome |
A vector containing the outcome variables of interest. There
should be no |
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
theta_optim returns a numeric value of the (negative) log-likelihood function.
Likelihood Function for Normal Outcome Mechanism with a Binary Mediator and an Interaction Term
theta_optim_XM(param_start, m, x, c_matrix, outcome, sample_size, n_cat)theta_optim_XM(param_start, m, x, c_matrix, outcome, sample_size, n_cat)
param_start |
A numeric vector or column matrix of starting values for the |
m |
vector or column matrix containing the true binary mediator or the
E-step weight (with values between 0 and 1). There
should be no |
x |
A vector or column matrix of the predictor or exposure of interest. There
should be no |
c_matrix |
A numeric matrix of covariates in the true mediator and outcome mechanisms.
|
outcome |
A vector containing the outcome variables of interest. There
should be no |
sample_size |
An integer value specifying the number of observations in the sample.
This value should be equal to the number of rows of the design matrix, |
n_cat |
The number of categorical values that the true outcome, |
theta_optim_XM returns a numeric value of the (negative) log-likelihood function.
Compute the probability of the latent true mediator as
for each of the n subjects.
true_classification_prob(beta_matrix, x_matrix)true_classification_prob(beta_matrix, x_matrix)
beta_matrix |
A numeric column matrix of estimated regression parameters for the
true mediator mechanism, |
x_matrix |
A numeric matrix of covariates in the true mediator mechanism.
|
true_classification_prob returns a dataframe containing three columns.
The first column, Subject, represents the subject ID, from to n,
where n is the sample size, or equivalently, the number of rows in x_matrix.
The second column, M, represents a true, latent mediator category .
The last column, Probability, is the value of the equation
computed
for each subject and true, latent mediator category.
set.seed(123) sample_size <- 1000 cov1 <- rnorm(sample_size) cov2 <- rnorm(sample_size, 1, 2) x_matrix <- matrix(c(cov1, cov2), nrow = sample_size, byrow = FALSE) estimated_betas <- matrix(c(1, -1, .5), ncol = 1) P_M <- true_classification_prob(estimated_betas, x_matrix) head(P_M)set.seed(123) sample_size <- 1000 cov1 <- rnorm(sample_size) cov2 <- rnorm(sample_size, 1, 2) x_matrix <- matrix(c(cov1, cov2), nrow = sample_size, byrow = FALSE) estimated_betas <- matrix(c(1, -1, .5), ncol = 1) P_M <- true_classification_prob(estimated_betas, x_matrix) head(P_M)
Note that this function should only be used for Binary outcome models.
w_m_binaryY( mstar_matrix, outcome_matrix, pistar_matrix, pi_matrix, p_yi_m0, p_yi_m1, sample_size, n_cat )w_m_binaryY( mstar_matrix, outcome_matrix, pistar_matrix, pi_matrix, p_yi_m0, p_yi_m1, sample_size, n_cat )
mstar_matrix |
A numeric matrix of indicator variables (0, 1) for the observed
mediator |
outcome_matrix |
A numeric matrix of indicator variables (0, 1) for the observed
outcome |
pistar_matrix |
A numeric matrix of conditional probabilities obtained from
the internal function |
pi_matrix |
A numeric matrix of probabilities obtained from the internal
function |
p_yi_m0 |
A numeric vector of outcome probabilities computed assuming a true mediator value of 0. |
p_yi_m1 |
A numeric vector of outcome probabilities computed assuming a true mediator value of 1. |
sample_size |
An integer value specifying the number of observations in
the sample. This value should be equal to the number of rows of the observed
mediator matrix, |
n_cat |
The number of categorical values that the true outcome, |
w_m_binaryY returns a matrix of E-step weights for the EM-algorithm.
Rows of the matrix correspond to each subject. Columns of the matrix correspond
to the true mediator categories n_cat.
Note that this function should only be used for Normal outcome models.
w_m_normalY( mstar_matrix, pistar_matrix, pi_matrix, p_yi_m0, p_yi_m1, sample_size, n_cat )w_m_normalY( mstar_matrix, pistar_matrix, pi_matrix, p_yi_m0, p_yi_m1, sample_size, n_cat )
mstar_matrix |
A numeric matrix of indicator variables (0, 1) for the observed
mediator |
pistar_matrix |
A numeric matrix of conditional probabilities obtained from
the internal function |
pi_matrix |
A numeric matrix of probabilities obtained from the internal
function |
p_yi_m0 |
A numeric vector of Normal outcome likelihoods computed assuming a true mediator value of 0. |
p_yi_m1 |
A numeric vector of Normal outcome likelihoods computed assuming a true mediator value of 1. |
sample_size |
An integer value specifying the number of observations in
the sample. This value should be equal to the number of rows of the observed
mediator matrix, |
n_cat |
The number of categorical values that the true outcome, |
w_m_normalY returns a matrix of E-step weights for the EM-algorithm.
Rows of the matrix correspond to each subject. Columns of the matrix correspond
to the true mediator categories n_cat.
Note that this function should only be used for Poisson outcome models.
w_m_poissonY( mstar_matrix, outcome_matrix, pistar_matrix, pi_matrix, p_yi_m0, p_yi_m1, sample_size, n_cat )w_m_poissonY( mstar_matrix, outcome_matrix, pistar_matrix, pi_matrix, p_yi_m0, p_yi_m1, sample_size, n_cat )
mstar_matrix |
A numeric matrix of indicator variables (0, 1) for the observed
mediator |
outcome_matrix |
A numeric matrix of indicator variables (0, 1) for the observed
outcome |
pistar_matrix |
A numeric matrix of conditional probabilities obtained from
the internal function |
pi_matrix |
A numeric matrix of probabilities obtained from the internal
function |
p_yi_m0 |
A numeric vector of outcome probabilities computed assuming a true mediator value of 0. |
p_yi_m1 |
A numeric vector of outcome probabilities computed assuming a true mediator value of 1. |
sample_size |
An integer value specifying the number of observations in
the sample. This value should be equal to the number of rows of the observed
mediator matrix, |
n_cat |
The number of categorical values that the true outcome, |
w_m_poissonY returns a matrix of E-step weights for the EM-algorithm.
Rows of the matrix correspond to each subject. Columns of the matrix correspond
to the true mediator categories n_cat.