| Title: | Propensity Score Methods for Survival Analysis |
| Version: | 0.1.0 |
| Description: | Implements propensity score weighting methods for estimating counterfactual survival functions and marginal hazard ratios in observational studies with time-to-event outcomes. Supports binary and multiple treatment groups with average treatment effect on the combined full population (ATE), average treatment effect on the treated or target group (ATT), and overlap weighting estimands. Includes symmetric (Crump) and asymmetric (Sturmer) trimming options for extreme propensity scores. Variance estimation via analytical M-estimation or bootstrap. Methods based on Cheng et al. (2022) <doi:10.1093/aje/kwac043> and Li & Li (2019) <doi:10.1214/19-AOAS1282>. |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| URL: | https://github.com/cxinyang/PSsurvival |
| BugReports: | https://github.com/cxinyang/PSsurvival/issues |
| Depends: | R (≥ 3.5.0) |
| Imports: | survival, stats, utils |
| Suggests: | nnet, ggplot2, parallel, testthat (≥ 3.0.0), knitr, rmarkdown |
| Encoding: | UTF-8 |
| LazyData: | true |
| RoxygenNote: | 7.3.2 |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2025-12-02 10:59:23 UTC; chengxinyang |
| Author: | Chengxin Yang [aut, cre], Chao Cheng [aut], Fan Li [aut], Fan Li [aut] |
| Maintainer: | Chengxin Yang <chengxin.yang@duke.edu> |
| Repository: | CRAN |
| Date/Publication: | 2025-12-09 07:40:18 UTC |
Propensity Score Weighting for PSsurvival Package
Description
Functions for calculating propensity score weights with optional trimming for binary and multiple treatment groups. Calculate Inverse Probability Weights (IPW)
Calculates inverse probability weights for binary or multiple treatment groups. For ATE (Average Treatment Effect), weights are 1/e_j(X) for each treatment j. For ATT (Average Treatment Effect on the Treated), weights target a specific treatment group.
Usage
calculate_ipw_weights(
ps_result,
data,
treatment_var,
estimand = "ATE",
att_group = NULL
)
Arguments
ps_result |
A list returned by
|
data |
A data.frame containing the treatment variable. |
treatment_var |
A character string specifying the name of the treatment
variable in |
estimand |
Character string specifying the estimand. One of "ATE" (default) or "ATT" (Average Treatment Effect on the Treated). |
att_group |
For ATT estimation, specifies which treatment group to target. This is MANDATORY when estimand = "ATT". |
Details
For ATE with multiple treatments, weights are:
w_j(X_i) = \frac{1}{e_j(X_i)} \text{ for individual i in treatment j}
For ATT targeting treatment k, weights are:
w_j(X_i) = \begin{cases}
1 & \text{if } j = k \\
\frac{e_k(X_i)}{e_j(X_i)} & \text{if } j \neq k
\end{cases}
Value
A numeric vector of IPW weights with length equal to nrow(data).
Calculate Overlap Weights
Description
Calculates overlap weights using the Li & Li (2019) formula for binary or multiple treatment groups. Overlap weights target the population with the most equipoise and are bounded between 0 and 1.
Usage
calculate_overlap_weights(ps_result, data, treatment_var)
Arguments
ps_result |
A list returned by
|
data |
A data.frame containing the treatment variable. |
treatment_var |
A character string specifying the name of the treatment
variable in |
Details
For multiple treatments, overlap weights are calculated as:
w_j(X_i) = \frac{1/e_j(X_i)}{\sum_{l=1}^{J} 1/e_l(X_i)}
For binary treatment (J=2), this reduces to:
w(X) = 1 - P(Z = observed | X)
which equals the probability of receiving the opposite treatment.
Value
A numeric vector of overlap weights with length equal to nrow(data).
References
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Check Data Structure
Description
Validates that the data structure is appropriate for model fitting. Only stops if models cannot be fit numerically.
Usage
check_data_structure(data, treatment_var, ps_formula, censor_formula)
Arguments
data |
A data.frame containing the analysis data. |
treatment_var |
Character string of treatment variable name. |
ps_formula |
Formula object for propensity score model. |
censor_formula |
Formula object for censoring model. |
Value
Invisible NULL if checks pass; otherwise throws an error.
Check if Variables Exist in Data
Description
Verifies that all variables referenced in formulas exist in the dataset.
Usage
check_variables_exist(data, treatment_var, ps_formula, censor_formula)
Arguments
data |
A data.frame containing the analysis data. |
treatment_var |
Character string of treatment variable name. |
ps_formula |
Formula object for propensity score model. |
censor_formula |
Formula object for censoring model. |
Value
Invisible NULL if all checks pass; otherwise throws an error.
Survival Effect Estimation with Weibull Censoring Scores
Description
Implements Estimator I (AJE 2022, Eq. 2) combining propensity score weighting and inverse probability of censoring weighting to estimate counterfactual survival functions and treatment effects. Compute Etau Normalization Constant
Usage
compute_etau(
estimand,
method,
ps_matrix,
n_levels,
att_group = NULL,
treatment_levels = NULL
)
Arguments
estimand |
Estimand type ("ATE", "ATT", or for overlap). |
method |
Method ("IPW" or "overlap"). |
ps_matrix |
Propensity score matrix [n x J]. |
n_levels |
Number of treatment levels. |
att_group |
Target group for ATT (if applicable). |
treatment_levels |
Vector of treatment levels. |
Value
Numeric scalar Etau.
Estimate Censoring Scores Using Cox Regression
Description
Estimate Censoring Scores Using Cox Regression
Usage
estimate_censoring_score_cox(
data,
time_var,
treatment_var,
formula,
control = list(),
ties = "efron"
)
Arguments
data |
Data frame. |
time_var |
Name of time variable. |
treatment_var |
Name of treatment variable. |
formula |
Censoring model formula. Use |
control |
Control parameters for |
ties |
Tie handling method. Default "efron". |
Details
Fits Cox models within each treatment group. Censoring scores computed as:
K_c^{(j)}(t, X) = \exp(-H_0^{(j)}(t) \cdot \exp(\beta_j' X))
where H_0^{(j)}(t) is cumulative baseline hazard. Baseline hazards
evaluated at nearest time point for each individual.
Value
List with class "censoring_score_cox":
censoring_models |
Fitted |
censoring_scores |
P(C >= T_i | Z_i, X_i) for observed treatment. |
censoring_matrix |
(n x J) matrix of P(C >= T_i | Z=j, X_i). |
n_levels |
Number of treatment levels. |
treatment_levels |
Sorted treatment values. |
model_type |
"cox". |
baseline_hazards |
Baseline cumulative hazards by treatment level. |
coef_list |
Coefficient vectors by treatment level. |
vcov_list |
Variance-covariance matrices by treatment level. |
linear_predictors_matrix |
(n x J) matrix of linear predictors. |
Censoring Score Estimation
Description
Estimate censoring scores P(C >= T | X) using Weibull or Cox models fit separately within each treatment group. Estimate Censoring Scores Using Weibull Regression
Usage
estimate_censoring_score_weibull(
data,
time_var,
treatment_var,
formula,
control = list(maxiter = 350)
)
Arguments
data |
Data frame. |
time_var |
Name of time variable. |
treatment_var |
Name of treatment variable. |
formula |
Censoring model formula. Use |
control |
Control parameters for |
Details
Fits Weibull models within each treatment group. Censoring scores computed as:
K_c^{(j)}(t, X) = \exp(-\exp(X'\theta_j) \cdot t^{\gamma_j})
where \theta_j = -\beta_j/\sigma_j, \gamma_j = 1/\sigma_j.
Value
List with class "censoring_score_weibull":
censoring_models |
Fitted |
censoring_scores |
P(C >= T_i | Z_i, X_i) for observed treatment. |
censoring_matrix |
(n x J) matrix of P(C >= T_i | Z=j, X_i). |
n_levels |
Number of treatment levels. |
treatment_levels |
Sorted treatment values. |
model_type |
"weibull". |
parameters |
Transformed parameters ( |
linear_predictors_matrix |
(n x J) matrix of linear predictors. |
Propensity Score Estimation for PSsurvival Package
Description
Functions for estimating propensity scores for binary and multiple treatment groups. Estimate Propensity Scores
Fits a propensity score model and extracts propensity scores for binary or multiple treatment groups. For binary treatments, uses binomial logistic regression. For multiple treatments (>2 levels), uses multinomial logistic regression to estimate generalized propensity scores.
Usage
estimate_ps(data, treatment_var, ps_formula, ps_control = list())
Arguments
data |
A data.frame containing the analysis data (typically the cleaned data with complete cases). |
treatment_var |
A character string specifying the name of the treatment
variable in |
ps_formula |
A formula object for the propensity score model, of the
form |
ps_control |
An optional list of control parameters to pass to the
model fitting function ( |
Details
Propensity Score Definition: Returns P(Z = observed | X) for each individual, not P(Z=1|X) for all (as in Rosenbaum & Rubin 1983). This definition enables direct use in IPW and extends naturally to multiple treatments.
Binary Treatments (2 levels):
Fits binomial logistic regression via glm(). Treatment is factorized
with levels sorted by sort(): numerically for numeric, alphabetically
for character, by factor level order for factor. Returns P(Z = observed | X).
Multiple Treatments (>2 levels):
Fits multinomial logistic regression via nnet::multinom(). Returns
P(Z = observed | X) for each individual from the generalized PS matrix.
Control Parameters (ps_control):
Binary:
glm.control()parameters (default:epsilon=1e-08, maxit=25)Multiple:
multinom()parameters (default:MaxNWts=10000, maxit=100, trace=FALSE)
Value
A list with the following components:
ps_model |
The fitted propensity score model object (class |
ps |
A numeric vector of propensity scores representing the probability
of receiving the actual treatment each individual received. Length equals
the number of rows in |
ps_matrix |
A numeric matrix of dimension n × K where n is the number of observations and K is the number of treatment levels. Each row contains the predicted probabilities for all treatment levels. Column names correspond to treatment levels. |
n_levels |
An integer indicating the number of treatment levels. |
treatment_levels |
A vector of unique treatment values sorted by
|
Examples
# Example 1: Binary treatment
data(simdata_bin)
ps_bin <- estimate_ps(
data = simdata_bin,
treatment_var = "Z",
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2
)
summary(ps_bin$ps)
table(simdata_bin$Z)
# Example 2: Multiple treatments
data(simdata_multi)
ps_multi <- estimate_ps(
data = simdata_multi,
treatment_var = "Z",
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2
)
head(ps_multi$ps_matrix)
Estimate Propensity Score Weights
Description
Calculates propensity score weights for causal inference with optional trimming. Supports ATE, ATT, and overlap population estimands for binary and multiple treatment groups.
Usage
estimate_weights(
ps_result,
data,
treatment_var,
estimand = "ATE",
att_group = NULL,
trim = NULL,
delta = NULL,
alpha = NULL
)
Arguments
ps_result |
A list returned by |
data |
A data.frame containing the treatment variable (same data used
in |
treatment_var |
A character string specifying the name of the treatment
variable in |
estimand |
Character string specifying the target population. One of:
|
att_group |
For ATT estimation, specifies which treatment group to target. This is MANDATORY when estimand = "ATT". Ignored for other estimands. |
trim |
Character string specifying the trimming method, or NULL for no trimming (default). Options: "symmetric" (Crump extension) or "asymmetric" (Sturmer extension). Trimming is NOT supported with overlap estimand. |
delta |
Trimming threshold for symmetric trimming in (0, 1/J], where J is the number of treatment levels. If NULL (default), uses recommended values from Yoshida et al. (2019). Ignored unless trim = "symmetric". |
alpha |
Percentile threshold for asymmetric trimming in (0, 0.5). If NULL (default), uses recommended values from Yoshida et al. (2019). Ignored unless trim = "asymmetric". |
Details
Trimming Workflow: When trimming is requested, the function: (1) identifies observations to trim using PS from full data, (2) re-estimates PS on trimmed data, (3) calculates weights from re-estimated PS. This ensures trimming uses the original covariate distribution while weights reflect the overlapping population.
Overlap weights do not support trimming (already bounded in [0,1]).
Value
A list containing:
weights |
Numeric vector of weights (length = nrow(data)). |
trim_summary |
Data frame with trimming summary by treatment group. |
ess |
Named numeric vector of effective sample size by treatment group. |
method |
Character string: "IPW" for ATE/ATT, "overlap" for overlap. |
estimand |
Character string of estimand used. |
att_group |
Target group for ATT (NULL if not applicable). |
trim_method |
Character string of trimming method (NULL if no trimming). |
delta |
Numeric trimming threshold used for symmetric trimming (NULL if not applicable). |
alpha |
Numeric percentile threshold used for asymmetric trimming (NULL if not applicable). |
n_levels |
Number of treatment levels. |
treatment_levels |
Vector of treatment level values. |
ps_result |
PS result object (refitted after trimming if trimming was applied). |
References
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
Crump, R. K., et al. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96(1), 187-199.
Examples
# Example 1: Overlap weighting for binary treatment
data(simdata_bin)
ps_bin <- estimate_ps(
data = simdata_bin,
treatment_var = "Z",
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2
)
weights_ow <- estimate_weights(
ps_result = ps_bin,
data = simdata_bin,
treatment_var = "Z",
estimand = "overlap"
)
summary(weights_ow$weights)
# Example 2: ATT with multiple treatments
data(simdata_multi)
ps_multi <- estimate_ps(
data = simdata_multi,
treatment_var = "Z",
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2
)
weights_att <- estimate_weights(
ps_result = ps_multi,
data = simdata_multi,
treatment_var = "Z",
estimand = "ATT",
att_group = "C"
)
summary(weights_att$weights)
Marginal Cox Model Estimation with Propensity Score Weighting
Description
Fits a weighted marginal Cox proportional hazards model to estimate marginal hazard ratios between treatment groups using propensity score weights. Fit Weighted Marginal Cox Model
Fits a marginal Cox model 'Surv(time, event) ~ treatment' with propensity score weights to estimate marginal hazard ratios. Automatically handles zero weights (from trimming) by subsetting data before fitting.
Usage
fit_marginal_cox(
data,
treatment_var,
time_var,
event_var,
weights,
treatment_levels,
reference_level,
robust = TRUE,
functionality = "main"
)
Arguments
data |
A data.frame containing the complete-case analysis data. |
treatment_var |
A character string specifying the name of the treatment
variable in |
time_var |
A character string specifying the name of the time variable
in |
event_var |
A character string specifying the name of the event variable
in |
weights |
A numeric vector of propensity score weights with length equal
to nrow(data). Returned from |
treatment_levels |
A vector of unique treatment values (sorted). Should
match the levels from |
reference_level |
Which treatment level to use as reference in the Cox model. MANDATORY parameter. |
robust |
Logical. Use robust (sandwich) variance estimator? Default TRUE.
When TRUE, uses |
functionality |
Character string indicating purpose: "main" for main point estimation or "boot" for bootstrap. Default "main". Controls error handling behavior when groups are missing or have no events. |
Details
Functionality Modes:
**"main" mode (point estimation):** - If reference group is missing or has no events after trimming: throws error - If non-reference group is missing: sets its HR and SE to NA, continues - If non-reference group has no events: sets its HR and SE to NA if coxph fails - Does NOT suppress warnings/messages from coxph
**"boot" mode (bootstrap):** - If reference group is missing or has no events: returns hr_estimates with all NA, no error - If non-reference group is missing: sets its HR and SE to NA - Does NOT suppress warnings (this is handled in bootstrap wrapper function)
Model Formula:
Fits Surv(time, event) ~ treatment where treatment is a factor with
k levels. Cox regression automatically creates k-1 dummy variables relative
to the reference level.
Zero Weights: coxph does not accept zero or negative weights. Observations with weight <= 0 are excluded before model fitting. This handles trimming automatically.
Coefficients:
Coefficients represent log hazard ratios. To get hazard ratios, use
exp(hr_estimates). A positive coefficient means higher hazard
(worse survival) compared to reference group.
Robust Variance:
When robust = TRUE, the sandwich variance estimator accounts for
weighting and provides more conservative standard errors. This is recommended
for propensity score weighted analyses.
Value
A list containing:
cox_model |
The fitted coxph model object. NULL if fitting failed in bootstrap mode. |
hr_estimates |
Named numeric vector of log hazard ratios (coefficients). Length = n_levels - 1. Names indicate which group is compared to reference (e.g., "trtB" means group B vs reference). Contains NA for groups that are missing in data or have no events. |
hr_se_robust |
Named numeric vector of robust standard errors for log HRs. Same length and names as hr_estimates. NA for failed groups. |
reference_level |
The treatment level used as reference. |
treatment_levels |
Vector of all treatment levels. |
n_levels |
Number of treatment levels. |
n_per_group_original |
Named numeric vector of sample sizes per group before trimming (from original data). |
n_per_group_used |
Named numeric vector of sample sizes per group actually used in Cox model (after excluding zero weights). |
events_per_group_original |
Named numeric vector of event counts per group before trimming. |
events_per_group_used |
Named numeric vector of event counts per group in fitted Cox model. |
Generate Bootstrap Sample Indices
Description
Helper function to generate bootstrap sample indices for either full sample resampling (observational studies) or stratified resampling within treatment groups (randomized controlled trials with fixed allocation).
Usage
generate_boot_indices(data, treatment_var, boot_level = "full")
Arguments
data |
Data frame to bootstrap from. |
treatment_var |
Name of treatment variable. |
boot_level |
Bootstrap sampling level: "full" (default) samples from entire dataset, "strata" samples within each treatment group preserving group sizes. |
Value
Integer vector of bootstrap indices of length nrow(data).
Marginal Cox Model with Propensity Score Weighting
Description
Main user interface for estimating marginal hazard ratios using propensity score weighting. Supports binary and multiple treatment groups with various weighting schemes (ATE, ATT, overlap) and optional trimming. Variance can be estimated via bootstrap or robust sandwich estimator.
Usage
marCoxph(
data,
ps_formula,
time_var,
event_var,
reference_level,
estimand = "ATE",
att_group = NULL,
trim = NULL,
delta = NULL,
alpha = NULL,
variance_method = "bootstrap",
boot_level = "full",
B = 100,
parallel = FALSE,
mc.cores = 2,
seed = NULL,
ps_control = list(),
robust = TRUE
)
Arguments
data |
Data frame containing treatment, survival outcome, and covariates. |
ps_formula |
Formula for propensity score model: |
time_var |
Character string specifying the time-to-event variable name. |
event_var |
Character string specifying the event indicator variable name. Should be coded as 1=event, 0=censored. |
reference_level |
Treatment level to use as reference in Cox model. MANDATORY. Must be one of the treatment levels. |
estimand |
Target estimand: "ATE" (average treatment effect), "ATT" (average treatment effect on the treated), or "overlap" (overlap weighting). Default "ATE". |
att_group |
Target group for ATT. Required if |
trim |
Trimming method: "symmetric" or "asymmetric". Default NULL (no trimming). |
delta |
Threshold for symmetric trimming (e.g., 0.1). Required if |
alpha |
Percentile for asymmetric trimming (e.g., 0.05). Required if |
variance_method |
Variance estimation method: "bootstrap" (default) or "robust".
"bootstrap" resamples the entire analysis pipeline. "robust" uses the sandwich
variance estimator from |
boot_level |
Bootstrap sampling level: "full" (default) or "strata".
"full" resamples from entire dataset (standard for observational studies). "strata"
resamples within each treatment group preserving group sizes (useful when treatment assignment
follows a stratified or fixed-ratio design). Only used if |
B |
Number of bootstrap iterations. Default 100. Used only if |
parallel |
Logical. Use parallel bootstrap computation? Default FALSE. |
mc.cores |
Number of cores for parallel bootstrap. Default 2. |
seed |
Random seed for bootstrap reproducibility. Default NULL. |
ps_control |
Control parameters for propensity score model. Default |
robust |
Logical. Use robust (sandwich) variance in Cox model fitting?
Default TRUE. When TRUE, |
Details
**Analysis Workflow:**
1. Extract treatment variable from ps_formula.
2. Estimate propensity scores using multinomial logistic regression (or logistic
for binary treatment).
3. Calculate propensity score weights based on estimand and optional trim.
4. Fit marginal Cox model Surv(time, event) ~ treatment with weights.
5. Estimate variance via bootstrap (resampling full pipeline) or robust sandwich
estimator.
**Variance Estimation:**
- bootstrap: Resamples data (full or stratified), re-estimates PS and weights,
re-fits Cox model. Provides bootstrap SE for log hazard ratios.
- robust: Uses robust sandwich variance from coxph() directly. No
bootstrap performed (faster but may be less accurate with extreme weights).
**Trimming:** - Symmetric: Crump extension for multiple treatments (Yoshida et al., 2019). - Asymmetric: Sturmer extension for multiple treatments (Yoshida et al., 2019). - Not supported with overlap weights (already bounded [0,1]).
Value
Object of class "marCoxph" containing:
coxph_fitted |
Fitted |
logHR_est |
Named vector of estimated log hazard ratios. Names are formatted as "treatment_var:level" (e.g., "Z:B" for treatment Z, level B vs reference). |
logHR_se_robust |
Named vector of robust standard errors from |
logHR_se_bootstrap |
Named vector of bootstrap standard errors. NULL if
|
n_coxph_fitted |
Named vector of sample sizes per treatment group used in Cox model fitting (after trimming). |
events_coxph_fitted |
Named vector of event counts per treatment group used in Cox model fitting (after trimming). |
variance_method |
Variance method used: "bootstrap-full", "bootstrap-strata", or "robust". |
estimand |
Target estimand used. |
att_group |
Target group for ATT (NULL if not applicable). |
trim_method |
Trimming method (NULL if no trimming). |
delta |
Symmetric trimming threshold (NULL if not applicable). |
alpha |
Asymmetric trimming threshold (NULL if not applicable). |
treatment_var |
Name of treatment variable. |
treatment_levels |
Sorted unique treatment values. |
reference_level |
Reference level used in Cox model. |
n_levels |
Number of treatment groups. |
n |
Number of complete cases used in analysis. |
ps_result |
Propensity score estimation results. |
weight_result |
Weight estimation results. |
boot_result |
Bootstrap results (NULL if |
References
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
Examples
# Example 1: Binary treatment with overlap weighting
data(simdata_bin)
result1 <- marCoxph(
data = simdata_bin,
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
time_var = "time",
event_var = "event",
reference_level = "A",
estimand = "overlap"
)
summary(result1)
# Example 2: Multiple treatments with ATT and robust variance
data(simdata_multi)
result2 <- marCoxph(
data = simdata_multi,
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
time_var = "time",
event_var = "event",
reference_level = "C",
estimand = "ATT",
att_group = "C",
variance_method = "robust"
)
summary(result2)
Plot Method for surveff Objects
Description
Plot Method for surveff Objects
Usage
## S3 method for class 'surveff'
plot(
x,
type = "surv",
max_time = NULL,
strata_to_plot = NULL,
strata_colors = NULL,
conf_level = 0.95,
include_CI = TRUE,
curve_width = 1,
CI_alpha = 0.3,
legend_position = "right",
legend_title = NULL,
plot_title = NULL,
...
)
Arguments
x |
A |
type |
Type of plot: "surv" for survival curves or "survdiff" for treatment effect curves. Default "surv". |
max_time |
Maximum time to display on x-axis. If NULL, uses max(eval_times). |
strata_to_plot |
Vector of strata to plot. For |
strata_colors |
Vector of color names/codes for strata. Length must match strata_to_plot. Order matches strata order. If NULL, uses ggplot2 default colors. |
conf_level |
Confidence level for confidence intervals. Default 0.95. |
include_CI |
Logical. Include confidence interval ribbons? Default TRUE. |
curve_width |
Line width for survival/difference curves. Default 1. |
CI_alpha |
Transparency level for CI ribbons (0-1). Default 0.3. |
legend_position |
Position of legend: "right" or "bottom". Default "right". |
legend_title |
Title for legend. If NULL, uses "Treatment" for type="surv" or "Comparison" for type="survdiff". |
plot_title |
Plot title. If NULL, uses default title based on type. |
... |
Additional arguments (ignored). |
Details
Creates publication-ready plots of survival curves or treatment effects over time.
For type = "surv": Plots estimated survival functions with optional
confidence intervals. Y-axis ranges from 0 to 1.
For type = "survdiff": Plots estimated treatment effects (survival differences)
with optional confidence intervals. Y-axis is not constrained to [0,1].
Value
A ggplot2 object.
Print Method for marCoxph Objects
Description
Print Method for marCoxph Objects
Usage
## S3 method for class 'marCoxph'
print(x, max.len = 10, round.digits = 4, ...)
Arguments
x |
A |
max.len |
Maximum number of treatment comparisons to print. Default 10. |
round.digits |
Number of digits for rounding displayed values. Default 4. |
... |
Additional arguments (ignored). |
Value
Invisibly returns the input object x.
Print Method for surveff Objects
Description
Print Method for surveff Objects
Usage
## S3 method for class 'surveff'
print(x, max.len = 6, round.digits = 4, ...)
Arguments
x |
A |
max.len |
Maximum number of rows (time points) to print. Default 6. |
round.digits |
Number of digits for rounding displayed values. Default 4. |
... |
Additional arguments (ignored). |
Value
Invisibly returns the input object x.
Simulated Survival Data with Binary Treatment
Description
A simulated dataset for demonstrating propensity score weighting methods in survival analysis with a binary treatment.
Usage
simdata_bin
Format
A data frame with 1000 observations and 8 variables:
- X1
Continuous covariate (standard normal).
- X2
Continuous covariate (standard normal).
- X3
Continuous covariate (standard normal).
- B1
Binary covariate (0/1).
- B2
Binary covariate (0/1).
- Z
Treatment group: "A" or "B". Distribution is approximately 40:60.
- time
Observed follow-up time (event or censoring), range 0-20.
- event
Event indicator: 1 = event observed, 0 = censored.
Details
The data were generated with the following characteristics:
Treatment assignment depends on X1, X2, and B1 via logistic model.
Survival times follow Weibull distributions with group-specific scales (group A has better survival than group B).
Censoring times follow an exponential distribution depending on X1 and B1.
Administrative censoring occurs at time 20.
Overall censoring rate is approximately 30
See Also
simdata_multi for a dataset with 4 treatment groups.
Examples
data(simdata_bin)
head(simdata_bin)
table(simdata_bin$Z)
Simulated Survival Data with Multiple Treatments
Description
A simulated dataset for demonstrating propensity score weighting methods in survival analysis with four treatment groups.
Usage
simdata_multi
Format
A data frame with 1000 observations and 8 variables:
- X1
Continuous covariate (standard normal).
- X2
Continuous covariate (standard normal).
- X3
Continuous covariate (standard normal).
- B1
Binary covariate (0/1).
- B2
Binary covariate (0/1).
- Z
Treatment group: "A", "B", "C", or "D". Distribution is approximately 20:20:20:35.
- time
Observed follow-up time (event or censoring), range 0-20.
- event
Event indicator: 1 = event observed, 0 = censored.
Details
The data were generated with the following characteristics:
Treatment assignment depends on X1, X2, X3, B1, and B2 via multinomial logistic model.
Survival times follow Weibull distributions with group-specific scales. Survival ordering (best to worst): C > A > B > D.
Censoring times follow an exponential distribution depending on X1 and B1.
Administrative censoring occurs at time 20.
Overall censoring rate is approximately 30
See Also
simdata_bin for a dataset with binary treatment.
Examples
data(simdata_multi)
head(simdata_multi)
table(simdata_multi$Z)
Summary Method for marCoxph Objects
Description
Summary Method for marCoxph Objects
Usage
## S3 method for class 'marCoxph'
summary(object, conf_level = 0.95, round.digits = 4, style = "prints", ...)
Arguments
object |
A |
conf_level |
Confidence level for intervals. Default 0.95. |
round.digits |
Number of digits for rounding displayed values. Default 4.
Only used if |
style |
Output style: "prints" (print formatted tables) or "returns" (return vectors). Default "prints". |
... |
Additional arguments (ignored). |
Details
Confidence intervals are Wald-type intervals calculated as:
Log scale: logHR ± z_crit * SE
Original scale: exp(logHR ± z_crit * SE)
The SE used depends on variance_method from the original marCoxph call:
"robust": Uses
logHR_se_robustfrom sandwich estimator."bootstrap-full" or "bootstrap-strata": Uses
logHR_se_bootstrap.
Value
If style = "prints", returns invisibly. If style = "returns",
returns a list with:
logHR |
Named vector of log hazard ratio estimates. |
logHR_CI_lower |
Named vector of lower CI bounds on log scale. |
logHR_CI_upper |
Named vector of upper CI bounds on log scale. |
SE |
Named vector of standard errors on log scale (from variance_method). |
HR |
Named vector of hazard ratio estimates (original scale). |
HR_CI_lower |
Named vector of lower CI bounds on original scale. |
HR_CI_upper |
Named vector of upper CI bounds on original scale. |
variance_method |
Variance method used. |
conf_level |
Confidence level used. |
n_per_group |
Named vector of sample sizes per group in Cox model. |
events_per_group |
Named vector of event counts per group in Cox model. |
Summary Method for surveff Objects
Description
Summary Method for surveff Objects
Usage
## S3 method for class 'surveff'
summary(
object,
conf_level = 0.95,
max.len = 6,
round.digits = 4,
style = "prints",
...
)
Arguments
object |
A |
conf_level |
Confidence level for intervals. Default 0.95. |
max.len |
Maximum number of rows (time points) to print. Default 6.
Only used if |
round.digits |
Number of digits for rounding displayed values. Default 4.
Only used if |
style |
Output style: "prints" (print formatted tables) or "returns" (return list of matrices). Default "prints". |
... |
Additional arguments (ignored). |
Value
If style = "prints", returns invisibly. If style = "returns",
returns a list with:
survival_summary |
List of matrices, one per treatment group, with columns: Time, Estimate, SE, CI.lower, CI.upper |
difference_summary |
List of matrices, one per contrast, with same columns. NULL if no contrasts estimated. |
Survival Effect Estimation with Cox Censoring Scores
Description
Implements Estimator I (AJE 2022, Eq. 2) combining propensity score weighting and inverse probability of censoring weighting to estimate counterfactual survival functions and treatment effects. Variance estimation uses bootstrap only. Estimate Counterfactual Survival Functions Using Cox Censoring Scores
Usage
surv_cox(
data,
time_var,
event_var,
treatment_var,
eval_times = NULL,
weight_result,
censoring_formula,
censoring_control = list(),
ties = "efron"
)
Arguments
data |
Data frame. |
time_var |
Name of time variable. |
event_var |
Name of event indicator (1 = event, 0 = censored). |
treatment_var |
Name of treatment variable. |
eval_times |
Numeric vector of time points. If NULL, uses all unique event times. |
weight_result |
Output from |
censoring_formula |
Formula for censoring score model. |
censoring_control |
Control parameters for |
ties |
Tie handling method for Cox model. Default "efron". |
Details
Estimates counterfactual survival function for each treatment group j:
S^{(j)}_w(t) = 1 - \frac{\sum_i w_i I(A_i=j) \delta_i I(T_i \leq t) / K_c^{(j)}(T_i, X_i)}{\sum_i w_i I(A_i=j)}
Censoring scores are estimated using Cox proportional hazards models fit separately within each treatment group. Variance estimation for Cox-based survival functions is performed using bootstrap only (no analytical variance available).
Value
List containing:
survival_matrix |
Matrix [time x J] of survival function estimates S^(j)(t). |
eval_times |
Time points evaluated. |
treatment_levels |
Treatment level values. |
n_levels |
Number of treatment levels. |
weights_by_group |
List of weight vectors by treatment group. |
censoring_scores_by_group |
List of censoring score vectors by group. |
method, estimand |
Weighting method and target estimand. |
censoring_result, ps_result, weight_result |
Input objects. |
Z_matrix |
Binary indicator matrix [n x J] for treatment groups. |
time_vec, event_vec |
Original time and event vectors. |
Estimate Counterfactual Survival Functions Using Weibull Censoring Scores
Description
Estimate Counterfactual Survival Functions Using Weibull Censoring Scores
Usage
surv_weibull(
data,
time_var,
event_var,
treatment_var,
eval_times = NULL,
weight_result,
censoring_formula,
censoring_control = list(maxiter = 350)
)
Arguments
data |
Data frame. |
time_var |
Name of time variable. |
event_var |
Name of event indicator (1 = event, 0 = censored). |
treatment_var |
Name of treatment variable. |
eval_times |
Numeric vector of time points. If NULL, uses all unique event times. |
weight_result |
Output from |
censoring_formula |
Formula for censoring score model. |
censoring_control |
Control parameters for |
Details
Estimates counterfactual survival function for each treatment group j:
S^{(j)}_w(t) = 1 - \frac{\sum_i w_i I(A_i=j) \delta_i I(T_i \leq t) / K_c^{(j)}(T_i, X_i)}{\sum_i w_i I(A_i=j)}
Value
List containing:
survival_matrix |
Matrix [time x J] of survival function estimates S^(j)(t). |
eval_times |
Time points evaluated. |
treatment_levels |
Treatment level values. |
n_levels |
Number of treatment levels. |
weights_by_group |
List of weight vectors by treatment group. |
censoring_scores_by_group |
List of censoring score vectors by group. |
method, estimand |
Weighting method and target estimand. |
Etau |
Normalization constant. |
censoring_result, ps_result, weight_result |
Input objects. |
design_matrices |
List with W (PS model) and X (censoring model). |
Z_matrix |
Binary indicator matrix [n x J] for treatment groups. |
time_vec, event_vec |
Original time and event vectors. |
Survival Effect Estimation with Propensity Score Weighting
Description
Main user interface for estimating counterfactual survival functions and treatment effects using propensity score weighting and inverse probability of censoring weighting. Supports binary and multiple treatment groups with various weighting schemes (ATE, ATT, overlap) and optional trimming.
Usage
surveff(
data,
ps_formula,
censoring_formula,
eval_times = NULL,
estimand = "ATE",
att_group = NULL,
trim = NULL,
delta = NULL,
alpha = NULL,
contrast_matrix = NULL,
censoring_method = "weibull",
variance_method = NULL,
B = 100,
parallel = FALSE,
mc.cores = 2,
seed = NULL,
censoring_control = NULL,
ties = "efron",
ps_control = list(),
boot_level = "full"
)
Arguments
data |
Data frame containing treatment, outcome, and covariates. |
ps_formula |
Formula for propensity score model: |
censoring_formula |
Formula for censoring model: |
eval_times |
Numeric vector of time points for evaluation. If NULL (default), uses all unique event times. |
estimand |
Target estimand: "ATE" (average treatment effect), "ATT" (average treatment effect on the treated), or "overlap" (overlap weighting). Default "ATE". |
att_group |
Target group for ATT. Required if |
trim |
Trimming method: "symmetric" or "asymmetric". Default NULL (no trimming). |
delta |
Threshold for symmetric trimming (e.g., 0.1). Required if |
alpha |
Percentile for asymmetric trimming (e.g., 0.05). Required if |
contrast_matrix |
Optional matrix for treatment differences in multiple group settings. Each row defines one contrast with exactly two non-zero elements: -1 and 1. Column names must match treatment levels. For binary treatment, always estimates second level minus first level (S1 - S0), ignoring this parameter. |
censoring_method |
Method for censoring score estimation: "weibull" or "cox". Default "weibull". |
variance_method |
Variance estimation method: "analytical" (binary treatment with Weibull censoring only) or "bootstrap". Default "analytical" for binary Weibull, "bootstrap" otherwise. Cox censoring always uses bootstrap. |
B |
Number of bootstrap iterations. Default 100. Used only if |
parallel |
Logical. Use parallel bootstrap computation? Default FALSE. |
mc.cores |
Number of cores for parallel bootstrap. Default 2. |
seed |
Random seed for bootstrap reproducibility. Default NULL. |
censoring_control |
Control parameters passed to censoring model fitting function.
For Weibull: passed to |
ties |
Tie handling method for Cox models. Default "efron". Ignored for Weibull. |
ps_control |
Control parameters for propensity score model. Default |
boot_level |
Bootstrap sampling level: "full" (default) or "strata".
"full" resamples from entire dataset (standard for observational studies). "strata"
resamples within each treatment group preserving group sizes (useful when treatment assignment
follows a stratified or fixed-ratio design). Only used if |
Details
**Variance Estimation:** - Analytical: Binary treatment with Weibull censoring only (M-estimation). - Bootstrap: All settings (resamples entire pipeline). - Cox: Always uses bootstrap.
**Treatment Effects:**
- Binary: S1 - S0 (second level minus first).
- Multiple groups: Requires contrast_matrix for pairwise comparisons.
Value
List containing:
survival_estimates |
Matrix [time x J] of survival function estimates for each group. |
survival_se |
Matrix [time x J] of standard errors for survival functions. |
difference_estimates |
Matrix [time x K] of treatment effect estimates.
For binary treatment: single column with S1-S0. For multiple groups: contrasts
from |
difference_se |
Matrix [time x K] of standard errors for treatment effects. |
eval_times |
Time points evaluated. |
treatment_levels |
Sorted unique treatment values. |
n_levels |
Number of treatment groups. |
n |
Sample size (complete cases after data validation). |
included |
Logical vector [n] indicating inclusion in analysis. TRUE = included, FALSE = excluded due to trimming. |
estimand |
Estimand used. |
censoring_method |
Censoring method used. |
variance_method |
Variance method used. |
contrast_matrix |
Contrast matrix used (NULL if not applicable). |
ps_model |
Fitted propensity score model (glm or multinom object). |
censoring_models |
Named list of fitted censoring models by treatment group. |
weights |
Numeric vector [n] of final weights (0 for trimmed observations). |
trim_summary |
Data frame with trimming summary by treatment group. |
ess |
Named numeric vector of effective sample size by treatment group. |
boot_result |
Bootstrap results (NULL if analytical variance used). |
Examples
# Example 1: Binary treatment with overlap weighting and Weibull censoring model
data(simdata_bin)
result1 <- surveff(
data = simdata_bin,
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
censoring_formula = survival::Surv(time, event) ~ X1 + B1,
estimand = "overlap",
censoring_method = "weibull"
)
summary(result1)
plot(result1)
# Example 2: Multiple treatments with ATE and Cox censoring model
data(simdata_multi)
result2 <- surveff(
data = simdata_multi,
ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
censoring_formula = survival::Surv(time, event) ~ X1 + B1,
estimand = "ATE",
censoring_method = "cox",
variance_method = "bootstrap",
B = 100
)
summary(result2)
Wrapper for Cox Survival Effect Estimation with Variance
Description
High-level wrapper combining propensity score estimation, weighting (with optional trimming), survival function estimation, and variance estimation using bootstrap. Unlike Weibull-based estimation, Cox censoring scores do not support analytical variance; bootstrap is always used.
Usage
surveff_cox(
data,
treatment_var,
ps_formula,
time_var,
event_var,
censoring_formula,
eval_times = NULL,
estimand = "ATE",
att_group = NULL,
trim = NULL,
delta = NULL,
alpha = NULL,
contrast_matrix = NULL,
B = 100,
parallel = FALSE,
mc.cores = 2,
seed = NULL,
censoring_control = list(),
ties = "efron",
ps_control = list(),
boot_level = "full"
)
Arguments
data |
Data frame (possibly processed by data validation). |
treatment_var |
Name of treatment variable. |
ps_formula |
Formula for propensity score model. |
time_var |
Name of time variable. |
event_var |
Name of event indicator (1 = event, 0 = censored). |
censoring_formula |
Formula for censoring score model. |
eval_times |
Numeric vector of time points. If NULL, uses all unique event times. |
estimand |
Target estimand: "ATE", "ATT", or "overlap". Default "ATE". |
att_group |
Target group for ATT (required if estimand = "ATT"). |
trim |
Trimming method: "symmetric" or "asymmetric". Default NULL (no trimming). |
delta |
Threshold for symmetric trimming. Required if trim = "symmetric". |
alpha |
Percentile for asymmetric trimming. Required if trim = "asymmetric". |
contrast_matrix |
Optional matrix for treatment differences. Each row represents one contrast with exactly two non-zero elements: -1 and 1. Column names must match treatment levels. For binary treatment, ignored (always estimates S1-S0). For >2 groups, required to estimate differences (otherwise returns NULL). |
B |
Number of bootstrap iterations. Default 100. |
parallel |
Logical. Use parallel bootstrap via |
mc.cores |
Number of cores for parallel bootstrap. Default 2. |
seed |
Random seed for bootstrap reproducibility. Default NULL. |
censoring_control |
Control parameters for |
ties |
Tie handling method for Cox model. Default "efron". |
ps_control |
Control parameters for PS model. Default |
boot_level |
Bootstrap sampling level: "full" (default) or "strata".
"full" resamples from entire dataset (standard for observational studies). "strata"
resamples within each treatment group preserving group sizes (useful when treatment assignment
follows a stratified or fixed-ratio design). Only used if |
Details
This function implements the complete estimation workflow:
**Without trimming:** 1. Estimate propensity scores on full data 2. Estimate weights from PS 3. Estimate censoring scores on full data 4. Estimate survival functions 5. Estimate differences (if applicable) 6. Estimate variances using bootstrap
**With trimming:** 1. Estimate propensity scores on full data (PS_full) 2. Use PS_full to identify observations to trim 3. Re-estimate propensity scores on trimmed data (PS_trimmed) 4. Estimate weights from PS_trimmed 5. Estimate censoring scores on trimmed data 6. Estimate survival functions on trimmed data 7. Estimate differences (if applicable) 8. Estimate variances using bootstrap
**Treatment differences:** - Binary (2 groups): Always estimates S1 - S0 (second level minus first level), ignoring contrast_matrix even if provided. - Multiple groups (>2): Requires contrast_matrix to estimate differences. Returns NULL for difference_* if contrast_matrix not provided.
**Variance estimation for differences:** - Binary: Differences columns in each bootstrap sample, calculates sample variance across B trials. - Multiple groups: Always uses bootstrap method.
Value
List containing:
survival_estimates |
Matrix [time x J] of survival function estimates. |
survival_variances |
Matrix [time x J] of survival function variances. |
difference_estimates |
Matrix [time x K] of treatment differences (NULL for >2 groups without contrast_matrix). |
difference_variances |
Matrix [time x K] of difference variances (NULL for >2 groups without contrast_matrix). |
eval_times |
Time points evaluated. |
treatment_levels |
Treatment level values. |
n_levels |
Number of treatment levels. |
contrast_matrix |
Contrast matrix used (NULL if not applicable). |
surv_result |
Output from |
variance_method |
Always "bootstrap" for Cox-based estimation. |
boot_result |
Bootstrap results. |
Wrapper for Weibull Survival Effect Estimation with Variance
Description
High-level wrapper combining propensity score estimation, weighting (with optional trimming), survival function estimation, and variance estimation (analytical or bootstrap).
Usage
surveff_weibull(
data,
treatment_var,
ps_formula,
time_var,
event_var,
censoring_formula,
eval_times = NULL,
estimand = "ATE",
att_group = NULL,
trim = NULL,
delta = NULL,
alpha = NULL,
contrast_matrix = NULL,
variance_method = NULL,
B = 100,
parallel = FALSE,
mc.cores = 2,
seed = NULL,
censoring_control = list(maxiter = 350),
ps_control = list(),
boot_level = "full"
)
Arguments
data |
Data frame (possibly processed by data validation). |
treatment_var |
Name of treatment variable. |
ps_formula |
Formula for propensity score model. |
time_var |
Name of time variable. |
event_var |
Name of event indicator (1 = event, 0 = censored). |
censoring_formula |
Formula for censoring score model. |
eval_times |
Numeric vector of time points. If NULL, uses all unique event times. |
estimand |
Target estimand: "ATE", "ATT", or "overlap". Default "ATE". |
att_group |
Target group for ATT (required if estimand = "ATT"). |
trim |
Trimming method: "symmetric" or "asymmetric". Default NULL (no trimming). |
delta |
Threshold for symmetric trimming. Required if trim = "symmetric". |
alpha |
Percentile for asymmetric trimming. Required if trim = "asymmetric". |
contrast_matrix |
Optional matrix for treatment differences. Each row represents one contrast with exactly two non-zero elements: -1 and 1. Column names must match treatment levels. For binary treatment, ignored (always estimates S1-S0). For >2 groups, required to estimate differences (otherwise returns NULL). |
variance_method |
Variance estimation method: "analytical" (binary only) or "bootstrap". Default "analytical" for binary, "bootstrap" for multiple groups. |
B |
Number of bootstrap iterations. Default 100. Ignored if variance_method = "analytical". |
parallel |
Logical. Use parallel bootstrap via |
mc.cores |
Number of cores for parallel bootstrap. Default 2. |
seed |
Random seed for bootstrap reproducibility. Default NULL. |
censoring_control |
Control parameters for |
ps_control |
Control parameters for PS model. Default |
boot_level |
Bootstrap sampling level: "full" (default) or "strata".
"full" resamples from entire dataset (standard for observational studies). "strata"
resamples within each treatment group preserving group sizes (useful when treatment assignment
follows a stratified or fixed-ratio design). Only used if |
Details
This function implements the complete estimation workflow:
**Without trimming:** 1. Estimate propensity scores on full data 2. Estimate weights from PS 3. Estimate censoring scores on full data 4. Estimate survival functions 5. Estimate differences (if applicable) 6. Estimate variances
**With trimming:** 1. Estimate propensity scores on full data (PS_full) 2. Use PS_full to identify observations to trim 3. Re-estimate propensity scores on trimmed data (PS_trimmed) 4. Estimate weights from PS_trimmed 5. Estimate censoring scores on trimmed data 6. Estimate survival functions on trimmed data 7. Estimate differences (if applicable) 8. Estimate variances
**Treatment differences:** - Binary (2 groups): Always estimates S1 - S0 (second level minus first level), ignoring contrast_matrix even if provided. - Multiple groups (>2): Requires contrast_matrix to estimate differences. Returns NULL for difference_* if contrast_matrix not provided.
**Variance estimation for differences:** - Binary with analytical: Uses influence function approach with proper correlation structure (NOT sum of variances). - Binary with bootstrap: Differences columns in each bootstrap sample, calculates sample variance across B trials. - Multiple groups: Always uses bootstrap method.
Value
List containing:
survival_estimates |
Matrix [time x J] of survival function estimates. |
survival_variances |
Matrix [time x J] of survival function variances. |
difference_estimates |
Matrix [time x K] of treatment differences (NULL for >2 groups without contrast_matrix). |
difference_variances |
Matrix [time x K] of difference variances (NULL for >2 groups without contrast_matrix). |
eval_times |
Time points evaluated. |
treatment_levels |
Treatment level values. |
n_levels |
Number of treatment levels. |
contrast_matrix |
Contrast matrix used (NULL if not applicable). |
surv_result |
Output from |
variance_method |
Method used for variance estimation. |
boot_result |
Bootstrap results (NULL if variance_method = "analytical"). |
Asymmetric Propensity Score Trimming (Sturmer Extension)
Description
Performs asymmetric (percentile-based) trimming using within-group percentile thresholds. Implements the Sturmer extension to multiple treatments as described in Yoshida et al. (2019).
Usage
trim_weights_asymmetric(ps_result, data, treatment_var, alpha = NULL)
Arguments
ps_result |
A list returned by |
data |
A data.frame containing the treatment variable. |
treatment_var |
A character string specifying the name of the treatment
variable in |
alpha |
Numeric percentile threshold in (0, 0.5). Default is NULL, which uses recommended values: 0.05 for binary treatment, 0.033 for 3 groups, 1/(2*J) for J >= 4. |
Details
The asymmetric trimming rule retains observation i if:
e_{ji} \geq F^{-1}_{e_{ji}|A_i=j}(\alpha|j) \text{ for all } j
where F^{-1}_{e_{ji}|A_i=j}(\alpha|j) is the \alpha-percentile of
propensity scores e_{ji} among individuals who actually received
treatment j.
Value
A logical vector of length n, where TRUE indicates the observation should be kept and FALSE indicates it should be trimmed.
References
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
Symmetric Propensity Score Trimming (Crump Extension)
Description
Performs symmetric trimming based on minimum propensity score across all treatment groups. Implements the Crump extension to multiple treatments as described in Yoshida et al. (2019).
Usage
trim_weights_symmetric(ps_result, data, treatment_var, delta = NULL)
Arguments
ps_result |
A list returned by |
data |
A data.frame containing the treatment variable. |
treatment_var |
A character string specifying the name of the treatment
variable in |
delta |
Numeric trimming threshold in (0, 1/J] where J is the number of treatment levels. Default is NULL, which uses recommended values: 0.1 for binary treatment, 0.067 for 3 groups, 1/(2*J) for J >= 4. |
Details
The symmetric trimming rule retains observation i if:
\min_j\{e_{ji}\} \geq \delta
For binary treatment (J=2), this reduces to: e(X) \in [\delta, 1-\delta].
Value
A logical vector of length n, where TRUE indicates the observation should be kept and FALSE indicates it should be trimmed.
References
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
Validate All Inputs for PSsurvdiff
Description
Umbrella function that calls all validation functions and returns the cleaned dataset with complete cases ready for model fitting.
Usage
validate_PSsurvdiff_inputs(
data,
ps_formula,
censor_formula,
weight_method,
censor_method,
trim_alpha,
trim_q,
time_points,
conf_level,
ps_control,
censor_control,
bootstrap_control
)
Arguments
data |
A data.frame containing the analysis data. |
ps_formula |
Formula object for propensity score model. |
censor_formula |
Formula object for censoring model. |
weight_method |
Character string specifying weighting method. |
censor_method |
Character string specifying censoring method. |
trim_alpha |
Numeric, symmetric trimming threshold. |
trim_q |
Numeric, asymmetric trimming quantile. |
time_points |
Numeric vector or NULL. |
conf_level |
Numeric, confidence level. |
ps_control |
List of PS model control parameters. |
censor_control |
List of censoring model control parameters. |
bootstrap_control |
List of bootstrap control parameters or NULL. |
Value
A list containing:
data_clean |
Data frame with complete cases only |
treatment_var |
Character string of treatment variable name |
time_var |
Character string of time variable name (possibly an expression) |
event_var |
Character string of event variable name (possibly an expression) |
censor_formula |
The validated censoring formula, or NULL if no censoring adjustment was requested (i.e., original formula had RHS of 0) |
n_complete |
Integer, number of complete cases used in analysis |
Validate Censoring Formula
Description
Checks that the censoring formula is valid, uses Surv() notation, and extracts the time and event variable names.
Usage
validate_censor_formula(censor_formula)
Arguments
censor_formula |
A formula object for the censoring model. Should be of the form 'Surv(time, event) ~ covariates'. Use 'Surv(time, event) ~ 0' to indicate no censoring adjustment (all censoring scores set to 1). |
Value
A list containing:
formula |
The validated formula object, or NULL if RHS is 0 (no censoring adjustment) |
time_var |
Character string of the time variable name |
event_var |
Character string of the event variable name |
Validate Data Variables
Description
Performs all variable-specific validation checks on treatment, time, event, and covariates. Only stops execution if critical errors are found.
Usage
validate_data_variables(data, treatment_var, ps_formula, censor_formula)
Arguments
data |
A data.frame containing the analysis data. |
treatment_var |
Character string of treatment variable name. |
ps_formula |
Formula object for propensity score model. |
censor_formula |
Formula object for censoring model. |
Value
Invisible NULL if all checks pass; otherwise throws an error.
Validate Method Arguments
Description
Validates weight_method, censor_method, trimming parameters, time_points, and control lists. Synthesizes all method-specific checks.
Usage
validate_method_arguments(
weight_method,
censor_method,
trim_alpha,
trim_q,
time_points,
conf_level,
ps_control,
censor_control,
bootstrap_control,
max_time
)
Arguments
weight_method |
Character string specifying weighting method. |
censor_method |
Character string specifying censoring method. |
trim_alpha |
Numeric, symmetric trimming threshold. |
trim_q |
Numeric, asymmetric trimming quantile. |
time_points |
Numeric vector or NULL. |
conf_level |
Numeric, confidence level. |
ps_control |
List of PS model control parameters. |
censor_control |
List of censoring model control parameters. |
bootstrap_control |
List of bootstrap control parameters or NULL. |
max_time |
Numeric, maximum observed time (to check time_points range). |
Value
Invisible NULL if checks pass; otherwise throws an error.
Data Validation Functions for PSsurvival Package
Description
Data Validation Functions for PSsurvival Package
Usage
validate_ps_formula(ps_formula)
Arguments
ps_formula |
A formula object for the propensity score model. Should be of the form 'treatment ~ covariates'. |
Value
A list containing:
formula |
The validated formula object |
treatment_var |
Character string of the treatment variable name |
Bootstrap Variance Estimation for Marginal Cox Model
Description
Estimates variance of marginal hazard ratios via bootstrap resampling. Bootstrap Variance for Marginal Cox Model
Performs bootstrap resampling to estimate variance of log hazard ratios from weighted marginal Cox model. Supports full (unstratified) and stratified bootstrap by treatment group.
Usage
var_marginalcox_bootstrap(
data,
treatment_var,
time_var,
event_var,
ps_formula,
treatment_levels,
reference_level,
estimand = "ATE",
att_group = NULL,
trim = NULL,
delta = NULL,
alpha = NULL,
boot_level = "full",
B = 100,
robust = TRUE,
parallel = FALSE,
mc.cores = 2,
seed = NULL
)
Arguments
data |
A data.frame containing the complete-case analysis data. |
treatment_var |
Character string specifying the treatment variable name. |
time_var |
Character string specifying the time variable name. |
event_var |
Character string specifying the event variable name. |
ps_formula |
A formula object for the propensity score model. |
treatment_levels |
Vector of treatment levels (from main fit_marginal_cox). |
reference_level |
Reference treatment level (from main fit_marginal_cox). |
estimand |
Character string: "ATE", "ATT", or "overlap". |
att_group |
For ATT, which group to target. NULL otherwise. |
trim |
Trimming method: NULL, "symmetric", or "asymmetric". |
delta |
Symmetric trimming threshold (NULL uses defaults). |
alpha |
Asymmetric trimming threshold (NULL uses defaults). |
boot_level |
Bootstrap sampling level: "full" (default) samples from entire dataset, "strata" samples within each treatment group preserving group sizes. |
B |
Integer number of bootstrap iterations. Default 100. |
robust |
Logical. Use robust variance in Cox model? Default TRUE. |
parallel |
Logical. If TRUE, use parallel computation via mclapply. Default FALSE. |
mc.cores |
Integer number of cores for parallel processing. Default 2. |
seed |
Optional random seed for reproducibility. Default NULL. |
Details
Bootstrap Workflow: For each bootstrap iteration:
Resample data with replacement (full or stratified by treatment)
Estimate propensity scores on bootstrap sample
Calculate weights (with optional trimming)
Fit marginal Cox model using fit_marginal_cox with functionality="boot"
Record estimates, sample sizes, and event counts
Parallel Processing: Uses parallel::mclapply for parallel bootstrap. Set ncores > 1 to enable. Note: mclapply uses forking (not available on Windows).
Error Handling: Uses fit_marginal_cox(..., functionality="boot") which returns NA for failed estimates instead of throwing errors. This ensures all B trials complete.
Value
A list containing:
boot_samples |
List of length B with hr_estimates from each iteration. |
boot_allocation |
Matrix (B x n_levels) of group sample sizes per trial. |
n_used_boot |
Matrix (B x n_levels) of sample sizes used in Cox model per trial (after trimming). |
events_used_boot |
Matrix (B x n_levels) of event counts used in Cox model per trial (after trimming). |
n_success_by_group |
Named integer vector of successful estimates per group (non-NA across B trials). |
B |
Number of bootstrap iterations. |
boot_level |
Bootstrap method used. |
treatment_levels |
Treatment levels used. |
reference_level |
Reference level used. |
Bootstrap Variance Estimation for Cox Survival Functions
Description
Estimates bootstrap variance for counterfactual survival functions from
surv_cox(). Supports binary and multiple treatment groups.
Usage
var_surv_cox_bootstrap(
data,
surv_result,
treatment_var,
time_var,
event_var,
censoring_formula,
censoring_control = list(),
ties = "efron",
B = 100,
parallel = FALSE,
mc.cores = 2,
seed = NULL,
boot_level = "full"
)
Arguments
data |
Data frame used in original |
surv_result |
Output from |
treatment_var |
Name of treatment variable. |
time_var |
Name of time variable. |
event_var |
Name of event indicator variable. |
censoring_formula |
Formula for censoring score model. |
censoring_control |
Control parameters for |
ties |
Tie handling method for Cox model. Default "efron". |
B |
Number of bootstrap iterations. Default 100. |
parallel |
Logical. If TRUE, use parallel computation via |
mc.cores |
Number of cores for parallel computation. Default 2. |
seed |
Optional random seed for reproducibility. Ensures identical results across runs and between sequential and parallel execution. Default NULL. |
boot_level |
Bootstrap sampling level: "full" (default) samples from entire dataset, "strata" samples within each treatment group preserving group sizes. |
Details
Each bootstrap iteration resamples observations with replacement and re-estimates propensity scores, weights, and survival functions using the same specifications as the original analysis. Treatment groups may fail independently if not sampled or if models fail to converge.
Value
List containing:
var_matrix |
Matrix [time x group] of bootstrap variances. |
se_matrix |
Matrix [time x group] of bootstrap standard errors. |
boot_samples |
List of length B with survival matrices from each iteration. |
boot_allocation |
Matrix [B x group] of sample sizes for each group in each bootstrap iteration. |
n_success_by_group |
Matrix [time x group] of successful iterations. |
n_failed_by_group |
Matrix [time x group] of failed iterations. |
B |
Total bootstrap iterations. |
Compute Analytical M-Estimation Variance for Binary Treatment Survival Functions
Description
Computes analytical variance estimates using M-estimation for binary treatment. Calculates variances for S^(0)(t), S^(1)(t), and their difference S^(1)(t) - S^(0)(t).
Usage
var_surv_weibull_analytical(surv_result)
Arguments
surv_result |
Output from |
Details
Implements M-estimation variance for binary treatment survival functions. For each group j:
I_j = \frac{1}{E_\tau}(I_{\tau,j} + I_{\theta_j} + I_{\gamma_j} + I_{\beta,j})
Var(S^{(j)}) = \sum I_j^2 / n^2
For the difference:
I_{diff} = \frac{1}{E_\tau}(I_{\tau,diff} + I_{\theta_1} - I_{\theta_0} + I_{\gamma_1} - I_{\gamma_0} + I_{\beta,diff})
Var(S^{(1)} - S^{(0)}) = \sum I_{diff}^2 / n^2
Value
List containing:
var_matrix |
Matrix [time x 3] of variances: [var(S0), var(S1), var(S1-S0)]. |
se_matrix |
Matrix [time x 3] of standard errors: [se(S0), se(S1), se(S1-S0)]. |
influence_components |
List of Itheta and Igamma arrays for delta variance. |
Etau |
Normalization constant. |
n |
Sample size after trimming. |
Bootstrap Variance Estimation for Weibull Survival Functions
Description
Estimates bootstrap variance for counterfactual survival functions from
surv_weibull(). Supports binary and multiple treatment groups.
Usage
var_surv_weibull_bootstrap(
data,
surv_result,
treatment_var,
time_var,
event_var,
censoring_formula,
censoring_control = list(maxiter = 350),
B = 100,
parallel = FALSE,
mc.cores = 2,
seed = NULL,
boot_level = "full"
)
Arguments
data |
Data frame used in original |
surv_result |
Output from |
treatment_var |
Name of treatment variable. |
time_var |
Name of time variable. |
event_var |
Name of event indicator variable. |
censoring_formula |
Formula for censoring score model. |
censoring_control |
Control parameters for |
B |
Number of bootstrap iterations. Default 100. |
parallel |
Logical. If TRUE, use parallel computation via |
mc.cores |
Number of cores for parallel computation. Default 2. |
seed |
Optional random seed for reproducibility. Ensures identical results across runs and between sequential and parallel execution. Default NULL. |
boot_level |
Bootstrap sampling level: "full" (default) or "strata". "full" resamples from entire dataset (observational studies). "strata" resamples within treatment groups preserving group sizes (RCTs). |
Details
Each bootstrap iteration resamples observations with replacement and re-estimates propensity scores, weights, and survival functions using the same specifications as the original analysis. Treatment groups may fail independently if not sampled or if models fail to converge.
Value
List containing:
var_matrix |
Matrix [time x group] of bootstrap variances. |
se_matrix |
Matrix [time x group] of bootstrap standard errors. |
boot_samples |
List of length B with survival matrices from each iteration. |
boot_allocation |
Matrix [B x group] of sample sizes for each group in each bootstrap iteration. |
n_success_by_group |
Matrix [time x group] of successful iterations. |
n_failed_by_group |
Matrix [time x group] of failed iterations. |
B |
Total bootstrap iterations. |