| Type: | Package |
| Title: | Benchmark for Publication Bias Correction Methods |
| Version: | 0.1.0 |
| Maintainer: | František Bartoš <f.bartos96@gmail.com> |
| Description: | Implements a unified interface for benchmarking meta-analytic publication bias correction methods through simulation studies (see Bartoš et al., 2022, <doi:10.48550/arXiv.2510.19489>). It provides 1) predefined data-generating mechanisms from the literature, 2) functions for running meta-analytic methods on simulated data, 3) pre-simulated datasets and pre-computed results for reproducible benchmarks, 4) tools for visualizing and comparing method performance. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| Depends: | R (≥ 3.5.0) |
| Imports: | stats, metafor, osfr, MASS, numDeriv, pwr, sandwich, clubSandwich, lmtest, puniform, Rdpack |
| Suggests: | RoBMA, testthat (≥ 3.0.0), rmarkdown, knitr, kableExtra, DT, ggplot2, scales, ggdist, rprojroot, desc |
| RdMacros: | Rdpack |
| URL: | https://github.com/FBartos/PublicationBiasBenchmark, https://fbartos.github.io/PublicationBiasBenchmark/ |
| BugReports: | https://github.com/FBartos/PublicationBiasBenchmark/issues |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2025-11-01 14:51:06 UTC; fbart |
| Author: | František Bartoš |
| Repository: | CRAN |
| Date/Publication: | 2025-11-05 20:00:02 UTC |
PublicationBiasBenchmark: Benchmark for Publication Bias Correction Methods
Description
Implements a unified interface for benchmarking meta-analytic publication bias correction methods through simulation studies (see Bartoš et al., 2022, doi:10.48550/arXiv.2510.19489). It provides 1) predefined data-generating mechanisms from the literature, 2) functions for running meta-analytic methods on simulated data, 3) pre-simulated datasets and pre-computed results for reproducible benchmarks, 4) tools for visualizing and comparing method performance.
Author(s)
Maintainer: František Bartoš f.bartos96@gmail.com (ORCID)
Authors:
See Also
Useful links:
Report bugs at https://github.com/FBartos/PublicationBiasBenchmark/issues
Options for the PublicationBiasBenchmark package
Description
A placeholder object and functions for the PublicationBiasBenchmark package.
Usage
PublicationBiasBenchmark.options(...)
PublicationBiasBenchmark.get_option(name)
Arguments
... |
named option(s) to change - for a list of available options, see details below. |
name |
the name of the option to get the current value of - for a list of available options, see details below. |
Details
"resources_directory"Location where the benchmark data/results/measures are stored
"prompt_for_download"Whether each file download should ask for explicit approval
Value
The current value of all available PublicationBiasBenchmark options (after applying any changes specified) is returned invisibly as a named list.
Calculate sample variance of generic statistic
Description
Calculate sample variance of generic statistic
Usage
S_G_squared(G)
Arguments
G |
Vector of generic statistics |
Value
Sample variance S_G^2
Calculate sample variance of squared errors
Description
Calculate sample variance of squared errors
Usage
S_theta_minus_theta_squared(theta_hat, theta)
Arguments
theta_hat |
Vector of estimates |
theta |
True parameter value |
Value
Sample variance S_(theta_hat - theta)^2
Calculate sample variance of estimates
Description
Calculate sample variance of estimates
Usage
S_theta_squared(theta_hat)
Arguments
theta_hat |
Vector of estimates |
Value
Sample variance S_theta^2
Calculate sample variance of CI widths
Description
Calculate sample variance of CI widths
Usage
S_w_squared(ci_upper, ci_lower)
Arguments
ci_upper |
Vector of upper CI bounds |
ci_lower |
Vector of lower CI bounds |
Value
Sample variance S_w^2
Compare method with Multiple Measures for a DGM
Description
This is a high-level wrapper function that computes multiple pairwise comparison measures for a Data-Generating Mechanism (DGM) and saves the results to CSV files. It provides a clean and extensible interface for comparing method performance.
Usage
compare_measures(
dgm_name,
method,
method_setting,
measures = NULL,
verbose = TRUE,
estimate_col = "estimate",
true_effect_col = "mean_effect",
convergence_col = "convergence",
method_replacements = NULL,
n_repetitions = 1000,
overwrite = FALSE,
conditions = NULL
)
Arguments
dgm_name |
Character string specifying the name of the DGM dataset to download. |
method |
Character vector of method names |
method_setting |
Character vector of method settings, must be same length as method |
measures |
Character vector of measures to compute. If NULL, computes all standard measures. |
verbose |
Print detailed progress of the calculation. |
estimate_col |
Character string specifying the column name containing parameter estimates. Default is "estimate" |
true_effect_col |
Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect" |
convergence_col |
Character string specifying the column name containing convergence indicators. Default is "convergence" |
method_replacements |
Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:
If multiple elements are specified within the vectors, these replacements are applied consecutively
in case the previous replacements also failed to converge.
Defaults to |
n_repetitions |
Number of repetitions in each condition. Necessary method replacement. Defaults to |
overwrite |
Logical indicating whether to overwrite existing files.
Defaults to |
conditions |
Data frame of conditions from dgm_conditions() |
Value
Invisible list of computed comparison data frames
Compare method with a Single Measure for a DGM
Description
This function provides pairwise comparison of method for Data-Generating Mechanisms (DGMs). It compares method performance on a condition-by-condition basis using estimates. For each pair of method, if method A has an estimate closer to the true value than method B, it gets a score of 1, if further it gets 0, and if equal it gets 0.5.
Usage
compare_single_measure(
dgm_name,
measure_name,
method,
method_setting,
conditions,
estimate_col = "estimate",
true_effect_col = "mean_effect",
convergence_col = "convergence",
method_replacements = NULL,
n_repetitions = 1000,
overwrite = FALSE,
...
)
Arguments
dgm_name |
Character string specifying the name of the DGM dataset to download. |
measure_name |
Name of the measure to compute (e.g., "bias", "mse") |
method |
Character vector of method names |
method_setting |
Character vector of method settings, must be same length as method |
conditions |
Data frame of conditions from dgm_conditions() |
estimate_col |
Character string specifying the column name containing parameter estimates. Default is "estimate" |
true_effect_col |
Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect" |
convergence_col |
Character string specifying the column name containing convergence indicators. Default is "convergence" |
method_replacements |
Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:
If multiple elements are specified within the vectors, these replacements are applied consecutively
in case the previous replacements also failed to converge.
Defaults to |
n_repetitions |
Number of repetitions in each condition. Necessary method replacement. Defaults to |
overwrite |
Logical indicating whether to overwrite existing files.
Defaults to |
... |
Additional arguments passed to measure functions |
Value
Data frame with pairwise comparison scores in long format (method_a, method_b, score)
Compute Multiple Performance measures for a DGM
Description
This is a high-level wrapper function that computes multiple performance measures for a Data-Generating Mechanism (DGM) and saves the results to CSV files. It provides a clean and extensible interface for computing standard simulation performance measures.
Usage
compute_measures(
dgm_name,
method,
method_setting,
measures = NULL,
verbose = TRUE,
power_test_type = "p_value",
power_threshold_p_value = 0.05,
power_threshold_bayes_factor = 10,
estimate_col = "estimate",
true_effect_col = "mean_effect",
ci_lower_col = "ci_lower",
ci_upper_col = "ci_upper",
p_value_col = "p_value",
bf_col = "BF",
convergence_col = "convergence",
method_replacements = NULL,
n_repetitions = 1000,
overwrite = FALSE,
conditions = NULL
)
Arguments
dgm_name |
Character string specifying the DGM name |
method |
Character vector of method names |
method_setting |
Character vector of method settings, must be same length as method |
measures |
Character vector of measures to compute. If NULL, computes all standard measures. |
verbose |
Print detailed progress of the calculation. |
power_test_type |
Character vector specifying the test type for power computation: "p_value" (default) or "bayes_factor" for each method. If a single value is provided, it is repeated for all methods. |
power_threshold_p_value |
Numeric threshold for power computation with p-values. Default is 0.05 (reject H0 if p < 0.05). |
power_threshold_bayes_factor |
Numeric threshold for power computation with Bayes factors. Default is 10 (reject H0 if BF > 10) |
estimate_col |
Character string specifying the column name containing parameter estimates. Default is "estimate" |
true_effect_col |
Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect" |
ci_lower_col |
Character string specifying the column name containing lower confidence interval bounds. Default is "ci_lower" |
ci_upper_col |
Character string specifying the column name containing upper confidence interval bounds. Default is "ci_upper" |
p_value_col |
Character string specifying the column name containing p-values. Default is "p_value" |
bf_col |
Character string specifying the column name containing Bayes factors. Default is "BF" |
convergence_col |
Character string specifying the column name containing convergence indicators. Default is "convergence" |
method_replacements |
Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:
If multiple elements are specified within the vectors, these replacements are applied consecutively
in case the previous replacements also failed to converge.
Defaults to |
n_repetitions |
Number of repetitions in each condition. Necessary method replacement. Defaults to |
overwrite |
Logical indicating whether to overwrite existing results. If FALSE (default), will skip computation for method-measure combinations that already exist |
conditions |
Data frame of conditions from dgm_conditions() |
Value
TRUE upon successfully computation of the results file
Examples
# Download DGM results
dgm_name <- "no_bias"
download_dgm_results(dgm_name)
# Basic usage
compute_measures(
dgm_name = dgm_name,
method = c("mean", "RMA", "PET"),
method_setting = c("default", "default", "default"),
measures = c("bias", "mse", "coverage")
)
# With method replacements for non-converged results
method_replacements <- list(
"RMA-default" = list(method = "FMA", method_setting = "default"),
"PET-default" = list(method = c("WLS", "FMA"),
method_setting = c("default", "default"))
)
compute_measures(
dgm_name = dgm_name,
method = c("RMA", "PET"),
method_setting = c("default", "default"),
method_replacements = method_replacements,
measures = c("bias", "mse")
)
Compute Performance Measures
Description
This function provides a modular and extensible way to compute performance measures (PM) for Data-Generating Mechanisms (DGMs). It handles different types of measures and automatically determines the required arguments for each measure function.
Usage
compute_single_measure(
dgm_name,
measure_name,
method,
method_setting,
conditions,
measure_fun,
measure_mcse_fun,
power_test_type = "p_value",
estimate_col = "estimate",
true_effect_col = "mean_effect",
ci_lower_col = "ci_lower",
ci_upper_col = "ci_upper",
p_value_col = "p_value",
bf_col = "BF",
convergence_col = "convergence",
power_threshold_p_value = 0.05,
power_threshold_bayes_factor = 10,
method_replacements = NULL,
n_repetitions = 1000,
overwrite = FALSE,
...
)
Arguments
dgm_name |
Character string specifying the DGM name |
measure_name |
Name of the measure to compute (e.g., "bias", "mse") |
method |
Character vector of method names |
method_setting |
Character vector of method settings, must be same length as method |
conditions |
Data frame of conditions from dgm_conditions() |
measure_fun |
Function to compute the measure |
measure_mcse_fun |
Function to compute the MCSE for the measure |
power_test_type |
Character vector specifying the test type for power computation: "p_value" (default) or "bayes_factor" for each method. If a single value is provided, it is repeated for all methods. |
estimate_col |
Character string specifying the column name containing parameter estimates. Default is "estimate" |
true_effect_col |
Character string specifying the column name in conditions data frame containing true effect sizes. Default is "mean_effect" |
ci_lower_col |
Character string specifying the column name containing lower confidence interval bounds. Default is "ci_lower" |
ci_upper_col |
Character string specifying the column name containing upper confidence interval bounds. Default is "ci_upper" |
p_value_col |
Character string specifying the column name containing p-values. Default is "p_value" |
bf_col |
Character string specifying the column name containing Bayes factors. Default is "BF" |
convergence_col |
Character string specifying the column name containing convergence indicators. Default is "convergence" |
power_threshold_p_value |
Numeric threshold for power computation with p-values. Default is 0.05 (reject H0 if p < 0.05). |
power_threshold_bayes_factor |
Numeric threshold for power computation with Bayes factors. Default is 10 (reject H0 if BF > 10) |
method_replacements |
Named list of replacement method specifications. Each element should be named with the "method-method_setting" combination (e.g., "RMA-default") and contain a named list with:
If multiple elements are specified within the vectors, these replacements are applied consecutively
in case the previous replacements also failed to converge.
Defaults to |
n_repetitions |
Number of repetitions in each condition. Necessary method replacement. Defaults to |
overwrite |
Logical indicating whether to overwrite existing results. If FALSE (default), will skip computation for method-measure combinations that already exist |
... |
Additional arguments passed to measure functions |
Value
TRUE upon successfully computation of the results file
Create standardized empty method result for convergence failures
Description
Create standardized empty method result for convergence failures
Usage
create_empty_result(method_name, note, extra_columns = NULL)
Arguments
method_name |
Character string of the method name |
note |
Character string describing the failure reason |
extra_columns |
Character vector of additional empty columns to add to the table |
Value
Data frame with standardized empty result structure
DGM Method
Description
S3 Method for defining data-generating mechanisms. See simulate_dgm() for
usage and further details.
Usage
dgm(dgm_name, settings)
Arguments
dgm_name |
Character string specifying the DGM type |
settings |
List containing the required parameters for the DGM or numeric condition_id |
Value
A data frame with simulated data following the structure described
in the Output Structure section. This is an S3 generic method that dispatches
to specific DGM implementations based on dgm_name.
Output Structure
The returned data frame follows a standardized schema that downstream functions rely on. Across the currently implemented DGMs, the following columns are used:
-
yi(numeric): The effect size estimate. -
sei(numeric): Standard error ofyi. -
ni(integer): Total sample size for the estimate (e.g., sum over groups where applicable). -
es_type(character): Effect size type, used to disambiguate the scale ofyi. Currently used values are"SMD"(standardized mean difference / Cohen's d),"logOR"(log odds ratio), and"none"(unspecified generic continuous coefficient). -
study_id(integer/character, optional): Identifier of the primary study/cluster when a DGM yields multiple estimates per study (e.g., Alinaghi2018, PRE). If absent, each row is treated as an independent study.
See Also
Examples
simulate_dgm("Carter2019", 1)
Alinaghi and Reed (2018) Data-Generating Mechanism
Description
This data-generating mechanism simulates univariate regression studies where a variable X
affects a continuous outcome Y. Each study estimates the coefficient of X, which consists
of a fixed component (alpha1) representing the overall mean effect, and a random component
that varies across studies but is constant within each study. In the "Random Effects"
environment ("RE"), each study produces one estimate, and the population effect
differs across studies. In the "Panel Random Effects" environment ("PRE"), each
study has 10 estimates, modeling the common scenario where multiple estimates per study
are available, with publication selection targeting the study rather than individual estimates.
The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Alinaghi and Reed (2018).
Usage
## S3 method for class 'Alinaghi2018'
dgm(dgm_name, settings)
Arguments
dgm_name |
DGM name (automatically passed) |
settings |
List containing
|
Details
This data-generating mechanism is based on Alinaghi & Reed (2018), who study univariate
regression models where a variable X affects a continuous variable Y. The parameter
of interest is the coefficient on X. In the "Random Effects" environment ("RE"),
each study produces one estimate, and the population effect differs across studies.
The coefficient on X equals a fixed component (alpha1) plus a random component that is
fixed within a study but varies across studies. The overall mean effect of X on Y is
given by alpha1. In the "Panel Random Effects" environment ("PRE"), each study has
10 estimates, modeling the common scenario where multiple estimates per study are
available. In this environment, effect estimates and standard errors are simulated to
be more similar within studies than across studies, and publication selection targets
the study rather than individual estimates (a study must have at least 7 out of 10
estimates that are significant or correctly signed.).
A distinctive feature of Alinaghi & Reed's experiments is that the number of effect size estimates is fixed before publication selection, making the meta-analyst's sample size endogenous and affected by the effect size. Large population effects are subject to less publication selection, as most estimates satisfy the selection criteria (statistical significance or correct sign). The sample size of all primary studies is fixed at 100 observations. (Neither the number of estimates nor the sample size of primary studies can be changed in the current implementation of the function.)
Another feature is the separation of statistical significance and sign of the estimated effect as criteria for selection. Significant/correctly-signed estimates are always "published," while insignificant/wrong-signed estimates have only a 10% chance of being published. This allows for different and sometimes conflicting consequences for estimator performance.
Value
Data frame with
- yi
effect size
- sei
standard error
- ni
sample size
- study_id
study identifier
- es_type
effect size type
References
Alinaghi N, Reed WR (2018).
“Meta-analysis and publication bias: How well does the FAT-PET-PEESE procedure work?”
Research Synthesis Methods, 9(2), 285-311.
doi:10.1002/jrsm.1298.
Hong S, Reed WR (2021).
“Using Monte Carlo experiments to select meta-analytic estimators.”
Research Synthesis Methods, 12(2), 192-215.
doi:10.1002/jrsm.1467.
See Also
Bom and Rachinger (2019) Data-Generating Mechanism
Description
Simulates univariate regression environments to estimate the effect of X1 on Y (parameter alpha1). Effect heterogeneity is introduced via an omitted variable (X2) correlated with X1, whose coefficient (alpha2) is randomly distributed with mean zero and variance sigma2_h.
The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Bom and Rachinger (2019).
Usage
## S3 method for class 'Bom2019'
dgm(dgm_name, settings)
Arguments
dgm_name |
DGM name (automatically passed) |
settings |
List containing
|
Details
This function simulates univariate regression environments, focusing on estimating the effect of a variable X1 on a dependent variable Y, represented by the parameter alpha1. The simulation introduces variation in the standard errors of estimated effects by allowing sample sizes to differ across primary studies. Effect heterogeneity is modeled through an omitted variable (X2) that is correlated with X1, where the coefficient on the omitted variable, alpha2, is randomly distributed across studies with mean zero and variance sigma2_h.
Publication selection is modeled in two regimes: (1) no selection, and (2) 50% selection. Under 50% selection, each estimate has a 50% chance of being evaluated for inclusion. If selected, only positive and statistically significant estimates are published; otherwise, new estimates are generated until this criterion is met. This process continues until the meta-analyst’s sample reaches its predetermined size.
Value
Data frame with
- yi
effect size
- sei
standard error
- ni
sample size
- es_type
effect size type
References
Bom PR, Rachinger H (2019).
“A kinked meta-regression model for publication bias correction.”
Research Synthesis Methods, 10(4), 497-514.
doi:10.1002/jrsm.1352.
Hong S, Reed WR (2021).
“Using Monte Carlo experiments to select meta-analytic estimators.”
Research Synthesis Methods, 12(2), 192-215.
doi:10.1002/jrsm.1467.
See Also
Carter et al. (2019) Data-Generating Mechanism
Description
This data-generating mechanism simulates primary studies estimating treatment effects using Cohen's d. The observed effect size is modeled as a fixed mean plus random heterogeneity across studies, with sample sizes varying to generate differences in standard errors. The simulation introduces publication bias via a selection algorithm where the probability of publication depends nonlinearly on the sign and p-value of the effect, with regimes for no, medium, and strong publication bias. It also incorporates questionable research practices (QRPs) such as optional outlier removal, selection between dependent variables, use of moderators, and optional stopping.
The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Carter et al. (2019).
Usage
## S3 method for class 'Carter2019'
dgm(dgm_name, settings)
Arguments
dgm_name |
DGM name (automatically passed) |
settings |
List containing
|
Details
This simulation environment is based on the framework described by Carter, Schönbrodt, Gervais, and Hilgard (2019). In this setup, primary studies estimate the effect of a treatment using Cohen's d as the effect size metric. The observed difference between treatment and control groups is modeled as the sum of a fixed effect (alpha1) and a random component, which introduces effect heterogeneity across studies. The degree of heterogeneity is controlled by the parameter sigma2_h. Variability in the standard errors of d is generated by simulating primary studies with different sample sizes.
The simulation incorporates two main types of distortions in the research environment. First, a publication selection algorithm is used, where the probability of a study being "published" depends nonlinearly on both the sign of the estimated effect and its p-value. Three publication selection regimes are modeled: "No Publication Bias," "Medium Publication Bias," and "Strong Publication Bias," each defined by different parameters in the selection algorithm. Second, the simulation includes four types of questionable research practices (QRPs): (a) optional removal of outliers, (b) optional selection between two dependent variables, (c) optional use of moderators, and (d) optional stopping.
Value
Data frame with
- yi
effect size
- sei
standard error
- ni
sample size
- es_type
effect size type
References
Carter EC, Schönbrodt FD, Gervais WM, Hilgard J (2019).
“Correcting for bias in psychology: A comparison of meta-analytic methods.”
Advances in Methods and Practices in Psychological Science, 2(2), 115-144.
doi:10.1177/2515245919847196.
Hong S, Reed WR (2021).
“Using Monte Carlo experiments to select meta-analytic estimators.”
Research Synthesis Methods, 12(2), 192-215.
doi:10.1002/jrsm.1467.
See Also
Stanley, Doucouliagos, and Ioannidis (2017) Data-Generating Mechanism
Description
Simulates two scenarios for meta-analysis studies investigating the effect of a treatment in: (1) Log Odds Ratio scenario, where the outcome is binary and effect heterogeneity is controlled by a random component, and (2) Cohen's d scenario, where the outcome is continuous and effect heterogeneity is introduced through a random component. Both scenarios allow for varying sample sizes and publication selection regimes, affecting the inclusion of study estimates based on their statistical significance and sign.
The description and code is based on Hong and Reed (2021). The data-generating mechanism was introduced in Stanley et al. (2017).
Usage
## S3 method for class 'Stanley2017'
dgm(dgm_name, settings)
Arguments
dgm_name |
DGM name (automatically passed) |
settings |
List containing
|
Details
This function simulates two meta-analysis scenarios to evaluate the effect
of a binary treatment variable (treat = {0, 1}) on study outcomes,
incorporating both effect heterogeneity and publication selection mechanisms.
In the Log Odds Ratio ("logOR") scenario, primary studies assess the
impact of treatment on a binary success indicator (Y = 1). The control
group has a fixed 10% probability of success, while the treatment group's
probability is increased by a fixed effect and a mean-zero random component,
whose variance (sigma2_h) controls effect heterogeneity. Each study estimates a
logistic regression, with the coefficient on treat (alpha1) as the effect of
interest. Study sample sizes vary, resulting in different standard errors
for estimated effects.
In the Cohen's d ("SMD") scenario, the outcome variable is
continuous. The treatment effect is modeled as a fixed effect (alpha1) plus a
random component (variance sigma2_h). Each study computes Cohen's d, the
standardized mean difference between treatment and control groups. Study
sample sizes vary, affecting the standard errors of d.
Publication selection is modeled in two regimes: (1) no selection, and (2) 50% selection. Under 50% selection, each estimate has a 50% chance of being evaluated for inclusion. If selected, only positive and statistically significant estimates are published; otherwise, new estimates are generated until this criterion is met. This process continues until the meta-analyst’s sample reaches its predetermined size.
Value
Data frame with
- yi
effect size
- sei
standard error
- ni
sample size
- es_type
effect size type
References
Hong S, Reed WR (2021).
“Using Monte Carlo experiments to select meta-analytic estimators.”
Research Synthesis Methods, 12(2), 192-215.
doi:10.1002/jrsm.1467.
Stanley TD, Doucouliagos H, Ioannidis JP (2017).
“Finding the power to reduce publication bias.”
Statistics in Medicine, 36(10), 1580-1598.
doi:10.1002/sim.7228.
See Also
Default DGM handler
Description
Default DGM handler
Usage
## Default S3 method:
dgm(dgm_name, settings)
Arguments
dgm_name |
Character string specifying the DGM type |
settings |
List containing the required parameters for the DGM or numeric condition_id |
Value
Throws an error indicating the DGM type is unknown. This default
method is only called when no specific DGM implementation is found for the
given dgm_name.
Normal Unbiased Data-Generating Mechanism
Description
An example data-generating mechanism to simulate effect sizes without publication bias.
Usage
## S3 method for class 'no_bias'
dgm(dgm_name, settings)
Arguments
dgm_name |
DGM name (automatically passed) |
settings |
List containing
|
Details
Sample sizes of individual effect size estimates are generated from a negative binomial distribution based on empirical sample size distribution presented in Appendix B of Maier et al. (2023)
Value
Data frame with
- yi
effect size
- sei
standard error
- ni
sample size
- es_type
effect size type
References
Maier M, Bartoš F, Wagenmakers E (2023). “Robust Bayesian meta-analysis: Addressing publication bias with model-averaging.” Psychological Methods, 28(1), 107-122. doi:10.1037/met0000405.
See Also
Return Pre-specified DGM Settings
Description
This function returns the list of pre-specified settings for a given Data Generating Mechanism (DGM).
Usage
dgm_conditions(dgm_name)
get_dgm_condition(dgm_name, condition_id)
Arguments
dgm_name |
Character string specifying the DGM type |
condition_id |
which conditions should settings be returned for. |
Value
A data frame containing the pre-specified settings including a
condition_id column which maps settings id to the corresponding settings.
Examples
head(dgm_conditions("Carter2019"))
get_dgm_condition("Carter2019", condition_id = 1)
head(dgm_conditions("Alinaghi2018"))
head(dgm_conditions("Stanley2017"))
Download Datasets/Results/Measures of a DGM
Description
This function downloads datasets/results/measures of a specified Data-Generating Mechanism (DGM)
from the OSF repository (https://osf.io/exf3m/). The datasets/results/measures are saved
to the location specified via PublicationBiasBenchmark.get_option("resources_directory").
To set the location permanently, specify the PublicationBiasBenchmark_RESOURCES environmental
variable. The data are stored in dgm_name/datasets, dgm_name/results, dgm_name/measures subfolders.
Usage
download_dgm_datasets(
dgm_name,
overwrite = FALSE,
progress = TRUE,
max_try = 10
)
download_dgm_results(
dgm_name,
overwrite = FALSE,
progress = TRUE,
max_try = 10
)
download_dgm_measures(
dgm_name,
overwrite = FALSE,
progress = TRUE,
max_try = 10
)
Arguments
dgm_name |
Character string specifying the name of the DGM dataset to download. |
overwrite |
Logical indicating whether to overwrite existing files.
Defaults to |
progress |
Logical indicating whether to show progress downloading files.
Defaults to |
max_try |
Integet specifying how many times should the function attempt reconnecting to OSF upon failure. |
Value
TRUE if the download was successful, otherwise an error is raised.
Examples
download_dgm_datasets("no_bias")
Performance Measures and Monte Carlo Standard Errors
Description
A comprehensive set of functions for computing performance measures and their Monte Carlo Standard Errors (MCSE) for simulation studies. All functions are based on definitions from Table 3 in Siepe et al. (2024). Winkler interval score is defined in Winkler (1972). Positive and negative likelihood ratios are defined in Huang and Trinquart (2023) and Deeks and Altman (2004). Also see Morris et al. (2019) for additional details. Bias and relative bias were modified to account for possibly different true values across repetitions.
Usage
bias(theta_hat, theta)
bias_mcse(theta_hat)
relative_bias(theta_hat, theta)
relative_bias_mcse(theta_hat, theta)
mse(theta_hat, theta)
mse_mcse(theta_hat, theta)
rmse(theta_hat, theta)
rmse_mcse(theta_hat, theta)
empirical_variance(theta_hat)
empirical_variance_mcse(theta_hat)
empirical_se(theta_hat)
empirical_se_mcse(theta_hat)
coverage(ci_lower, ci_upper, theta)
coverage_mcse(ci_lower, ci_upper, theta)
power(test_rejects_h0)
power_mcse(test_rejects_h0)
mean_ci_width(ci_upper, ci_lower)
mean_ci_width_mcse(ci_upper, ci_lower)
mean_generic_statistic(G)
mean_generic_statistic_mcse(G)
positive_likelihood_ratio(tp, fp, fn, tn)
positive_likelihood_ratio_mcse(tp, fp, fn, tn)
negative_likelihood_ratio(tp, fp, fn, tn)
negative_likelihood_ratio_mcse(tp, fp, fn, tn)
interval_score(ci_lower, ci_upper, theta, alpha = 0.05)
interval_score_mcse(ci_lower, ci_upper, theta, alpha = 0.05)
Arguments
theta_hat |
Vector of parameter estimates from simulations |
theta |
True parameter value |
ci_lower |
Vector of lower confidence interval bounds |
ci_upper |
Vector of upper confidence interval bounds |
test_rejects_h0 |
Logical vector indicating whether statistical tests reject the null hypothesis |
G |
Vector of generic statistics from simulations |
tp |
Numeric with the count of true positive hypothesis tests |
fp |
Numeric with the count of false positive hypothesis tests |
fn |
Numeric with the count of false negative hypothesis tests |
tn |
Numeric with the count of true negative hypothesis tests |
alpha |
Numeric indicating the 1 - coverage level for interval_score calculation |
Details
The package provides the following performance measures and their corresponding MCSE functions:
-
bias(theta_hat, theta): Bias estimate -
relative_bias(theta_hat, theta): Relative bias estimate -
mse(theta_hat, theta): Mean Square Error -
rmse(theta_hat, theta): Root Mean Square Error -
empirical_variance(theta_hat): Empirical variance -
empirical_se(theta_hat): Empirical standard error -
coverage(ci_lower, ci_upper, theta): Coverage probability -
mean_ci_width(ci_upper, ci_lower): Mean confidence interval width -
interval_score(ci_lower, ci_upper, theta, alpha): interval_score -
power(test_rejects_h0): Statistical power -
positive_likelihood_ratio(tp, fp, fn, tn): Log positive likelihood ratio -
negative_likelihood_ratio(tp, fp, fn, tn): Log negative likelihood ratio -
mean_generic_statistic(G): Mean of any generic statistic
Value
Each metric function returns a numeric value representing the performance measure. Each MCSE function returns a numeric value representing the Monte Carlo standard error.
References
Deeks JJ, Altman DG (2004).
“Diagnostic tests 4: likelihood ratios.”
BMJ, 329(7458), 168–169.
doi:10.1136/bmj.329.7458.168.
Huang Q, Trinquart L (2023).
“Relative likelihood ratios for neutral comparisons of statistical tests in simulation studies.”
Biometrical Journal, 66(1), 2200102.
doi:10.1002/bimj.202200102.
Morris TP, White IR, Crowther MJ (2019).
“Using simulation studies to evaluate statistical methods.”
Statistics in Medicine, 38(11), 2074–2102.
doi:10.1002/sim.8086.
Siepe BS, Bartoš F, Morris TP, Boulesteix A, Heck DW, Pawel S (2024).
“Simulation studies for methodological research in psychology: A standardized template for planning, preregistration, and reporting.”
Psychological Methods.
doi:10.1037/met0000695.
Winkler RL (1972).
“A decision-theoretic approach to interval estimation.”
Journal of the American Statistical Association, 67(337), 187–191.
doi:10.1080/01621459.1972.10481224.
Examples
# Generate some example data
set.seed(123)
theta_true <- 0.5
theta_estimates <- rnorm(1000, mean = theta_true, sd = 0.1)
# Compute bias and its MCSE
bias_est <- bias(theta_estimates, theta_true)
bias_se <- bias_mcse(theta_estimates)
# Compute MSE and its MCSE
mse_est <- mse(theta_estimates, theta_true)
mse_se <- mse_mcse(theta_estimates, theta_true)
# Example with coverage
ci_lower <- theta_estimates - 1.96 * 0.1
ci_upper <- theta_estimates + 1.96 * 0.1
coverage_est <- coverage(ci_lower, ci_upper, theta_true)
coverage_se <- coverage_mcse(ci_lower, ci_upper, theta_true)
Method Method
Description
S3 Method for defining methods. See run_method() for
usage and further details.
Usage
method(method_name, data, settings)
Arguments
method_name |
Character string specifying the method type |
data |
Data frame containing yi (effect sizes) and sei (standard errors) |
settings |
Either a character identifying a method version or list containing method-specific settings. An emty input will result in running the default (first implemented) version of the method. |
Value
A data frame with method results following the structure described
in the Output Structure section. This is an S3 generic method that dispatches
to specific method implementations based on method_name.
Output Structure
The returned data frame follows a standardized schema that downstream functions rely on. All methods return the following columns:
-
method(character): The name of the method used. -
estimate(numeric): The meta-analytic effect size estimate. -
standard_error(numeric): Standard error of the estimate. -
ci_lower(numeric): Lower bound of the 95% confidence interval. -
ci_upper(numeric): Upper bound of the 95% confidence interval. -
p_value(numeric): P-value for the estimate. -
BF(numeric): Bayes Factor for the estimate. -
convergence(logical): Whether the method converged successfully. -
note(character): Additional notes describing convergence issues.
Some methods may include additional method-specific columns beyond these
standard columns. Use get_method_extra_columns() to query which
additional columns a particular method returns.
See Also
Examples
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4),
sei = c(0.1, 0.15, 0.08, 0.12)
)
result <- run_method("RMA", data, "default")
AK Method
Description
Implements the Andrews & Kasy (AK) method for publication bias correction in meta-analysis. The AK method categorizes estimated effects into groups with different probabilities of being published. AK1 uses symmetric selection grouping estimates into significant (|t| >= 1.96) and insignificant (|t| < 1.96) estimates. AK2 uses asymmetric selection with four groups based on both significance and sign: highly significant positive/negative effects and marginally significant positive/negative effects, each with different publication probabilities. See Andrews and Kasy (2019) for details.
Usage
## S3 method for class 'AK'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes), sei (standard errors), and study_id (for clustering wherever available) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"Uses AK1 estimator (symmetric selection)
"AK1"Symmetric selection model grouping estimates into significant (|t| >= 1.96) and insignificant (|t| < 1.96) categories with relative publication probabilities of 1 and p1 respectively.
"AK2"Asymmetric selection model with four groups based on t-statistics: (a) t >= 1.96, (b) t < -1.96, (c) -1.96 <= t < 0, and (d) 0 <= t < 1.96, with relative publication probabilities of 1, p1, p2, and p3 respectively.
Value
Data frame with AK results
References
Andrews I, Kasy M (2019). “Identification of and correction for publication bias.” American Economic Review, 109(8), 2766–2794. doi:10.1257/aer.20180310.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply AK method
result <- run_method("AK", data, "default")
print(result)
Endogenous Kink Method
Description
Implements the endogenous kink (EK) method proposed by Bom and Rachinger for publication bias correction in meta-analysis. This method modifies the PET-PEESE approach by incorporating a non-linear relationship between publication bias and standard errors through a kinked regression specification. The method recognizes that when the true effect is non-zero, there is minimal publication selection when standard errors are very small (since most estimates are significant), but selection increases as standard errors grow. The kink point is endogenously determined using a two-step procedure based on the confidence interval of the initial effect estimate. See Bom and Rachinger (2019) for details.
Usage
## S3 method for class 'EK'
method(method_name, data, settings = NULL)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (no settings version are implemented) |
Value
Data frame with EK results
References
Bom PR, Rachinger H (2019). “A kinked meta-regression model for publication bias correction.” Research Synthesis Methods, 10(4), 497-514. doi:10.1002/jrsm.1352.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply EK method
result <- run_method("EK", data)
print(result)
Fixed Effects Meta-Analysis Method
Description
Implements the publication bias-unadjusted fixed effects meta-analysis.
Usage
## S3 method for class 'FMA'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"T-distribution adjustment (
test = "t") and cluster robust standard errors with small-sample adjustment (if converged, otherwise no small-sample adjustment or no cluster robust standard errors) for fixed effects meta-analysis ifstudy_idsis specified in the data
Value
Data frame with FMA results
References
There are no references for Rd macro \insertAllCites on this help page.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply FMA method
result <- run_method("FMA", data)
print(result)
PEESE (Precision-Effect Estimate with Standard Errors) Method
Description
Implements the Precision-Effect Estimate with Standard Errors method for publication bias correction. PEESE regresses effect sizes against standard errors^2 to correct for publication bias. The intercept represents the bias-corrected effect size estimate. See Stanley and Doucouliagos (2014) for details.
Usage
## S3 method for class 'PEESE'
method(method_name, data, settings = NULL)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (no settings version are implemented) |
Value
Data frame with PEESE results
References
Stanley TD, Doucouliagos H (2014). “Meta-regression approximations to reduce publication selection bias.” Research Synthesis Methods, 5(1), 60–78. doi:10.1002/jrsm.1095.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply PEESE method
result <- run_method("PEESE", data)
print(result)
PET (Precision-Effect Test) Method
Description
Implements the Precision-Effect Test for publication bias correction. PET regresses effect sizes against standard errors to test for and correct publication bias. The intercept represents the bias-corrected effect size estimate. See Stanley and Doucouliagos (2014) for details.
Usage
## S3 method for class 'PET'
method(method_name, data, settings = NULL)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (no settings version are implemented) |
Value
Data frame with PET results
References
Stanley TD, Doucouliagos H (2014). “Meta-regression approximations to reduce publication selection bias.” Research Synthesis Methods, 5(1), 60–78. doi:10.1002/jrsm.1095.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply PET method
result <- run_method("PET", data)
print(result)
PET-PEESE (Precision-Effect Test and Precision-Effect Estimate with Standard Errors) Method
Description
Implements the Precision-Effect Test and Precision-Effect Estimate with Standard Errors (PET-PEESE) regresses effect sizes against standard errors^2 to correct for publication bias. The intercept represents the bias-corrected effect size estimate. See Stanley and Doucouliagos (2014) for details.
Usage
## S3 method for class 'PETPEESE'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"-
(
conditional_alpha = 0.10) determines whether to use PET (PET's effect is not significant at alpha = 0.10 or PEESE estimate (PET's effect is significant at alpha = 0.10)
Value
Data frame with PET-PEESE results
References
Stanley TD, Doucouliagos H (2014). “Meta-regression approximations to reduce publication selection bias.” Research Synthesis Methods, 5(1), 60–78. doi:10.1002/jrsm.1095.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply PETPEESE method
result <- run_method("PETPEESE", data)
print(result)
Random Effects Meta-Analysis Method
Description
Implements the publication bias-unadjusted random-effects meta-analysis.
Usage
## S3 method for class 'RMA'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"Restricted Maximum Likelihood estimator (
method = "REML") with Knapp-Hartung adjustment (test = "knha") for a simple random effects meta-analysis and Restricted Maximum Likelihood estimator (method = "REML") with t-distribution adjustment (test = "t") and cluster robust standard errors with small-sample adjustment (if converged, otherwise no small-sample adjustment or no cluster robust standard errors) for a multilevel random effects meta-analysis ifstudy_idsis specified in the data
Value
Data frame with RMA results
References
There are no references for Rd macro \insertAllCites on this help page.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply RMA method
result <- run_method("RMA", data)
print(result)
Robust Bayesian Meta-Analysis (RoBMA) Method
Description
Implements the robust Bayesian meta-analysis (RoBMA) method that uses Bayesian model-averaging to combine results across several complementary publication bias adjustment methods. See Maier et al. (2023) and Bartoš et al. (2023) for details.
Note that the prior settings is dispatched based on "es_type" column attached
to the dataset. The resulting estimates are then summarized on the same scale
as was the dataset input (for "r", heterogeneity is summarized on Fisher's z).
Important: This method requires JAGS (Just Another Gibbs Sampler) to be installed on your system. Please download and install JAGS from https://mcmc-jags.sourceforge.io/ before using this method.
Usage
## S3 method for class 'RoBMA'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes), sei (standard errors), es_type
(either |
settings |
List of method settings (see Details.) |
Details
The following settings are implemented
"default"RoBMA-PSMA with publication bias adjustment as described in Bartoš et al. (2023). (the MCMC settings was reduced to speed-up the simulations) with the three-level specification whenever
"study_ids"are supplied with the data"PSMA"RoBMA-PSMA with publication bias adjustment as described in Bartoš et al. (2023). (the MCMC settings was reduced to speed-up the simulations) with the three-level specification whenever
"study_ids"are supplied with the data
Value
Data frame with RoBMA results
References
Bartoš F, Maier M, Wagenmakers E, Doucouliagos H, Stanley TD (2023).
“Robust Bayesian meta-analysis: Model-averaging across complementary publication bias adjustment methods.”
Research Synthesis Methods, 14(1), 99–116.
doi:10.1002/jrsm.1594.
Maier M, Bartoš F, Wagenmakers E (2023).
“Robust Bayesian meta-analysis: Addressing publication bias with model-averaging.”
Psychological Methods, 28(1), 107-122.
doi:10.1037/met0000405.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09),
es_type = "SMD"
)
# Apply RoBMA method
result <- run_method("RoBMA", data)
print(result)
SM (Selection Models) Method
Description
Implements selection models for publication bias correction in meta-analysis. The method first fits a random effects meta-analysis model, then applies selection modeling to adjust for publication bias using the metafor package. Selection models account for the probability that studies are published based on their p-values or effect sizes. See Vevea and Hedges (1995) for details.
Usage
## S3 method for class 'SM'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"or"3PSM"3-parameter step function selection model with Maximum Likelihood estimator (
method = "ML") and one step at one-sided p = 0.025 (i.e., selection for significance))"4PSM"4-parameter step function selection model with Maximum Likelihood estimator (
method = "ML") and two steps at one-sided p = 0.025 and p = 0.50 (i.e., selection for significance and direction of the effect)
Value
Data frame with SM results
References
Vevea JL, Hedges LV (1995). “A general linear model for estimating effect size in the presence of publication bias.” Psychometrika, 60(3), 419–435. doi:10.1007/BF02294384.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply SM method
result <- run_method("SM", data, "3PSM")
print(result)
WAAPWLS (Weighted Average of Adequately Powered Studies) Method
Description
Implements the WAAP-WLS method for meta-analysis, which combines WLS and WAAP approaches. First fits a WLS model to all studies, then identifies high-powered studies based on the criterion that the WLS estimate divided by 2.8 is greater than or equal to the standard error. If at least 2 high-powered studies are found, uses WAAP (weighted average of adequate power studies only), otherwise uses the original WLS estimate. See Stanley et al. (2017) for details.
Usage
## S3 method for class 'WAAPWLS'
method(method_name, data, settings = NULL)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (no settings version are implemented) |
Value
Data frame with WAAPWLS results
References
Stanley TD, Doucouliagos H, Ioannidis JP (2017). “Finding the power to reduce publication bias.” Statistics in Medicine, 36(10), 1580-1598. doi:10.1002/sim.7228.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply WAAPWLS method
result <- run_method("WAAPWLS", data)
print(result)
Weighted and Iterated Least Squares (WILS) Method
Description
Implements the weighted and iterated least squares (WILS) method for publication bias correction in meta-analysis. The method is based on the idea of using excess statistical significance (ESS) to identify how many underpowered studies should be removed to reduce publication selection bias. See Stanley and Doucouliagos (2024) for details.
Usage
## S3 method for class 'WILS'
method(method_name, data, settings = NULL)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (see Details) |
Details
The WILS method has two implementation versions based on Stanley & Doucouliagos (2024). The following settings are implemented
"default"The simulation version (default) uses residuals from the t ~ Precision regression for the first iteration, then switches to individual excess statistical significance (ESS) for subsequent iterations.
"example"The example version consistently uses residuals from the t ~ Precision regression to identify studies to remove across all iterations.
Value
Data frame with WILS results
References
Stanley TD, Doucouliagos H (2024). “Harnessing the power of excess statistical significance: Weighted and iterative least squares.” Psychological Methods, 29(2), 407–420. doi:10.1037/met0000502.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply WILS method
result <- run_method("WILS", data)
print(result)
WLS (Weighted Least Squares) Method
Description
Implements the Weighted Least Squares method for meta-analysis. WLS fits a weighted regression model with effect sizes as the outcome and weights based on the inverse of the squared standard errors. The intercept represents the weighted average effect size estimate.
Usage
## S3 method for class 'WLS'
method(method_name, data, settings = NULL)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (no settings version are implemented) |
Value
Data frame with WLS results
References
There are no references for Rd macro \insertAllCites on this help page.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply WLS method
result <- run_method("WLS", data)
print(result)
Default method handler
Description
Default method handler
Usage
## Default S3 method:
method(method_name, data, settings = list())
Arguments
method_name |
Character string specifying the method type |
data |
Data frame containing yi (effect sizes) and sei (standard errors) |
settings |
Either a character identifying a method version or list containing method-specific settings. An emty input will result in running the default (first implemented) version of the method. |
Value
Throws an error indicating the method type is unknown. This default
method is only called when no specific method implementation is found for the
given method_name.
Mean Method
Description
Implements the unweighted mean method. I.e., the mean of observed effect sizes.
Usage
## S3 method for class 'mean'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"No settings
Value
Data frame with mean results
References
There are no references for Rd macro \insertAllCites on this help page.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply mean method
result <- run_method("mean", data)
print(result)
pcurve (P-Curve) Method
Description
Implements the p-Curve method which analyzes the distribution of p-values from significant studies to assess whether the significant findings reflect true effects or QRP/publication bias. The method also provides tests for the evidential value, lack of evidential value, and p-hacking. See Simonsohn et al. (2014) for details.
The current implementation does not provide a test against the null hypothsis of no effect and does not produce confidence intervals of the estimate.
Usage
## S3 method for class 'pcurve'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes), sei (standard errors), and ni (sample sizes wherever available, otherwise set to Inf) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"no options
Value
Data frame with P-Curve results
References
Simonsohn U, Nelson LD, Simmons JP (2014). “p-curve and effect size: Correcting for publication bias using only significant results.” Perspectives on Psychological Science, 9(6), 666–681. doi:10.1177/1745691614553988.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply pcurve method
result <- run_method("pcurve", data)
print(result)
puniform (P-Uniform) Method
Description
Implements the p-uniform method for publication bias detection and correction. P-uniform uses the distribution of p-values from significant studies to test for publication bias and estimate the effect size corrected for publication bias. The method assumes that p-values follow a uniform distribution under the null hypothesis of no effect, and uses this to detect and correct for bias. See van Assen et al. (2015) and van Aert and van Assen (2025) for details.
Usage
## S3 method for class 'puniform'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (see Details) |
Details
The following settings are implemented
"default"Default p-uniform analysis settings.
"star"P-uniform star version of the method.
Value
Data frame with P-Uniform results
References
van Aert RCM, van Assen MALM (2025).
“Correcting for publication bias in a meta-analysis with the p-uniform* method.”
Psychonomic Bulletin & Review.
https://osf.io/preprints/metaarxiv/zqjr9/.
van Assen MALM, van Aert RCM, Wicherts JM (2015).
“Meta-analysis using effect size distributions of only statistically significant studies.”
Psychological Methods, 20(3), 293–309.
doi:10.1037/met0000025.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply puniform method
result <- run_method("puniform", data)
print(result)
Trim-and-Fill Meta-Analysis Method
Description
Implements the trim-and-fill method for adjusting publication bias in meta-analysis using the metafor package.
Usage
## S3 method for class 'trimfill'
method(method_name, data, settings)
Arguments
method_name |
Method name (automatically passed) |
data |
Data frame with yi (effect sizes) and sei (standard errors) |
settings |
List of method settings (see Details.) |
Details
The following settings are implemented
"default"Random effects model fitted with Restricted Maximum Likelihood estimator (
method = "REML") with Knapp-Hartung adjustment (test = "knha"), followed by trim-and-fill using left-side trimming (side = "left") and L0 estimator (estimator = "L0").
Value
Data frame with trim-and-fill results
References
There are no references for Rd macro \insertAllCites on this help page.
Examples
# Generate some example data
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4, 0.25),
sei = c(0.1, 0.15, 0.08, 0.12, 0.09)
)
# Apply trimfill method
result <- run_method("trimfill", data)
print(result)
Method Extra Columns
Description
Retrieves the character vector of custom columns for a given method. These are method-specific columns beyond the standard columns (method, estimate, standard_error, ci_lower, ci_upper, p_value, BF, convergence, note) that each method returns.
Usage
get_method_extra_columns(method_name)
method_extra_columns(method_name)
## Default S3 method:
method_extra_columns(method_name)
Arguments
method_name |
Character string of the method name |
Value
Character vector of extra column names, or empty character vector if no extra columns are defined for the method
Examples
# Get extra columns for PET method
get_method_extra_columns("PET")
# Get extra columns for RMA method
get_method_extra_columns("RMA")
Return Pre-specified Method Settings
Description
This function returns the list of pre-specified settings for a given Method
Usage
method_settings(method_name)
get_method_setting(method_name, version_id)
Arguments
method_name |
Character string specifying the method type |
version_id |
which method version should be used. |
Value
A list containing the pre-specified settings. For most methods, the list contains extension of the function call, however, a more elaborate list of settings that is dispatched within the method call is possible.
Examples
method_settings("RMA")
get_method_setting("RMA", version_id = "default")
Retrieve a Pre-Simulated Condition and Repetition From a DGM
Description
This function returns a pre-simulated dataset of a given repetition and
condition from a dgm. The pre-simulated datasets must be already stored
locally. See download_dgm_datasets() function for more guidance.
Usage
retrieve_dgm_dataset(dgm_name, condition_id, repetition_id = NULL)
Arguments
dgm_name |
Character string specifying the DGM type |
condition_id |
which conditions should settings be returned for. |
repetition_id |
Which repetition should be returned. The complete
condition can be returned by setting to either |
Value
A data.frame
Examples
# get condition 1, repetition 1
retrieve_dgm_dataset("no_bias", condition_id = 1, repetition_id = 1)
# get condition 1, all repetitions
retrieve_dgm_dataset("no_bias", condition_id = 1)
Retrieve Pre-Computed Performance measures for a DGM
Description
This function returns pre-computed performance measures for a specified
Data-Generating Mechanism (DGM). The pre-computed measures must be already stored
locally. See download_dgm_measures() function for more guidance.
Usage
retrieve_dgm_measures(
dgm_name,
measure = NULL,
method = NULL,
method_setting = NULL,
condition_id = NULL,
replacement = FALSE
)
Arguments
dgm_name |
Character string specifying the DGM type |
measure |
Which performance measure should be returned (e.g., "bias", "mse", "coverage").
All measures can be returned by setting to |
method |
Which method(s) should be returned. The complete results are returned by setting to |
method_setting |
Which method setting(s) should be returned. The complete results are returned by setting to |
condition_id |
which conditions should settings be returned for. |
replacement |
Whether performance measures computed using replacement should be returned. Defaults to |
Value
A data.frame
Examples
# get bias measures for all methods and conditions
retrieve_dgm_measures("no_bias", measure = "bias")
# get all measures for RMA method
retrieve_dgm_measures("no_bias", method = "RMA")
# get MSE measures for PET method in condition 1
retrieve_dgm_measures("no_bias", measure = "mse", method = "PET", condition_id = 1)
Retrieve a Pre-Computed Results of a Method Applied to DGM
Description
This function returns a pre-computed results of a given method at a specific
repetition and condition from a dgm. The pre-computed results must be already stored
locally. See download_dgm_results() function for more guidance.
Usage
retrieve_dgm_results(
dgm_name,
method = NULL,
method_setting = NULL,
condition_id = NULL,
repetition_id = NULL
)
Arguments
dgm_name |
Character string specifying the DGM type |
method |
Which method(s) should be returned. The complete results are returned by setting to |
method_setting |
Which method setting(s) should be returned. The complete results are returned by setting to |
condition_id |
which conditions should settings be returned for. |
repetition_id |
Which repetition should be returned. The complete
condition can be returned by setting to either |
Value
A data.frame
Examples
# get condition 1, repetition 1 for default method setting
retrieve_dgm_results("no_bias", condition_id = 1, repetition_id = 1)
# get condition 1, all repetitions for default method setting
retrieve_dgm_results("no_bias", condition_id = 1)
Generic method function for publication bias correction
Description
This function provides a unified interface to various publication bias
correction methods. The specific method is determined by the first argument.
See
vignette("Adding_New_Methods", package = "PublicationBiasBenchmark")
for details of extending the package with new methods
Usage
run_method(method_name, data, settings = NULL, silent = FALSE)
Arguments
method_name |
Character string specifying the method type |
data |
Data frame containing yi (effect sizes) and sei (standard errors) |
settings |
Either a character identifying a method version or list containing method-specific settings. An emty input will result in running the default (first implemented) version of the method. |
silent |
Logical indicating whether error messages from the method should be suppressed. |
Value
A data frame with standardized method results
Output Structure
The returned data frame follows a standardized schema that downstream functions rely on. All methods return the following columns:
-
method(character): The name of the method used. -
estimate(numeric): The meta-analytic effect size estimate. -
standard_error(numeric): Standard error of the estimate. -
ci_lower(numeric): Lower bound of the 95% confidence interval. -
ci_upper(numeric): Upper bound of the 95% confidence interval. -
p_value(numeric): P-value for the estimate. -
BF(numeric): Bayes Factor for the estimate. -
convergence(logical): Whether the method converged successfully. -
note(character): Additional notes describing convergence issues.
Some methods may include additional method-specific columns beyond these
standard columns. Use get_method_extra_columns() to query which
additional columns a particular method returns.
Examples
# Example usage with RMA method
data <- data.frame(
yi = c(0.2, 0.3, 0.1, 0.4),
sei = c(0.1, 0.15, 0.08, 0.12)
)
result <- run_method("RMA", data, "default")
Simulate From Data-Generating Mechanism
Description
This function provides a unified interface to various data-generating
mechanisms for simulation studies. The specific DGM is determined by
the first argument. See
vignette("Adding_New_DGMs", package = "PublicationBiasBenchmark")
for details of extending the package with new DGMs.
Usage
simulate_dgm(dgm_name, settings)
Arguments
dgm_name |
Character string specifying the DGM type |
settings |
List containing the required parameters for the DGM or numeric condition_id |
Value
A data frame containing the generated data with standardized structure
Output Structure
The returned data frame follows a standardized schema that downstream functions rely on. Across the currently implemented DGMs, the following columns are used:
-
yi(numeric): The effect size estimate. -
sei(numeric): Standard error ofyi. -
ni(integer): Total sample size for the estimate (e.g., sum over groups where applicable). -
es_type(character): Effect size type, used to disambiguate the scale ofyi. Currently used values are"SMD"(standardized mean difference / Cohen's d),"logOR"(log odds ratio), and"none"(unspecified generic continuous coefficient). -
study_id(integer/character, optional): Identifier of the primary study/cluster when a DGM yields multiple estimates per study (e.g., Alinaghi2018, PRE). If absent, each row is treated as an independent study.
See Also
validate_dgm_setting(),
dgm.Stanley2017(),
dgm.Alinaghi2018(),
dgm.Bom2019(),
dgm.Carter2019()
Examples
simulate_dgm("Carter2019", 1)
simulate_dgm("Carter2019", list(mean_effect = 0, effect_heterogeneity = 0,
bias = "high", QRP = "high", n_studies = 10))
simulate_dgm("Stanley2017", list(environment = "SMD", mean_effect = 0,
effect_heterogeneity = 0, bias = 0, n_studies = 5,
sample_sizes = c(32,64,125,250,500)))
Upload Datasets of a DGM
Description
This function uploads datasets of a specified Data-Generating Mechanism (DGM) to the OSF repository at https://osf.io/exf3m/.
This is an internal function intended for the benchmark maintainer.
It requires OSF repository authentication (via osfr::osf_auth())
and repository access.
Usage
upload_dgm_datasets(dgm_name, overwrite = FALSE, progress = TRUE, max_try = 10)
upload_dgm_results(dgm_name, overwrite = FALSE, progress = TRUE, max_try = 10)
upload_dgm_measures(dgm_name, overwrite = TRUE, progress = TRUE, max_try = 10)
Arguments
dgm_name |
Character string specifying the name of the DGM dataset to upload. |
overwrite |
Logical indicating whether to overwrite existing files on OSF.
Defaults to |
progress |
Logical indicating whether to show progress uploading files. Defaults to |
max_try |
Integet specifying how many times should the function attempt reconnecting to OSF upon failure. |
Value
TRUE if the upload was successful, otherwise an error is raised.
Validate DGM Settings
Description
This function validates the settings provided for a given Data Generating Mechanism (DGM).
Usage
validate_dgm_setting(dgm_name, settings)
Arguments
dgm_name |
Character string specifying the DGM type |
settings |
List containing the required parameters for the DGM or numeric condition_id |
Value
Error or TRUE depending whether the settings are valid for
the specified DGM.
Examples
validate_dgm_setting("Carter2019", list(mean_effect = 0,
effect_heterogeneity = 0, bias = "high",
QRP = "high", n_studies = 10))
validate_dgm_setting("Alinaghi2018", list(environment = "FE",
mean_effect = 0, bias = "positive"))
validate_dgm_setting("Stanley2017", list(environment = "SMD",
mean_effect = 0,
effect_heterogeneity = 0, bias = 0, n_studies = 5,
sample_sizes = c(32,64,125,250,500)))