Title: | Effect Estimates from All Models |
---|---|
Description: | Estimates and plots effect estimates from models with all possible combinations of a list of variables. It can be used for assessing treatment effects in clinical trials or risk factors in bio-medical and epidemiological research. Like Stata command 'confall' (Wang Z (2007) <doi:10.1177/1536867X0700700203> ), 'allestimates' calculates and stores all effect estimates, and plots them against p values or Akaike information criterion (AIC) values. It currently has functions for linear regression: all_lm(), logistic and Poisson regression: all_glm(), and Cox proportional hazards regression: all_cox(). |
Authors: | Zhiqiang Wang [aut, cre] |
Maintainer: | Zhiqiang Wang <[email protected]> |
License: | GPL-2 |
Version: | 0.2.3 |
Built: | 2024-10-25 04:17:26 UTC |
Source: | https://github.com/cran/allestimates |
Estimates hazard ratios using Proportional Hazards Regression models
("coxph"
from survival package) from models with all
possible combinations of a list of variables.
all_cox(crude, xlist, data, na_omit = TRUE, ...)
all_cox(crude, xlist, data, na_omit = TRUE, ...)
crude |
An object of formula for initial model, generally crude model. However, any other variables can also be included here as the initial model. The left-hand side of ~ is the outcome of interest, and the variable on the right-hand side of ~ is the exposure of the interest (either a treatment or a risk factor) |
xlist |
A vector of a list of variable names. |
data |
Data frame. |
na_omit |
Remove all missing values. Default is |
... |
Further optional arguments. |
A list of all effect estimates.
surival
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) results
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) results
glm
all_glm
estimates odds ratios or rate ratios using
generalized linear models (glm
) with all
possible combinations of a list of variables (potential confounding factors).
all_glm(crude, xlist, data, family = "binomial", na_omit = TRUE, ...)
all_glm(crude, xlist, data, family = "binomial", na_omit = TRUE, ...)
crude |
An object of formula for initial model, generally crude model. However, any other variables can also be included here as the initial model. |
xlist |
A vector of a list of variable names (potential confounding factors). |
data |
Data frame. |
family |
family Description of the error distribution. Default is |
na_omit |
Remove all missing values. Default is |
... |
Further optional arguments. |
A list of all effect estimates.
stats
diab_df$Overweight <- as.numeric(diab_df$BMI >= 25) vlist <- c("Age", "Sex", "Income") all_glm(crude = "Diabetes ~ Overweight", xlist = vlist, data = diab_df)
diab_df$Overweight <- as.numeric(diab_df$BMI >= 25) vlist <- c("Age", "Sex", "Income") all_glm(crude = "Diabetes ~ Overweight", xlist = vlist, data = diab_df)
lm
all_lm
estimates coefficients of a specific variable using
linear models (lm
) with all possible combinations of other variables (potential confounding factors).
all_lm(crude, xlist, data, na_omit = TRUE, ...)
all_lm(crude, xlist, data, na_omit = TRUE, ...)
crude |
An object of formula for initial model, generally crude model. However, additional variables can also be included here as the initial model. |
xlist |
A vector of a list of variable names (potential confounding factors). |
data |
Data frame. |
na_omit |
Remove all missing values. Default is |
... |
Further optional arguments. |
A list of all effect estimates.
lm
vlist <- c("Age", "Sex", "Cancer", "CVD", "Education", "Income") all_lm(crude = "BMI ~ Married", xlist = vlist, data = diab_df)
vlist <- c("Age", "Sex", "Cancer", "CVD", "Education", "Income") all_lm(crude = "BMI ~ Married", xlist = vlist, data = diab_df)
all_plot()
generates a scatter plot with effect estimates of all possible models
again p values.
all_plot( data, xlabels = c(0, 0.001, 0.01, 0.05, 0.2, 0.5, 1), xlim = c(0, 1), xlab = "P value", ylim = NULL, ylab = NULL, yscale_log = FALSE, title = NULL )
all_plot( data, xlabels = c(0, 0.001, 0.01, 0.05, 0.2, 0.5, 1), xlim = c(0, 1), xlab = "P value", ylim = NULL, ylab = NULL, yscale_log = FALSE, title = NULL )
data |
Object from |
xlabels |
Numeric vector x-axis tick labels. Default is
|
xlim |
Vector of 2 numeric values for x-axis limits. Default is |
xlab |
Character string for x-axis name. Default is |
ylim |
Vector of 2 numeric values for y-axis limits. |
ylab |
Character string for y-axis name. Default depends on original model types. |
yscale_log |
TRUE or FALSE to re-scale y-axis to "log10". Default is |
title |
Character for plot title. Default is |
A ggplot2 object: scatter plot
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot(results)
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot(results)
all_plot_aic()
generates a scatter plot with all effect estimates against AIC.
all_plot_aic(data, xlab = "AIC", ylab = NULL, title = NULL)
all_plot_aic(data, xlab = "AIC", ylab = NULL, title = NULL)
data |
Object from |
xlab |
Character string for x-axis name. Default is |
ylab |
Character string for y-axis name. Default depends on original model types. |
title |
Character for plot title. Default is |
A ggplot2 object: scatter plot
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot_aic(results)
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot_aic(results)
all_plot_aic2()
draws multiple scatter plots of
all effect estimates against AIC. Each plot indicates if a specific
variable is included in the models.
all_plot_aic2(data, xlab = "AIC", ylab = NULL, title = NULL)
all_plot_aic2(data, xlab = "AIC", ylab = NULL, title = NULL)
data |
Object from |
xlab |
Character string for x-axis name. Default is |
ylab |
Character string for y-axis name. Default depends on original model types. |
title |
Character for plot title. Default is |
A ggplot2 object: scatter plot.
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot_aic(data = results)
vlist <- c("Age", "Sex", "Married", "BMI", "Education", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot_aic(data = results)
all_plot2()
generates a panel of scatter plots with effect estimates of all possible models
again p values. Each plot includes effect estimates from all models including a specific variable.
all_plot2( data, xlabels = c(0, 0.001, 0.01, 0.05, 0.2, 0.5, 1), xlim = c(0, 1), xlab = "P value", ylim = NULL, ylab = NULL, yscale_log = FALSE, title = NULL )
all_plot2( data, xlabels = c(0, 0.001, 0.01, 0.05, 0.2, 0.5, 1), xlim = c(0, 1), xlab = "P value", ylim = NULL, ylab = NULL, yscale_log = FALSE, title = NULL )
data |
Object from |
xlabels |
numeric vector x-axis tick labels. Default is
|
xlim |
vector of 2 numeric values for x-axis limits. Default is |
xlab |
Character string for x-axis name. Default is |
ylim |
vector of 2 numeric values for y-axis limits. |
ylab |
Character string for y-axis name. Default depends on original model types. |
yscale_log |
TRUE or FALSE re-scale y-axis to "log10". Default is |
title |
Character title. Default is |
A ggplot2 object: scatter plot
vlist <- c("Age", "Sex", "Married", "BMI", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot2(results)
vlist <- c("Age", "Sex", "Married", "BMI", "Income") results <- all_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) all_plot2(results)
To assess treatment effects in clinical trials and risk factors in bio-medical
and epidemiological research, we use
regression coefficients, odds ratios or hazard ratios as
effect estimates. allestimates
allows users to quickly obtain
effect estimates from models with all possible combinations of a list of variables
specified by users. all_lm
for linear regression, all_glm
for
logistic regression, all_speedglm
using speedlm
as a faster alternative of all_glm
, and
all_cox
for Cox Proportional Hazards Models. Users can further
use those values in a returned list of results.
all_plot
draws scatter plots with all effect
estimate values against p values, as Stata confall
command
(Wang Z (2007) <doi:10.1177/1536867X0700700203>).
Those plots divide estimates into four categories:
positive and significant: left-top quarter
negative and significant: left-bottom quarter
positive and non-significant: right-top quarter
negative and non-significant: right-bottom quarter
all_plot2
draws multiple plots. Each of those plots
indicates whether a specific variable is included or
not included in models.
Those effect estimates help users better understand
confounding effects, uncertainty of their estimates, as well as
inappropriately including variables in the models. This is a tool for
calculating and exploring effect estimates from all possible models.
Interpretation of the results should be in the context of other
analyses and biological knowledge.
? all_speedglm ? all_glm ? all_cox ? all_lm ? all_plot ? all_plot2
? all_speedglm ? all_glm ? all_cox ? all_lm ? all_plot ? all_plot2
A data frame with 2372 rows and 14 variables with diabetes status
diabetes
and mortality status endpoint
. For the purpose
of demonstrate, assume that we are interested in the association
between diabetes
and endpoint
. Other variables are
considered as possible confounding factors. The purposes of this dataset
is to illustrate those functions in chest and allestimates packages only.
Therefore, we assume it is a cohort design for Cox Proportional Hazard regression,
and a case-control design for logistic regression.
diab_df
diab_df
A data frame with 2372 rows and 14 variables:
diabetes status 1: with diabetes 0: without diabetes
mortality status 1: reached end point, and 0: survived
Age, in years
sex, 1: male, 2: Female
Body mass index
marital status 1: married, 0: not
smoking status 1: smoker, 0: non-smoker
cardiovascular disease 1: yes 0: no
cancer 1: yes, 0: no
education 1: high, 0: low
income 1: high, 0: low
time (age) at the start of the follow-up
time (age) at the end of the follow-up
matched set id, for conditional logistic regression