--- title: "Change-in-estimate Approach: Assessing Confounding Effects" output: rmarkdown::html_vignette author: "Zhiqiang Wang" vignette: > %\VignetteIndexEntry{chest-vignette} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- *Badges* Confounding; Logistic regression; Cox proportional hazards model; Linear regression; ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Overview The _'chest'_ package systematically calculates and compares effect estimates from various models with different combinations of variables. It calculates the changes in effect estimates when each variable is added to the model sequentially in a step-wise fashion. Effect estimates here can be regression coefficients, odds ratios and hazard ratios depending on modelling methods. At each step, only one variable that causes the largest change among the remaining variables is added to the model. The final results from many models are summarized in one graph and one data frame table. This approach can be used for assessing confounding effects in epidemiological studies and bio-medical research including clinical trials. ## Installation You can install the released version of chest from [CRAN](https://CRAN.R-project.org) with: ```{r, eval=FALSE} install.packages("chest") ``` ## Getting Started ```{r setup} library(chest) library(ggplot2) names(diab_df) ``` ### Data: diabetes and mortality A data frame 'diab_df' is used to examine the association between `Diabetes` and mortality `Endpoint`. The purpose of using this data set is to demonstrate the use of the functions in this package rather than answering any research questions. ### chest_glm: Logistic regression using (generalized linear models, glm). 'chest_glm' is slow. We can use `indicate = TRUE` to monitor the progress. ```{r, chest_glm, eval=FALSE} vlist <- c("Age", "Sex", "Married", "Smoke", "Education") results <- chest_glm(crude = "Endpoint ~ Diabetes", xlist = vlist, data = diab_df, indicate = TRUE) ``` ### chest_cox: Using Cox Proportional Hazards Models: 'coxph' of 'survival' package ```{r, coxhp, fig.height=5, fig.width = 6.4} vlist <- c("Age", "Sex", "Married", "Smoke", "Education") results <- chest_cox(crude = "Surv(t0, t1, Endpoint) ~ Diabetes", xlist = vlist, data = diab_df) chest_plot(results) chest_forest(results) ``` ### chest_clogit: Conditional logistic regression: 'clogit' of 'survival' package ```{r, clogit, fig.height=3.2, fig.width = 6.4} results <- chest_clogit(crude = "Endpoint ~ Diabetes + strata(mid)", xlist = vlist, data = diab_df) ``` ### chest_lm: linear regression ```{r, lm, fig.height=6, fig.width = 6.4} vlist<-c("Age", "Sex", "Married", "Cancer", "CVD","Education", "Income") results <- chest_lm(crude = "BMI ~ Diabetes", xlist = vlist, data = diab_df) chest_plot(results) ``` ## Notes: * Because 'chest' fits many models and compares effect estimates, some analyses may take long time to complete. * Possible alternative explanations: Although a large change the presence of possible confounding effects, we also need to keep in mind alternative explanations. + Different sample sizes: When different models are fitted using different sample sizes due to missing values, this may also contribute to the change in effect estimates. The change can partly reflect the selection bias. Removing all the missing values can be helpful for distinguish the two. + Intermediate measurements: Some variables in observational studies may be the intermediate factors of the causal pathway. This is more a design issue than a analysis issue. The package can be used to identify the change-in-effect estimates but cannot be used to distinguish confounding factors from intermediate factors.