r - Adding an column for the category of glm coeffients in broom results -
is there way add column result of broom package's tidy
function can act relate term column both original names used in formula
argument , columns in data
argument.
for example if run following get:
library(ggplot2) library(dplyr) mod <- glm(mpg ~ wt + qsec + as.factor(carb), data = mtcars) tidy(mod) # term estimate std.error statistic p.value # 1 (intercept) 21.132995090 7.5756463 2.78959633 1.017187e-02 # 2 wt -4.916303175 0.6747590 -7.28601380 1.584408e-07 # 3 qsec 0.843355538 0.3930252 2.14580532 4.221188e-02 # 4 as.factor(carb)2 0.004133826 1.5321134 0.00269812 9.978695e-01 # 5 as.factor(carb)3 -0.755346006 2.3451222 -0.32209239 7.501715e-01 # 6 as.factor(carb)4 -0.489721798 2.0628564 -0.23739985 8.143615e-01 # 7 as.factor(carb)6 -0.886846134 3.4443957 -0.25747510 7.990068e-01 # 8 as.factor(carb)8 -0.894783610 3.7496630 -0.23863041 8.134180e-01
what looking this:
# term estimate std.error statistic p.value term_base # 1 (intercept) 21.132995090 7.5756463 2.78959633 1.017187e-02 # 2 wt -4.916303175 0.6747590 -7.28601380 1.584408e-07 wt # 3 qsec 0.843355538 0.3930252 2.14580532 4.221188e-02 qsec # 4 as.factor(carb)2 0.004133826 1.5321134 0.00269812 9.978695e-01 carb # 5 as.factor(carb)3 -0.755346006 2.3451222 -0.32209239 7.501715e-01 carb # 6 as.factor(carb)4 -0.489721798 2.0628564 -0.23739985 8.143615e-01 carb # 7 as.factor(carb)6 -0.886846134 3.4443957 -0.25747510 7.990068e-01 carb # 8 as.factor(carb)8 -0.894783610 3.7496630 -0.23863041 8.134180e-01 carb
not bothered if first row in new column empty, intercept
or 1
. need can match term column original variable names passed formula?
edit
would if didn't depend on using as.factor
in formula, e.g. work on:
mod <- glm(mpg ~ wt + qsec + carb, data = mtcars %>% mutate(carb = factor(carb))) tidy(mod) # term estimate std.error statistic p.value # 1 (intercept) 21.132995090 7.5756463 2.78959633 1.017187e-02 # 2 wt -4.916303175 0.6747590 -7.28601380 1.584408e-07 # 3 qsec 0.843355538 0.3930252 2.14580532 4.221188e-02 # 4 carb2 0.004133826 1.5321134 0.00269812 9.978695e-01 # 5 carb3 -0.755346006 2.3451222 -0.32209239 7.501715e-01 # 6 carb4 -0.489721798 2.0628564 -0.23739985 8.143615e-01 # 7 carb6 -0.886846134 3.4443957 -0.25747510 7.990068e-01 # 8 carb8 -0.894783610 3.7496630 -0.23863041 8.134180e-01
we can use regex create 'term_base' column
tidy(mod) %>% mutate(term_base = sub("intercept", "", gsub(".*\\(|\\).*", "", term))) # term estimate std.error statistic p.value term_base #1 (intercept) 21.132995090 7.5756463 2.78959633 1.017187e-02 #2 wt -4.916303175 0.6747590 -7.28601380 1.584408e-07 wt #3 qsec 0.843355538 0.3930252 2.14580532 4.221188e-02 qsec #4 as.factor(carb)2 0.004133826 1.5321134 0.00269812 9.978695e-01 carb #5 as.factor(carb)3 -0.755346006 2.3451222 -0.32209239 7.501715e-01 carb #6 as.factor(carb)4 -0.489721798 2.0628564 -0.23739985 8.143615e-01 carb #7 as.factor(carb)6 -0.886846134 3.4443957 -0.25747510 7.990068e-01 carb #8 as.factor(carb)8 -0.894783610 3.7496630 -0.23863041 8.134180e-01 carb
the as.factor
can removed 'term' if mutate
'carb' factor
before glm
step
mtcars %>% mutate(carb = factor(carb)) %>% glm(formula = mpg ~wt + qsec + carb, data = .) %>% tidy(.) %>% mutate(term_base = sub("\\(.*\\)|\\d+", "", term)) # term estimate std.error statistic p.value term_base #1 (intercept) 21.132995090 7.5756463 2.78959633 1.017187e-02 #2 wt -4.916303175 0.6747590 -7.28601380 1.584408e-07 wt #3 qsec 0.843355538 0.3930252 2.14580532 4.221188e-02 qsec #4 carb2 0.004133826 1.5321134 0.00269812 9.978695e-01 carb #5 carb3 -0.755346006 2.3451222 -0.32209239 7.501715e-01 carb #6 carb4 -0.489721798 2.0628564 -0.23739985 8.143615e-01 carb #7 carb6 -0.886846134 3.4443957 -0.25747510 7.990068e-01 carb #8 carb8 -0.894783610 3.7496630 -0.23863041 8.134180e-01 carb
Comments
Post a Comment