Canadareg <-read.csv("~/Desktop/Canada.csv",header=TRUE) #browse to select data "Canada.csv"
male <- ifelse(Canadareg$ITSEX==2,0,1)
Canadareg <- cbind(Canadareg,male)
Canadareg[Canadareg==999999]=NA
Canadareg <- na.omit(Canadareg)
CanadaBMA <- subset(Canadareg, select=c(ASRREA01,male,ASBG04,ASBGSBS,ASBGSMR,ASBGSCR,
ASBR05E,ASBR05F,ASBR05G))
male (1=male)
ASBG04 = # of books in home 1) 0-10, 2) 11-25, 3) 26-100, 4) 101-200 5) More than 200
ASBGSBS = Bullying/teasing at school (higher values mean less bullying/teasing)
ASBGSMR = Students motivated to read (4pt, strongly agree to strongly disagree)
ASBGSCR = Students confidence in their reading (")
ASBR05E = Teacher is easy to understand (")
ASBR05F = Interested in what teacher says (")
ASBR05G = Teacher gives interesting things to read (")
ASRREA01 is the first plausible value reading score
y <- CanadaBMA[,1]
x <- CanadaBMA[,2:9]
BMA
using bicreg
and get plots. The analysis assumes unit information priors
on the parameters and a uniform
prior on the model space.CanadaBMA1 <- bicreg(x,y)
summary(CanadaBMA1)
##
## Call:
## bicreg(x = x, y = y)
##
##
## 5 models were selected
## Best 5 models (cumulative posterior probability = 1 ):
##
## p!=0 EV SD model 1 model 2 model 3
## Intercept 100.0 399.9802 17.2465 400.076 399.998 392.000
## male 9.0 0.5434 2.0692 . 6.006 .
## ASBG04 100.0 8.8294 1.9433 8.817 8.969 8.803
## ASBGSBS 100.0 4.4090 1.1079 4.422 4.247 4.314
## ASBGSMR 100.0 -5.7355 1.0745 -5.754 -5.782 -5.879
## ASBGSCR 100.0 13.1241 1.0580 13.143 13.022 12.927
## ASBR05E 5.3 0.1873 1.0482 . . 3.545
## ASBR05F 4.1 -0.1153 0.8138 . . .
## ASBR05G 4.0 -0.1001 0.7276 . . .
##
## nVar 4 5 5
## r2 0.207 0.209 0.208
## BIC -180.823 -176.526 -175.450
## post prob 0.775 0.090 0.053
## model 4 model 5
## Intercept 404.301 404.171
## male . .
## ASBG04 8.819 8.799
## ASBGSBS 4.544 4.502
## ASBGSMR -5.383 -5.443
## ASBGSCR 13.183 13.191
## ASBR05E . .
## ASBR05F -2.799 .
## ASBR05G . -2.499
##
## nVar 5 5
## r2 0.208 0.208
## BIC -174.952 -174.895
## post prob 0.041 0.040
Notice that 5 models are retained using the Occam’s window criterion. Note that the BICs are all roughly the same, indicating very little difference in the choice among the competing models. However, note that the model with the lowest BIC measure obtains a posterior model probability of 0.775, suggesting some model uncertainty. This model uncertainity is not accounted for if one were to choose only a single model to report. Averaging across all of these models takes into account model uncertainty.
The column labeled EV
provides the model averaged coefficients, summed and weighted by the respective posterior model probabilities. The column labeled p! = 0
, gives the posterior inclusion probability
indicating the percentage of times that the variable appears in the models. So, e.g. male
only appears in 9% of the models while ASBG04
appears across all models. This is an alternative measure of the importance of a variable in the context of model uncertainty.
These are plots of the posterior distribuion of the coefficients produced by model averaging. The posterior probability that the coefficient is zero is represented by a solid line at zero, with heightequal to the probability. The nonzero part of the distribution is scaled so that the maximum height is equal to the probability that the coefficient is nonzero.