Practice: Chp. 13#

source("iplot.R")
suppressPackageStartupMessages(library(rethinking))

13E1. Which of the following priors will produce more shrinkage in the estimates?

(a) \(\alpha_{tank} \sim Normal(0,1)\)

(b) \(\alpha_{tank} \sim Normal(0,2)\)

Answer. We could interpret this question in two ways. The first is that the author meant to write:

(a) \(\bar{\alpha} \sim Normal(0,1)\)

(b) \(\bar{\alpha} \sim Normal(0,2)\)

That is, interpretion #1 is that these are priors for the hyperparameter for the average of all tanks, with no new prior specified for the hyperparameter for the variability among tanks \(\sigma\).

The second interpretation is that the author is trying to express in shorthand new priors for both the \(\bar{\alpha}\) and \(\sigma\) priors, and only the expected value of these two priors, ignoring a standard deviation. To use the same terms as the chapter:

(a) \(\alpha_{tank} \sim Normal(\bar{\alpha},\sigma), \bar{\alpha} \sim Normal(0,?), \sigma \sim Exponential(1)\)

(b) \(\alpha_{tank} \sim Normal(\bar{\alpha},\sigma), \bar{\alpha} \sim Normal(0,?), \sigma \sim Exponential(\frac{1}{2})\)

Notice we decrease the \(\lambda\) parameter to the exponential distribution to increase the expected value.

Said another way, the first interpretation is that the author meant to specify this prior at the second level of the model, and the second interpretation is that the author meant to specify this prior at the first level. The first interpretation makes sense given this prior shows up directly in models in the chapter already, without much reinterpretation. The second interpretation is supported by the author’s choice of variable name (e.g. see the chapter and question 13M3). See also question 13M5.

In both interpretations, prior (a) will produce more shrinkage because it is more regularizing. It has a smaller standard deviation and therefore expects less variability.

13E2. Rewrite the following model as a multilevel model.

\[\begin{split} \begin{align} y_i & \sim Binomial(1,p_i) \\ logit(p_i) & = \alpha_{group[i]} + \beta x_i \\ \alpha_{group} & \sim Normal(0, 1.5) \\ \beta & \sim Normal(0, 0.5) \end{align} \end{split}\]

Answer. The focus of this chapter is on varying intercepts rather than varying effects (see the next chapter), so we’ll only convert the intercept:

\[\begin{split} \begin{align} y_i & \sim Binomial(1,p_i) \\ logit(p_i) & = \alpha_{group[i]} + \beta x_i \\ \alpha_{group} & \sim Normal(\bar{\alpha}, \sigma) \\ \bar{\alpha} & \sim Normal(0, 1.5) \\ \sigma & \sim Exponential(1) \\ \beta & \sim Normal(0, 0.5) \end{align} \end{split}\]

13E3. Rewrite the following model as a multilevel model.

\[\begin{split} \begin{align} y_i & \sim Normal(\mu_i,\sigma) \\ \mu_i & = \alpha_{group[i]} + \beta x_i \\ \alpha_{group} & \sim Normal(0, 5) \\ \beta & \sim Normal(0, 1) \\ \sigma & \sim Exponential(1) \end{align} \end{split}\]

Answer. As in the last question, we’ll only convert the intercept:

\[\begin{split} \begin{align} y_i & \sim Normal(\mu_i,\sigma_1) \\ \mu_i & = \alpha_{group[i]} + \beta x_i \\ \alpha_{group} & \sim Normal(\bar{\alpha}, \sigma_2) \\ \bar{\alpha} & \sim Normal(0, 5) \\ \sigma_2 & \sim Exponential\left(\frac{1}{2}\right) \\ \beta & \sim Normal(0, 1) \\ \sigma_1 & \sim Exponential(1) \end{align} \end{split}\]

13E4. Write a mathematical model formula for a Poisson regression with varying intercepts.

\[\begin{split} \begin{align} y_i & \sim Poisson(\lambda) \\ log(\lambda) & = \alpha_{group[i]} \\ \alpha_{group} & \sim Normal(\bar{\alpha}, \sigma) \\ \bar{\alpha} & \sim Normal(0, 5) \\ \sigma & \sim Exponential(1) \end{align} \end{split}\]

13E5. Write a mathematical model formula for a Poisson regression with two different kinds of varying intercepts, a cross-classified model.

\[\begin{split} \begin{align} y_i & \sim Poisson(\lambda) \\ log(\lambda) & = \alpha_{x[i]} + \alpha_{y[i]} \\ \alpha_{x} & \sim Normal(\mu_x, \sigma_x) \\ \mu_x & \sim Normal(0, 5) \\ \sigma_x & \sim Exponential(1) \\ \alpha_{y} & \sim Normal(0, \sigma_y) \\ \sigma_y & \sim Exponential(1) \end{align} \end{split}\]

13M1. Revisit the Reed frog survival data, data(reedfrogs), and add the predation and size treatment variables to the varying intercept model. Consider models with either main effect alone, both main effects, as well as a model including both and their interaction. Instead of focusing on inferences about these two predictor variables, focus on the inferred variation across tanks. Explain why it changes as it does across models.

ERROR. The predation predictor is actually labeled pred.

ERROR. The author duplicated this question in 13H4, without realizing he had moved it and primarily only changed wording.

Compare the second sentences:

Consider models with either main effect alone, both main effects, as well as a model including both and their interaction.

Consider models with either predictor alone, both predictors, as well as a model including their interaction.

One sentence only exists in 13H4:

What do you infer about the causal inference of these predictor variables?

Compare the last two sentences:

Instead of focusing on inferences about these two predictor variables, focus on the inferred variation across tanks. Explain why it changes as it does across models.

Also focus on the inferred variation across tanks (the \(\sigma\) across tanks). Explain why it changes as it does across models with different predictors included.

We could treat 13H4 as a separate question that expands on 13M1 by adding causal inference, but this answer will combine the two.

Answer. First, let’s reproduce results from model m13.2 in the chapter:

data(reedfrogs)
rf_df <- reedfrogs
rf_df$tank <- 1:nrow(rf_df)

rf_dat <- list(
  S = rf_df$surv,
  N = rf_df$density,
  tank = rf_df$tank
)

## R code 13.3
m13.2 <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank],
    a[tank] ~ dnorm(a_bar, sigma),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE
)

iplot(function() {
  plot(precis(m13.2, depth=2), main='m13.2')
}, ar=1.0)
../_images/practice-13_3_0.png

Raw data (preceding plot):

display(precis(m13.2, depth = 2), mimetypes="text/plain")

rf_df$Predator <- as.integer(as.factor(rf_df$pred))
rf_df$Size <- as.integer(as.factor(rf_df$size))
rf_df$Treatment <- 1 + ifelse(rf_df$Predator == 1, 0, 1) + 2*ifelse(rf_df$Size == 1, 0, 1)
      mean         sd        5.5%        94.5%       n_eff    Rhat4    
a[1]   2.123895005 0.8800029  0.88324287  3.60357846 3135.042 0.9993171
a[2]   3.069417613 1.0879816  1.53213514  4.97198488 2880.482 0.9991000
a[3]   1.025435079 0.6859747 -0.01876991  2.11478208 4086.947 0.9992862
a[4]   3.045003704 1.0904417  1.44463933  4.93253177 3026.895 0.9986501
a[5]   2.133584064 0.8865480  0.82569033  3.69185945 3242.085 0.9996229
a[6]   2.129875960 0.8622741  0.85550909  3.51833473 4101.535 0.9983410
a[7]   3.042309196 1.0803602  1.50263737  4.89527573 3300.782 0.9992896
a[8]   2.143396095 0.8774313  0.87912951  3.65976593 4755.805 0.9987193
a[9]  -0.181523811 0.6239130 -1.20279368  0.81605579 3355.371 0.9989467
a[10]  2.160446881 0.8868395  0.90286995  3.69858829 2919.991 1.0002489
a[11]  0.982944595 0.6409224 -0.01787153  2.03912637 4340.708 0.9989579
a[12]  0.587181118 0.6201083 -0.37215354  1.60414730 5487.450 0.9985093
a[13]  1.013060494 0.6513388  0.04314274  2.09267998 4561.709 0.9991331
a[14]  0.208921393 0.5888650 -0.77247363  1.13103679 3773.708 0.9990913
a[15]  2.112968097 0.8671883  0.85145433  3.61051924 4976.691 0.9991278
a[16]  2.153449499 0.9001026  0.84944705  3.64511067 3623.187 0.9985997
a[17]  2.922906230 0.8207311  1.72786343  4.31192056 2727.910 1.0017675
a[18]  2.391108593 0.6594986  1.42438049  3.46904519 3825.827 0.9988642
a[19]  2.016368646 0.5883191  1.14001978  3.00910079 4198.075 0.9993717
a[20]  3.696619684 1.0349830  2.22233524  5.54831067 2697.947 0.9997244
a[21]  2.392176052 0.6338009  1.49698276  3.49281857 3195.078 0.9992764
a[22]  2.380392651 0.6348463  1.44870379  3.44691853 3352.341 0.9986587
a[23]  2.387766858 0.6677886  1.38891610  3.52825066 3481.055 0.9992638
a[24]  1.707655898 0.5241377  0.91766718  2.60417300 5142.443 0.9981182
a[25] -0.991016689 0.4439675 -1.72522983 -0.28233503 4224.010 0.9982751
a[26]  0.162679937 0.3803444 -0.44872939  0.76593787 5110.381 0.9991206
a[27] -1.428071502 0.4897130 -2.22955461 -0.68573472 3306.490 0.9985784
a[28] -0.470549934 0.4166839 -1.15849296  0.19914758 4359.714 0.9992520
a[29]  0.156650932 0.4082734 -0.48325866  0.81498354 3888.688 0.9992465
a[30]  1.441025015 0.4863160  0.70097745  2.24540124 3645.397 0.9990923
a[31] -0.633425518 0.4092424 -1.29825277  0.01788939 3807.050 0.9988602
a[32] -0.307137176 0.4049534 -0.96427001  0.31531593 4200.218 0.9989423
a[33]  3.196570698 0.7447295  2.10154436  4.49980149 2880.864 0.9990879
a[34]  2.712794013 0.6433201  1.74888220  3.78643188 3095.331 0.9988198
a[35]  2.731620944 0.6664324  1.77676015  3.83311873 2992.938 0.9990613
a[36]  2.051060489 0.5000432  1.28411900  2.86453520 3690.270 0.9992633
a[37]  2.065591183 0.5017172  1.30591712  2.92647140 3638.510 0.9989189
a[38]  3.891461657 0.9699415  2.47805339  5.58549440 2735.887 1.0004954
a[39]  2.705455255 0.6495701  1.77860035  3.84518415 2741.298 0.9996879
a[40]  2.354011311 0.5506689  1.50543484  3.26092214 4252.637 0.9991978
a[41] -1.799497041 0.4835684 -2.57131168 -1.07337199 3708.538 0.9989449
a[42] -0.576316697 0.3417616 -1.12795518 -0.03377259 5124.122 0.9981467
a[43] -0.450266837 0.3399222 -0.99046630  0.08640756 5109.116 0.9986930
a[44] -0.333124738 0.3317197 -0.84461886  0.18941723 2979.151 0.9985858
a[45]  0.585451130 0.3402349  0.04255837  1.14995230 4701.920 0.9993146
a[46] -0.569382107 0.3518077 -1.14556430 -0.01996810 6426.916 0.9987633
a[47]  2.070103142 0.5140356  1.30579740  2.93810996 3331.361 0.9987141
a[48] -0.002109234 0.3203171 -0.52018692  0.50742515 3988.177 0.9981734
a_bar  1.346620398 0.2500101  0.96140376  1.74271963 3215.743 0.9989353
sigma  1.610600906 0.2106371  1.30392541  1.97048034 1805.981 1.0002795

The reedfrogs data.frame is small enough to show in its entirety. Notice several new preprocessed variables (columns) this solution will introduce later as they are used in models:

display(rf_df)
A data.frame: 48 × 9
densitypredsizesurvpropsurvtankPredatorSizeTreatment
<int><fct><fct><int><dbl><int><int><int><dbl>
10no big 90.9000000 1111
10no big 101.0000000 2111
10no big 70.7000000 3111
10no big 101.0000000 4111
10no small 90.9000000 5123
10no small 90.9000000 6123
10no small101.0000000 7123
10no small 90.9000000 8123
10predbig 40.4000000 9212
10predbig 90.900000010212
10predbig 70.700000011212
10predbig 60.600000012212
10predsmall 70.700000013224
10predsmall 50.500000014224
10predsmall 90.900000015224
10predsmall 90.900000016224
25no big 240.960000017111
25no big 230.920000018111
25no big 220.880000019111
25no big 251.000000020111
25no small230.920000021123
25no small230.920000022123
25no small230.920000023123
25no small210.840000024123
25predbig 60.240000025212
25predbig 130.520000026212
25predbig 40.160000027212
25predbig 90.360000028212
25predsmall130.520000029224
25predsmall200.800000030224
25predsmall 80.320000031224
25predsmall100.400000032224
35no big 340.971428633111
35no big 330.942857134111
35no big 330.942857135111
35no big 310.885714336111
35no small310.885714337123
35no small351.000000038123
35no small330.942857139123
35no small320.914285740123
35predbig 40.114285741212
35predbig 120.342857142212
35predbig 130.371428643212
35predbig 140.400000044212
35predsmall220.628571445224
35predsmall120.342857146224
35predsmall310.885714347224
35predsmall170.485714348224

The help is also short:

display(help(reedfrogs))
reedfrogs {rethinking}R Documentation

Data on reed frog predation experiments

Description

Data on lab experiments on the density- and size-dependent predation rate of an African reed frog, Hyperolius spinigularis, from Vonesh and Bolker 2005

Usage

data(reedfrogs)

Format

Various data with variables:

density

initial tadpole density (number of tadpoles in a 1.2 x 0.8 x 0.4 m tank) [experiment 1]

pred

factor: predators present or absent [experiment 1]

size

factor: big or small tadpoles [experiment 1]

surv

number surviving

propsurv

proportion surviving (=surv/density) [experiment 1]

Source

Vonesh and Bolker (2005) Compensatory larval responses shift trade-offs associated with predator-induced hatching plasticity. Ecology 86:1580-1591

Examples

data(reedfrogs)
boxplot(propsurv~size*density*pred,data=reedfrogs)

[Package rethinking version 2.13 ]

Our first model will add the pred predictor on only the first level:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta_{pred[i]} \\ \alpha_j & \sim Normal(\bar{\alpha}, \sigma), j = 1..48 \\ \bar{\alpha} & \sim Normal(0, 1.5) \\ \sigma & \sim Exponential(1) \\ \beta_j & \sim Normal(0, 1.5), j = 1..2 \end{align} \end{split}\]

Seeing as this is a chapter on multilevel models, and it’s generally advisable to consider a multilevel model when we have exchangeable index variables, should we add a second level of parameters for this predictor? In other words:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta_{pred[i]} \\ \alpha_j & \sim Normal(\bar{\alpha}, \sigma_{\alpha}), j = 1..48 \\ \bar{\alpha} & \sim Normal(0, 1.5) \\ \sigma_{\alpha} & \sim Exponential(1) \\ \beta_j & \sim Normal(0, \sigma_{\beta}), j = 1..2 \\ \sigma_{\beta} & \sim Exponential(1) \end{align} \end{split}\]

The problem with this model is that there are only 2 (rather than 48) clusters for us to learn from, that is, we are adding one second-level parameter to go with only two first-level parameters. Said another way, do our clusters have anything to learn from each other? It may help to not forget about the last tank of tadpoles when learning about a new one, or the last chimp when learning about a new one, but in this case there is only one other pool to learn from: either all predator pools, or all non-predator pools.

We could try to remember what we learned about one predator pool when moving to another predator pool. In this approach, we would learn one \(\bar{\alpha}\) and one \(\sigma\) for the predator pools, and one \(\bar{\alpha}\) and one \(\sigma\) for the non-predator pools. Presumably this would require reindexing predator and non-predator pools from 1..24. The original model from the chapter already has an intercept for individual pools this approach would likely not differ significantly from, so we will avoid these complications and only add pred on the first level.

Now let’s fit our new model, with a pred predictor on the first level. This model struggles to fit as provided above. To avoid Stan warnings about RHat, Bulk ESS, and Tail ESS, we’re actually going to fit a model with a different \(\bar{\alpha}\) prior than the one proposed above:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta_{pred[i]} \\ \alpha_j & \sim Normal(\bar{\alpha}, \sigma), j = 1..48 \\ \bar{\alpha} & \sim Normal(0, 0.1) \\ \sigma & \sim Exponential(1) \\ \beta_j & \sim Normal(0, 1.5), j = 1..2 \end{align} \end{split}\]

See more comments on this tight \(\bar{\alpha}\) prior below.

rf_dat <- list(
  S = rf_df$surv,
  N = rf_df$density,
  tank = rf_df$tank,
  Predator = rf_df$Predator
)

m_rf_pred_orig <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank] + bPredator[Predator],
    a[tank] ~ dnorm(a_bar, sigma),
    a_bar ~ dnorm(0, 0.1),
    sigma ~ dexp(1),
    bPredator[Predator] ~ dnorm(0, 1.5)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE, iter=2000
)

iplot(function() {
  plot(precis(m_rf_pred_orig, depth=2), main='m_rf_pred_orig')
}, ar=1.0)
../_images/practice-13_11_0.png

Raw data (preceding plot):

display(precis(m_rf_pred_orig, depth = 2), mimetypes="text/plain")
             mean         sd         5.5%       94.5%       n_eff     Rhat4    
a[1]         -0.091053566 0.67733826 -1.1619387  1.00363590 5706.8320 0.9992140
a[2]          0.405284429 0.72231669 -0.6835375  1.63240635 4280.2961 0.9999209
a[3]         -0.893632468 0.65422903 -1.9246907  0.14803434 3835.9654 1.0001504
a[4]          0.396810887 0.73341977 -0.7484879  1.60863213 4610.8823 0.9996852
a[5]         -0.081867389 0.68455884 -1.1372001  1.03591740 5469.9431 1.0000679
a[6]         -0.073562695 0.71342897 -1.2177587  1.09017998 5130.5129 0.9995533
a[7]          0.391493658 0.76835333 -0.7467720  1.67546422 4158.3462 0.9997849
a[8]         -0.077672429 0.71342739 -1.2207131  1.08462669 4612.9119 0.9993553
a[9]         -0.275556310 0.53007489 -1.1456814  0.56789648 4072.1194 0.9993842
a[10]         1.083916853 0.59780561  0.1604030  2.07252886 3372.1984 0.9993962
a[11]         0.515366726 0.55919568 -0.3630818  1.42150703 3304.4540 1.0008314
a[12]         0.256253729 0.54397209 -0.6027094  1.10748121 3376.5464 1.0001392
a[13]         0.507371149 0.54253780 -0.3383925  1.37422264 4165.1205 0.9994616
a[14]        -0.019154027 0.53076770 -0.8700474  0.83096797 3398.9324 1.0000889
a[15]         1.084497824 0.59215692  0.1710215  2.09371321 3286.2663 1.0000800
a[16]         1.083833289 0.60537079  0.1432592  2.10200025 3836.9189 1.0000499
a[17]         0.311359990 0.62295500 -0.6315440  1.32062429 3987.2987 0.9993120
a[18]        -0.016406295 0.59236703 -0.9468339  0.94955229 4037.9169 0.9997998
a[19]        -0.325798059 0.57463583 -1.2267136  0.61010962 4114.0977 1.0001319
a[20]         0.714880887 0.67672986 -0.3007139  1.84490729 4203.5601 0.9998058
a[21]        -0.031969435 0.59121056 -0.9530994  0.93914665 4234.7039 0.9992996
a[22]        -0.028834440 0.58703520 -0.9301436  0.94422696 3964.1567 0.9995044
a[23]        -0.041898444 0.59765311 -0.9853586  0.93718093 4209.5053 0.9993394
a[24]        -0.594958996 0.53980600 -1.4501513  0.27491654 3875.6886 0.9994484
a[25]        -0.922065489 0.43447970 -1.6310744 -0.24752504 2623.0503 0.9997568
a[26]         0.040021306 0.41800846 -0.6349270  0.69762218 2431.4703 1.0006699
a[27]        -1.247234379 0.46782385 -2.0144778 -0.53216369 2516.0727 1.0004558
a[28]        -0.485287300 0.41851536 -1.1705819  0.17420454 2349.9120 1.0001383
a[29]         0.047076151 0.42232253 -0.6337015  0.70075034 2247.7332 1.0000059
a[30]         1.044254424 0.45618381  0.3344624  1.78213795 2637.6238 1.0000159
a[31]        -0.626126045 0.41941549 -1.2956179  0.04965493 1935.7106 1.0002374
a[32]        -0.353929282 0.42071794 -1.0243468  0.29269038 2126.4161 1.0031737
a[33]         0.471450515 0.59385007 -0.4268991  1.43683414 5077.8028 0.9994915
a[34]         0.169549989 0.57957901 -0.7313541  1.12555060 3302.1348 0.9997328
a[35]         0.161643000 0.57092408 -0.7005694  1.10204382 3356.3200 0.9994369
a[36]        -0.348806228 0.51888966 -1.1750501  0.48017641 3660.0056 1.0004628
a[37]        -0.345350105 0.52735836 -1.1802968  0.51616336 3714.7200 0.9993824
a[38]         0.845436525 0.66721207 -0.1544767  1.95980305 3939.9955 0.9998421
a[39]         0.162843299 0.58118370 -0.7135465  1.13912173 3949.5421 0.9996835
a[40]        -0.094286545 0.53314844 -0.9099927  0.77195836 3965.9727 0.9995954
a[41]        -1.568018639 0.45229941 -2.3055081 -0.87131821 2517.4945 1.0004870
a[42]        -0.578110805 0.37654285 -1.1815079  0.01066022 2046.9753 1.0004631
a[43]        -0.472536643 0.37636857 -1.0735304  0.11987256 1900.8481 1.0012754
a[44]        -0.367927755 0.37617701 -0.9810108  0.22073088 1693.1895 1.0016356
a[45]         0.427653043 0.37664412 -0.1665974  1.02235529 1770.0766 1.0015635
a[46]        -0.575108971 0.37146068 -1.1738507  0.01541667 1716.4397 1.0004397
a[47]         1.533185496 0.44959493  0.8328447  2.28363795 2106.3792 1.0005826
a[48]        -0.076281000 0.37718528 -0.6926673  0.52807620 1714.3057 1.0005010
a_bar         0.009485129 0.09955516 -0.1492049  0.16871382 1371.0897 1.0006038
sigma         0.828621783 0.14446428  0.6129801  1.06819666 1609.1266 1.0012639
bPredator[1]  2.638227839 0.26352128  2.2212575  3.06821355 1510.8586 1.0009788
bPredator[2]  0.023562060 0.22059236 -0.3288387  0.37112341  737.4917 1.0041597

As explained above, this model struggles to sample. We’ve managed to avoid the warnings and achieve decent mixing by tightening a prior:

iplot(function() {
  traceplot(m_rf_pred_orig, pars=c("a[41]", "a[30]", "a_bar", "sigma", "bPredator[1]", "bPredator[2]"))
}, ar=2)
iplot(function() {
  trankplot(m_rf_pred_orig, pars=c("a[41]", "a[30]", "a_bar", "sigma", "bPredator[1]", "bPredator[2]"))
}, ar=2)
[1] 2000
[1] 1
[1] 2000
../_images/practice-13_15_1.png ../_images/practice-13_15_2.png

Why are we struggling to sample? To debug this, let’s go back to the chapter on debugging MCMC, in particular section 9.5.4. on non-identifiable parameters. If non-identifiable parameters are a cause of this symptom, what could we check for? Going even further back, to the end of section 6.1, the author suggests checking whether the posterior has changed relative to the prior. This is a good question to ask regularly; which parameters are unchanged relative to their prior? The obvious parameter is \(\bar{\alpha}\), which we already had to tweak. Let’s change the \(\bar{\alpha}\) prior again and see if it remains unchanged in the posterior:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta_{pred[i]} \\ \alpha_j & \sim Normal(\bar{\alpha}, \sigma), j = 1..48 \\ \bar{\alpha} & \sim Normal(1.0, 0.1) \\ \sigma & \sim Exponential(1) \\ \beta_j & \sim Normal(0, 1.5), j = 1..2 \end{align} \end{split}\]
m_rf_pred_shift <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank] + bPredator[Predator],
    a[tank] ~ dnorm(a_bar, sigma),
    a_bar ~ dnorm(1.0, 0.1),
    sigma ~ dexp(1),
    bPredator[Predator] ~ dnorm(0, 1.5)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE, iter=2000
)

iplot(function() {
  plot(precis(m_rf_pred_shift, depth=2), main='m_rf_pred_shift')
}, ar=1.0)
display_markdown("Raw data (preceding plot):")
display(precis(m_rf_pred_shift, depth = 2), mimetypes="text/plain")
../_images/practice-13_17_0.png

Raw data (preceding plot):

             mean        sd         5.5%        94.5%      n_eff     Rhat4    
a[1]          0.89209569 0.67477533 -0.16745233  2.0011115 4735.4615 0.9992439
a[2]          1.39224094 0.73818963  0.25400126  2.6080315 3957.1503 1.0004964
a[3]          0.10122305 0.66283664 -0.97613390  1.1653688 3418.7183 1.0000106
a[4]          1.38871216 0.73452590  0.28161585  2.5953023 3785.4037 0.9994385
a[5]          0.90554711 0.69830248 -0.16292157  2.0571965 4904.7154 0.9998165
a[6]          0.90176640 0.69104425 -0.19062266  2.0195898 4457.3569 0.9994585
a[7]          1.39936938 0.75992449  0.20574222  2.6583386 4249.0306 0.9996905
a[8]          0.91059656 0.68309855 -0.18988412  2.0056462 4003.0761 1.0012390
a[9]          0.73248137 0.52721299 -0.11813063  1.5496131 3425.8315 0.9995393
a[10]         2.06301112 0.58356879  1.19447321  3.0422058 2793.8910 1.0005541
a[11]         1.49963661 0.52484637  0.66308684  2.3571843 3926.1740 1.0020691
a[12]         1.23113983 0.54391426  0.37062379  2.1014719 3627.1536 1.0003903
a[13]         1.49939343 0.54482885  0.65913121  2.3839408 3168.7937 1.0006681
a[14]         0.98772905 0.52333559  0.15660760  1.8276881 3324.1530 1.0015907
a[15]         2.06758066 0.57084495  1.21328743  3.0248502 3365.5445 0.9997303
a[16]         2.06976816 0.60316648  1.14447531  3.0806674 4120.2195 0.9999741
a[17]         1.29229894 0.60917356  0.37290175  2.2960869 4007.8210 1.0002482
a[18]         0.94007920 0.57718091  0.05406167  1.8824613 3926.6926 0.9997396
a[19]         0.65752522 0.55403428 -0.17795545  1.5459452 3541.3975 1.0012304
a[20]         1.69407162 0.67858631  0.67788785  2.8359920 3233.6798 1.0019009
a[21]         0.95862978 0.58846189  0.01658356  1.9128141 3413.0634 1.0002383
a[22]         0.93698425 0.56500639  0.05393344  1.8602302 4012.3578 1.0001226
a[23]         0.96240309 0.58264693  0.06316729  1.8934749 4503.2347 1.0006012
a[24]         0.39965050 0.53998463 -0.43439589  1.2812131 3192.9299 1.0001324
a[25]         0.07621943 0.43948050 -0.63167357  0.7536801 2181.3490 1.0013726
a[26]         1.03888573 0.41344545  0.36588081  1.6863169 2452.7963 1.0023155
a[27]        -0.24250659 0.47663660 -1.02658207  0.4852501 2222.4989 1.0006004
a[28]         0.49973971 0.41881256 -0.17784275  1.1666101 2290.0176 1.0008810
a[29]         1.03235316 0.40609705  0.36608494  1.6754619 2286.3186 1.0019069
a[30]         2.02159219 0.42916485  1.36954429  2.7244621 2305.3258 1.0005783
a[31]         0.36793186 0.42202497 -0.31798667  1.0220448 2154.6493 1.0015990
a[32]         0.64086201 0.41060626 -0.03934122  1.3016825 1963.1853 1.0011168
a[33]         1.46138803 0.59516254  0.52792464  2.4607093 4247.0558 1.0013806
a[34]         1.14174442 0.56920314  0.26951677  2.0622307 3790.0600 1.0008938
a[35]         1.16039851 0.57498749  0.28519144  2.1135609 3559.2976 1.0015933
a[36]         0.63760672 0.51303529 -0.16677345  1.4736388 3106.9969 1.0003387
a[37]         0.63900192 0.51835456 -0.16468426  1.4894591 3150.7196 1.0003524
a[38]         1.82482034 0.63775234  0.85172586  2.8864206 3479.1211 1.0005003
a[39]         1.15723400 0.57205560  0.28350920  2.0816031 3242.3819 1.0019410
a[40]         0.87573768 0.53712506  0.03494396  1.7547448 3325.7247 1.0000819
a[41]        -0.55852738 0.45148673 -1.32444554  0.1297604 2039.5352 1.0010345
a[42]         0.42040389 0.37351131 -0.18386648  0.9985524 1871.0674 1.0024993
a[43]         0.52256157 0.37881219 -0.09506383  1.1170410 1639.8070 1.0015673
a[44]         0.61999632 0.36585617  0.03341034  1.2006817 1709.6801 1.0023348
a[45]         1.42271809 0.36963628  0.84366835  2.0282822 1999.1441 1.0009946
a[46]         0.41424304 0.37841286 -0.19656934  1.0070465 1839.1323 1.0011048
a[47]         2.51436060 0.45221636  1.83604601  3.2685320 2747.6259 1.0002937
a[48]         0.91928048 0.35897769  0.33980370  1.4889490 1605.9322 1.0022225
a_bar         1.00359144 0.09591719  0.84977652  1.1545523 1255.1540 1.0053490
sigma         0.82101396 0.14649139  0.60758009  1.0741069 1396.0209 1.0006210
bPredator[1]  1.66349494 0.26370106  1.25552431  2.0945184 1281.1362 1.0030336
bPredator[2] -0.96300955 0.21888214 -1.29949688 -0.6091436  708.4835 1.0063713

The \(\bar{\alpha}\) prior remains unchanged even after being shifted. Notice that the predator parameters are the ones that have responded (and unsuprisingly, all the a parameters). Previously the model had inferred a ‘lack of predator’ drastically helps survival and a predator has no effect on survival (treating a predator as the baseline). Now it has inferred a predator hurts chances of survival, and the lack of a predator doesn’t help as much. That is, all these parameters only make sense relative to each other.

To avoid this mess we could use an indicator variable:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta P_i \\ \alpha_j & \sim Normal(\bar{\alpha}, \sigma), j = 1..48 \\ \bar{\alpha} & \sim Normal(0, 2) \\ \sigma & \sim Exponential(1) \\ \beta & \sim Normal(0, 1.5) \end{align} \end{split}\]
rf_dat <- list(
  S = rf_df$surv,
  N = rf_df$density,
  tank = rf_df$tank,
  Predator = ifelse(rf_df$Predator == 1, 0, 1)
)
m_rf_pred_indicator <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank] + bPredator * Predator,
    a[tank] ~ dnorm(a_bar, sigma),
    a_bar ~ dnorm(0, 2),
    sigma ~ dexp(1),
    bPredator ~ dnorm(0, 3)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE, iter=4000
)

iplot(function() {
  plot(precis(m_rf_pred_indicator, depth=2), main='m_rf_pred_indicator')
}, ar=1.0)
display_markdown("Raw data (preceding plot):")
display(precis(m_rf_pred_indicator, depth = 2), mimetypes="text/plain")
../_images/practice-13_19_0.png

Raw data (preceding plot):

          mean       sd        5.5%       94.5%     n_eff     Rhat4    
a[1]       2.5522648 0.7005179  1.4704404  3.724137 6465.3677 0.9997779
a[2]       3.0204002 0.7509938  1.8883086  4.273045 4801.7676 1.0000717
a[3]       1.7453122 0.6243500  0.7505595  2.729144 5643.3185 0.9999198
a[4]       3.0308241 0.7630791  1.8813480  4.323262 5013.6301 0.9998198
a[5]       2.5490308 0.6825154  1.4863144  3.682729 5641.1540 1.0000314
a[6]       2.5490481 0.6828359  1.4998263  3.666704 6223.8256 0.9999237
a[7]       3.0277615 0.7447970  1.9050250  4.254546 4683.1569 0.9999306
a[8]       2.5523347 0.6957374  1.4755051  3.710511 6337.5569 1.0006373
a[9]       2.3564626 0.5704366  1.4249326  3.246933 2273.3989 0.9999730
a[10]      3.6979365 0.6404354  2.7154496  4.759738 2307.2144 1.0012041
a[11]      3.1357584 0.5848394  2.2227532  4.082508 2218.6606 1.0003584
a[12]      2.8730044 0.5714784  1.9691444  3.807295 2342.4570 0.9998993
a[13]      3.1333766 0.6008392  2.1954840  4.093771 2193.1768 1.0007692
a[14]      2.6121693 0.5773987  1.6854409  3.532968 2250.0685 1.0006250
a[15]      3.7063579 0.6486675  2.7129112  4.778418 2321.4588 1.0004578
a[16]      3.6995097 0.6422066  2.7268487  4.771852 2525.4463 1.0000863
a[17]      2.9315669 0.6131325  1.9981556  3.953931 5284.7219 0.9998453
a[18]      2.6011891 0.5676435  1.7185426  3.537737 5644.0729 0.9998619
a[19]      2.3008860 0.5254164  1.4853700  3.157238 6758.9622 1.0004038
a[20]      3.3397761 0.6888784  2.3057830  4.505656 4574.8774 0.9996914
a[21]      2.6044994 0.5763145  1.7208334  3.553111 5326.6637 1.0001091
a[22]      2.5991773 0.5630963  1.7358827  3.526439 6609.4474 1.0001926
a[23]      2.5974198 0.5701527  1.7355686  3.530736 6394.1215 0.9998074
a[24]      2.0506006 0.5165176  1.2566012  2.894075 6907.7744 1.0001762
a[25]      1.7243139 0.4917251  0.9252327  2.495221 1611.2019 1.0002243
a[26]      2.6686296 0.4623173  1.9255557  3.409024 1480.8194 1.0003777
a[27]      1.3971871 0.5191158  0.5484776  2.186560 1664.7034 1.0000776
a[28]      2.1433253 0.4678318  1.3789682  2.885350 1538.8003 1.0004297
a[29]      2.6707417 0.4714297  1.9044658  3.417621 1534.8110 1.0003517
a[30]      3.6664625 0.5007730  2.8731590  4.493474 1466.1202 1.0003867
a[31]      2.0054734 0.4709779  1.2607800  2.755354 1490.3879 1.0004900
a[32]      2.2792674 0.4685247  1.5218700  3.025201 1588.9496 1.0003599
a[33]      3.1158945 0.5974454  2.2051671  4.128562 5757.0877 0.9996544
a[34]      2.8078013 0.5453006  1.9912577  3.721995 5840.2122 0.9999484
a[35]      2.7949597 0.5440604  1.9722512  3.710830 5590.0865 1.0001520
a[36]      2.2903009 0.4728252  1.5763166  3.079256 6214.4524 1.0000563
a[37]      2.2948037 0.4790739  1.5722842  3.084986 6157.2150 0.9998679
a[38]      3.4649569 0.6816790  2.4613353  4.617711 4279.8047 1.0000297
a[39]      2.7880349 0.5380635  1.9720357  3.683688 5951.3460 0.9999877
a[40]      2.5410647 0.5103372  1.7516598  3.399963 6211.9274 1.0000169
a[41]      1.0668785 0.5055053  0.2238685  1.846239 1695.9204 1.0003529
a[42]      2.0509100 0.4399465  1.3463682  2.753869 1367.0356 1.0003402
a[43]      2.1523915 0.4452148  1.4372461  2.863473 1392.6392 1.0007920
a[44]      2.2617789 0.4340331  1.5606782  2.940569 1261.3145 1.0007757
a[45]      3.0598169 0.4385458  2.3552809  3.754512 1281.9748 1.0002635
a[46]      2.0561885 0.4473343  1.3340787  2.775770 1360.7541 0.9999792
a[47]      4.1475119 0.5009864  3.3612204  4.959540 1600.1446 1.0000179
a[48]      2.5585814 0.4290491  1.8812055  3.243232 1297.0543 1.0004239
a_bar      2.6418635 0.2431518  2.2643128  3.034041  887.3094 1.0005768
sigma      0.8196008 0.1450572  0.6112842  1.064260 2140.9977 1.0006708
bPredator -2.6028713 0.3128245 -3.0963048 -2.096804  731.0829 1.0009372

Notice in the last model we’ve been able to widen the \(\bar{\alpha}\) prior and actually learn it from the data, that is, the prior changes in the posterior.

Still, an indicator variable isn’t ideal for at least one of the reasons given in section 5.3.1.: We aren’t more uncertain about the predator than the non-predator situation. The indicator variable approach also makes it harder to read the a parameters relative to a_bar since they aren’t centered at zero, and makes it harder to set an a_bar prior.

Instead let’s simply fix the a_bar prior’s expected value to zero as suggested in section 13.3.1:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta_{pred[i]} \\ \alpha_j & \sim Normal(0, \sigma), j = 1..48 \\ \sigma & \sim Exponential(1) \\ \beta_j & \sim Normal(0, 1.5), j = 1..2 \end{align} \end{split}\]
rf_dat <- list(
  S = rf_df$surv,
  N = rf_df$density,
  tank = rf_df$tank,
  Predator = rf_df$Predator
)
m_rf_df_pred <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank] + bPredator[Predator],
    a[tank] ~ dnorm(0, sigma),
    sigma ~ dexp(1),
    bPredator[Predator] ~ dnorm(0, 1.5)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE, iter=2000
)

iplot(function() {
  plot(precis(m_rf_df_pred, depth=2), main='m_rf_df_pred')
}, ar=1.0)
display_markdown("Raw data (preceding plot):")
display(precis(m_rf_df_pred, depth = 2), mimetypes="text/plain")
../_images/practice-13_21_0.png

Raw data (preceding plot):

             mean        sd        5.5%       94.5%        n_eff     Rhat4    
a[1]         -0.09697859 0.6841610 -1.1624090  0.978796500 7104.0967 0.9991372
a[2]          0.39051532 0.7118729 -0.7055822  1.585237467 4997.2601 0.9997518
a[3]         -0.88904449 0.6238732 -1.8911480  0.087113890 4115.0420 0.9999485
a[4]          0.41269873 0.7425609 -0.7262197  1.638435915 5028.8425 0.9994130
a[5]         -0.08992742 0.6658583 -1.1304364  0.979097793 4868.8726 1.0002809
a[6]         -0.10079296 0.6726996 -1.1429354  0.978045998 5619.5932 0.9998326
a[7]          0.38246971 0.7259052 -0.7097182  1.584208094 4646.3051 0.9996028
a[8]         -0.07450817 0.6763312 -1.1585558  1.022350219 4882.5723 0.9998955
a[9]         -0.27823505 0.5226344 -1.1208755  0.547098083 3865.5537 0.9999979
a[10]         1.07936183 0.5812063  0.1974903  2.007386701 4001.4846 1.0009419
a[11]         0.49797296 0.5371057 -0.3283804  1.358467547 4659.2236 1.0001918
a[12]         0.23363445 0.5416713 -0.6148350  1.113431663 4300.4459 0.9995509
a[13]         0.51324613 0.5455704 -0.3571918  1.401333472 4995.6051 0.9994275
a[14]        -0.01964218 0.5098752 -0.8471211  0.767138747 3755.1251 1.0003357
a[15]         1.07173121 0.5855436  0.1844398  2.030393592 4504.2158 0.9999240
a[16]         1.07566945 0.5792244  0.1801974  2.023543821 4490.1072 0.9998481
a[17]         0.29275396 0.6396542 -0.6886191  1.391203496 5798.8060 0.9993214
a[18]        -0.04523565 0.5757471 -0.9454994  0.938394996 5220.0649 0.9993111
a[19]        -0.32044369 0.5471879 -1.1695065  0.552227484 3969.2413 1.0002053
a[20]         0.67529437 0.6791460 -0.3326967  1.833778951 4212.5325 0.9996342
a[21]        -0.03522830 0.5792738 -0.9324501  0.908435970 4574.5422 0.9994619
a[22]        -0.03968798 0.5670331 -0.8942994  0.897039310 4903.4849 0.9994307
a[23]        -0.03165309 0.5728453 -0.9131142  0.901236994 4867.8231 1.0000564
a[24]        -0.59880483 0.5256781 -1.4429958  0.226346990 4048.9714 0.9991929
a[25]        -0.92613069 0.4258198 -1.6141886 -0.264702371 3117.3589 0.9995583
a[26]         0.03769847 0.4021599 -0.6019276  0.677847891 3170.7403 1.0012505
a[27]        -1.23936592 0.4522377 -1.9862324 -0.544366300 2848.3630 1.0003908
a[28]        -0.49235389 0.4134558 -1.1593876  0.167492659 2823.6535 1.0003123
a[29]         0.04646079 0.3889926 -0.5785333  0.664016953 2624.6351 1.0015064
a[30]         1.03414342 0.4395880  0.3480782  1.765167828 3529.6936 1.0000845
a[31]        -0.62693058 0.4122822 -1.2835615  0.008595614 3179.6959 0.9996412
a[32]        -0.35547092 0.4005949 -1.0154093  0.274143421 2859.0685 1.0014165
a[33]         0.48630083 0.6185443 -0.4274168  1.521135912 4609.1703 0.9993187
a[34]         0.16418432 0.5472980 -0.6807129  1.085121870 5050.9348 0.9993816
a[35]         0.17030285 0.5570812 -0.6707985  1.076228321 3845.2128 0.9999526
a[36]        -0.34515564 0.5115704 -1.1373833  0.476882477 3764.8778 0.9996538
a[37]        -0.34532324 0.5059938 -1.1250993  0.480615388 4662.9715 0.9994801
a[38]         0.83753669 0.6862194 -0.1984012  1.975697764 5109.4183 0.9995974
a[39]         0.15282231 0.5640697 -0.7004958  1.092202698 5035.6147 0.9997246
a[40]        -0.11355840 0.5107425 -0.9154867  0.726645541 3879.4945 0.9996644
a[41]        -1.56458436 0.4502418 -2.3268471 -0.883605386 2629.1010 1.0007711
a[42]        -0.58326697 0.3730827 -1.1916108  0.011371234 2902.3533 1.0006533
a[43]        -0.47753634 0.3612391 -1.0614049  0.091031124 2393.1816 1.0008698
a[44]        -0.37480105 0.3587195 -0.9656114  0.188737115 2328.3670 1.0013135
a[45]         0.42094747 0.3676015 -0.1570650  1.010639836 2734.5974 1.0007342
a[46]        -0.58616032 0.3700466 -1.1823389 -0.006083808 2481.4581 1.0027914
a[47]         1.52162086 0.4242206  0.8779488  2.224791652 2704.3370 1.0003412
a[48]        -0.07801268 0.3640619 -0.6608928  0.501107049 2658.8869 1.0022556
sigma         0.82165527 0.1394925  0.6132546  1.062145001 1399.6148 1.0003748
bPredator[1]  2.64280559 0.2451182  2.2609478  3.040105815 2013.0146 1.0001101
bPredator[2]  0.03032788 0.1982585 -0.2796168  0.349500599  963.9921 1.0028167

Let’s fit a similar model with only the size predictor:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta_{size[i]} \\ \alpha_j & \sim Normal(0, \sigma), j = 1..48 \\ \sigma & \sim Exponential(1) \\ \beta_j & \sim Normal(0, 1.5), j = 1..2 \end{align} \end{split}\]
rf_dat <- list(
  S = rf_df$surv,
  N = rf_df$density,
  tank = rf_df$tank,
  Size = rf_df$Size
)
m_rf_df_size <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank] + bSize[Size],
    a[tank] ~ dnorm(0, sigma),
    sigma ~ dexp(1),
    bSize[Size] ~ dnorm(0, 1.5)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE, iter=2000
)

iplot(function() {
  plot(precis(m_rf_df_size, depth=2), main='m_rf_df_size')
}, ar=1.0)
display_markdown("Raw data (preceding plot):")
display(precis(m_rf_df_size, depth = 2), mimetypes="text/plain")
../_images/practice-13_23_0.png

Raw data (preceding plot):

         mean        sd        5.5%        94.5%       n_eff     Rhat4    
a[1]      0.97752303 0.8835042 -0.32453635  2.46154161 2430.2235 1.0012180
a[2]      1.86327598 1.0624751  0.30664617  3.65467852 2537.9613 1.0017688
a[3]     -0.10443315 0.7230797 -1.21230383  1.06307623 1940.0259 1.0016095
a[4]      1.87033162 1.0752683  0.27684865  3.71942342 2934.6080 1.0008884
a[5]      0.65949438 0.9296719 -0.76855059  2.22705412 3080.9694 1.0011339
a[6]      0.63542458 0.9206881 -0.73879582  2.14077619 2739.0607 1.0008519
a[7]      1.61518718 1.1494012 -0.05417674  3.53659232 2273.5358 0.9993930
a[8]      0.66273450 0.9182667 -0.70231129  2.23267394 2338.0138 0.9999125
a[9]     -1.28242464 0.6779740 -2.36918377 -0.22965485 1702.2136 1.0005464
a[10]     0.98409873 0.8912791 -0.36259458  2.46901686 2614.4988 1.0006619
a[11]    -0.11020205 0.7321976 -1.23599653  1.10713271 1925.7231 0.9999628
a[12]    -0.52380795 0.6829459 -1.60787257  0.57698456 1771.8878 1.0003157
a[13]    -0.49610098 0.7443987 -1.63784388  0.68404612 2191.9232 1.0010597
a[14]    -1.32474819 0.6885549 -2.42596473 -0.21554533 2186.7153 1.0004160
a[15]     0.65758409 0.9284452 -0.71527759  2.18234668 2422.1029 1.0016655
a[16]     0.64374669 0.9124251 -0.70765879  2.18761902 3129.5615 0.9996408
a[17]     1.76336221 0.7885090  0.55543932  3.06380870 2786.8698 1.0004905
a[18]     1.28464199 0.7152119  0.19101763  2.45304314 1785.5785 1.0010351
a[19]     0.91279649 0.6460401 -0.06653632  1.99398924 1845.2972 1.0016696
a[20]     2.48738601 0.9936424  1.07549218  4.20363364 2617.9282 1.0002907
a[21]     0.88984898 0.7341968 -0.22559476  2.10225237 2080.4659 1.0006731
a[22]     0.88460185 0.7577859 -0.26811851  2.13897933 2145.1575 1.0012040
a[23]     0.86302654 0.7224245 -0.21699469  2.09702860 2403.2826 0.9997079
a[24]     0.17477428 0.6210206 -0.77787076  1.19973949 1740.9190 1.0032064
a[25]    -2.09351668 0.5443075 -2.99248210 -1.25033531 1301.0796 1.0017277
a[26]    -0.92071091 0.5170888 -1.73999848 -0.08758907 1263.7412 1.0032009
a[27]    -2.51160448 0.5779779 -3.49405576 -1.64633955 1390.5255 1.0008773
a[28]    -1.54911141 0.5304216 -2.41313158 -0.71722523 1208.2725 1.0005355
a[29]    -1.38780572 0.5150797 -2.22767237 -0.58240377 1330.5323 1.0026714
a[30]    -0.08334482 0.5934668 -1.01964309  0.85913071 1612.7431 1.0036123
a[31]    -2.17321305 0.5413917 -3.05741826 -1.33792005 1342.1802 1.0034081
a[32]    -1.84227386 0.5343419 -2.73825437 -1.00906277 1359.2655 1.0035962
a[33]     2.05994892 0.7798711  0.90032973  3.37892869 2370.3716 1.0000161
a[34]     1.58991185 0.6802528  0.55802614  2.68983709 1817.1788 1.0005279
a[35]     1.59839736 0.7014364  0.52523148  2.77829088 1607.0835 1.0036347
a[36]     0.95641017 0.5890629  0.01526481  1.89197674 1390.8940 1.0016272
a[37]     0.52103424 0.6031463 -0.42626574  1.51350427 1768.2851 1.0017421
a[38]     2.42872363 1.0374497  0.92145929  4.21005058 2299.1514 1.0003496
a[39]     1.18261013 0.7147161  0.09191460  2.36356186 2141.0947 1.0009511
a[40]     0.82536880 0.6248690 -0.14496913  1.85081628 1862.9680 1.0021163
a[41]    -2.89293403 0.5725528 -3.86403288 -2.01178169 1118.2075 1.0033815
a[42]    -1.65419161 0.4853054 -2.44458463 -0.88896203 1000.3817 1.0016024
a[43]    -1.53368467 0.4764261 -2.31780764 -0.78687128  993.8615 1.0009041
a[44]    -1.42242505 0.4768055 -2.20032668 -0.67186286  938.4543 1.0017574
a[45]    -0.95806009 0.4853153 -1.73491533 -0.19815837 1215.9525 1.0042005
a[46]    -2.11021208 0.4902764 -2.89647469 -1.32872259 1244.7923 1.0033629
a[47]     0.52311701 0.5904331 -0.40693924  1.48374522 1567.3817 1.0021215
a[48]    -1.53744448 0.4682381 -2.31937713 -0.79495099 1164.7406 1.0030632
sigma     1.61312478 0.2142123  1.30124065  1.98828592 1740.8490 1.0025179
bSize[1]  1.06931649 0.3442443  0.53816609  1.62691389  577.5558 1.0049769
bSize[2]  1.55123496 0.3528192  0.99305972  2.12098783  733.4177 1.0064655

Finally, lets model an interaction term. We’ve already added a ‘treatment’ index variable in preprocessing; see the data.frame near the start of this question.

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} + \beta_{Treatment[i]} \\ \alpha_j & \sim Normal(0, \sigma), j = 1..48 \\ \sigma & \sim Exponential(1) \\ \beta_j & \sim Normal(0, 1.5), j = 1..4 \end{align} \end{split}\]
rf_dat <- list(
  S = rf_df$surv,
  N = rf_df$density,
  tank = rf_df$tank,
  Treatment = rf_df$Treatment
)
m_rf_df_interaction <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank] + bTreatment[Treatment],
    a[tank] ~ dnorm(0, sigma),
    sigma ~ dexp(1),
    bTreatment[Treatment] ~ dnorm(0, 1.5)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE, iter=2000
)

iplot(function() {
  plot(precis(m_rf_df_interaction, depth=2), main='m_rf_df_interaction')
}, ar=1.0)
display_markdown("Raw data (preceding plot):")
display(precis(m_rf_df_interaction, depth = 2), mimetypes="text/plain")
../_images/practice-13_25_0.png

Raw data (preceding plot):

              mean         sd        5.5%        94.5%       n_eff    
a[1]          -0.087257569 0.6477772 -1.11234636  0.93996928 4722.5288
a[2]           0.318245478 0.6745471 -0.71460359  1.45023200 4778.9117
a[3]          -0.800963795 0.6008024 -1.77561418  0.13641758 3342.8367
a[4]           0.302265247 0.6518208 -0.69100984  1.40382139 3496.4853
a[5]          -0.035445130 0.6249678 -1.03410501  0.96793664 5036.8068
a[6]          -0.032250608 0.6139588 -1.01072576  0.97492396 5164.0431
a[7]           0.354867448 0.6353728 -0.61346231  1.40583923 4210.0953
a[8]          -0.043945496 0.6151364 -0.98306911  0.95695367 4749.5379
a[9]           0.001957609 0.5208627 -0.83548335  0.82989428 4419.5517
a[10]          1.196095698 0.5760802  0.33357862  2.14055378 3114.5423
a[11]          0.704742593 0.5152675 -0.09071266  1.55676304 2997.9738
a[12]          0.481584350 0.5021899 -0.29572659  1.29737687 4194.2670
a[13]          0.221965804 0.5138389 -0.58784720  1.05716499 3667.6201
a[14]         -0.261465503 0.5124471 -1.10106069  0.53187012 3899.2559
a[15]          0.736553286 0.5634339 -0.12468992  1.68514887 3311.3946
a[16]          0.745341820 0.5492434 -0.07720214  1.67428163 4345.1706
a[17]          0.251020002 0.5774214 -0.63275481  1.20814775 4588.8592
a[18]         -0.051935338 0.5627769 -0.94764556  0.85892117 3756.3156
a[19]         -0.318002893 0.5417412 -1.17314038  0.54672294 3210.6201
a[20]          0.604834487 0.6319848 -0.34227492  1.66173991 3385.2257
a[21]          0.033472764 0.5333835 -0.79554130  0.90861101 4043.5704
a[22]          0.029242947 0.5428392 -0.82458011  0.91021538 4415.5132
a[23]          0.034957911 0.5529902 -0.81975311  0.93719528 3632.9530
a[24]         -0.470096905 0.5063614 -1.28365869  0.33271814 3927.2373
a[25]         -0.517670247 0.4253073 -1.22039146  0.14365757 2979.9141
a[26]          0.395552981 0.4141425 -0.25771856  1.07265705 3025.9498
a[27]         -0.843342418 0.4681718 -1.60698350 -0.12182338 2667.5470
a[28]         -0.114857926 0.4223913 -0.80308511  0.54755944 2796.1127
a[29]         -0.282946147 0.3973980 -0.91493729  0.35204248 2595.8095
a[30]          0.663597991 0.4361002 -0.02320321  1.37190893 2954.9465
a[31]         -0.907729257 0.4230211 -1.57233709 -0.24741158 1978.1305
a[32]         -0.658813684 0.4163325 -1.34268637 -0.01756846 2263.8078
a[33]          0.401378759 0.5762771 -0.47515903  1.35544512 4357.1064
a[34]          0.136819073 0.5360838 -0.70850460  1.01958949 4029.9205
a[35]          0.133440081 0.5397772 -0.68993194  1.05455481 4238.6507
a[36]         -0.326567524 0.4924184 -1.08865937  0.46026124 4043.2871
a[37]         -0.236285314 0.4737229 -0.99774743  0.51750538 3483.7743
a[38]          0.787539951 0.6079236 -0.12172403  1.81772432 2932.5664
a[39]          0.218542419 0.5309329 -0.59404407  1.08255773 4175.5458
a[40]         -0.020655554 0.5090984 -0.82681174  0.80482873 3577.4556
a[41]         -1.129685951 0.4485302 -1.87230867 -0.42836194 2292.7948
a[42]         -0.178347382 0.3754678 -0.79923448  0.40450969 2828.1548
a[43]         -0.071638312 0.3756510 -0.66980700  0.52025454 2681.6744
a[44]          0.019679012 0.3738944 -0.58021129  0.61633966 2643.5542
a[45]          0.065156173 0.3699510 -0.53374719  0.65132274 2468.8254
a[46]         -0.899000138 0.3933015 -1.51634559 -0.27886996 2164.0248
a[47]          1.114395643 0.4469646  0.42623423  1.82967537 2565.2845
a[48]         -0.415071015 0.3848530 -1.04290846  0.18833472 2107.4585
sigma          0.735941763 0.1459044  0.51628227  0.98159040  981.8263
bTreatment[1]  2.661419274 0.3253302  2.15441259  3.18572489 2589.9114
bTreatment[2] -0.443829846 0.2563421 -0.84018144 -0.02537645 1483.9414
bTreatment[3]  2.497303072 0.3161132  2.00347641  2.99792231 2213.3737
bTreatment[4]  0.460283400 0.2527771  0.05946582  0.86930412 1310.9256
              Rhat4    
a[1]          1.0020151
a[2]          1.0006531
a[3]          1.0013384
a[4]          1.0000362
a[5]          0.9997723
a[6]          0.9995127
a[7]          1.0003023
a[8]          0.9994760
a[9]          0.9994705
a[10]         0.9997850
a[11]         1.0004945
a[12]         1.0000306
a[13]         0.9996838
a[14]         0.9999756
a[15]         1.0002805
a[16]         0.9994730
a[17]         0.9994418
a[18]         0.9995753
a[19]         1.0000075
a[20]         1.0006909
a[21]         0.9999656
a[22]         0.9997511
a[23]         0.9999732
a[24]         0.9997563
a[25]         1.0000648
a[26]         1.0000309
a[27]         1.0003747
a[28]         1.0000041
a[29]         0.9994521
a[30]         1.0001314
a[31]         1.0004173
a[32]         0.9999176
a[33]         0.9995793
a[34]         0.9996435
a[35]         0.9996489
a[36]         0.9995608
a[37]         1.0000851
a[38]         1.0009833
a[39]         0.9997947
a[40]         0.9998519
a[41]         1.0006167
a[42]         1.0006921
a[43]         0.9997063
a[44]         1.0003002
a[45]         0.9992996
a[46]         1.0000655
a[47]         1.0003963
a[48]         1.0008080
sigma         1.0046172
bTreatment[1] 1.0005507
bTreatment[2] 1.0007625
bTreatment[3] 1.0000693
bTreatment[4] 1.0008946

Let’s go back to the original question:

Instead of focusing on inferences about these two predictor variables, focus on the inferred variation across tanks. Explain why it changes as it does across models.

The inferred variation across tanks (\(\sigma\)) decreases in these models as we add more predictors. We see the largest \(\sigma\) in m13.2 where we haven’t tried to explain variation with any predictors. When we add the predator predictor in m_rf_df_pred we see sigma drop significantly because there is a lot of predictive power in that variable (or at least more than size). The model with an interaction term has the lowest inference for sigma because it indirectly has access to all the predictors.

One question only exists in 13H4:

What do you infer about the causal inference of these predictor variables?

In the typical predator-prey relationship it’s more appropriate to use differential equations to model the relationship between the populations. Reading the original paper, though, it appears that the presence of predators, tadpole sizes, and initial density were manipulated. If that is the case then the causal diagram would be relatively simple: the predictors can only affect the outcome, not each other.

It’s tempting to read from the difference between bTreatment[1] and bTreatment[3] relative to the difference between bTreatment[2] and bTreatment[4] that being large helps survival in the absence of predators (to overcome the elements/nature) but in the presence of predators being large makes a tadpole easier to catch (harder to hide) and more desirable to catch.

The support for the advantage of size in the absence of predators is limited, though, considering the error bars on bTreatment[1] and bTreatment[3]. If we only accept the significant difference in survivability from being smaller in the presence of predators, we would conclude being small would primarily be an advantage in the larval/tadpole stage of the frog’s development without an explanation for why some tadpoles hatch larger. Based on the original paper, the confounding variable seems to be survivability at metamorphosis, which could influence the timing of hatching and therefore the ideal tadpole size. Tadpoles that are smaller at metamorphosis may be less likely to survive (or reproduce) than those that are larger.

That is, there are two influences on the ideal tadpole size, both survivability at life stages in nature:

MetaSurv -> TadpoleSize <- LarvSurv

The species (apparently) has to tradeoff size to survive both stages.

13M2. Compare the models you fit just above, using WAIC. Can you reconcile the differences in WAIC with the posterior distributions of the models?

Answer. For comparison and to check on warnings, these results include PSIS deviances as well. Notice the model with the predator predictor does well in both comparisons. This predictor significantly improves predictive accuracy at the cost of only one parameter. Generally speaking the model with only the size predictor does not do as well, because (alone) this is not an effective predictor of whether the tadpole will survive to metamorphosis.

In general there are not large differences between these models, however, considering the error bars produced by both WAIC and PSIS.

iplot(function() {
  plot(compare(m13.2, m_rf_df_pred, m_rf_df_size, m_rf_df_interaction))
}, ar=3.5)
display_markdown("Raw data (preceding plot):")
display(compare(m13.2, m_rf_df_pred, m_rf_df_size, m_rf_df_interaction), mimetypes="text/plain")

p_comp = compare(m13.2, m_rf_df_pred, m_rf_df_size, m_rf_df_interaction, func=PSIS)
iplot(function() {
  plot(p_comp)
}, ar=3.5)
display_markdown("Raw data (preceding plot):")
display(p_comp, mimetypes="text/plain")
../_images/practice-13_27_0.png

Raw data (preceding plot):

                    WAIC     SE       dWAIC     dSE      pWAIC    weight   
m13.2               199.0271 7.317398 0.0000000       NA 20.52833 0.4163127
m_rf_df_pred        199.9161 9.232676 0.8890021 5.761711 19.59478 0.2669165
m_rf_df_size        200.9270 7.066293 1.8998788 1.016211 21.21073 0.1610150
m_rf_df_interaction 200.9934 9.684257 1.9662939 6.970610 19.53916 0.1557559
Some Pareto k values are very high (>1). Set pointwise=TRUE to inspect individual points.
Some Pareto k values are very high (>1). Set pointwise=TRUE to inspect individual points.
Some Pareto k values are very high (>1). Set pointwise=TRUE to inspect individual points.
Some Pareto k values are very high (>1). Set pointwise=TRUE to inspect individual points.
../_images/practice-13_27_7.png

Raw data (preceding plot):

                    PSIS     SE        dPSIS     dSE      pPSIS    weight     
m_rf_df_interaction 215.4582 12.018328 0.0000000       NA 26.77156 0.376661627
m13.2               215.7296  8.498153 0.2713941 9.115141 28.87958 0.328865935
m_rf_df_pred        215.9888 11.378194 0.5306113 4.814114 27.63113 0.288888725
m_rf_df_size        223.8812  7.984385 8.4229865 9.625212 32.68783 0.005583713

13M3. Re-estimate the basic Reed frog varying intercept model, but now using a Cauchy distribution in place of the Gaussian distribution for the varying intercepts. That is, fit this model:

\[\begin{split} \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha_{tank[i]} \\ \alpha_{tank} & \sim Cauchy(\alpha, \sigma), tank = 1..48 \\ \alpha & \sim Normal(0, 1) \\ \sigma & \sim Exponential(1) \\ \end{align} \end{split}\]

(You are likely to see many divergent transitions for this model. Can you figure out why? Can you fix them?) Compare the posterior means of the intercepts, \(\alpha_{tank}\), to the posterior means produced in the chapter, using the customary Gaussian prior. Can you explain the pattern of differences? Take note of any change in the mean \(\alpha\) as well.

Answer. First, let’s reproduce some of the plots from the chapter. Similar to the approach in the R code 13.22 box and elsewhere, we’ll print the raw output of precis for a model before its plots:

data(reedfrogs)
rf_df <- reedfrogs
rf_df$tank <- 1:nrow(rf_df)

rf_dat <- list(
  S = rf_df$surv,
  N = rf_df$density,
  tank = rf_df$tank
)

## R code 13.3
m13.2 <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank],
    a[tank] ~ dnorm(a_bar, sigma),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE
)

plot_means <- function(post, plot_main) {
  # compute mean intercept for each tank
  # also transform to probability with logistic
  rf_df$propsurv.est <- logistic(apply(post$a, 2, mean))

  iplot(function() {
    # display raw proportions surviving in each tank
    plot(rf_df$propsurv,
      ylim = c(0, 1), pch = 16, xaxt = "n",
      xlab = "tank", ylab = "proportion survival", col = rangi2,
      main=plot_main
    )
    axis(1, at = c(1, 16, 32, 48), labels = c(1, 16, 32, 48))

    # overlay posterior means
    points(rf_df$propsurv.est)

    # mark posterior mean probability across tanks
    abline(h = mean(inv_logit(post$a_bar)), lty = 2)

    # draw vertical dividers between tank densities
    abline(v = 16.5, lwd = 0.5)
    abline(v = 32.5, lwd = 0.5)
    text(8, 0, "small tanks")
    text(16 + 8, 0, "medium tanks")
    text(32 + 8, 0, "large tanks")
  })
}

display(precis(m13.2, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m13.2, depth=2), main='m13.2')
}, ar=1.0)
post <- extract.samples(m13.2)
plot_means(post, "m13.2")
      mean         sd        5.5%        94.5%         n_eff    Rhat4    
a[1]   2.166532834 0.8415777  0.94352287  3.5504316180 2487.705 0.9994595
a[2]   3.083820555 1.1286709  1.46455255  5.0701086864 2743.009 1.0004070
a[3]   1.016293639 0.6985934 -0.04076748  2.2287308266 3758.880 0.9991277
a[4]   3.043958298 1.0851385  1.49588614  4.9413883041 2912.386 0.9998120
a[5]   2.139423360 0.8904333  0.84651899  3.6387978108 3858.914 0.9997494
a[6]   2.106234009 0.8548032  0.83426377  3.5369092492 4052.181 0.9988760
a[7]   3.074045734 1.0832856  1.43759964  4.9418012232 2633.082 0.9996423
a[8]   2.155537339 0.8840924  0.86141276  3.6540940431 2528.486 0.9995637
a[9]  -0.183009287 0.6181161 -1.18128721  0.7753945948 4670.277 0.9984721
a[10]  2.155175584 0.8775133  0.86118519  3.6029886441 2723.008 0.9996099
a[11]  1.016263791 0.7156066 -0.07593707  2.2010431997 4537.656 0.9989630
a[12]  0.586259955 0.6082589 -0.35700958  1.5598335369 2945.767 0.9991685
a[13]  0.993370216 0.6616831  0.02932024  2.0340490072 3422.754 0.9988980
a[14]  0.200205841 0.5954573 -0.74466849  1.1526114496 6168.260 0.9985134
a[15]  2.160777602 0.8852034  0.89140831  3.7249424638 2769.060 0.9984947
a[16]  2.135269447 0.8728510  0.82302708  3.5372001103 3466.436 0.9996513
a[17]  2.920866171 0.8218896  1.73692434  4.3532783713 3032.486 0.9983675
a[18]  2.389906265 0.6457832  1.45755634  3.5092229445 3162.778 0.9996631
a[19]  2.012956911 0.5895563  1.11296558  2.9988914401 3153.116 0.9986043
a[20]  3.663996701 0.9911659  2.27516564  5.3380689361 2711.051 1.0013611
a[21]  2.395792576 0.6640905  1.42501176  3.5243121341 2953.745 0.9988757
a[22]  2.414117363 0.6710001  1.41611035  3.5427246122 2902.783 0.9986059
a[23]  2.407671181 0.6921338  1.41916362  3.5353627632 2459.663 0.9989985
a[24]  1.704750052 0.5293150  0.90629218  2.5495097822 3651.669 0.9988592
a[25] -0.994319675 0.4532927 -1.75235510 -0.2927055601 4017.570 0.9988534
a[26]  0.164556033 0.3937696 -0.45773453  0.8249454460 4677.469 0.9990814
a[27] -1.428287992 0.4943052 -2.24252972 -0.7033488354 3889.606 0.9990090
a[28] -0.473645576 0.4181870 -1.14362004  0.1987316685 4193.475 0.9988749
a[29]  0.152870077 0.3892231 -0.45690627  0.7788458084 5067.647 0.9987500
a[30]  1.448648952 0.4836664  0.69532944  2.2347934380 4506.236 0.9990361
a[31] -0.645829116 0.4135679 -1.31555328  0.0006253300 4252.301 0.9999317
a[32] -0.298607182 0.3836891 -0.92584145  0.2988842719 4034.821 0.9992044
a[33]  3.168398158 0.7186510  2.10514401  4.3791047937 2653.573 1.0008589
a[34]  2.716821745 0.6388658  1.74062328  3.7972089794 3392.903 0.9988065
a[35]  2.697692077 0.6373798  1.76997299  3.7864685410 3245.377 0.9994348
a[36]  2.071199176 0.5233694  1.31121377  2.9624441747 3371.962 1.0006748
a[37]  2.060716382 0.5081973  1.29818835  2.9219161746 3209.431 0.9985773
a[38]  3.911450602 1.0091420  2.47520448  5.6659437217 2368.681 1.0009199
a[39]  2.721275412 0.6679960  1.72508201  3.8795193702 2366.981 0.9999171
a[40]  2.335837792 0.5590744  1.48314458  3.2869636774 3345.704 0.9987150
a[41] -1.792990832 0.4670499 -2.56949264 -1.0765227758 3350.595 0.9991847
a[42] -0.574124729 0.3529837 -1.14964530 -0.0288867842 5625.134 0.9986725
a[43] -0.452584154 0.3316090 -0.98495100  0.0796395604 3678.316 0.9994217
a[44] -0.337160405 0.3427056 -0.89481695  0.2156299349 3397.883 0.9987282
a[45]  0.578105526 0.3331420  0.04870016  1.1144327713 3446.767 0.9988275
a[46] -0.569168356 0.3490973 -1.14603518 -0.0001102425 4554.960 0.9988415
a[47]  2.053672364 0.5490424  1.21890748  3.0052046403 4200.242 0.9989221
a[48]  0.003217844 0.3466782 -0.55469935  0.5485189019 4947.435 0.9987783
a_bar  1.351105699 0.2543945  0.94780768  1.7627148219 2829.865 1.0009259
sigma  1.619300421 0.2201868  1.30942421  1.9943555871 1471.004 1.0029568
../_images/practice-13_29_1.png ../_images/practice-13_29_2.png

As promised, sampling from the model produces divergent transitions:

m_tank_cauchy_orig <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank],
    a[tank] ~ dcauchy(a_bar, sigma),
    a_bar ~ dnorm(0, 1),
    sigma ~ dexp(1)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE
)
Warning message:
“There were 130 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”

Adjusting adapt_delta does little to reduce the number of divergent transitions:

m_tank_cauchy <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank],
    a[tank] ~ dcauchy(a_bar, sigma),
    a_bar ~ dnorm(0, 1),
    sigma ~ dexp(1)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE, control=list(adapt_delta=0.99)
)
recompiling to avoid crashing R session
Warning message:
“There were 94 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”

Let’s examine the pairs() plot suggested in the warning message (for only a few parameters):

sel_pars = c("a_bar", "sigma", "a[41]", "a[38]")
iplot(function() {
  pairs(m_tank_cauchy@stanfit, pars=sel_pars)
})
iplot(function() {
  traceplot(m_tank_cauchy, pars=sel_pars)
}, ar=2)
../_images/practice-13_35_0.png
[1] 1000
[1] 1
[1] 1000
../_images/practice-13_35_2.png

For comparison, these are the same plots for the Devil’s Funnel (i.e. Neal’s Funnel):

m13.7 <- ulam(
  alist(
    v ~ normal(0, 3),
    x ~ normal(0, exp(v))
  ),
  data = list(N = 1), chains = 4
)
display(precis(m13.7), mimetypes="text/plain")
iplot(function() {
  plot(precis(m13.7), main='m13.7')
}, ar=4.5)
SAMPLING FOR MODEL '3a5530af995ac1418f5e59b7607ceed3' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 8e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.067334 seconds (Warm-up)
Chain 1:                0.545005 seconds (Sampling)
Chain 1:                0.612339 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL '3a5530af995ac1418f5e59b7607ceed3' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 4e-06 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.016038 seconds (Warm-up)
Chain 2:                0.024405 seconds (Sampling)
Chain 2:                0.040443 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL '3a5530af995ac1418f5e59b7607ceed3' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3e-06 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.030976 seconds (Warm-up)
Chain 3:                0.033322 seconds (Sampling)
Chain 3:                0.064298 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL '3a5530af995ac1418f5e59b7607ceed3' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 3e-06 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.022122 seconds (Warm-up)
Chain 4:                0.013651 seconds (Sampling)
Chain 4:                0.035773 seconds (Total)
Chain 4: 
Warning message:
“There were 59 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“There were 262 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“The largest R-hat is 1.2, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”
  mean      sd         5.5%      94.5%     n_eff     Rhat4   
v  1.096171   2.278782  -2.66222  4.626455  15.60993 1.201442
x 22.061459 164.909855 -27.64887 40.783227 100.21564 1.048952
../_images/practice-13_37_8.png

This model produces warnings even producing the pairs() plot:

iplot(function() {
  pairs(m13.7@stanfit)
})
iplot(function() {
  traceplot(m13.7)
}, ar=2)
Warning message in KernSmooth::bkde2D(x, bandwidth = bandwidth, gridsize = nbin, :
“Binning grid too coarse for current (small) bandwidth: consider increasing 'gridsize'”
Warning message in KernSmooth::bkde2D(x, bandwidth = bandwidth, gridsize = nbin, :
“Binning grid too coarse for current (small) bandwidth: consider increasing 'gridsize'”
Warning message in KernSmooth::bkde2D(x, bandwidth = bandwidth, gridsize = nbin, :
“Binning grid too coarse for current (small) bandwidth: consider increasing 'gridsize'”
../_images/practice-13_39_3.png
[1] 1000
[1] 1
[1] 1000
../_images/practice-13_39_5.png

Unlike the pairs() plot from the Funnel, the divergent transitions produced by the Cauchy distribution are not associated with steep contours.

Let’s at least attempt to reparameterize the model to confirm whether or not it will help. One way to think about reparameterizing a model is as factoring the quantile function. As explained in Quantile function, a sample from a given distribution may be obtained in principle by applying its quantile function to a sample from a uniform distribution. The quantile function for the Normal distribution is:

\[ Q(p, \mu, \sigma) = \mu + \sigma \sqrt{2} \cdot erf^{-1} (2p - 1) \]

So for the standard normal: $\( Q_s(p) = \sqrt{2} \cdot erf^{-1} (2p - 1) \)$

Imagining p comes from a uniform distribution in both cases, we can write: $\( Q(p, \mu, \sigma) = \mu + \sigma Q_s(p) \)$

Starting from the quantile function for the Cauchy distribution: $\( Q(p, x_0, \gamma) = x_0 + \gamma \tan[\pi(p - \frac{1}{2})] \)$

Define the standard Cauchy distribution quantile function as: $\( Q_s(p) = \tan[\pi(p - \frac{1}{2})] \)$

So that: $\( Q(p, x_0, \gamma) = x_0 + \gamma Q_s(p) \)$

Finally, define a new non-centered model: $\( \begin{align} S_i & \sim Binomial(N_i,p_i) \\ logit(p_i) & = \alpha + \sigma \cdot a_{tank[i]} \\ a_{tank} & \sim Cauchy(0, 1), tank = 1..48 \\ \alpha & \sim Normal(0, 1) \\ \sigma & \sim Exponential(1) \\ \end{align} \)$

Unfortunately, this model produces the same divergent transitions:

m_tank_noncen_cauchy <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a_bar + sigma * std_c[tank],
    std_c[tank] ~ dcauchy(0, 1),
    a_bar ~ dnorm(0, 1),
    sigma ~ dexp(1)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE
)
Warning message:
“There were 116 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“The largest R-hat is 1.05, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”

What’s more likely going on here is that the energy sanity checks in MCMC are too tight for the extreme deviates produced by the Cauchy distribution. Consider the quantile function for the Cauchy distribution above, defined based on the trigonometric tan function. This definition leads to extreme negative or positive deviates when p is near either zero or one; notice this term approaches negative infinity as p approaches zero and positive infinity as it approaches one.

Notice in the pairs() plot above, the trace plot, and in the number of effective samples that larger parameters like a[38] are harder to sample than a[41]. Part of the reason for this may be that these samples would get rejected as divergent transitions, even when we rarely happen to sample from this part of the posterior.

In the posterior distributions parameter a[38] is much less certain than a[41]. In the original and Cauchy model, parameter 38 is inferred to be larger than 41. All large parameters are going to be more uncertain with the Cauchy distribution and in fact for all distributions with a thick tail; getting a sample of a large value implies a rather large deviate from the distribution and a specific large deviate is relatively unlikely relative to other large deviates (the long tail is relatively flat). Additionally, a large observation could be explained by applying a larger range of parameters to a distribution because every parameterization has large tails that allow for the observation. The same observation could be made of the Student-t inferences in the next question. The increase in uncertainty for larger parameters exists even in model m13.2 but is much more significant for thick-tailed distributions.

display(precis(m_tank_cauchy, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m_tank_cauchy, depth=2), main='m_tank_cauchy')
}, ar=1.0)
      mean        sd        5.5%        94.5%        n_eff     Rhat4    
a[1]   1.99647046 0.8401515  0.92399643  3.411510128 1154.0952 0.9997665
a[2]   4.80818116 4.3096068  1.48628478 12.538800150  273.8290 1.0104956
a[3]   1.10831364 0.6388928  0.12340151  2.146160772 1778.2777 1.0000306
a[4]   5.30629433 5.3564101  1.47513932 15.746841449  149.8555 1.0095211
a[5]   2.01941918 0.9213374  0.82533394  3.581618325 1311.4512 1.0019060
a[6]   1.99709160 0.8042129  0.88130554  3.357805723 1253.6649 0.9997312
a[7]   4.62571446 4.2960997  1.42773290 12.903590399  232.4462 1.0208917
a[8]   1.98723785 0.8665473  0.87453012  3.385687794  666.2308 1.0093281
a[9]  -0.11674686 0.6907840 -1.23520967  0.970768429 1864.3198 0.9993282
a[10]  2.00446437 0.8428720  0.84037345  3.472374934 1588.3626 1.0003484
a[11]  1.12839957 0.6229925  0.15387665  2.141774357 1746.7038 1.0015580
a[12]  0.74692410 0.6108493 -0.24639222  1.690552636 3001.9513 0.9989188
a[13]  1.10472578 0.6181940  0.14641442  2.090989992 1948.0771 1.0001995
a[14]  0.34700095 0.6481867 -0.73376079  1.332006978 1932.4623 0.9993175
a[15]  2.02611832 0.8319995  0.89262532  3.456077732 1362.0597 0.9993527
a[16]  1.99018856 0.8141311  0.86684468  3.377906856 1278.3009 0.9996515
a[17]  2.88291700 0.9725162  1.65689708  4.621965258 1106.5265 0.9999320
a[18]  2.28946763 0.6651255  1.33708509  3.496878809 1415.6688 1.0005361
a[19]  1.91066070 0.5397316  1.11289112  2.827356144 2487.7280 0.9992930
a[20]  6.84957079 5.9057550  2.32855017 18.830080271  193.3365 1.0128372
a[21]  2.26143930 0.6775916  1.34902977  3.419180255 1205.6598 1.0018396
a[22]  2.30246763 0.7130372  1.34260846  3.547126606 1331.2158 1.0005971
a[23]  2.25968364 0.6193421  1.39585928  3.306143417 1792.2083 1.0001322
a[24]  1.66404523 0.5033919  0.90690609  2.493801685 1581.4321 0.9997193
a[25] -1.05983114 0.4649516 -1.82969118 -0.356337807 2414.2958 1.0000577
a[26]  0.24965270 0.4140912 -0.41622083  0.911340541 2077.2350 0.9991681
a[27] -1.60081415 0.5933089 -2.57146578 -0.696736300 1492.8493 1.0007407
a[28] -0.47376825 0.4273280 -1.16869084  0.185317259 2332.1189 0.9993965
a[29]  0.23182910 0.4073967 -0.43407257  0.895983369 2132.4610 1.0015407
a[30]  1.45208215 0.4473584  0.76761621  2.183423358 1726.0879 0.9997407
a[31] -0.64793467 0.4393736 -1.37480740  0.036584466 2302.0482 0.9997767
a[32] -0.26509198 0.4124485 -0.94142099  0.411683935 2538.8333 0.9996077
a[33]  3.19639133 0.9039710  2.01616557  4.849090457 1456.2612 1.0019402
a[34]  2.59748876 0.6725988  1.64362812  3.814365409 1576.6584 1.0028913
a[35]  2.57970591 0.6378767  1.69493708  3.691241236 1730.5200 0.9997945
a[36]  1.96987300 0.4711234  1.27146874  2.764291753 1840.7914 1.0000677
a[37]  1.94082591 0.4591843  1.24639942  2.717371014 2527.5515 0.9990169
a[38]  7.59653965 5.6085767  2.75382037 18.801685780  368.1091 1.0046948
a[39]  2.58616652 0.6501753  1.68806860  3.728043530 1734.8279 0.9995514
a[40]  2.25867244 0.5566407  1.46441712  3.217474682 1877.2075 0.9994741
a[41] -1.99566626 0.5398651 -2.91269006 -1.162880133 2408.0692 0.9997810
a[42] -0.57325485 0.3544456 -1.13037844  0.007080425 2565.8816 0.9997211
a[43] -0.43998249 0.3550993 -1.00404870  0.115039569 1883.9716 0.9999241
a[44] -0.31880578 0.3490138 -0.91182391  0.243305851 1782.7352 1.0001195
a[45]  0.64508137 0.3441780  0.08143198  1.201752738 1774.2780 0.9998679
a[46] -0.56317756 0.3714863 -1.16481888 -0.001210898 2442.0758 0.9989250
a[47]  1.96987255 0.4822971  1.26215186  2.798774488 2079.1188 1.0004091
a[48]  0.04917341 0.3455890 -0.53072736  0.592330075 2594.2359 1.0008281
a_bar  1.42980194 0.2966167  0.92918352  1.899325658 1297.2857 1.0004267
sigma  1.01571455 0.2258337  0.69331489  1.413051988 1117.9858 0.9990079
../_images/practice-13_43_1.png

The same trends in shrinkage exist in this model as the original model. Because of the thicker tails, the model doesn’t apply as much shrinkage to extreme observations. Because of the difficulty in sampling, the shrinkage is more variable.

post <- extract.samples(m_tank_cauchy)
plot_means(post, "m_tank_cauchy")
../_images/practice-13_45_0.png

13M4. Now use a Student-t distribution with \(\nu = 2\) for the intercepts:

\[ \alpha_{tank} \sim Student(2, \alpha, \sigma) \]

Refer back to the Student-t example in Chapter 7 (page 234), if necessary. Compare the resulting posterior to both the original model and the Cauchy model in 13M3. Can you explain the differences and similarities in shrinkage in terms of the properties of these distributions?

Answer. This model produces some but fewer divergent transitions, likely because of deviates coming from the thick tails:

m_tank_student_t <- ulam(
  alist(
    S ~ dbinom(N, p),
    logit(p) <- a[tank],
    a[tank] ~ dstudent(2, a_bar, sigma),
    a_bar ~ dnorm(0, 1),
    sigma ~ dexp(1)
  ),
  data = rf_dat, chains = 4, cores = 4, log_lik = TRUE
)
Warning message:
“There were 38 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”

Although we won’t address these divergent transitions, let’s at least check how much mixing is occurring:

iplot(function() {
  pairs(m_tank_cauchy@stanfit, pars=sel_pars)
})
iplot(function() {
  traceplot(m_tank_cauchy, pars=sel_pars)
}, ar=2)

display(precis(m_tank_student_t, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m_tank_student_t, depth=2), main='m_tank_student_t')
}, ar=1.0)
../_images/practice-13_49_0.png
[1] 1000
[1] 1
[1] 1000
../_images/practice-13_49_2.png
      mean        sd        5.5%        94.5%       n_eff     Rhat4    
a[1]   2.05346111 0.8798277  0.83265412  3.56529641 1894.7674 0.9995025
a[2]   4.20350410 4.0843261  1.38667541  9.21600134  107.4649 1.0292177
a[3]   1.01892742 0.6158019  0.07690434  2.04176014 2503.2046 0.9997456
a[4]   4.12635920 3.5855378  1.48243761  9.62940287  147.0624 1.0308590
a[5]   2.03154539 0.8197053  0.84061165  3.45192050 1735.9796 1.0001229
a[6]   2.03098082 0.8490672  0.82347559  3.45006273 2207.8372 0.9996984
a[7]   4.04334784 3.0275362  1.47342549  9.55036941  337.6570 1.0060960
a[8]   2.04606209 0.8369052  0.90828960  3.54123093 1882.4850 0.9998630
a[9]  -0.13729933 0.6595361 -1.19020064  0.83071439 3169.4645 0.9994058
a[10]  2.03703508 0.8150551  0.90136601  3.46128118 2164.3847 0.9995474
a[11]  1.02671169 0.6772876 -0.02383362  2.13218408 2175.9477 1.0014099
a[12]  0.64859784 0.6286126 -0.30551849  1.67850140 2388.1551 1.0004375
a[13]  1.03646789 0.6237585  0.08554072  2.07076385 3016.4163 0.9994354
a[14]  0.27977924 0.6359484 -0.74574317  1.29352239 2538.3967 0.9989464
a[15]  2.06188663 0.8814039  0.82801833  3.54209877 1789.8560 1.0014399
a[16]  2.05181595 0.8626070  0.82361803  3.58798723 2255.2025 0.9990352
a[17]  2.87728537 0.9018348  1.67913029  4.42794493 1669.0232 0.9984970
a[18]  2.31642241 0.6505178  1.40508668  3.44151751 2176.2193 1.0009349
a[19]  1.93409433 0.5635859  1.09208926  2.89620308 2489.2512 0.9995286
a[20]  5.62055720 4.4598315  2.31325484 12.07168268  110.2271 1.0266536
a[21]  2.33508291 0.6813357  1.38123332  3.49585048 1312.3687 0.9994553
a[22]  2.31284256 0.6739769  1.35632798  3.49685589 1720.3227 1.0015946
a[23]  2.31846906 0.6656521  1.35422474  3.46551041 1526.6936 1.0013160
a[24]  1.66757814 0.4979423  0.92188589  2.47879730 2566.5212 0.9990095
a[25] -1.03392567 0.4377526 -1.74734747 -0.35294034 2451.6089 1.0012125
a[26]  0.21743198 0.4016209 -0.42224599  0.84061306 2986.4681 0.9992165
a[27] -1.54297844 0.5241730 -2.41338600 -0.72323073 2507.5654 1.0013788
a[28] -0.45910639 0.4106750 -1.13102151  0.17563578 2537.2157 0.9983397
a[29]  0.19308979 0.4060744 -0.44687677  0.84283026 2835.0891 1.0006900
a[30]  1.44396469 0.4795011  0.70458177  2.26577005 2537.2308 1.0009585
a[31] -0.64919615 0.4276299 -1.35542210  0.03176932 3001.9487 1.0000804
a[32] -0.27183247 0.4225807 -0.95164602  0.37888470 3084.7988 0.9996074
a[33]  3.24564052 0.9518985  2.01484314  4.89867158 1173.2678 1.0001080
a[34]  2.64030469 0.6641608  1.69389011  3.79058736 1913.0803 0.9991754
a[35]  2.65270155 0.6597212  1.70469063  3.80608662 1840.8487 0.9995334
a[36]  2.00233095 0.4929151  1.26489902  2.80231183 2761.8043 0.9983534
a[37]  2.00704751 0.4778764  1.29594172  2.81999828 2596.9392 1.0004952
a[38]  6.13837367 4.5887634  2.68258408 13.92001457  271.8254 1.0152496
a[39]  2.63724614 0.6646627  1.69818508  3.79183391 2265.9681 0.9996580
a[40]  2.27877698 0.5619549  1.46347232  3.23645274 2127.5588 0.9997687
a[41] -1.95569134 0.5293822 -2.85168821 -1.15603423 2487.4041 0.9988101
a[42] -0.56751946 0.3550501 -1.14914849 -0.01421364 2918.9527 0.9994021
a[43] -0.42325442 0.3591591 -1.00059159  0.13647542 3016.4850 0.9990144
a[44] -0.31871452 0.3371112 -0.83726287  0.20877479 2574.8550 0.9986924
a[45]  0.61255997 0.3474697  0.07495584  1.16294731 3150.4886 0.9987996
a[46] -0.56930698 0.3664327 -1.17224364  0.02203844 2615.8171 1.0000332
a[47]  2.00375105 0.4919450  1.28091427  2.84422997 2530.8236 0.9990265
a[48]  0.03866316 0.3372636 -0.48603363  0.57295427 2800.7939 0.9990020
a_bar  1.34075975 0.2687579  0.91847535  1.76581758 2095.4535 0.9999049
sigma  1.27364954 0.2198242  0.95907257  1.64193321 1612.9297 1.0009702
../_images/practice-13_49_4.png

Similar trends in shrinkage exist in this model. Because of the thicker tails, the model doesn’t apply as much shrinkage to extreme observations. Improved sampling relative to the Cauchy distribution seems to have produced slightly more consistent shrinkage.

post <- extract.samples(m_tank_student_t)
plot_means(post, "m_tank_student_t")
../_images/practice-13_51_0.png

13M5. Modify the cross-classified chimpanzees model m13.4 so that the adaptive prior for blocks contains a parameter \(\gamma\) for its mean:

\[\begin{split} \begin{align} \gamma_j & \sim Normal(\bar{\gamma}, \sigma_{\gamma}) \\ \bar{\gamma} & \sim Normal(0, 1.5) \end{align} \end{split}\]

Compare this model to m13.4. What has including \(\gamma\) done?

Answer. First, let’s reproduce the relevant plots from the chapter:

data(chimpanzees)
d <- chimpanzees
d$treatment <- 1 + d$prosoc_left + 2 * d$condition

dat_list <- list(
  pulled_left = d$pulled_left,
  actor = d$actor,
  block_id = d$block,
  treatment = as.integer(d$treatment)
)

set.seed(13)
m13.4 <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + g[block_id] + b[treatment],
    b[treatment] ~ dnorm(0, 0.5),
    ## adaptive priors
    a[actor] ~ dnorm(a_bar, sigma_a),
    g[block_id] ~ dnorm(0, sigma_g),
    ## hyper-priors
    a_bar ~ dnorm(0, 1.5),
    sigma_a ~ dexp(1),
    sigma_g ~ dexp(1)
  ),
  data = dat_list, chains = 4, cores = 4, log_lik = TRUE
)

display(precis(m13.4, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m13.4, depth=2), main='m13.4')
}, ar=2.0)
Warning message:
“There were 8 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”
        mean        sd        5.5%         94.5%       n_eff    Rhat4   
b[1]    -0.11615950 0.3005730 -0.590557431  0.35487909 341.9669 1.013459
b[2]     0.39798057 0.3082406 -0.091243273  0.89572497 382.5939 1.008442
b[3]    -0.46319143 0.3074857 -0.955775110  0.02294964 442.0306 1.008027
b[4]     0.29072509 0.3022209 -0.198956418  0.77520474 340.5103 1.005458
a[1]    -0.37237534 0.3735216 -0.952143414  0.24461789 360.9980 1.005206
a[2]     4.64362246 1.2404897  3.062224463  6.73789096 677.9918 1.012002
a[3]    -0.68062851 0.3697046 -1.272373182 -0.08899088 286.9193 1.005228
a[4]    -0.67622831 0.3732220 -1.236789673 -0.06893589 371.2960 1.007441
a[5]    -0.35994244 0.3656058 -0.921735209  0.25147471 354.5740 1.007732
a[6]     0.57366442 0.3809162 -0.004217109  1.17408617 325.9938 1.012455
a[7]     2.09973745 0.4622194  1.398033779  2.83875730 431.2037 1.006421
g[1]    -0.18166155 0.2473370 -0.672918642  0.08538534 225.9254 1.014807
g[2]     0.03786340 0.1936058 -0.230668997  0.35845635 410.2488 1.014838
g[3]     0.05187367 0.1948952 -0.216316191  0.38606310 389.5636 1.021056
g[4]     0.01162107 0.1920499 -0.286063273  0.31457637 444.1353 1.010494
g[5]    -0.03372976 0.1875264 -0.327363879  0.23893233 312.3261 1.014771
g[6]     0.11314564 0.2052871 -0.139423515  0.48303072 419.7813 1.017376
a_bar    0.55787921 0.7133251 -0.526962103  1.69749820 557.0130 1.008263
sigma_a  2.03959649 0.6769575  1.216298295  3.40139094 312.7584 1.008526
sigma_g  0.22291426 0.1816101  0.021827743  0.54077837 160.8728 1.024422
../_images/practice-13_53_5.png

When we add the new parameter, the number of divergent transitions increase:

m13.4_extra_param <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + g[block_id] + b[treatment],
    b[treatment] ~ dnorm(0, 0.5),
    ## adaptive priors
    a[actor] ~ dnorm(a_bar, sigma_a),
    g[block_id] ~ dnorm(g_bar, sigma_g),
    ## hyper-priors
    a_bar ~ dnorm(0, 1.5),
    g_bar ~ dnorm(0, 1.5),
    sigma_a ~ dexp(1),
    sigma_g ~ dexp(1)
  ),
  data = dat_list, chains = 4, cores = 4, log_lik = TRUE
)
Warning message:
“There were 385 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“The largest R-hat is 1.55, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”

We also see many HDPI increase:

display(precis(m13.4_extra_param, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m13.4_extra_param, depth=2), main='m13.4_extra_param')
}, ar=2.0)
        mean        sd        5.5%        94.5%       n_eff      Rhat4   
b[1]    -0.17813736 0.2603696 -0.59177498  0.28168289 477.802182 1.020488
b[2]     0.26875222 0.3289717 -0.10305314  0.83007340   5.189463 1.300721
b[3]    -0.58919595 0.3080753 -0.92700257 -0.07270346   6.211768 1.236896
b[4]     0.18286977 0.2936379 -0.16282659  0.69268802  10.554156 1.144322
a[1]    -0.49538280 1.0579117 -2.39711975  0.79275830  10.970626 1.147074
a[2]     4.83839506 1.7132034  2.07716453  6.98893009   6.084632 1.258717
a[3]    -0.64527672 1.1988423 -2.64370490  0.72398162   5.102209 1.310564
a[4]    -0.76862008 1.0822653 -2.67498627  0.44261506   8.237649 1.191341
a[5]    -0.36943430 1.1816639 -2.38146859  1.04030622   5.514536 1.294735
a[6]     0.33962335 0.9951733 -1.49753082  1.71581973  43.629054 1.067636
a[7]     2.01019961 1.0936865  0.01590190  3.31309956   9.062989 1.164873
g[1]     0.07292304 0.9791712 -1.19950901  1.82572130  48.922049 1.080038
g[2]     0.23335141 0.9991374 -0.98702361  2.01905031  16.150120 1.120425
g[3]     0.25257601 1.0100317 -0.94932471  2.01178480  15.123122 1.122954
g[4]     0.21687297 1.0027637 -1.02863119  1.98399979  17.569246 1.114660
g[5]     0.18323217 0.9935132 -1.03603191  1.96190655  19.472040 1.105377
g[6]     0.29531996 1.0254596 -0.88359120  2.13254755  12.467377 1.133592
a_bar    0.34393322 0.9656677 -1.34241524  1.81217244 289.944388 1.023295
g_bar    0.20735506 0.9933946 -1.02400712  1.97511367  17.496528 1.113645
sigma_a  1.97322416 0.5484229  1.27586783  3.04137557 707.275121 1.016221
sigma_g  0.17771181 0.1706753  0.01894145  0.49719091   6.824275 1.220724
../_images/practice-13_57_1.png

Notice the previously specified prior for the new parameter \(\bar{\gamma}\) did not change significantly in the posterior. Let’s change the prior to: $\( \bar{\gamma} \sim Normal(5, 4) \)$

With this change, we can come up with significantly different inferences:

m13.4_extra_param_new_prior <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + g[block_id] + b[treatment],
    b[treatment] ~ dnorm(0, 0.5),
    ## adaptive priors
    a[actor] ~ dnorm(a_bar, sigma_a),
    g[block_id] ~ dnorm(g_bar, sigma_g),
    ## hyper-priors
    a_bar ~ dnorm(0, 1.5),
    g_bar ~ dnorm(5, 1.5),
    sigma_a ~ dexp(1),
    sigma_g ~ dexp(1)
  ),
  data = dat_list, chains = 4, cores = 4, log_lik = TRUE
)

display(precis(m13.4_extra_param_new_prior, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m13.4_extra_param_new_prior, depth=2), main='m13.4_extra_param_new_prior')
}, ar=2.0)
Warning message:
“There were 42 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.”
Warning message:
“There were 171 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded”
Warning message:
“Examine the pairs() plot to diagnose sampling problems
”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”
        mean       sd        5.5%        94.5%       n_eff    Rhat4   
b[1]    -0.1982002 0.3104625 -0.72790681  0.29152455 647.2476 1.000009
b[2]     0.3332580 0.3027702 -0.15706229  0.80377376 557.5257 1.001001
b[3]    -0.5483467 0.3068423 -1.05140473 -0.05345565 705.2644 1.002405
b[4]     0.2159099 0.3072428 -0.28113841  0.71039389 726.5421 1.000120
a[1]    -3.5701226 1.2542313 -5.56103644 -1.66419461 148.8309 1.029586
a[2]     1.8570795 1.8344063 -0.81761566  5.02372101 275.7184 1.008268
a[3]    -3.8649515 1.2612236 -5.88895760 -1.97150792 149.4719 1.027678
a[4]    -3.8609477 1.2726703 -5.92074520 -1.91772504 149.9519 1.028724
a[5]    -3.5654391 1.2710183 -5.55586118 -1.66868112 146.0176 1.029502
a[6]    -2.6169841 1.2572365 -4.64793820 -0.68266901 145.9102 1.029996
a[7]    -1.0728047 1.2963241 -3.22741473  0.85734865 153.0102 1.025513
g[1]     3.1049763 1.2554391  1.13769846  5.07737529 139.8131 1.034503
g[2]     3.3204328 1.2509203  1.32067650  5.25879070 143.7018 1.028116
g[3]     3.3377147 1.2520669  1.35748335  5.28192803 143.3000 1.027749
g[4]     3.2873947 1.2514220  1.29297229  5.27030263 142.4782 1.030420
g[5]     3.2420809 1.2593901  1.24941038  5.19733086 144.4855 1.031121
g[6]     3.3967067 1.2540103  1.41208240  5.35444757 146.0944 1.026500
a_bar   -1.8202624 1.2145033 -3.79149760  0.10998530 232.1819 1.016298
g_bar    3.2892587 1.2428016  1.33463476  5.24329650 142.9446 1.029420
sigma_a  2.2105176 0.7628012  1.26305682  3.55048611 643.9483 1.007105
sigma_g  0.2233260 0.1670153  0.05083342  0.52403040 374.7944 1.018752
../_images/practice-13_59_6.png

All these symptoms are indicative of a non-identifiable parameter, as discussed in the chapter at the start of section 13.3.1. We encountered the same issue in question 13M1.

ERROR:

Likewise, the tails of distributions strongly influence can (sic) outliers are shrunk or not towards the mean.

Likewise, the tails of distributions strongly influence whether outliers are shrunk towards the mean.

13M6. Sometimes the prior and the data (through the likelihood) are in conflict, because they concentrate around different regions of parameter space. What happens in these cases depends a lot upon the shape of the tails of the distributions. Likewise, the tails of distributions strongly influence can outliers are shrunk or not towards the mean. I want you to consider four different models to fit to one observation at \(y = 0\). The models differ only in the distributions assigned to the likelihood and the prior. Here are the four models:

Model NN: $\( \begin{align} y & \sim Normal(\mu, 1) \\ \mu & \sim Normal(10, 1) \end{align} \)$

Model TN: $\( \begin{align} y & \sim Student(2, \mu, 1) \\ \mu & \sim Normal(10, 1) \end{align} \)$

Model NT: $\( \begin{align} y & \sim Normal(\mu, 1) \\ \mu & \sim Student(2, 10, 1) \end{align} \)$

Model TT: $\( \begin{align} y & \sim Student(2, \mu, 1) \\ \mu & \sim Student(2, 10, 1) \end{align} \)$

Estimate the posterior distributions for these models and compare them. Can you explain the results, using the properties of the distributions?

Answer. Fit model NN:

m_nn <- ulam(
  alist(
    y ~ dnorm(mu, 1),
    mu ~ dnorm(10, 1)
  ),
  data = list(y = 0), chains = 4
)
flush.console()
SAMPLING FOR MODEL '143b90eac39af089032ce5f1b377f41a' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 7e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.00475 seconds (Warm-up)
Chain 1:                0.004517 seconds (Sampling)
Chain 1:                0.009267 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL '143b90eac39af089032ce5f1b377f41a' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 3e-06 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.004598 seconds (Warm-up)
Chain 2:                0.004455 seconds (Sampling)
Chain 2:                0.009053 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL '143b90eac39af089032ce5f1b377f41a' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3e-06 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.004563 seconds (Warm-up)
Chain 3:                0.004448 seconds (Sampling)
Chain 3:                0.009011 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL '143b90eac39af089032ce5f1b377f41a' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 3e-06 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.00456 seconds (Warm-up)
Chain 4:                0.003603 seconds (Sampling)
Chain 4:                0.008163 seconds (Total)
Chain 4: 

Output of precis for model NN:

display(precis(m_nn), mimetypes="text/plain")
      result     
mean    4.9778639
sd      0.6627867
5.5%    3.9302048
94.5%   6.0280244
n_eff 749.2425503
Rhat    1.0038182

Fit model TN:

m_tn <- ulam(
  alist(
    y ~ dstudent(2, mu, 1),
    mu ~ dnorm(10, 1)
  ),
  data = list(y = 0), chains = 4
)
flush.console()
SAMPLING FOR MODEL '63059b348951686cf386007a367cbbb4' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.003993 seconds (Warm-up)
Chain 1:                0.004455 seconds (Sampling)
Chain 1:                0.008448 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL '63059b348951686cf386007a367cbbb4' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 2e-06 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.004454 seconds (Warm-up)
Chain 2:                0.004449 seconds (Sampling)
Chain 2:                0.008903 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL '63059b348951686cf386007a367cbbb4' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3e-06 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.004611 seconds (Warm-up)
Chain 3:                0.004021 seconds (Sampling)
Chain 3:                0.008632 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL '63059b348951686cf386007a367cbbb4' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 3e-06 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.004581 seconds (Warm-up)
Chain 4:                0.004276 seconds (Sampling)
Chain 4:                0.008857 seconds (Total)
Chain 4: 
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”

Output of precis for model TN:

display(precis(m_tn), mimetypes="text/plain")
      result    
mean    9.731344
sd      1.029323
5.5%    8.098945
94.5%  11.393812
n_eff 530.009814
Rhat    1.008176

Fit model NT:

m_nt <- ulam(
  alist(
    y ~ dnorm(mu, 1),
    mu ~ dstudent(2, 10, 1)
  ),
  data = list(y = 0), chains = 4
)
flush.console()
SAMPLING FOR MODEL '5190f924283a6d1bedaa2b8ac8053620' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.004309 seconds (Warm-up)
Chain 1:                0.004242 seconds (Sampling)
Chain 1:                0.008551 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL '5190f924283a6d1bedaa2b8ac8053620' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 4e-06 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.004606 seconds (Warm-up)
Chain 2:                0.004203 seconds (Sampling)
Chain 2:                0.008809 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL '5190f924283a6d1bedaa2b8ac8053620' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3e-06 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.004703 seconds (Warm-up)
Chain 3:                0.004282 seconds (Sampling)
Chain 3:                0.008985 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL '5190f924283a6d1bedaa2b8ac8053620' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 3e-06 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.004672 seconds (Warm-up)
Chain 4:                0.004342 seconds (Sampling)
Chain 4:                0.009014 seconds (Total)
Chain 4: 

Output of precis for model NT:

display(precis(m_nt), mimetypes="text/plain")
      result     
mean    0.2717644
sd      0.9905853
5.5%   -1.3059760
94.5%   1.9153333
n_eff 713.2714690
Rhat    1.0034458

Fit model NN:

m_tt <- ulam(
  alist(
    y ~ dstudent(2, mu, 1),
    mu ~ dstudent(2, 10, 1)
  ),
  data = list(y = 0), chains = 4
)
flush.console()
SAMPLING FOR MODEL 'e678c0b25fd3bc9742b33a33e8521152' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.006268 seconds (Warm-up)
Chain 1:                0.006276 seconds (Sampling)
Chain 1:                0.012544 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'e678c0b25fd3bc9742b33a33e8521152' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 3e-06 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.006154 seconds (Warm-up)
Chain 2:                0.004696 seconds (Sampling)
Chain 2:                0.01085 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'e678c0b25fd3bc9742b33a33e8521152' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3e-06 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.006446 seconds (Warm-up)
Chain 3:                0.004593 seconds (Sampling)
Chain 3:                0.011039 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'e678c0b25fd3bc9742b33a33e8521152' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 3e-06 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.005668 seconds (Warm-up)
Chain 4:                0.005175 seconds (Sampling)
Chain 4:                0.010843 seconds (Total)
Chain 4: 
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”

Output of precis for model TT:

display(precis(m_tt), mimetypes="text/plain")
      result    
mean    5.094986
sd      4.624934
5.5%   -0.959102
94.5%  11.001243
n_eff 109.364261
Rhat    1.017138

These four models are best understood in terms of Robust regression (section 7.5.2) and Regularizing priors (section 7.3).

Try not to assume a right answer in these inferences. The \(y = 0\) data point is definitely an outlier with respect to the four priors centered at \(\mu = 10\), but are the priors or the data point correct? It may be that most observations are at \(y = 10\) and this sample is truly an outlier, or that most observations are at \(y = 0\) and the human building the model made a mistake in stating a prior centered on \(\mu = 10\).

Let’s consider regularizing priors. The normal priors on \(\mu\) in the first and second models are much more regularizing than the Student-t priors. A regularizing prior in this and any other model means a prior that prevents the model from getting overly excited by the data. Said another way, a regularizing prior is a strong statement about prior beliefs that should override the data to some extent. The term regularizing is generally a positive one, but a regularizing prior is more generally a neutral concept of preference for the prior to the data. In these first two models we see the inference for \(\mu\) is generally closer to the prior than to the data. That is, \(\mu \sim 5\) and \(\mu \sim 10\) are closer to the prior of \(\mu = 10\) than the models with less regularizing Student-t distributions, with inferences of \(\mu \sim 0\) and \(\mu \sim 5\).

Equally as important to the final inferences are the likelihood functions. The Student-t likelihoods in the second and fourth models are much more robust to outliers than the noormal likelihoods. We say we are doing robust regression when we pick a likelihood function that is not easily surprised by the data such as a Student-t distribution. Again, the term robust is generally a positive one, but a robust regression is more generally a neutral concept of preference against getting excited by any single data point. With one data point, this translates to preference for the prior. In the second and fourth models we see the inference for \(\mu\) is generally closer to the prior than to the data. That is, \(\mu \sim 10\) and \(\mu \sim 5\) are closer to the prior of \(\mu = 10\) than the models with less robust normal likelihoods, with inferences of \(\mu \sim 5\) and \(\mu \sim 0\).

Interestingly, the flat prior on model TT has led to some divergent transitions. See section 9.5.3 for how a weakly informative prior such as the normal prior in model TN could address this issue. If you look at the pairs() plot and traceplot below, you’ll see the sampling process can’t decide whether the data or the likelihood is correct. That is, the histogram for mu produces a large number of samples at both \(\mu = 0\) and \(\mu = 10\). The Student-t likelihood doesn’t strongly rule out the possibility that the Student-t prior is correct, and the Student-t prior doesn’t strongly rule out the possibility that the data, through the Student-t likelihood, is correct.

iplot(function() {
  pairs(m_tt@stanfit)
})
iplot(function() {
  traceplot(m_tt)
}, ar=2)
../_images/practice-13_77_0.png
[1] 1000
[1] 1
[1] 1000
../_images/practice-13_77_2.png

13H1. In 1980, a typical Bengali woman could have 5 or more children in her lifetime. By the year 2000, a typical Bengali woman had only 2 or 3. You’re going to look at a historical set of data, when contraception was widely available but many families chose not to use it. These data reside in data(bangladesh) and come from the 1988 Bangladesh Fertility Survey. Each row is one of 1934 women. There are six variables, but you can focus on two of them for this practice problem:

  1. district: ID number of administrative district each woman resided in

  2. use.contraception: An indicator (0/1) of whether the woman was using contraception

The first thing to do is ensure that the cluster variable, district, is a contiguous set of integers. Recall that these values will be index values inside the model. If there are gaps, you’ll have parameters for which there is no data to inform them. Worse, the model probably won’t run. Look at the unique values of the district variable:

R code 13.40

> sort(unique(d$district))

[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
[51] 51 52 53 55 56 57 58 59 60 61

District 54 is absent. So district isn’t yet a good index variable, because it’s not contiguous. This is easy to fix. Just make a new variable that is contiguous. This is enough to do it:

R code 13.41

> d$district_id <- as.integer(as.factor(d$district))
> sort(unique(d$district_id))

[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
[51] 51 52 53 54 55 56 57 58 59 60

Now there are 60 values, contiguous integers 1 to 60. Now, focus on predicting use.contraception, clustered by district_id. Fit both (1) a traditional fixed-effects model that uses an index variable for district and (2) a multilevel model with varying intercepts for district. Plot the predicted proportions of women in each district using contraception, for both the fixed-effects model and the varying-effects model. That is, make a plot in which district ID is on the horizontal axis and expected proportion using contraception is on the vertical. Make one plot for each model, or layer them on the same plot, as you prefer. How do the models disagree? Can you explain the pattern of disagreement? In particular, can you explain the most extreme cases of disagreement, both why they happen where they do and why the models reach different inferences?

Answer. The help for the bangladesh data.frame, to confirm we aren’t missing anything:

data(bangladesh)
display(help(bangladesh))

bc_df <- bangladesh
bc_df$district_id <- as.integer(as.factor(bc_df$district))
sort(unique(bc_df$district_id))
bangladesh {rethinking}R Documentation

Bangladesh contraceptive use data

Description

Contraceptive use data from 1934 Bangladeshi women.

Usage

data(bangladesh)

Format

  1. woman : ID number for each woman in sample

  2. district : Number for each district

  3. use.contraception : 0/1 indicator of contraceptive use

  4. living.children : Number of living children

  5. age.centered : Centered age

  6. urban : 0/1 indicator of urban context

References

Bangladesh Fertility Survey, 1989


[Package rethinking version 2.13 ]
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60

The head of the bangladesh data.frame, with the new variable suggested by the author:

display(head(bc_df))
A data.frame: 6 × 7
womandistrictuse.contraceptionliving.childrenage.centeredurbandistrict_id
<int><int><int><int><dbl><int><int>
11104 18.440011
22101 -5.559911
33103 1.440011
44104 8.440011
55101-13.559011
66101-11.560011

A summary of the bangladesh data.frame:

display(summary(bc_df))
     woman           district     use.contraception living.children
 Min.   :   1.0   Min.   : 1.00   Min.   :0.0000    Min.   :1.000  
 1st Qu.: 484.2   1st Qu.:14.00   1st Qu.:0.0000    1st Qu.:1.000  
 Median : 967.5   Median :29.00   Median :0.0000    Median :3.000  
 Mean   : 967.5   Mean   :29.35   Mean   :0.3925    Mean   :2.652  
 3rd Qu.:1450.8   3rd Qu.:45.00   3rd Qu.:1.0000    3rd Qu.:4.000  
 Max.   :1934.0   Max.   :61.00   Max.   :1.0000    Max.   :4.000  
  age.centered            urban         district_id   
 Min.   :-13.560000   Min.   :0.0000   Min.   : 1.00  
 1st Qu.: -7.559900   1st Qu.:0.0000   1st Qu.:14.00  
 Median : -1.559900   Median :0.0000   Median :29.00  
 Mean   :  0.002198   Mean   :0.2906   Mean   :29.25  
 3rd Qu.:  6.440000   3rd Qu.:1.0000   3rd Qu.:45.00  
 Max.   : 19.440000   Max.   :1.0000   Max.   :60.00  

Sampling from the fixed effects model:

bc_dat <- list(
  UseContraception = bc_df$use.contraception,
  DistrictId = bc_df$district_id
)

m_bc_fe <- ulam(
  alist(
    UseContraception ~ dbinom(1, p),
    logit(p) <- a[DistrictId],
    a[DistrictId] ~ dnorm(0, 1.5)
  ),
  data = bc_dat, chains = 4, cores = 4, log_lik = TRUE
)
display(precis(m_bc_fe, depth=2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m_bc_fe, depth=2), main='m_bc_fe')
}, ar=0.8)
      mean          sd        5.5%       94.5%        n_eff    Rhat4    
a[1]  -1.0547681689 0.2137850 -1.4113712 -0.717667817 4192.102 0.9986961
a[2]  -0.5855981344 0.4717542 -1.3520859  0.125955260 3223.185 0.9995662
a[3]   1.2550583484 1.1595008 -0.5349966  3.177147224 4249.073 0.9985745
a[4]   0.0052522427 0.3622941 -0.5692936  0.581238249 5704.123 0.9989137
a[5]  -0.5709354259 0.3285286 -1.0951361 -0.065205523 5477.596 0.9998667
a[6]  -0.8749469559 0.2580990 -1.3026629 -0.470909235 4472.230 0.9986760
a[7]  -0.9013248831 0.4971907 -1.7143530 -0.132134003 4350.855 0.9984769
a[8]  -0.4837459936 0.3371450 -1.0304723  0.063470078 4550.952 0.9990496
a[9]  -0.7869813762 0.4492103 -1.5267538 -0.090049801 4186.349 0.9996882
a[10] -1.9505405062 0.7253086 -3.2008269 -0.898679184 3471.160 0.9995382
a[11] -2.9533827816 0.8062259 -4.2617018 -1.796383362 4018.345 0.9989964
a[12] -0.6210314803 0.3654525 -1.2173021 -0.040618588 5205.563 0.9984615
a[13] -0.3282156478 0.4050214 -0.9691728  0.303809055 4008.268 0.9986197
a[14]  0.5115884255 0.1836011  0.2259073  0.810562013 5069.333 0.9995102
a[15] -0.5288315253 0.4306832 -1.2183537  0.156350803 3797.747 0.9985919
a[16]  0.1968089925 0.4434692 -0.4946812  0.934174012 4018.224 0.9991142
a[17] -0.8476844009 0.4400050 -1.5738591 -0.151441182 5606.310 0.9983207
a[18] -0.6513312317 0.3038974 -1.1215344 -0.188963354 3485.557 0.9983464
a[19] -0.4551743881 0.3897660 -1.0916403  0.180574052 5041.504 0.9991072
a[20] -0.3957950452 0.5056605 -1.2007204  0.386312996 3911.199 0.9988558
a[21] -0.4251602284 0.4539119 -1.1326079  0.285516031 4110.442 0.9996135
a[22] -1.2834923909 0.5199184 -2.1771181 -0.499955352 3721.947 1.0008591
a[23] -0.9352796145 0.5577795 -1.8839850 -0.081217902 4427.597 0.9986600
a[24] -2.0332675013 0.7332857 -3.2749123 -0.962836528 3762.332 0.9986260
a[25] -0.2063003707 0.2446207 -0.6033283  0.179821704 4516.826 0.9993179
a[26] -0.4317313211 0.5679176 -1.3411195  0.456665335 4186.062 0.9983912
a[27] -1.4433471879 0.3702887 -2.0342143 -0.866991965 3063.022 0.9993172
a[28] -1.0979455059 0.3270854 -1.6146927 -0.580667649 4831.533 0.9986556
a[29] -0.8966693304 0.4032127 -1.5428497 -0.271057874 4052.135 0.9989059
a[30] -0.0317843359 0.2520086 -0.4375573  0.369123314 5228.458 0.9991807
a[31] -0.1791887496 0.3575725 -0.7542883  0.408519554 5398.637 0.9987135
a[32] -1.2662656010 0.4627407 -2.0387609 -0.549057740 3897.366 0.9986331
a[33] -0.2747499521 0.4961998 -1.0755237  0.533220531 4436.955 0.9990616
a[34]  0.6303869056 0.3324131  0.1125598  1.163565844 3894.392 0.9988002
a[35]  0.0007613289 0.2880611 -0.4555591  0.444080649 4267.746 0.9991929
a[36] -0.5757697892 0.4987013 -1.3928939  0.196232261 4520.144 0.9987159
a[37]  0.1290149596 0.5530372 -0.7615294  1.011827794 5576.749 0.9985675
a[38] -0.8437549988 0.5501183 -1.7407113  0.005903183 5470.352 0.9983552
a[39]  0.0012694957 0.3912894 -0.6153342  0.622156787 4807.861 0.9985472
a[40] -0.1485404405 0.3220110 -0.6692041  0.348270907 4822.785 0.9993728
a[41] -0.0015475140 0.4001965 -0.6261782  0.645662040 5721.190 0.9986209
a[42]  0.1531121107 0.5529579 -0.7528373  1.062774192 3990.717 0.9991634
a[43]  0.1264317841 0.2996213 -0.3451562  0.584802894 4835.870 0.9991298
a[44] -1.2075757695 0.4509205 -1.9588258 -0.506909095 3295.934 0.9988037
a[45] -0.6765711108 0.3299037 -1.2096784 -0.146148167 4165.902 0.9987580
a[46]  0.0906947206 0.2169200 -0.2593299  0.436542049 3520.850 0.9984747
a[47] -0.1270971740 0.4941105 -0.9128387  0.632968795 4878.768 0.9993441
a[48]  0.0896217491 0.3021986 -0.3835983  0.568194371 4464.013 0.9990106
a[49] -1.7263563597 1.0271090 -3.5288819 -0.199839214 3669.684 0.9986273
a[50] -0.0980337730 0.4421999 -0.7889957  0.599278086 5419.794 0.9984798
a[51] -0.1570070979 0.3405839 -0.6962646  0.384063541 5156.160 0.9987228
a[52] -0.2281241353 0.2685612 -0.6726185  0.195834419 4232.174 0.9985204
a[53] -0.3036272976 0.4380177 -1.0075952  0.377601239 4089.384 0.9983238
a[54] -1.2217653559 0.8308343 -2.5974221  0.042572369 4092.345 0.9987841
a[55]  0.3093235995 0.2914719 -0.1561210  0.777189305 4355.213 0.9986629
a[56] -1.3914075160 0.4713755 -2.1549389 -0.665411585 2677.657 0.9993986
a[57] -0.1709203641 0.3370759 -0.7038543  0.357756114 3919.973 0.9989232
a[58] -1.7302958689 0.7613415 -3.0423049 -0.602380814 3831.756 0.9987976
a[59] -1.2147122087 0.4092811 -1.8780669 -0.574033077 3842.336 0.9992560
a[60] -1.2591466051 0.3508944 -1.8513903 -0.719459238 4456.370 0.9985475
../_images/practice-13_85_1.png

Sampling from the varying effects model:

m_bc_ve <- ulam(
  alist(
    UseContraception ~ dbinom(1, p),
    logit(p) <- a[DistrictId],
    a[DistrictId] ~ dnorm(a_bar, sigma),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = bc_dat, chains = 4, cores = 4, log_lik = TRUE
)
display(precis(m_bc_ve, depth=2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m_bc_ve, depth=2), main='m_bc_ve')
}, ar=0.8)

post_fe <- extract.samples(m_bc_fe)
post_ve <- extract.samples(m_bc_ve)
p_fe_c <- logistic(apply(post_fe$a, 2, mean))
p_ve_c <- logistic(apply(post_ve$a, 2, mean))
      mean         sd         5.5%       94.5%        n_eff     Rhat4    
a[1]  -0.9911365   0.1874488  -1.2841565 -0.697323993 3724.069  0.9986166
a[2]  -0.5973283   0.3613326  -1.1887150 -0.038431148 3555.855  0.9988966
a[3]  -0.2534990   0.4932482  -1.0147419  0.540260179 2360.417  0.9989659
a[4]  -0.1897251   0.2869915  -0.6464395  0.272078861 3940.974  0.9993814
a[5]  -0.5631291   0.3025600  -1.0646529 -0.074426451 3624.225  0.9989869
a[6]  -0.8221548   0.2478595  -1.2324877 -0.438239250 2530.636  1.0003354
a[7]  -0.7547900   0.3680264  -1.3719530 -0.199882225 3268.395  1.0001832
a[8]  -0.5253510   0.2909768  -0.9968847 -0.070098722 4706.451  0.9993295
a[9]  -0.7047649   0.3490570  -1.2743097 -0.146081615 4097.080  0.9995121
a[10] -1.1371790   0.4275107  -1.8456066 -0.506439030 2463.665  0.9989101
a[11] -1.5375451   0.4368121  -2.2604115 -0.868793630 1829.279  1.0001064
a[12] -0.6069719   0.3137430  -1.1179321 -0.106477511 3773.145  0.9994154
a[13] -0.4114001   0.3216827  -0.9278973  0.095592442 4315.792  0.9990882
a[14]  0.3932803   0.1849226   0.1036812  0.698775124 2982.668  0.9999743
a[15] -0.5589397   0.3234493  -1.0862898 -0.066113505 4431.534  0.9993583
a[16] -0.1265695   0.3419858  -0.6723780  0.419188139 3336.782  0.9996718
a[17] -0.7405418   0.3396837  -1.2996079 -0.211062630 3859.886  0.9985521
a[18] -0.6338055   0.2591002  -1.0620451 -0.230075867 4195.865  0.9989785
a[19] -0.5070372   0.3299352  -1.0313660  0.009552186 3797.020  0.9983655
a[20] -0.4832641   0.3854034  -1.1299058  0.125409155 3169.711  1.0007198
a[21] -0.4879549   0.3385939  -1.0269999  0.061414762 2759.459  1.0012865
a[22] -0.9595981   0.3486549  -1.5324380 -0.416809417 3064.734  0.9986252
a[23] -0.7475987   0.3864535  -1.3670480 -0.142692472 3162.536  0.9997868
a[24] -1.1755111   0.4453223  -1.9342545 -0.504894376 2545.720  0.9984940
a[25] -0.2742567   0.2220537  -0.6224019  0.087584267 4007.146  0.9987441
a[26] -0.5118632   0.3762343  -1.1263524  0.078668075 2889.556  1.0004930
a[27] -1.1773818   0.3048968  -1.6911009 -0.697737886 2675.709  0.9986454
a[28] -0.9649865   0.2762108  -1.4220485 -0.532278438 3327.589  0.9992282
a[29] -0.7964959   0.3121289  -1.2936163 -0.303774115 3692.578  0.9984907
a[30] -0.1397838   0.2320535  -0.5186390  0.231159676 3268.204  0.9996337
⋮     ⋮            ⋮          ⋮          ⋮            ⋮         ⋮        
a[33] -0.431074927 0.36078891 -1.0070316  0.13248620  3106.4879 0.9986770
a[34]  0.268804496 0.29313877 -0.1920465  0.76232467  2843.7107 0.9992301
a[35] -0.137373082 0.25346691 -0.5492551  0.27855563  3240.1599 0.9990327
a[36] -0.581503895 0.36028795 -1.1741271 -0.01349598  2987.4340 0.9994724
a[37] -0.222198678 0.36475878 -0.7974002  0.36254161  3493.3820 0.9995120
a[38] -0.712484784 0.38146158 -1.3219789 -0.11020644  3169.2858 0.9984891
a[39] -0.200399139 0.31415541 -0.6947224  0.29080620  3896.6873 0.9991367
a[40] -0.260503446 0.27873644 -0.7014452  0.18362604  4243.6551 0.9998715
a[41] -0.204316194 0.31525666 -0.7159777  0.29098687  3479.3094 0.9991047
a[42] -0.234861181 0.39769542 -0.9014570  0.38553967  2916.5487 0.9990804
a[43] -0.041829872 0.26291963 -0.4550113  0.37371403  3929.3624 0.9986894
a[44] -0.950316894 0.34405861 -1.5367159 -0.43130522  3174.0100 0.9984630
a[45] -0.657105765 0.28732486 -1.1253653 -0.20395486  3256.7921 0.9988105
a[46] -0.004683593 0.20140414 -0.3249970  0.31451062  2844.0269 0.9988385
a[47] -0.339132416 0.36201909 -0.9256789  0.23520108  4097.4534 0.9983989
a[48] -0.084488483 0.26776938 -0.5154784  0.34970546  4163.9774 0.9990185
a[49] -0.863038193 0.46947866 -1.6134919 -0.14994408  2502.9158 1.0006120
a[50] -0.297019559 0.33072976 -0.8254224  0.23944810  2784.5532 0.9994626
a[51] -0.270317309 0.26708038 -0.7004971  0.16030359  4151.5310 0.9992892
a[52] -0.293437787 0.22383035 -0.6455123  0.05050294  4242.8562 0.9988072
a[53] -0.426655424 0.35448550 -1.0019535  0.13610317  4584.2222 0.9990480
a[54] -0.784412276 0.46030201 -1.5210047 -0.07484807  4436.0743 0.9988564
a[55]  0.090184723 0.25630336 -0.3156922  0.49730657  3929.3675 1.0006963
a[56] -1.062681530 0.34762822 -1.6241983 -0.51756486  3111.5607 0.9993370
a[57] -0.300323347 0.29848730 -0.7686184  0.17260736  4306.2543 0.9988506
a[58] -1.010199303 0.44537117 -1.7124701 -0.32710344  3039.5226 0.9987960
a[59] -0.989726024 0.32189011 -1.5071225 -0.47070939  2563.2959 0.9989187
a[60] -1.043555423 0.29207770 -1.5446089 -0.57736213  3585.0769 0.9989461
a_bar -0.538589558 0.08555061 -0.6763500 -0.40304142  1913.3734 1.0012319
sigma  0.513375602 0.08172605  0.3946703  0.65255158   917.0302 1.0023959
../_images/practice-13_87_1.png

The plot suggested by the author in the question:

iplot(function() {
  plot(p_fe_c,
    ylim = c(0, 1), pch = 16, xaxt = "n",
    xlab = "district ID", ylab = "proportion using contraception", col = rangi2,
    main="Predicted proportion using contraception"
  )
  axis(1, at = seq(1, 60, by=2), las=2)

  # overlay posterior means
  points(p_ve_c)

  # mark posterior mean probability across districts
  abline(h = mean(inv_logit(post_ve$a_bar)), lty = 2)
})
../_images/practice-13_89_0.png

The number of observations (women) in every district:

iplot(function() {
  barplot(setNames(table(bc_df$district_id), sort(unique(bc_df$district_id))))
})
../_images/practice-13_91_0.png

There are two factors affecting shrinkage, as discussed under Figure 13.1 in the text. The first is the number of observations in the district; notice that district 3 (with only a few observations) shrinks much more towards the weighted cluster mean (the dashed line) than any other district. The second is the distance from the dashed line. The estimate for district 35, which still has a reasonable number of observations, shrinks significantly towards the dashed line because it starts far from the mean. District 14 starts from a similar distance to the dashed line but shrinks less because it has so many observations.

13H2. Return to data(Trolley) from Chapter 12. Define and fit a varying intercepts model for these data. Cluster intercepts on individual participants, as indicated by the unique values in the id variable. Include action, intention, and contact as ordinary terms. Compare the varying intercepts model and a model that ignores individuals, using both WAIC and posterior predictions. What is the impact of individual variation in these data?

Answer. The head of the Trolley data.frame:

data(Trolley)
t_df <- Trolley
display(head(t_df))
A data.frame: 6 × 12
caseresponseorderidagemaleeduactionintentioncontactstoryaction2
<fct><int><int><fct><int><int><fct><int><int><int><fct><int>
1cfaqu4 296;434140Middle School001aqu1
2cfbur33196;434140Middle School001bur1
3cfrub41696;434140Middle School001rub1
4cibox33296;434140Middle School011box1
5cibur3 496;434140Middle School011bur1
6cispe3 996;434140Middle School011spe1

The help on the Trolley data.frame:

display(help(Trolley))
Trolley {rethinking}R Documentation

Experimental data on ethical dilemmas

Description

These data comprise outcomes of experimental ethical dilemmas that are often called 'trolley' problems. Data kindly provided by Fiery Cushman.

Usage

data(Trolley)

Format

  1. case: a code that combines treatment and story labels

  2. response: participant's rating of appropriateness of action in story

  3. order: integer order story appeared, within participant

  4. id: participant id (factor)

  5. age: participant's age in years

  6. male: participant's gender; 1 for male, 0 for female

  7. edu: participant's highest educational level

  8. action: treatment code for story with action (1) or inaction (0)

  9. intention: treatment code for intent (1) or lack of intent (0)

  10. contact: treatmetn code for contact action (1) or lack of contact (0)

  11. story: factor label for basic scenario modified by treatments

  12. action2: alternative coding of action that is union of action and contact variables

References

Cushman et al. 2006. Psychological Science 17:1082–1089.


[Package rethinking version 2.13 ]

A summary of the Trolley data.frame:

display(summary(Trolley))
      case         response         order            id            age       
 cfaqu  : 331   Min.   :1.000   Min.   : 1.0   96;434 :  30   Min.   :10.00  
 cfbur  : 331   1st Qu.:3.000   1st Qu.: 9.0   96;445 :  30   1st Qu.:26.00  
 cfrub  : 331   Median :4.000   Median :16.5   96;451 :  30   Median :36.00  
 cibox  : 331   Mean   :4.199   Mean   :16.5   96;456 :  30   Mean   :37.49  
 cibur  : 331   3rd Qu.:6.000   3rd Qu.:24.0   96;458 :  30   3rd Qu.:48.00  
 cispe  : 331   Max.   :7.000   Max.   :32.0   96;466 :  30   Max.   :72.00  
 (Other):7944                                  (Other):9750                  
      male                         edu           action         intention     
 Min.   :0.000   Bachelor's Degree   :3540   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.000   Some College        :2460   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :1.000   Master's Degree     :1410   Median :0.0000   Median :0.0000  
 Mean   :0.574   Graduate Degree     :1050   Mean   :0.4333   Mean   :0.4667  
 3rd Qu.:1.000   High School Graduate: 870   3rd Qu.:1.0000   3rd Qu.:1.0000  
 Max.   :1.000   Some High School    : 420   Max.   :1.0000   Max.   :1.0000  
                 (Other)             : 180                                    
    contact        story         action2      
 Min.   :0.0   box    :1324   Min.   :0.0000  
 1st Qu.:0.0   bur    :1324   1st Qu.:0.0000  
 Median :0.0   spe    : 993   Median :1.0000  
 Mean   :0.2   swi    : 993   Mean   :0.6333  
 3rd Qu.:0.0   aqu    : 662   3rd Qu.:1.0000  
 Max.   :1.0   boa    : 662   Max.   :1.0000  
               (Other):3972                   

Let’s reproduce results for model m12.5 in the text:

t_dat <- list(
  R = t_df$response,
  A = t_df$action,
  I = t_df$intention,
  C = t_df$contact
)
m12.5 <- ulam(
  alist(
    R ~ dordlogit(phi, cutpoints),
    phi <- bA * A + bC * C + BI * I,
    BI <- bI + bIA * A + bIC * C,
    c(bA, bI, bC, bIA, bIC) ~ dnorm(0, 0.5),
    cutpoints ~ dnorm(0, 1.5)
  ),
  data = t_dat, chains = 4, cores = 4, log_lik = TRUE
)
display(precis(m12.5, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m12.5, depth=2), xlim=c(-6,8),main='m12.5')
}, ar=1.8)

plot_cutpoints <- function(model) {
  iplot(function() {
    par(mfrow=c(1,3))
    plot_intention <- function(kA, kC, model) {
      plot(NULL,
        type = "n", xlab = "intention", ylab = "probability",
        xlim = c(0, 1), ylim = c(0, 1), xaxp = c(0, 1, 1), yaxp = c(0, 1, 2),
        main=paste0("action=", kA, ", contact=", kC)
      )

      kI <- 0:1 # values of intention to calculate over
      pdat <- data.frame(A = kA, C = kC, I = kI, id=1)
      phi <- link(model, data = pdat)$phi

      post <- extract.samples(model)
      for (s in 1:50) {
        pk <- pordlogit(1:6, phi[s, ], post$cutpoints[s, ])
        for (i in 1:6) lines(kI, pk[, i], col = grau(0.1))
      }
    }
    plot_intention(0, 0, model)
    plot_intention(1, 0, model)
    plot_intention(0, 1, model)
  }, ar=2.2)
}

plot_hist <- function(model) {
  iplot(function() {
    par(mfrow=c(1,3))
    plot_intent <- function(kA, kC, model) {
        kI <- 0:1 # values of intention to calculate over
        pdat <- data.frame(A = kA, C = kC, I = kI, id=1)
        s <- sim(model, data = pdat)
        simplehist(s, xlab = "response", main=paste0("action=", kA, ", contact=", kC))
    }
    plot_intent(0, 0, model)
    plot_intent(1, 0, model)
    plot_intent(0, 1, model)
  }, ar=2.2)
}

plot_cutpoints(m12.5)
plot_hist(m12.5)
             mean       sd         5.5%       94.5%      n_eff     Rhat4   
bIC          -1.2375932 0.09543187 -1.3890913 -1.0886818 1175.5217 1.001908
bIA          -0.4333412 0.07630972 -0.5513900 -0.3125365 1049.6312 1.005211
bC           -0.3425199 0.06711358 -0.4504975 -0.2386072 1140.6447 1.001802
bI           -0.2912295 0.05512958 -0.3813351 -0.2075464  805.5991 1.004812
bA           -0.4721490 0.05153728 -0.5522278 -0.3904029 1066.9576 1.003966
cutpoints[1] -2.6330227 0.05090999 -2.7158505 -2.5535491  935.4765 1.004076
cutpoints[2] -1.9380016 0.04728174 -2.0140466 -1.8620192  930.2425 1.003504
cutpoints[3] -1.3437476 0.04524626 -1.4155249 -1.2698631  877.9480 1.002820
cutpoints[4] -0.3083380 0.04284013 -0.3767162 -0.2396050  992.4171 1.001940
cutpoints[5]  0.3617297 0.04292332  0.2942512  0.4311708 1034.7950 1.001985
cutpoints[6]  1.2672309 0.04527226  1.1964799  1.3390448 1343.7071 1.001424
../_images/practice-13_99_1.png ../_images/practice-13_99_2.png ../_images/practice-13_99_3.png

We’ll use the following model with id clusters:

\[\begin{split} \begin{align} R_i & \sim OrderedLogit(\phi_i,\kappa) \\ \phi_i & = \alpha_{id[i]} + \ldots \\ \alpha_{id} & \sim Normal(0, \sigma), id = 1..331 \\ \sigma & \sim Exponential(1) \\ \end{align} \end{split}\]

Notice we do not provide a cluster mean \(\bar{\alpha}\); this variable would be non-identifiable with respect to the intercepts in the ordered logit. If you try to add this parameter, you’ll see the divergent transitions and poor sampling characteristic of this problem.

Sampling from the model with id clusters:

t_id_dat <- list(
  R = t_df$response,
  A = t_df$action,
  I = t_df$intention,
  id = t_df$id,
  C = t_df$contact
)
m_trolley_id <- ulam(
  alist(
    R ~ dordlogit(phi, cutpoints),
    phi <- a[id] + bA * A + bC * C + BI * I,
    a[id] ~ dnorm(0, sigma),
    sigma ~ dexp(1),
    BI <- bI + bIA * A + bIC * C,
    c(bA, bI, bC, bIA, bIC) ~ dnorm(0, 0.5),
    cutpoints ~ dnorm(0, 1.5)
  ),
  data = t_id_dat, chains = 4, cores = 4, log_lik = TRUE, iter=800
)
display(precis(m_trolley_id, depth = 2), mimetypes="text/plain")
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”
             mean        sd         5.5%        94.5%       n_eff     Rhat4    
a[1]         -0.56353317 0.2961871  -1.02180217 -0.09805443  609.4124 1.0005135
a[2]         -1.75769877 0.4366643  -2.46081067 -1.05799696 1374.2609 0.9985736
a[3]         -2.96541068 0.4173908  -3.64774547 -2.31760922 1070.1699 0.9996611
a[4]         -0.37053427 0.4761909  -1.13348728  0.38172148 1209.9746 1.0002442
a[5]         -3.46992394 0.4332231  -4.18261279 -2.80619157 1249.0739 1.0031403
a[6]          0.63332401 0.3174243   0.11606255  1.12686623  750.1932 1.0022227
a[7]         -0.46657182 0.3209316  -0.98671807  0.04852067  721.1306 1.0034259
a[8]          3.85997536 0.4975133   3.09604198  4.67143047 1294.1944 1.0013937
a[9]         -0.87006827 0.3199011  -1.37586817 -0.35114960  589.8566 1.0041920
a[10]         0.77257066 0.3344652   0.21815818  1.28903980  698.9720 1.0080996
a[11]         0.58163790 0.3429877   0.03486364  1.13120264  712.7327 1.0014701
a[12]         2.39164951 0.3897521   1.77621261  3.02176079  973.2805 0.9997558
a[13]        -0.43860210 0.3162860  -0.94455169  0.06649594  633.6931 1.0014437
a[14]        -1.00885293 0.3423811  -1.55262975 -0.46208981  792.1098 1.0029308
a[15]         1.62317172 0.3196033   1.13065485  2.13118339  655.2993 1.0018768
a[16]        -1.85514675 0.4180702  -2.50067525 -1.17784524 1363.9095 1.0039794
a[17]         1.59304146 0.4451567   0.87763670  2.27107819 1204.9260 0.9998765
a[18]        -1.21473989 0.3588845  -1.78193099 -0.65530368  875.4417 1.0012111
a[19]        -1.32521070 0.3157718  -1.86585500 -0.83017463  866.0450 1.0010094
a[20]        -0.52783541 0.2999337  -0.98511706 -0.03742998  528.3042 1.0042488
a[21]        -2.14824735 0.3402819  -2.70621545 -1.59529493  739.2898 1.0023227
a[22]        -0.02678743 0.3661637  -0.60340204  0.56346466  779.7531 1.0015685
a[23]        -5.23144825 0.6241298  -6.32654285 -4.31990665 1223.4801 1.0023636
a[24]        -1.67222594 0.3099048  -2.17495014 -1.17642989  704.4844 1.0031817
a[25]        -0.68463486 0.3240165  -1.20124259 -0.17489539  835.4960 1.0002229
a[26]         0.92698878 0.3511169   0.35895447  1.49579364  949.6574 1.0017567
a[27]         1.09466960 0.4395155   0.39502070  1.78836931 1348.9841 1.0014579
a[28]        -0.82685263 0.3581814  -1.40157906 -0.24647796  960.6689 1.0035368
a[29]         1.17488007 0.3290987   0.65462994  1.71830911  668.6289 1.0037767
a[30]         1.65701666 0.3492145   1.10556087  2.24158374  894.4211 1.0036917
⋮            ⋮           ⋮          ⋮           ⋮           ⋮         ⋮        
a[314]       -1.3542568  0.32945432 -1.86742764 -0.82142138  971.6676 1.0020511
a[315]       -0.8540011  0.34241482 -1.39386953 -0.31436297  624.9605 1.0050265
a[316]        1.4812958  0.32448888  0.93933320  1.99329093  638.4552 1.0002180
a[317]       -0.8066667  0.33949087 -1.37586028 -0.29679957  671.0560 1.0043909
a[318]       -0.8339491  0.32464892 -1.35563812 -0.29373991  689.0683 1.0020395
a[319]       -0.3601915  0.29239934 -0.82979913  0.12130636  714.9171 1.0039653
a[320]       -1.7844106  0.36109067 -2.34479272 -1.20211265  946.4787 1.0025796
a[321]       -1.7523853  0.35275445 -2.30124278 -1.19702351  724.7850 1.0038180
a[322]        4.0188856  0.44091784  3.33944830  4.75967169  970.0613 1.0018672
a[323]        0.2192927  0.33185887 -0.30365605  0.74333970  753.3259 1.0048503
a[324]        6.1394695  0.80943044  4.96839059  7.50448881 1970.4292 0.9983066
a[325]       -1.2142048  0.58523731 -2.13481130 -0.28469151 1863.5302 0.9986617
a[326]        0.6195830  0.33119313  0.09059312  1.13023415  684.4746 0.9996006
a[327]       -0.3058018  0.32691683 -0.81558409  0.21422232  804.3834 1.0021583
a[328]        1.7048587  0.33147722  1.17414277  2.22621687  741.0444 1.0012876
a[329]       -1.2521281  0.32081154 -1.74845916 -0.74499073  585.1115 1.0066785
a[330]       -3.4222229  0.42437841 -4.13034469 -2.74412580  844.9541 1.0041652
a[331]       -0.5978400  0.38576024 -1.22200816  0.03604231  917.3723 1.0020468
sigma         1.9143741  0.07695070  1.79496507  2.04170111 1877.4210 0.9996872
bIC          -1.6686659  0.10200631 -1.83669982 -1.51108365 1004.1502 1.0000917
bIA          -0.5591247  0.08465414 -0.69482550 -0.42699295  990.1649 1.0005712
bC           -0.4516718  0.07158114 -0.56303007 -0.33464898  905.3852 1.0001007
bI           -0.3851002  0.06230474 -0.48105230 -0.28248089  788.8813 0.9994695
bA           -0.6486070  0.05739569 -0.73743063 -0.55605901 1017.0408 0.9987082
cutpoints[1] -3.7188728  0.11839125 -3.91090361 -3.52899919  114.0915 1.0308318
cutpoints[2] -2.7694481  0.11584907 -2.95671145 -2.58437139  109.0320 1.0339041
cutpoints[3] -1.9653730  0.11524119 -2.15138635 -1.78470696  107.3575 1.0335104
cutpoints[4] -0.4676383  0.11422623 -0.65270587 -0.28169906  105.5194 1.0337652
cutpoints[5]  0.5818210  0.11573294  0.39648815  0.76722987  107.4401 1.0316479
cutpoints[6]  1.9903550  0.11933490  1.79906822  2.18108821  113.1717 1.0306061

Although it may seem excessive, here are all the id parameter estimates on the same x-axis scale as the precis plot above for model m12.5:

iplot(function() {
  plot(precis(m_trolley_id, depth=2), xlim=c(-6,8), main='m_trolley_id')
}, ar=0.2)
../_images/practice-13_103_0.png

Notice the individual intercepts are much less certain than sigma, bIC, and the other higher-level parameters. This uncertainty leads to the more variable posterior predictive plot in the first triptych below. These two tripytch plots for a typical individual, specifically the individual with id equal 1, who did not have a strong positive or negative tendency.

plot_cutpoints(m_trolley_id)
plot_hist(m_trolley_id)

display_markdown("
Comparing the models using WAIC:
")
iplot(function() {
  plot(compare(m12.5, m_trolley_id))
}, ar=4.5)

display_markdown("
Comparing the models using PSIS, to check for overfitting or outliers:
")
iplot(function() {
  plot(compare(m12.5, m_trolley_id, func=PSIS))
}, ar=4.5)
../_images/practice-13_105_0.png ../_images/practice-13_105_1.png

Comparing the models using WAIC:

../_images/practice-13_105_3.png

Comparing the models using PSIS, to check for overfitting or outliers:

Some Pareto k values are high (>0.5). Set pointwise=TRUE to inspect individual points.
../_images/practice-13_105_6.png

The new model has a much lower WAIC/PSIS score because the new predictor explains much of the unexplained variation in the previous model (notice the large inferred mean for sigma).

13H3. The Trolley data are also clustered by story, which indicates a unique narrative for each vignette. Define and fit a cross-classified varying intercepts model with both id and story. Use the same ordinary terms as in the previous problem. Compare this model to the previous models. What do you infer about the impact of different stories on responses?

Answer. Sampling from the suggested model:

data(Trolley)
t_df <- Trolley

t_story_id_dat <- list(
  R = t_df$response,
  A = t_df$action,
  I = t_df$intention,
  id = t_df$id,
  story = as.integer(t_df$story),
  C = t_df$contact
)
m_trolley_story_id <- ulam(
  alist(
    R ~ dordlogit(phi, cutpoints),
    phi <- a[id] + b[story] + bA * A + bC * C + BI * I,
    a[id] ~ dnorm(0, sigma_a),
    sigma_a ~ dexp(1),
    b[story] ~ dnorm(0, sigma_b),
    sigma_b ~ dexp(1),
    BI <- bI + bIA * A + bIC * C,
    c(bA, bI, bC, bIA, bIC) ~ dnorm(0, 0.5),
    cutpoints ~ dnorm(0, 1.5)
  ),
  data = t_story_id_dat, chains = 4, cores = 4, log_lik = TRUE, iter=800
)
display(precis(m_trolley_story_id, depth = 2), mimetypes="text/plain")
iplot(function() {
  plot(precis(m_trolley_story_id, depth=2), xlim=c(-6,8), main='m_trolley_story_id')
}, ar=0.18)
Warning message:
“The largest R-hat is 1.06, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat”
Warning message:
“Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess”
Warning message:
“Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess”
             mean        sd         5.5%        94.5%       n_eff     
a[1]         -0.56316487 0.3062317  -1.04866012 -0.08303912  700.1520 
a[2]         -1.70264993 0.4463165  -2.38798429 -0.95805328 1269.7956 
a[3]         -3.03816922 0.4212535  -3.70576458 -2.36267427 1221.3509 
a[4]         -0.39840814 0.4607174  -1.11824362  0.33492908 1987.7908 
a[5]         -3.53627392 0.4212684  -4.25853409 -2.90347294 1112.0019 
a[6]          0.67017225 0.3182554   0.17001121  1.18672970  670.9657 
a[7]         -0.40558820 0.3197392  -0.91008963  0.11270959  704.1311 
a[8]          3.95010049 0.4669430   3.24021992  4.73336448 1299.1907 
a[9]         -0.87778821 0.3126278  -1.36629041 -0.37457437  636.1947 
a[10]         0.79689158 0.3387809   0.26862455  1.35358976  905.3176 
a[11]         0.56853755 0.3567595   0.02087721  1.12585481  910.6981 
a[12]         2.43042937 0.3657701   1.84698263  3.00068292 1295.9689 
a[13]        -0.44351859 0.3020161  -0.92608546  0.05152353  646.9133 
a[14]        -0.99103075 0.3281953  -1.51986956 -0.47570117  672.2905 
a[15]         1.72993150 0.3317421   1.19972529  2.26167304  646.7812 
a[16]        -1.89165803 0.4215558  -2.55985563 -1.23807219 1494.2323 
a[17]         1.67720096 0.4735600   0.89432934  2.42044364 1336.5860 
a[18]        -1.17242059 0.3467013  -1.73047967 -0.61884137 1096.5816 
a[19]        -1.36327308 0.3156245  -1.85102495 -0.86337880  668.7573 
a[20]        -0.53714499 0.2937734  -1.01418880 -0.06291586  795.5939 
a[21]        -2.18408098 0.3561207  -2.74291209 -1.58525155  573.6218 
a[22]         0.01515743 0.3743190  -0.60486391  0.60575978  827.9334 
a[23]        -5.38259002 0.5928650  -6.35802119 -4.47518875 2352.7562 
a[24]        -1.74776990 0.3155132  -2.24725506 -1.22288890  789.6467 
a[25]        -0.67922782 0.3345888  -1.21942041 -0.15029700  677.7735 
a[26]         0.91663072 0.3129825   0.42169584  1.40920127  719.7911 
a[27]         1.08895725 0.4545951   0.35963490  1.80504876 1621.3281 
a[28]        -0.82124038 0.3653313  -1.39906499 -0.21555552  826.5497 
a[29]         1.23325809 0.3270985   0.70574788  1.74797048 1281.7369 
a[30]         1.73635245 0.3296675   1.18662216  2.26371802  730.6892 
⋮            ⋮           ⋮          ⋮           ⋮           ⋮         
a[327]       -0.28662965 0.32968612 -0.8397690   0.22969090  889.14997
a[328]        1.73420600 0.32667940  1.2170105   2.26672644 1052.55393
a[329]       -1.29204269 0.31889777 -1.8001379  -0.77876378  782.93440
a[330]       -3.53235839 0.46302658 -4.2753005  -2.82080683 1239.20399
a[331]       -0.59046808 0.37514898 -1.1714589   0.01492163  850.98552
sigma_a       1.96610510 0.08467663  1.8350399   2.10567674 3118.02782
b[1]          0.92967879 0.19845686  0.6396093   1.25019007  106.70089
b[2]         -0.07238916 0.19505092 -0.3564759   0.24083207   86.66171
b[3]          0.08209595 0.18364279 -0.1724035   0.37996989   92.58106
b[4]          0.62321817 0.19218158  0.3449638   0.94013540   94.68768
b[5]          0.03032877 0.18893382 -0.2369821   0.32915908   98.89361
b[6]          0.12069973 0.18969155 -0.1478143   0.43278542  112.35333
b[7]         -0.56876205 0.18869525 -0.8392746  -0.25774645  114.70110
b[8]          0.03602667 0.19545684 -0.2488369   0.35041177  115.67326
b[9]         -0.37852292 0.19095179 -0.6585248  -0.06312420   91.39244
b[10]         0.87280587 0.19511909  0.5830461   1.19022737   90.69660
b[11]        -0.27013388 0.18930543 -0.5473264   0.02937101   99.04665
b[12]        -0.24966498 0.18685623 -0.5181914   0.05921666   94.23037
sigma_b       0.54517184 0.14154137  0.3711658   0.80338096  299.61242
bIC          -1.28787914 0.11266707 -1.4702146  -1.11203088 1201.87531
bIA          -0.52890152 0.08192155 -0.6623173  -0.39963644 1372.01715
bC           -1.07536950 0.09578888 -1.2261345  -0.92467002 1154.81943
bI           -0.45759246 0.06766379 -0.5673367  -0.34981692 1113.66532
bA           -0.89126583 0.06763231 -1.0002158  -0.78328436 1112.15073
cutpoints[1] -3.91752790 0.21080380 -4.2384182  -3.56319039   83.33309
cutpoints[2] -2.94166839 0.20888383 -3.2618960  -2.59017921   81.56924
cutpoints[3] -2.11014952 0.20704331 -2.4246454  -1.76567932   82.26298
cutpoints[4] -0.56232537 0.20579724 -0.8680882  -0.21451455   81.88016
cutpoints[5]  0.51961396 0.20579640  0.2155380   0.85862691   83.20928
cutpoints[6]  1.96266241 0.20728119  1.6549095   2.31161821   84.86356
             Rhat4    
a[1]         1.0043664
a[2]         1.0026881
a[3]         1.0023259
a[4]         0.9991884
a[5]         1.0019290
a[6]         1.0046831
a[7]         1.0035200
a[8]         1.0019186
a[9]         1.0072141
a[10]        1.0045010
a[11]        1.0017051
a[12]        1.0007463
a[13]        1.0035501
a[14]        1.0052138
a[15]        1.0046899
a[16]        1.0016086
a[17]        1.0015974
a[18]        1.0020826
a[19]        1.0041743
a[20]        1.0021651
a[21]        1.0090934
a[22]        1.0040289
a[23]        1.0005327
a[24]        1.0036648
a[25]        1.0011532
a[26]        1.0044393
a[27]        0.9998108
a[28]        1.0036460
a[29]        1.0005073
a[30]        1.0052238
⋮            ⋮        
a[327]       1.0016571
a[328]       1.0009393
a[329]       1.0040270
a[330]       1.0010456
a[331]       1.0061746
sigma_a      0.9989950
b[1]         1.0591266
b[2]         1.0681644
b[3]         1.0614984
b[4]         1.0674420
b[5]         1.0564720
b[6]         1.0560444
b[7]         1.0555663
b[8]         1.0550387
b[9]         1.0578995
b[10]        1.0612750
b[11]        1.0631018
b[12]        1.0590566
sigma_b      1.0269283
bIC          0.9989761
bIA          0.9993492
bC           0.9991989
bI           0.9992121
bA           1.0001422
cutpoints[1] 1.0460215
cutpoints[2] 1.0463412
cutpoints[3] 1.0462888
cutpoints[4] 1.0463179
cutpoints[5] 1.0458839
cutpoints[6] 1.0448288
../_images/practice-13_107_4.png

Comparing the models using WAIC:

iplot(function() {
  plot(compare(m12.5, m_trolley_id, m_trolley_story_id))
}, ar=4.5)
../_images/practice-13_109_0.png

Comparing the models using PSIS, to check for overfitting or outliers:

iplot(function() {
  plot(compare(m12.5, m_trolley_id, m_trolley_story_id, func=PSIS))
}, ar=4.5)
Some Pareto k values are high (>0.5). Set pointwise=TRUE to inspect individual points.
../_images/practice-13_111_1.png

With this new predictor, our inferences about bIC, bA and similar parameters have changed more significantly. Presumably these individual stories provide even more detail than the contact, action, etc. indicators and are potentially redundant or non-identifiable with respect to the parameters attached to the indicator variables.

Still, this extra detail seems to provide significant predictive power according to WAIC and PSIS, though not to the same degree that the individual variable helped prediction.