Sunday, December 3, 2017

A Bayesian Look at the Challenger Disaster

A Bayesian Look at the Challenger Disaster
The NASA space shuttle Challenger exploded on January 28, 1986, just 73 seconds after liftoff, bringing a devastating end to the spacecraft’s 10th mission. The disaster claimed the lives of all seven astronauts aboard, including Christa McAuliffe, a teacher from New Hampshire who would have been the first civilian in space. It was later determined that two rubber O-rings, which had been designed to separate the sections of the rocket booster, had failed due to cold temperatures on the morning of the launch. The tragedy and its aftermath received extensive media coverage and prompted NASA to temporarily suspend all shuttle missions.
Suppose that we could look back in time to examine the Challenger O-rings. We can approximate this time journey by looking at the data availabe to the Shuttle engineers at the time. There were twenty-three previous Space Shuttle flights using four separate vehicles (Challenger, Atlantis, Columbia and Discovery). Several flights were determined to have defective O-rings on post flight inspection. Using that data, I will attempt a look back at the Challenger disaster to see what effect the low temperature at the time of launch (31 degrees F) may have had on the O-rings, leading to failure.
First, load the required packages
#devtools::install_github("rmcelreath/rethinking", force = TRUE)
library(rethinking)
library(ggplot2)
set.seed(123)
The data for this analysis can be downloaded from the UCI database UCI. To make it easier, I have recompiled the data into a data frame(orings) that you can just copy.
o_rings <- rep(6, 23)
distressed <- c(0, 1, 0, 0, 0, 0, 0, 0,
                1, 1, 1, 0, 0, 2, 0, 0,
                0, 0, 0, 0, 2, 0, 1)
l_temp <- c(66, 70, 69, 68, 67, 72, 73, 70,
            57, 63, 70, 78, 67, 53, 67, 75,
            70, 81, 76, 79, 75, 76, 58)
psi <- c(50, 50, 50, 50, 50, 50, 100, 100,
         200, 200, 200, 200, 200, 200, 200, 200,
         200, 200, 200, 200, 200, 200, 200)
flight <- seq(1,23,1)
weight <- c(235415, 241664, 247112, 256744, 249178, 242742,
             247619, 250452, 254254, 263477, 242780, 263324,
             250891, 250891, 246880, 256524, 252855, 262309,
             190400, 243762, 261455, 256003, 121778)
vehicle <- c("Columbia",   "Columbia",   "Columbia",  
              "Challenger", "Challenger", "Challenger",
              "Columbia",   "Challenger", "Challenger",
              "Discovery",  "Challenger", "Discovery", 
              "Discovery",  "Discovery",  "Challenger",
              "Discovery",  "Challenger", "Discovery", 
              "Atlantis",   "Challenger", "Atlantis",  
              "Columbia",   "Challenger")            
orings <- data.frame(o_rings, distressed, l_temp,
                      psi, flight, weight, vehicle)
You can see the distribution of normal and distressed O-rings found on the four Shuttle vehicles.
ggplot(orings, aes(x = l_temp, y = distressed, color = vehicle)) + geom_point() +xlab("Launch Air Temperature") +
  ylab("Distressed O-Rings") + ggtitle("Distressed O-Rings: Pre-Challenger(STS-51-L) Flights")

Only 18 points are visible due to overlap. The normal and distressed O-rings appear to be fairly evenly distributed among the four vehicles.
Below, in the boxplots, lower temperatures are associated with more distressed O-rings. The lowest launch temperature, until the Challenger disaster (31 degrees F), was 53 degrees F.
ggplot(orings, aes(x = as.factor(distressed), y = l_temp)) +
  geom_boxplot() +
  ggtitle(label = "Challenger Disaster",
          subtitle = "DIstressed O-Rings vs. Launch Air Temperature") +
  ylab("Launch Air Temperature") +
  xlab("Distressed O-Rings (per mission)")

In order to model the relationship between launch temperature and the count of distressed O-rings, I used a Bayesian approach. We start out first with a simple model and work up in complexity. The various models are compared. The model having the posterior distribution with the highest entropy (measured by WAIC - widely applicable information criterion) will be selected for predicting O-ring damage at lower temperatures.
I have used weakly informative priors primarily to act as a brake on overfitting. Quadratic approximation will be used for most of the models and MCMC(using Stan) for one other model.
#The simplest model. Coefficients are zero. Only the intercept is included in the model
m1 <- map(
  alist(
    distressed ~ dpois(lambda),
    log(lambda) <- a,
    a ~ dnorm(1, 100)
  ),
  data = orings) 
precis(m1) 
##    Mean StdDev  5.5% 94.5%
## a -0.94   0.33 -1.47 -0.41
Now to increase the complexity of the model by adding launch temperature (bt).
#adding a coefficient for launch temperature
m2 <- map(
  alist(
    distressed ~ dpois(lambda),
    log(lambda) <- a + bt*l_temp,
    a ~ dnorm(0, 1),
    bt ~ dnorm(0,1)
  ),
  data = orings) 
precis(m2, corr = TRUE)
##     Mean StdDev  5.5% 94.5%     a    bt
## a   0.64   0.96 -0.88  2.17  1.00 -0.94
## bt -0.02   0.01 -0.05  0.00 -0.94  1.00
plot(precis(m2))

The plot for model m2 shows that the coefficient for launch temperature is -0.02 and the right tail (at 94.5%) does not go past zero.
Model m3 adds launch weight (bw) as a predictor. I have chosen to scale the weight since it is several orders of magnitude greater than the values for temperature. This will make the computation less complex and will guard against overfitting.
#adding a coefficient for weight
orings$weight_sc <- scale(orings$weight)
m3 <- map(
  alist(
    distressed ~ dpois(lambda),
    log(lambda) <- a + bt*l_temp + bw*weight_sc,
    a ~ dnorm(0, 1),
    bt ~ dnorm(0, 1),
    bw ~ dnorm(0, 1)
  ),
  data = orings) 
precis(m3, corr = TRUE)
##     Mean StdDev  5.5% 94.5%     a    bt    bw
## a   0.59   0.96 -0.94  2.13  1.00 -0.94  0.09
## bt -0.02   0.01 -0.05  0.00 -0.94  1.00 -0.06
## bw -0.07   0.28 -0.52  0.37  0.09 -0.06  1.00
plot(precis(m3))

Model m3 shows that launch tmperature retains its position relative to zero. Launch weight appears to have a negative effect on O-ring damage (more weight, less damage; less weight, more damage), but it has a high degree of uncertainty compared to temperature.
Model m4 adds scaled leak-check pressure as a variable.
#adding scaled leak-check pressure
orings$psi_sc <- scale(orings$psi)
m4 <- map(
  alist(
    distressed ~ dpois(lambda),
    log(lambda) <- a + bt*l_temp + bw*weight_sc + bp*psi_sc,
    a ~ dnorm(0, 1),
    bt ~ dnorm(0, 1),
    bw ~ dnorm(0, 1),
    bp ~ dnorm(0, 1)
  ),
  data = orings) 
precis(m4, corr = TRUE)
##     Mean StdDev  5.5% 94.5%     a    bt    bw    bp
## a   0.56   0.95 -0.95  2.08  1.00 -0.92  0.09 -0.03
## bt -0.02   0.01 -0.05  0.00 -0.92  1.00 -0.08 -0.15
## bw -0.03   0.26 -0.45  0.39  0.09 -0.08  1.00  0.09
## bp  0.57   0.44 -0.13  1.27 -0.03 -0.15  0.09  1.00
plot(precis(m4))

We can use the Shuttle vehicles as a source of group level variation that we can exploit in a multilevel model. Here we are using Stan to produce a Hamiltonian MCMC model for the four types of Shuttle vehicles.
#Multilevel model using integers for vehicles
#Data preparation for Stan
orings$vehic <- ifelse(orings$vehicle == "Challenger", 1,
                       ifelse(orings$vehicle == "Atlantis", 2,
                              ifelse(orings$vehicle == "Columbia", 3, 4)))
orings$vehic <- coerce_index(orings$vehic)
orings_stan <- orings[, c(2,3,10)]
orings_stan$distressed <- as.integer(orings_stan$distressed)
orings_stan$l_temp <- as.integer(orings_stan$l_temp)
m5 <- map2stan(
  alist(
    distressed ~ dbinom(l_temp, p),
    logit(p) <- a_vehic[vehic],
    a_vehic[vehic] ~ dnorm(0,5)
  ),
  data = orings_stan)
precis(m5, depth = 2)
##             Mean StdDev lower 0.89 upper 0.89 n_eff Rhat
## a_vehic[1] -5.51   0.60      -6.35      -4.50   970    1
## a_vehic[2] -4.41   0.74      -5.63      -3.31  1000    1
## a_vehic[3] -6.05   1.00      -7.44      -4.58   750    1
## a_vehic[4] -5.05   0.62      -5.99      -4.10   835    1
plot(precis(m5, depth = 2))

There does not appear to be much variation between the Suttle vehicle groups. This would be expected since the basic design and launch physics of each vehicle would be similar.
A final area to explore is whether there is any interaction between launch temperature and weight to include in our prediction.
m6 <- map(
  alist(
    distressed ~ dpois( lambda ),
    log(lambda) <- Intercept +
      bt*l_temp +
      bw*weight_sc +
      btw*l_temp*weight_sc,
    Intercept ~ dnorm(0,1),
    bt ~ dnorm(0,1),
    bw ~ dnorm(0,1),
    btw ~ dnorm(0,1)
  ),
  data = orings)
precis(m6, corr = TRUE)
##            Mean StdDev  5.5% 94.5% Intercept    bt    bw   btw
## Intercept  0.63   0.96 -0.91  2.16      1.00 -0.94  0.00  0.03
## bt        -0.02   0.01 -0.05  0.00     -0.94  1.00  0.03 -0.05
## bw        -0.16   0.95 -1.67  1.36      0.00  0.03  1.00 -0.96
## btw        0.00   0.02 -0.02  0.03      0.03 -0.05 -0.96  1.00
plot(precis(m6))

There may be a small degree of interaction between temperature and weight, but temperature is the dominant factor here.
Now to compare the various models:
compare(m1, m2, m3, m4, m5, m6)
##     WAIC pWAIC dWAIC weight    SE  dSE
## m2  37.6   1.1   0.0   0.61  7.84   NA
## m1  39.8   1.1   2.2   0.20  7.70 1.42
## m4  41.2   3.5   3.6   0.10  8.33 3.88
## m6  42.7   3.3   5.1   0.05  9.25 3.72
## m3  43.4   3.7   5.8   0.03  9.92 5.00
## m5 117.8   4.3  80.2   0.00 32.73 8.68
Model 2 (m2) has the lowest WAIC, and is given 61% of the weight. It is also the simplest model of the group. It can now be used to predict a probability distribution for O-ring damage at lower temperatures.
The number of distressed O-rings stays near zero at temperatures above 60 drgrees F. At 31 degrees F, up to 4 distressed O-rings are within the probability interval. The mean line (the densest part of the probability interval), shows 1 distressed O-ring at 31 degrees F).

No comments:

Post a Comment