A Bayesian Course with Examples in R and Stan (& PyMC3 & brms & Julia too, see links below)
Most recent set of free lectures: Statistical Rethinking 2023
Second Edition
The second edition is now out in print. Publisher information on the CRC Press page. For more detail about what is new, look here.
Materials
2nd Edition
- Book: CRC Press
- Book sample: Chapters 1 and 2 (2MB PDF)
- Lectures and slides:
* Winter 2022 materials (ongoing)
* Winter 2019 materials - Code and examples:
* R package: rethinking (github repository)
* R code examples from the book: code.txt
* Book examples in Stan+tidyverse
* brms + tidyverse conversion here
* PyMC3 code examples: PyMC repository
* NumPyro!
* More NumPyro
* TensorFlow Probability notebooks
* Julia & Turing examples (both 1st and 2nd edition)
* Another Julia code translation with clean outline in notebook format
* R-INLA examples
* pyro/pytorch notebooks
* a new PyMC translation for the 3rd edition draft examples
* New PyMC5 translation for 2023 lecture examples
1st Edition
- Code and examples:
* R package: rethinking (github repository)
* Code examples from the book in plain text: code.txt
* 1st edition examples translated to brms syntax: Statistical Rethinking with brms, ggplot2, and the tidyverse
* 1st edition translated to Python & PyMC3
* 1st edition translated to Julia
* 1st edition examples as raw Stan - 1st edition errata: [view on github]
Overview
Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds your knowledge of and confidence in making inferences from data. Reflecting the need for scripting in today's model-based statistics, the book pushes you to perform step-by-step calculations that are usually automated. This unique computational approach ensures that you understand enough of the details to make reasonable choices and interpretations in your own modeling work.
The text presents causal inference and generalized linear multilevel models from a simple Bayesian perspective that builds on information theory and maximum entropy. The core material ranges from the basics of regression to advanced multilevel models. It also presents measurement error, missing data, and Gaussian process models for spatial and phylogenetic confounding.
The second edition emphasizes the directed acyclic graph (DAG) approach to causal inference, integrating DAGs into many examples. The new edition also contains new material on the design of prior distributions, splines, ordered categorical predictors, social relations models, cross-validation, importance sampling, instrumental variables, and Hamiltonian Monte Carlo. It ends with an entirely new chapter that goes beyond generalized linear modeling, showing how domain-specific scientific models can be built into statistical analyses.
R package
The book is accompanied by an R package, rethinking. The package is available here and from on github. The core of this package is two functions, quap and ulam, that allow many different statistical models to be built up from standard model formulas. This has the virtue of forcing the user to lay out all of the assumptions. The function quap performs maximum a posteriori fitting. The function ulam builds a Stan model that can be used to fit the model using MCMC sampling. Some of the more advanced models in the last chapter are written directly in Stan code, in order to provide a bridge to a more general tool. There is also a technical manual with additional documentation.
Contents
Chapter 1. The Golem of Prague
Statistical golems
Statistical rethinking
Tools for golem engineering
Chapter 2. Small Worlds and Large Worlds
The garden of forking data
Building a model
Components of the model
Making the model go
Chapter 3. Sampling the Imaginary
Sampling from a grid-approximate posterior
Sampling to summarize
Sampling to simulate prediction
Chapter 4. Geocentric Models
Why normal distributions are normal
A language for describing models
Gaussian model of height
Linear prediction
Curves from lines
Chapter 5. The Many Variables & The Spurious Waffles
Spurious association
Masked relationship
Categorical variables
Chapter 6. The Haunted DAG & The Causal Terror
Multicollinearity
Post-treatment bias
Collider bias
Confronting confounding
Chapter 7. Ulysses’ Compass
The problem with parameters
Entropy and accuracy
Golem Taming: Regularization
Predicting predictive accuracy
Model comparison
Chapter 8. Conditional Manatees
Building an interaction
Symmetry of interactions
Continuous interactions
Chapter 9. Markov Chain Monte Carlo
Good King Markov and His island kingdom
Metropolis Algorithms
Hamiltonian Monte Carlo
Easy HMC: ulam
Care and feeding of your Markov chain
Chapter 10. Big Entropy and the Generalized Linear Model
Maximum entropy
Generalized linear models
Maximum entropy priors
Chapter 11. God Spiked the Integers
Binomial regression
Poisson regression
Multinomial and categorical models
Chapter 12. Monsters and Mixtures
Over-dispersed counts
Zero-inflated outcomes
Ordered categorical outcomes
Ordered categorical predictors
Chapter 13. Models With Memory
Example: Multilevel tadpoles
Varying effects and the underfitting/overfitting trade-off
More than one type of cluster
Divergent transitions and non-centered priors
Multilevel posterior predictions
Chapter 14. Adventures in Covariance
Varying slopes by construction
Advanced varying slopes
Instruments and causal designs
Social relations as correlated varying effects
Continuous categories and the Gaussian process
Chapter 15. Missing Data and Other Opportunities
Measurement error
Missing data
Categorical errors and discrete absences
Chapter 16. Generalized Linear Madness
Geometric people
Hidden minds and observed behavior
Ordinary differential nut cracking
Population dynamics
Chapter 17. Horoscopes