While Model Trains

Read data blog posts.
Carefully handpicked.
Presented 3 at a time.

"Just get some labelled data"

Neal Lathia

"This is just a small ode to the folks who spend countless hours on the toil that is 'just' getting some labelled data."

Read it!

Prediction intervals for Random Forests

Ando Saabas

Prediction intervals are commonly used for linear models but are often underused for random forests. Leveraging the fact that a random forest can provide a conditional distribution instead of just the conditional mean makes prediction intervals relatively straightforward to use in this context.

Read it!

Understanding the beta distribution (using baseball statistics)

David Robinson

"The beta distribution is best for representing a probabilistic distribution of probabilities- the case where we don’t know what a probability is in advance, but we have some reasonable guesses."

Read it!