While Model Trains

Read data blog posts.
Carefully handpicked.
Presented 3 at a time.

Why Correlation Usually ≠ Causation

Gwern

"Despite this admonition, people are overconfident in claiming correlations to support favored causal interpretations and are surprised by the results of randomized experiments, suggesting that they are biased & systematically underestimate the prevalence of confounds / common-causation."

Read it!

Data scientists work alone and that's bad

Ethan Rosenthal

"The norm is that of a lonely life for the data scientist. Whether they lie near analytics, machine learning, or elsewhere in the large latent space that spans this ill-defined role, just like in the curse of high-dimensionality, they are likely alone."

Read it!

Two faces of overfitting

Zygmunt

Overfitting the validation set occurs when multiple settings are tested and compared using the same validation set repeatedly until satisfactory performance is achieved.

Read it!