While Model Trains

Read data blog posts.
Carefully handpicked.
Presented 3 at a time.

How much data should you allocate to training and validation?

Francesco Pochetti

To avoid responding with "that's what Andrew NG said" when asked about the reason behind choosing an 80% training and 20% validation split, consider this explanation.

Read it!

Writing Robust Tests for Data & Machine Learning Pipelines

Eugene Yan

An in-depth analysis of why certain types of tests break more frequently than others, along with suggestions for creating more robust pipeline tests.

Read it!

Confession of a so-called AI expert

Chip Huyen

"Even though I’m one of the beneficiary of this AI craze, I can’t help but thinking this will burst. I don’t know how and when, but I have this belief that the system is currently being rigged in favor of people whose resumes dotted with fancy keywords like mine, and a rigged system can’t be sustainable."

Read it!