Many years ago now I told a class of summer semester students that one of the lowest effort, highest reward things they could do to prepare themselves for working on big data problems was to build familiarity with Linux, the operating system of the cloud. This is probably one of the most prophetic things I have ever said. This was back before Kubernetes existed, and if Docker existed, I’d certainly never seen it used.

I advised them to try switching their personal laptop OS to Linux.

I think this is still decent advice for all Data Scientists today. Linux know-how is a great value add for teams that need to scale up themselves - that don’t have the support (or don’t have priority or quality support) from dedicated cloud infrastructure teams.

If you are confident with the Linux ecosystem, you’re not dependent someone else to ‘productionise’ your work. You can cede as much or as little of that as you want.

It’s also a way easier sell these days. I mean, I play Steam games without a hitch on my personal laptop running Linux. Steam Games! What times we live in!

In the weirdest twist of fate, Microsoft Windows is now a strong contender as a desktop OS for those who want to build Linux skills with the safety net of a commercial OS. The Windows Subsystem for Linux ‘just works’ pretty well. Especially when you combine it with VSCode.

On the Apple side of the fence there look to be some cool projects that are aiming to create a decent Linux experience on the proprietary Apple chips. This is definitely worth looking into if you’re one of the, what seems like, 95% of Data Scientists that favour working on a Mac.