Magpies have started sneaking in the back door to steal the kids scraps from under the table and TBH I’m not even mad.
Let’s stop doubly-screwing data science learners
I frequently see tweets that highlight the fact that people learning coding are not taught in depth about fundamental tools or processes like using a linter, or debugging. For example, this blog post from Greg Wilson.
It’s possible that in Data Science land we are doubly-screwing over learners by not only not teaching them fundamental coding knowledge, but also not teaching analogous things in our own domain.
I know of one particularly progressive course in Business Analytics at Monash University that teaches RMardkown for writing analytical documents, and even touches on Shiny for interactive apps. Students are rightfully being taught how to put together a polished looking piece of data driven communication as core coursework.
To me this a really insightful move, because out here in the trenches I have seen and felt the pain first hand of people who think they are on to a winning idea, but can’t make it connect, due to inability to communicate it in a convincing way. My sense is that Monash’s approach is the exception rather than the rule, and that is doing students a disservice.
The big one, the one that I think DS educators are totally sleeping on, is building project pipelines. By that I mean the craft of building out scalable software machines that ingest data from various sources and transmute it into various outputs, probably involving aforementioned presentation layer technology for the final leg.
Tools in this space are becoming mature and ubiquitous. It seems that every big data driven tech company has had to build one, and a few have open sourced them. Examples: Airbnb and Airflow, Spotify and Luigi, Netflix and Metaflow. In the R world we have been very fortunate to have the rOpensci peer-reviewed option in {drake}, and soon we’ll have another peer-reviewed option in {targets}.
I have written at some length about {drake}
, and how its benefits can be felt all the way down to small projects. Recently a colleague of mine who is studying told his lecturer and tutors about our {drake}
workflow, and was invited to teach his class about it. At least some of his peers, data science students, are now using it for their assignments and raving about it.
This confirms to me that pipeline tools, and the principles that underpin them are ready to be incorporated into the canon of core Data Science knowledge. I really hope I hear of more institutions following Monash’s lead, and teaching students modern tools, arising from the data science domain, that can set them up for success in industry.
I’ve been writing a fair bit of Typescript and #rstats in VSCode over the last month and I’m struck by how much confidence the TS type linting gives me to slash at the code base. Editor highlights all the things I’ve broken quite well. Most of the bugs have been in the R code…
Keyboards vs. developer skill and the virtuos loop of productive developers
A bit of nonsense in the Twitterverse this week about developer seniority and usage of the mouse.
I see this as recurrence of the long running thread that rears up now and again about how ‘real’ developers use keyboard-driven editors like Emacs or Vim.
Some thoughts:
There could be a loose correlation between seniority and keyboard driven editors due to:
- Age. These are old tools, and the people who started out when they were cutting edge are now old, and yes senior developers.
- Injuries. Ergonomically, a mouse and standard size keyboard just don’t work long term for a segment of the population. Ergonomic keyboards, and keyboard mappings in keyboard-driven editors are a common solution to this. But you have to be at a mouse and keyboard for a fair amount of time for this to become a pain issue - skill accumulated over that time again probably leads to a loose correlation with developer seniority.
So I think some people might be observing a signal that is real (if weak), but surprise surprise getting themselves snagged in the correlation-causation-conundrum.
I have my own theories about better markers for productive programmers. I think after you gain enough programming skill you reach an inflection point where that skill can be brought to bear not just on the problems you have, but on your processes for solving them. You can write code to make yourself more efficient at writing code. You craft your own tools to fit your own niche problems.
There are examples of people who are known to be highly productive doing this
everywhere. In the R world think about how {knitr}
, {devtools}
, {usethis}
,
{reprex}
and their like came to be. They’re programming/CLI tools intended to
supplement the capabilities of a GUI in a composite interface to the niche
problems of building documents, packages, projects, and examples.
An interesting thing often happens where these things start out as command line things, and become so important to a workflow that they graduate to a keybinding or a GUI button. And so here I think we encounter another loose correlation between preference for keyboard-driven and seniority:
If you’re in the business of crafting the interface to your workflow, keybindings or buttons allow you to reduce the friction of that interface and make it ‘feel’ nicer to use. I guess it’s like the digital equivalent of a wall-mounted pegboard for tools. Having all these for-purpose tools right at your fingertips, you can reach for without thinking, helps you focus on what’s on the bench.
You could array your tools with buttons or menus to be moused-on, but keybindings give you a bit more ‘space’ to work with before things get unweildly - you run out of pixels fast! So there’s a practicality aspect that could be a driver for keybindings and editors that make keybindings easy to execute.
But it’s not creation of buttons or keybindings that is important. What exactly is a ‘low friction’ inteface will vary by person, and is relative to the friction of the task being interfaced with. In fact if you have powerful commands, a sharp memory, and are a fast typist, maybe a CLI already feels friction free.
The important thing - the productity multiplier - is using your skills to shape your tools and the environment that you work in, which in-turn makes your skills more effective. It’s an extremely virtuos loop, and I think possibly what people are really aspiring to, rather than say mastery of the keyboard or a keyboard-driven editor like Vim or Emacs.
Commands, buttons, bindings, foot pedals, voice commands, gesture controls… these are all just implementation options for interfaces created by that virtuos loop.
Howdy eveyrone it’s me Acrobat Reader. I’m a Reader for Pee Dee Effs. Definitely gonna write-lock those suckers tho (haha), so don’t forget to close me down or I’m gonna have to derail the shit out of your rendring pipelines. Seriously, I will DESTROY them. Have a great day!
Can one get a Phd in the fiddly little offset maths involved in inserting text into documents programatically?
Feeling like a bit of an outlier while wistfully throwing name in hat for Github codespaces. Also a Vim checkbox but no Emacs!
I have a feeling this is going to shake things up quite a bit when it drops.
Just ripped through @alexsmann and Co’s ‘The Eleventh’ in a few days: www.abc.net.au/radio/pro…
Great Australian historical storytelling!
We bought a pair of high end noise cancelling headphones, and now my partner is 5m away from me working on her board report while I am doing guitar drills. This is the greatest thing ever!