Magpies have started sneaking in the back door to steal the kids scraps from under the table and TBH I’m not even mad.
Let’s stop doubly-screwing data science learners
I frequently see tweets that highlight the fact that people learning coding are not taught in depth about fundamental tools or processes like using a linter, or debugging. For example, this blog post from Greg Wilson.
It’s possible that in Data Science land we are doubly-screwing over learners by not only not teaching them fundamental coding knowledge, but also not teaching analogous things in our own domain.
I know of one particularly progressive course in Business Analytics at Monash University that teaches RMardkown for writing analytical documents, and even touches on Shiny for interactive apps. Students are rightfully being taught how to put together a polished looking piece of data driven communication as core coursework.
To me this a really insightful move, because out here in the trenches I have seen and felt the pain first hand of people who think they are on to a winning idea, but can’t make it connect, due to inability to communicate it in a convincing way. My sense is that Monash’s approach is the exception rather than the rule, and that is doing students a disservice.
The big one, the one that I think DS educators are totally sleeping on, is building project pipelines. By that I mean the craft of building out scalable software machines that ingest data from various sources and transmute it into various outputs, probably involving aforementioned presentation layer technology for the final leg.
Tools in this space are becoming mature and ubiquitous. It seems that every big data driven tech company has had to build one, and a few have open sourced them. Examples: Airbnb and Airflow, Spotify and Luigi, Netflix and Metaflow. In the R world we have been very fortunate to have the rOpensci peer-reviewed option in {drake}, and soon we’ll have another peer-reviewed option in {targets}.
I have written at some length about {drake}
, and how its benefits can be felt all the way down to small projects. Recently a colleague of mine who is studying told his lecturer and tutors about our {drake}
workflow, and was invited to teach his class about it. At least some of his peers, data science students, are now using it for their assignments and raving about it.
This confirms to me that pipeline tools, and the principles that underpin them are ready to be incorporated into the canon of core Data Science knowledge. I really hope I hear of more institutions following Monash’s lead, and teaching students modern tools, arising from the data science domain, that can set them up for success in industry.
I’ve been writing a fair bit of Typescript and #rstats in VSCode over the last month and I’m struck by how much confidence the TS type linting gives me to slash at the code base. Editor highlights all the things I’ve broken quite well. Most of the bugs have been in the R code…
Keyboards vs. developer skill and the virtuos loop of productive developers
A bit of nonsense in the Twitterverse this week about developer seniority and usage of the mouse.
I see this as recurrence of the long running thread that rears up now and again about how ‘real’ developers use keyboard-driven editors like Emacs or Vim.
Some thoughts:
There could be a loose correlation between seniority and keyboard driven editors due to:
- Age. These are old tools, and the people who started out when they were cutting edge are now old, and yes senior developers.
- Injuries. Ergonomically, a mouse and standard size keyboard just don’t work long term for a segment of the population. Ergonomic keyboards, and keyboard mappings in keyboard-driven editors are a common solution to this. But you have to be at a mouse and keyboard for a fair amount of time for this to become a pain issue - skill accumulated over that time again probably leads to a loose correlation with developer seniority.
So I think some people might be observing a signal that is real (if weak), but surprise surprise getting themselves snagged in the correlation-causation-conundrum.
I have my own theories about better markers for productive programmers. I think after you gain enough programming skill you reach an inflection point where that skill can be brought to bear not just on the problems you have, but on your processes for solving them. You can write code to make yourself more efficient at writing code. You craft your own tools to fit your own niche problems.
There are examples of people who are known to be highly productive doing this
everywhere. In the R world think about how {knitr}
, {devtools}
, {usethis}
,
{reprex}
and their like came to be. They’re programming/CLI tools intended to
supplement the capabilities of a GUI in a composite interface to the niche
problems of building documents, packages, projects, and examples.
An interesting thing often happens where these things start out as command line things, and become so important to a workflow that they graduate to a keybinding or a GUI button. And so here I think we encounter another loose correlation between preference for keyboard-driven and seniority:
If you’re in the business of crafting the interface to your workflow, keybindings or buttons allow you to reduce the friction of that interface and make it ‘feel’ nicer to use. I guess it’s like the digital equivalent of a wall-mounted pegboard for tools. Having all these for-purpose tools right at your fingertips, you can reach for without thinking, helps you focus on what’s on the bench.
You could array your tools with buttons or menus to be moused-on, but keybindings give you a bit more ‘space’ to work with before things get unweildly - you run out of pixels fast! So there’s a practicality aspect that could be a driver for keybindings and editors that make keybindings easy to execute.
But it’s not creation of buttons or keybindings that is important. What exactly is a ‘low friction’ inteface will vary by person, and is relative to the friction of the task being interfaced with. In fact if you have powerful commands, a sharp memory, and are a fast typist, maybe a CLI already feels friction free.
The important thing - the productity multiplier - is using your skills to shape your tools and the environment that you work in, which in-turn makes your skills more effective. It’s an extremely virtuos loop, and I think possibly what people are really aspiring to, rather than say mastery of the keyboard or a keyboard-driven editor like Vim or Emacs.
Commands, buttons, bindings, foot pedals, voice commands, gesture controls… these are all just implementation options for interfaces created by that virtuos loop.
Howdy eveyrone it’s me Acrobat Reader. I’m a Reader for Pee Dee Effs. Definitely gonna write-lock those suckers tho (haha), so don’t forget to close me down or I’m gonna have to derail the shit out of your rendring pipelines. Seriously, I will DESTROY them. Have a great day!
Can one get a Phd in the fiddly little offset maths involved in inserting text into documents programatically?
Feeling like a bit of an outlier while wistfully throwing name in hat for Github codespaces. Also a Vim checkbox but no Emacs!
I have a feeling this is going to shake things up quite a bit when it drops.
Just ripped through @alexsmann and Co’s ‘The Eleventh’ in a few days: www.abc.net.au/radio/pro…
Great Australian historical storytelling!
We bought a pair of high end noise cancelling headphones, and now my partner is 5m away from me working on her board report while I am doing guitar drills. This is the greatest thing ever!
This Outlook spam caught my eye today. 27 days in the last month where work hasn’t leaked into home time. I’m actually quite proud of this summary statistic!
I do resent the label though. The days are contained but they aren’t that ‘quiet’!
Hey #rstats I recommend giving ‘The Social Dilemma’ on Netflix a watch.
Right now I’m reflecting on how much we are using Twitter to connect on R topics versus how much Twitter is using that pretext as an excuse to jack us in to the money printing machine. 🤔
Me: thingie.map((x) => x.thing.thingo) JS: Yeaaahhh boi Inline lambda. Nice one! Me: thingie.map((x) => {x.thing.thingo}) JS: NOPE. I WILL NEVER. HOW COULD YOU POSSIBLY?! GET OUT.
Book week is just the prisoner’s dilemma recast where the snitches are parents who roll out legit costumes. CHANGE MY MIND.
Kid 1 wants to go as ‘The Onceler in his lerkim’ to book week tomorrow. Challenge accepted my dude! 😂😂😂
Look at your console, now back at me, the output is on the clipboard, look again, now it’s on my page, now it’s in my code, THE OUTPUT IS NOW DIAMONDS…
clipr::write_last_clip() is a handy little #rstats fn that can save mousing around in the console.
‘GitHub pull request’ adds a PR and Issue workflow to VSCode. You can make PR comments, code review comments, and execute a merge from within VSCode itself. github.com/microsoft… 🤩
Birthdays
Birthday@25:
- sleep in till 10
- play new videogames (present to self)
- get sloshed on coronas in the driveway
- memory corrupt
Birthday@35
- sleep in till 6:30
- change shitty nappies
- unclog washing machine filter
- go on trail run (present to self)
- afternoon tea with kids + extended fam
- quiet scotch paired with solving N-Queens in Scheme
Remember when Twitter inexplicably started hiding links to people’s blogs and refused to enter into any correspondence about it? mobile.twitter.com/MilesMcBa…
Your followers aren’t yours. This is their house. You just live here.
How I got utteranc.es working on my rmarkdown distill blog
What is utteranc.es?
I frist saw these on Nick Tierney’s site.
The premise: What if your blog comment threads were just GitHub issue threads on your blog source repo? What if they were syncronised between Github and the footer of the your blog posts? Neat idea hey? You keep control of your data (relatively speaking) and there’s one less site to data mine and track your readers - yes this is a thing in Disqus. Booouuurrrnnns.
How did I get it working?
I didn’t. I fought and fought for some hours with the grid CSS in the Distill template and no matter what I tried my comments iframe always had 0 height. Then I had a big whinge to my colleague Anthony North who defied the grid using a html include
with a javascipt payload that injects the Utterances iframe into the end of the article. Very hacky and very cool.
Here’s the HTML file which shouldn’t be too difficult to adapt to your own distill site if you have one.
You need to refer to it in your _site.yaml like:
output:
distill::distill_article:
css: mmstyle.css
includes:
in_header: utterances.html
Recover is the apex R debugging method.
I just debugged a ‘non-numeric argument error’ being thrown by this beastie in under 5 minutes with #rstats’ options(error = recover).
A strength of recover over other methods for stuff like this is that all 4(!) loop indices will be set to the values they were on failure. Chef’s kiss
IMHO, contrary to Jenny Bryan’s ranking in her incredible object of type closure is not subsettable talk, this makes recover the apex R debugging method.
Error turned about to be from line 12 due to sticky sf
geometry btw. 🥳
edit: I accidentally wrote recover = TRUE
when I first posted this, a typo I often make due to wishful thinking perhaps.
I’m experimenting with using a micro blog syndicated to Twitter for my status type posts, part of my latest efforts to turn Twitter into a less noisy social network. micro.blog
Spacemacs for the rest of us
Despite the joy it brings me I’ve always balked at recommending Spacemacs + ESS as a dev environment for #rstats due to the brutal learning curve. However yesterday, thanks to Jack of Some’s Youtube channel, I discovered there is a reasonably faithful port of Spacemacs to VSCode. It’s called VSpaceCode and it’s completely compatible with the R extension!
I gave it a blast today and work and it screams on Windows compared to the Emacs version. The responsiveness just can’t be un-felt, and will definitely be addictive. There’s even a built in port of magit that was again super slick compared to the slowness of the Emacs native version.
Interestingly, I have noticed the situation is reversed on my personal laptop running linux, where there seem to be a few performance glitches in VSCode. There’s still no one editor to rule them all it seems - But it feels like a similar enough experience that I could be productive using the faster one on each platform. We’ll see as I rack up more time with VSpaceCode.