Something in prod bombed over the weekend. I had logged the #rstats {targets} build output so I knew which target. I pulled the dependencies from the cache stepped through the target code interactively until I found the bomb. Data source schema change ofcourse. Took about ~10 minutes to pin down the field and record in the cached data and fire off the email to upsream team. Prod without a target graph and a cache? I can’t even.

Setup

In response to a Twitter question from Jared Lander, here is my logging setup:

Top level file is a .cmd - yes we’re on Windows Server Data Centre.

pushd $~dp0
Rscript.exe the_script.R > log.txt 2>&1

Roughly translated to:

  • set the working dir to the scipt’s location.
  • pipe the std output and std error of running the_script.R to log.txt

Here is the_script.R:

capsule::run({
  targets::tar_invalidate(source_file)
  targets::tar_make(output)
})

Which translates to:

  • within my {capsule} ({renv}):
    • invalidate the source data (so it will be refreshed)
    • build the output (this plan has multiple outputs on different schedules)

To do interactive diagnostics with cached targets, I run capusle::repl() to switch my R REPL over to the capsule environment.