Something in prod bombed over the weekend. I had logged the #rstats {targets} build output so I knew which target. I pulled the dependencies from the cache stepped through the target code interactively until I found the bomb. Data source schema change ofcourse. Took about ~10 minutes to pin down the field and record in the cached data and fire off the email to upsream team. Prod without a target graph and a cache? I can’t even.
Setup
In response to a Twitter question from Jared Lander, here is my logging setup:
Top level file is a .cmd
- yes we’re on Windows Server Data Centre.
pushd $~dp0
Rscript.exe the_script.R > log.txt 2>&1
Roughly translated to:
- set the working dir to the scipt’s location.
- pipe the std output and std error of running the_script.R to log.txt
Here is the_script.R
:
capsule::run({
targets::tar_invalidate(source_file)
targets::tar_make(output)
})
Which translates to:
- within my {capsule} ({renv}):
- invalidate the source data (so it will be refreshed)
- build the output (this plan has multiple outputs on different schedules)
To do interactive diagnostics with cached targets, I run capusle::repl()
to switch my R REPL over to the capsule environment.