Taking a Tour of the Pirate Ship ‘GitHub DMCA’ with R

Taking a Tour of the Pirate Ship ‘GitHub DMCA’ with R

| | R, TLAPD
Despite having sailed through the core components of this year’s Talk Like A Pirate Day R post time has been an enemy of late so this will be a short post that others can build off of, especially since there’s lots more knife work ground to cover from the data ... Read More

Access the Internet Archive Advanced Search/Scrape API with wayback (+ a links to a new vignette & pkgdown site)

| | R
The wayback🔗 package has had an update to more efficiently retrieve mementos and added support for working with the Internet Archive’s advanced search+scrape API. Search/Scrape The search/scrape interface lets you examine the IA collections and download what you are after (programmatically). The main function is ia_scrape() but you can also ... Read More

The Evolution of Data Literacy at the U.S. Department of Energy + Finding Power Grid Cyber Attacks in a Data Haystack

| | data wrangling, R
I was chatting with some cyber-mates at a recent event and the topic of cyber attacks on the U.S. power-grid came up (as it often does these days). The conversation was brief, but the topic made its way into active memory and resurfaced when I saw today’s Data Is Plural ... Read More

Driving Drill Dynamically with Docker and Updating Storage Configurations On-the-fly with sergeant

| | Apache Drill, drill, R
The sergeant🔗 package has a minor update that adds REST API coverage for two “new” storage endpoints that make it possible to add, update and remove storage configurations on-the-fly without using the GUI or manually updating a config file. This is an especially handy feature when paired with Drill’s new, ... Read More

Simplifying World Tile Grid Creation with geom_wtg()

| | ggplot, R
Nowadays (I’ve seen that word used so much in journal articles lately that I could not resist using it) I’m using world tile grids more frequently as the need arises to convey the state of exposure of various services at a global (country) scale. Given that necessity fosters invention it ... Read More
Friday #rstats twofer: Finding macOS 32-bit apps & Processing Data from System Commands

Friday #rstats twofer: Finding macOS 32-bit apps & Processing Data from System Commands

| | Apple, macos, R
Apple has run the death bell on 32-bit macOS apps and, if you’re running a recent macOS version on your Mac (which you should so you can get security updates) you likely see this alert from time-to-time: If you’re like me, you click through that and keep working but later ... Read More

Introducing ‘gepetto’ — a Splash-like REST API to Headless Chrome

| | R, web scraping
It’s been over a year since Headless Chrome was introduced and it has matured greatly over that time and has acquired a pretty large user base. The TLDR on it is that you can now use Chrome as you would any command-line interface (CLI) program and generate PDFs, images or ... Read More

In-brief: Using Bro connection logs with Apache Drill

| | Apache Drill
If you’ve got a directory full of Bro NSM logs, it’s easy to work with them in Apache Drill since they’re just tab-separated values (TSV) files by default. The most tedious part is mapping the columns to proper types and hopefully this saves at least one person from typing it ... Read More

Updates to the sergeant (Apache Drill connector) Package & a look at Apache Drill 1.14.0 release

| | Apache Drill, R
Apache Drill 1.14.0 was recently released, bringing with it many new features and a temporary incompatibility with the current rev of the MapR ODBC drivers. The Drill community expects new ODBC drivers to arrive shortly. The sergeant🔗 is an alternative to ODBC for R users as it provides a dplyr ... Read More

In-brief: splashr update + High Performance Scraping with splashr, furrr & TeamHG-Memex’s Aquarium

| | R, web scraping
The development version of splashr now support authenticated connections to Splash API instances. Just specify user and pass on the initial splashr::splash() call to use your scraping setup a bit more safely. For those not familiar with splashr and/or Splash: the latter is a lightweight alternative to tools like Selenium ... Read More