web scraping - Tagged - Security Boulevard
Reality Check: Automated Shopping Bots are a Business Problem

Reality Check: Automated Shopping Bots are a Business Problem

Last week, I had the pleasure of participating in a webinar on automated shopping bots with Sandy Carielli, Security and Risk Analyst at Forrester Research. The webinar highlighted two things for me: ...

Audio Recordings Used to Copy Keys, Carnival Ransomware Attack, Social Media Profile Data Exposed

In episode 135 for August 24th 2020: Details on how researchers can use audio recordings of keys being used in locks to create copies, Carnival cruise lines becomes the victim of a ...

Tales from the Front Lines: Why Simple Attacks Like Content Scraping are the Hardest to Block

Of all of the automated business logic abuse attacks, the simple act of copying and pasting content from one web page to another is the most difficult for any technology to stop ...
DC Court Ruling Reduces Webscraping Risk

DC Court Ruling Reduces Webscraping Risk

In a decision that reduces some risk associated with webscraping, the United States District Court for the District of Columbia ruled that violating a website’s terms of service cannot alone be the ...
Quick Hit: Scraping javascript-“enabled” Sites with {htmlunit}

Quick Hit: Scraping javascript-“enabled” Sites with {htmlunit}

| | R, web scraping
I’ve mentioned {htmlunit} in passing before, but did not put any code in the blog post. Since I just updated {htmlunitjars} to the latest and greatest version, now might be a good ...
🔗

splashr 0.6.0 Now Uses the CRAN-nascent stevedore Package for Docker Orchestration

| | R, splash, splashr, web scraping
The splashr package [srht|GL|GH] — an alternative to Selenium for javascript-enabled/browser-emulated web scraping — is now at version 0.6.0 (still in dev-mode but on its way to CRAN in the next 14 ...
‘data:’ Scraping & Chart Reproduction : Arrows of Environmental Destruction

‘data:’ Scraping & Chart Reproduction : Arrows of Environmental Destruction

Today’s RSS feeds picked up this article by Marianne Sullivan, Chris Sellers, Leif Fredrickson, and Sarah Lamdanon on the woeful state of enforcement actions by the U.S. Environmental Protection Agency (EPA). While ...
👀

More “Scraping Ethics Gone Awry” and “Why Do This When There’s a Free API?”

| | web scraping
I can’t seem to free my infrequently-viewed email inbox from “you might like!” notices by the content-lock-in site Medium. This one made it to the iOS notification screen (otherwise I’d’ve been blissfully ...
🔗

Introducing ‘gepetto’ — a Splash-like REST API to Headless Chrome

| | R, web scraping
It’s been over a year since Headless Chrome was introduced and it has matured greatly over that time and has acquired a pretty large user base. The TLDR on it is that ...
🔗

In-brief: splashr update + High Performance Scraping with splashr, furrr & TeamHG-Memex’s Aquarium

| | R, web scraping
The development version of splashr now support authenticated connections to Splash API instances. Just specify user and pass on the initial splashr::splash() call to use your scraping setup a bit more safely ...