The digital entropy of death: link rot

Hot on the heels of a grim blog about digital death comes…another blog about digital death. Except in this case, the recently deceased would be the links that tie the web together, otherwise known as link rot.

Link rot is a weird thing. Say I blog for Puppy Chow and I write an article about the best dog shows. For one of my examples, I link to an article with the URL “fabulous-puppy-show.html.” Since I’m Puppy Chow, my product has a decent shelf-life and my blog sticks around for a while. But now, if readers stumble upon that original Puppy Chow article and click on my example link, they land on a page about “Top 10 laptops of 2018.” What gives?

Over time, websites get taken offline, or companies start to run out of server space and delete old articles, or (arguably worse) they simply add new, unrelated content to old URL links (Hence best laptops on a page with “puppy show” in the URL). One suspects some sites replace content on old URLs rather than create new ones because the URL as it stands has a good PageRank in search engines. Why reinvent the wheel, when you can keep driving traffic to one of your pages regardless of the desired content?

This is, as you would expect, enormously confusing. Even image hosts can cause headaches as old content is switched out without warning [1], [2]. Everything from legal threats and careless website rebuilding to privacy policy alterations or a confused employee breaking some HTML code can cause mayhem.

I’m melting…mellllllting

Not to put too fine a point on it, but portions of the web that we use on a daily basis are slowly, almost imperceptibly dying. That’s one of the reasons why sites such as archive.org exist. Top tip: if you have an online bio or linkdump of any kind for personal projects, save yourself a headache and link to pages on the Archive instead. You’ll probably have to move all your links over to it after around five years of “the page is still there, honest it is” anyway.

As regular readers are aware, we post a lot blog posts. We’ve been blogging since 2012, and long may it continue. Those posts naturally will link to all manner of websites and information, and we have absolutely zero control over those sites still being around in the future. In fact, every time someone links to a third-party site, they’re just sort of assuming the thing will still be there tomorrow. Maybe the site owner dies. Maybe it’s been hacked and sends you to Viagra spam. Maybe the city turns into sludge after a Bitcoin frenzy. Who knows; the point is, anything you link to today could be something entirely different tomorrow.

The long and the short of it

This problem is made worse when people use now-defunct link shorteners or similar services that suddenly have all their links pointed somewhere else. A social network uses its own shortened link for stat tracking before sending you to a redirect, which now sends you to something about balloons, which…

…and so on. Most popular link shorteners have some failsafes built in; many will state that their links will never expire or alter, which is great news. If you’re curious as to how many potential URLs a shortening service might have available before people have to worry about reusing links, this link on Stack Overflow will prove handy.

Throwing an additional layer of complexity into the ring, there are services that offer time-limited shortened URLs; once they pass the expiration date, the shortened URL will no longer point to the original destination. Again, there is some room for ambiguity here—most services I’ve seen clearly state that the now-defunct shortened URL will not go back into the pool (to prevent unrelated final destinations showing up). At the same time, there’s a ton of generic, cookie-cutter sites offering similar services with no readily available information about link reuse.

Ultimately, regardless of the setup used, the link will be permanently broken should the original service go down and not return. If the service does return, but was purchased by someone up to no good, in theory all those links could be reactivated, except now they point to malware or exploits. On top of this, many people conscious of their personal security will choose to avoid shortened links due to not being able to see the final destination. Sure, you can use link lengtheners to see where you’re going, but for many that’s just too much work.

One way or another, we’re building a deliberate layer of impermanence over the top of theoretically stable links and content. This is such a problem that Internet Archive created 301Works back in 2009 to combat link rot caused by the flimsy structure we keep packing around the base of the Internet.

A helping hand…for someone else’s PageRank

Some services and opportunities have sprung up in the wake of the web wonkiness of link rot. Over the last year or so, we’ve noticed an uptick in emails from individuals or businesses letting us know that old links on our old blogs are dead. At this point, the oldest dated blog we’ve received a message about was from 2014, in relation to a long-dead Apple phish.

The email typically then goes on to suggest swapping out the old, dead link with their website instead, or (in some cases) offering an additional selection of SEO services for a fee. Some of them are persistent, too; two or three mails will go into the spam box before they stop sending. Here’s a snippet from one mail chain:

Link fixing

Click to enlarge

Generally speaking, you probably don’t want to go adding links to websites you’re not familiar with, as you’ve no idea what you’re directing your readers to. If the URL seems clean (aka not malicious), feel free to check it out and replace an old link in a still-relevant post. (Going through the work of replacing dead links in a practically dead blog hardly seems worth the effort, though.) Most folks likely won’t hire the first passing SEO expert through a random mailshot, either—a wise decision, as your own spam folder can probably attest.

Turning the tide

There are some ways to combat link rot, but ultimately the best defence we have is probably the various archives hoovering up portions of the net. Of course, there’s no guarantee the archives themselves will be online forever—but I have a feeling that when we reach the point where those are going under, we’ll probably have bigger things to worry about.



This is a Security Bloggers Network syndicated blog post authored by Christopher Boyd. Read the original post at: Malwarebytes Labs