Archive.org was once nothing more than a curiosity. It's an attempt to save the websites of the past for future generations to experience. However, over the years digital marketing efforts have evolved, and Archive.org has become an indispensable tool for the serious SEO. Let's take a look at what it is, who runs it, and how you can use it to avoid some pretty serious SEO disasters.
What Is Archive.org?
Archive.org is a nonprofit that was set up in 2006. Its aim is purely historical and the organization wishes to preserve as much of the internet as possible. Websites go down and get recreated with surprising regularity. Being able to access this information once it has been removed or replaced is going to allow future generations to experience the early internet.
What Information Can I See About A Website In Archive.org?
The organization's crawlers are prolific. Considering the limited size of the nonprofit they manage to crawl and store a huge amount of data on every website that was ever worth archiving (and many that weren't worth archiving too). Archive.org creates snapshots of individual pages of a website throughout time. Not all of the pages of a website will be added to their index, only the most important ones (unless you have a monster authority site). You can easily click through Archive.org to see how the website has changed over the years. More popular sites will have more snapshots than smaller websites. This means that the more snapshots the website has in Archive.org, the more popular it is (or was). You can check the first and last dates that a site was visited, plus the total number of times it has been saved (and thus how popular it is) using our tool. More information to read is also featured below.
Check first and last dates a url were saved, and the amount of times a url has been saved for up to 30 urls now
Url:The url in question
First Date:The date that the url was first captured by the Archive.org crawlers
Last Date:The most recent date that the url has been crawled by Archive.org
Saved Times:The total number of times the url has been saved
Avg. SPM:'Average Saves per Month' - our own metric derived from the data above
Popular?:Our own metric, shows whether the url has recently been archived, and has been saved at least bi-monthly on average
You can choose to download the data in the table above as either a CSV file or an XLSX file by clicking on the icons below.
How Can I Use Archive.org For Whitehat SEO?
Checking the history of a URL is of paramount importance these days. We are living in a time where many domains have been previously registered and used before. When you are starting a new project and want to buy a new domain, it is important that it has a 'clean' history. Google holds serious grudges when it comes to web spam. Once a domain has been blacklisted there is little chance of you being able to convince them to give the domain a second chance - even if you had nothing to do with the previous owner.
By looking through Archive.org you will be able to see if the domain has ever been used before, and most importantly - if it has ever been a little bit 'dodgy'. If you see any signs of spam, scams, duplicate content, adult, pharma, warez, hacking, hate speech, or anything else like that - avoid the domain like the plague. If you don't do this, you could invest countless hours (and dollars) into SEO for a domain that is never going to make it into Google's index because of its dark history. It happens to newbies all the time (and it happens to lazy experts sometimes too).
It takes two minutes to check. So do it. It's a very wise use of your time.
How Can I Use Archive.org For Blackhat SEO?
The PBN revolution must have given Archive.org a huge amount of extra traffic over recent years. Anyone using a PBN (which is most SEOs) needs to check the history of the domains they are buying for their network due to the reasons we outlined above. A domain can have an excellent backlink profile, but if it's been blacklisted or used for dodgy things - it's worthless. When you consider the fact that these domains can go for several thousand dollars at a time, it's an essential part of domain buying due diligence.
More nefarious Blackhat SEOs will actually scrape the data from Archive.org and resurrect the website with the old content back on it. We do not recommend this purely from a legal point of view. We are unsure if it's legal or not, but we would be surprised if it is.
So there you have it, a little bit of information on Archive.org and how it can be used to your advantage for SEO purposes. The guys at the nonprofit are doing a great job, and if you have never checked the archive out before - we highly recommend you do. It's amusing to go back 15 years and see how ugly the internet once was. Enjoy!