In the early 2000’s, when Google announced an update, it usually meant a significant change in the search results. Google Update Florida in 2004 was a giant step forward. This article discusses what Update Florida probably was and how it affects SEO today.
Update Florida happened in November 2003, just before the Christmas shopping season and just before PubCon Florida in Orlando. It was immediately perceived as a change in how links are calculated at Google. Many innocent non-spam sites lost rankings. These innocent sites that lost rankings were labeled as “false positives.”
The shake up continued well into 2004. Matt Cutts actively sought examples of false positives. Many consultants sent example URLS of innocent sites that had been affected.
It wasn’t until sometime in January or February when rankings began to stabilize and Google had sorted through the false positives. I’m not sure how much in-house testing Google had done prior to the release of Update Florida, but in my experience, it felt as if there was very little pre-release modeling of how it would affect innocent sites.
At the time there were many theories about what Google was doing. I remember some prominent (black hat at the time) SEOs speculating that Google was using OCR to identify the “buy” button on eCommerce sites in order to weed them out of informational queries.
But the predominant theory was a general sense that this affected links. I believe that is what Update Florida really was. It was a link analysis algorithm. At the time, nobody, including myself, knew much about how link analysis worked.
At a Search Engine Strategies San Jose session about Googlebot, Marissa Mayer revealed that Google depreciated links from irrelevant pages. This was important because up until then a high PageRank link could help a site rank, regardless of topic.
Was this what Update Florida was about? I didn’t think so at the time.
In 2005, at Pubcon New Orleans, Google engineers revealed that they were using statistical link analysis to weed out spam sites. It was announced at a super session of ten Google engineers. An engineer spoke about statistical analysis then opened the floor to informal one-on-one discussions.
That was the first I’d heard of statistical analysis and it was a mind blowing revelation, even more important than the revelation that Google depreciated PageRank from irrelevant sites.
Update Florida was a major disruption. Far more than just a simple devaluation for irrelevant links. Google has never disclosed what Update Florida was, but in my opinion, the obvious candidate is statistical analysis.
What is Statistical Analysis?
Statistical analysis for links is the process of plotting out on a graph the characteristics of a web page or web site. You can tally up statistics on things like the average amount of outbound links per web page, percentages of outbound links that contain keyword rich anchor text, and so on.
Google had been researching statistical properties of links since at least 2001. A paper entitled Who Links to Whom: Mining Linkage between Web Sites (PDF) details work on modeling statistical properties of web pages and web sites. They also noticed how certain properties seemed to indicate the presence of spam.
One of the authors of this study is Krishna Bharat, who would later go on to found and head Google News. He is a co-author of the famous Hilltop algorithm and a creator of Google’s LocalRank algorithm.
By June of 2004, Microsoft had publicly published the famous research paper titled, Spam, Damn Spam, and Statistics (PDF). It is this research paper that says out loud what search engines had been developing in secrecy. It reveals the mature vision of statistical analysis for finding spam. If you have never read this research paper, I strongly encourage you to read it. It will give you a good idea about what statistical analysis is in relation to spam fighting and SEO.
The timeline for the development of link analysis fits the timeline for when Update Florida happened, in late 2003. Seven months later Microsoft was publishing research papers about it. Google had been researching mining the web graph since at least 2001 (and that paper cites research that were published before 2001).
I believe that it is not unreasonable to assume that Update Florida was Google’s first attempt at using statistical analysis to find spam. It was a bumpy debut, which only highlighted how new and important this algorithm was.
Link analysis changed how we talk about SEO. It ushered in phrases like “looking natural” in relation to linking patterns. Even today, the SEO industry is still worried about “looking natural” and for good reason.
There aren’t many research papers that focus on link analysis these days. It may be because the technology is fully matured.
The emphasis today is on machine learning in the areas of understanding concepts, understanding content and identifying user intent.
Nevertheless, link analysis, whether it was a part of Update Florida, may well be a part of Google’s core algorithm. It’s an easy way to catch obvious spam and remove it. The changes Update Florida brought in regards to how we approach the task of SEO is still a part of the SEO vocabulary.